Last month we requested a report from IT department, we expected nice charts, averages, max request, minimum, peak hours / days / weeks, month. What we got was a nice 65MB RTF file with a copy & paste of Apache access.log.
As professors my colleagues where outraged to say the least, “how are supposed to get any sense from this data?”, well I just take a copy on my USB stick, as we where handled a full CD to give us the info.
As on 2006 I maintained a site for political discussion, with really big traffic, this short of analysis was no unknow to me, I know there are better tools for the job, but one fast, great and that you can use straight from CLI its webalizer, so I checked still was installed on my system, re-read some of the switches help and after like 4 attempts ended firing something like:
webalizer Bigreport.log -v -i -j -b -T -n http://fu.bar/ -D.cache -N 5 -o tmp/
As you may notice first saved the RTF file to plain text just to be on the safe side.
And voila!, we have better insight from the data… but next time, I just going to plain ask for a full backup of the site, DB and logs.