Log file analysis

Log file analyses is so important. It helps to see which contents are of most interest to your visitors and even which sites they tried to reach, which you have deleted by accident. Hosting a website without those statistics is somehow like driving a car without mirrors: It works to a point. Here I am going to compare the tools I have been using for that job.


One of the coolest website log viewers is called goaccess: The documentation is currently a little incomplete, but I got it working with Nginx like that:

time-format %H:%M:%S
date-format %d/%b/%Y
log-format %h - %^ [%d:%t %^] %^ "%m %U %H" %s %b "%R" "%u"

Notice: The documentation misses the point, that you must either use the %U OR %r as log-format and never both.

log_format    main    '$remote_addr - $remote_user [$time_local] $upstream_cache_status '
                      '"$request_method $scheme://$server_name$uri $server_protocol" $status $body_bytes_sent "$http_referer" "$http_user_agent"';
access_log    /var/log/nginx-access.log main;

This stays compatible with the default nginx log file format, but creates a somehow wrong request url. The advantage is, that this allows to see different addresses (virtual hosts) the user came from, whereas you have no chance finding the virtual host of any 404 error with the default config. [caption id="attachment_2352" align="alignleft" width="1024"]log analysis with
goaccess log analysis with goaccess[/caption]

awstats #

This has been the most eye candy one before web 2.0 arrived and it is still very usable. On the other hand it is comparatively slow. awstats

webalizer #

This is the default log file viewer used by many webspace providers. It is very fast, produces informative results and is very easy to set up. webalizer