AWStats logfile analyzer 6.95 Documentation

 


Log analyzers Comparisons



Comparison between AWStats and other famous statistics tools

Features/SoftwaresAWStatsAnalogWebalizer Sawmill Analytics
Version - Date6.95 - October 20096.0 - December 20042.01-10 - April 2002 7.2.15 - May 2008
LanguagePerlCC C++/Salang
Available on all platformsYesYesYes Yes
Readable sources availableYesYesYes No (obfuscated sources for compilation only)
Price/LicenceFree/GPLFree/GPLFree/GPL From $99 Per Profile
Lite/Pro/Ent
Works with Apache combined (XLF/ELF)YesYesYes Yes
Works with Apache common (CLF) log formatAll features available with log format (b)All features available with log format (b)All features available with log format (b) All features available with log format (b)
Works with IIS (W3C) log formatYesYesNeed a patch Yes
Works with personalized log formatYesYesNo Yes
Analyze Web/Ftp/Mail log filesYes/Yes/YesYes/No/NoYes/No/No Yes/Yes/Yes
Update of statistics fromCommand line (CLI) and/or
a browser (CGI)
Command line (CLI) and/or
a browser (CGI)
Command line Command line (CLI) and/or
a browser (CGI)

SchedulerExternal (crontab, windows task manager)External (crontab, windows task manager)External (crontab, windows task manager) Built-in
Internal reverse DNS lookupYesYesYes Yes
DNS cache fileStatic and dynamicStatic or dynamicStatic or dynamic Yes (per update, or custom)
Process logs spitted by load balancing systemsYesYesNo Yes
Report number of "human" visitsYesNoYes Yes (Sessions)
Report unique "human" visitorsYesNoNo Yes (Visitors)
Report session durationYesNoNo Yes
Not ordered records tolerance and reorder for visitsYesVisits not supportedNo Yes
Statistics for visits are based onPages *****Not supportedPages ***** Pages *****
Statistics for unique visitors are based onPages *****Not supportedNot supported Client IP / Cookie
Custom *****
Report countriesFrom IP location
or domain name
Domain nameDomain name From IP location
or domain name
Report regions (US and Canada states)Need Maxmind Regions databaseNoNo Yes
GeoLite City included
Report cities and major countries regionsNeed Maxmind Cities databaseNoNo Yes
GeoLite City included
Report ISPNeed Maxmind ISP databaseNoNo Need Maxmind ISP database
Report Organizations nameNeed Maxmind Org databaseNoNo Need Maxmind Org database
Report hostsYesYesYes Yes
Report WhoIs informations on hostsYesNo No No
Report authenticated usersYesYesNo Yes
Report/Filter robots (nb detected)Yes/Yes (642**)Yes / Yes (8**)No/No Yes/Yes (250**)
Report/Filter worms (nb of families detected)Yes/Yes (5)No / NoNo/No Yes/Yes (4)
Report rush hoursYesYesYes Yes
Report days of weekYesYesYes Yes
Report most often viewed pagesYes
YesYes Yes
Report entry pagesYes
NoYes Yes
Report exit pagesYes
NoYes Yes
Not ordered records tolerance and reorder for entry/exit pagesYesEntry/Exit not supportedNo Yes
Detection of CGI pages as pages (and not just hits)YesOnly if prog ends by a defined valueOnly if prog ends by a defined value Yes
Report pages by directoryNoYesNo Yes
Report pages with last access time/average sizeYes/YesYes/NoNo/No Yes/No
Dynamic filter on hosts/pages/referers reportYes/Yes/YesNo/No/NoNo/No/No Yes/Yes/Yes
Report web compression statistics (mod_gzip,mod_deflate)YesNoNo ?
Report file typesYesYesNo Yes
Report by file sizeNoYesNo Yes
Report OS (nb detected)Yes (71)Yes (29)No (0) Yes
Report browsers (nb detected)Yes (208*)Yes (9*)Yes (4*) Yes (~20*)
Report details of browsers versionsMajor and minor versionsMajor versions by default,
minor with SUBBROW option
Major an minor versions Major and minor versions
Report screen sizesYesNoNo Yes & Depths
Report tech supported by browser for Java/Flash/PDFYes/Yes/YesNo/No/NoNo/No/No No/No/No
Report audio format supported by browser for Real/QuickTime/MediaplayerYes/Yes/YesNo/No/NoNo/No/No No/No/No
Report search engines used (nb detected)Yes (228***)Yes (24)No (0) Yes (67***)
Report keywords/keyphrases used on search engines (nb detected)Yes/Yes (118***)Yes/No (29***)No/Yes (14***) Yes/Yes (67***)
Report external refering web page with/without queryYes/YesNo/NoNo/Yes Yes/Yes
Report HTTP ErrorsYes
YesYes Yes
Report 404 ErrorsNb + List last date/referer
Nb onlyNb only Nb + List last date/referer
Report 'Add to favorites' statisticsYes
NoNo No
Other personalized reports for miscellanous/marketing purposeYes
NoNo Yes
Daily statistics YesYesYes Yes
Weekly statistics No No No Yes
Monthly statistics YesYesYes Yes
Yearly statistics YesYesYes Yes
Custom date range No No No Yes
Benchmark with no DNS lookup in lines/seconds
(full features enabled, with XLF format, cygwin Perl 5.8, Athlon 1Ghz)
5200****39000****12000**** Not calculated
Benchmark with DNS lookup in lines/seconds
(full features enabled, with XLF format, cygwin Perl 5.8, Athlon 1Ghz)
80****80****80**** Not calculated
Analyzed data save format (to use with third tools)Structured text file or XMLText files with OUTPUT optionFlat text file Flat text file/MySQL
Export statistics to PDFExperimentalNoNo No
HTML (static/email) & CSV
Graphical statistics in one page / several / or framesYes/Yes/YesYes/No/NoYes/Yes/No Yes/Yes/Yes

* This number is not really the number of browsers detected. All browsers (known and unknown) can be detected by products that support user agent listing (AWStats,Analog,Webalizer,Sawmill). The 'browser detection feature' and number is the number of known browsers for which different versions/ids of same browser are grouped by default in one browser name.

** AWStats can detect robots visits: All robots among the most common are detected, list is in robotslist.txt (250Kb). Products that are not able to do this give you false information, above all if your site has few visitors. For example, if you're site was submitted to all famous search engines, robots can make 500 visits a month, to find updates or to see if your site is still online. So, if you have only 2000 visits a month, products with no robot detection capabilities will report 2500 visits (A 25% error !). AWStats will report 500 visits from robots and 2000 visits from human visitors.Sawmill Analytics uses a "currently active" list of robots based on the robotstxt.org database.

*** AWStats has url syntax rules for the most popular search engines (that's the 'number detected'). Those rules are updated with AWStats updates. But AWStats has also an algorithm to detect keywords of unknown search engines with unknown url syntax rules. Sawmill uses unique syntax to detect 67 search engines, and you can add any number of custom SE's.

**** Most log analyzers have poor (or not at all) robots, search engines, os or browsers detection capabilities and less features (no or poor visits count, no filter rules, etc...).
It is not possible to add all AWStats features to other log analyzers, so don't forget that benchmarks results are for 'different features'. For this benchmark, I did just complete Webalizer and Analog robots or search engines databases with part of AWStats database. So Webalizer config file was completed with this file, Analog config file was completed with this file. Note that without this very light add (using default conf file), Webalizer speed is 3 times faster, Analog is 15% faster).
Benchmark was made on a combined (XLF/CLF) log record on an Athlon 1GHz.
You must keep in mind that all this times are without reverse DNS lookup. DNS lookup speed depends on your system, network and Internet but not on the log analyzer you use. For this reason, DNS lookup is disabled in all log analyzer benchmarks. Don't forget that DNS lookup is 95% (even with a lookup cache) of the time used by a log analyzer, so if your host is not already resolved in log file and DNS lookup is enable, the total time of the process will be nearly the same whatever is the speed of the log analyzer.

***** Some visitors use a lot of proxy servers to surf (ie: AOL users), this means it's possible that several hosts (with several IP addresses) are used to reach your site for only one visitor (ie: one proxy server download the page and 2 other servers download all images). Because of this, if stats of unique visitors are made on "Hits", 3 users are reported but it's wrong. So AWStats, considers only HTML pages to count unique visitors. This decrease the error (not totally, because it's always possible that a proxy server download one HTML frame and another one download another frame).
Sawmill Analytics allows you to choose what you define as a visitor - by default the client IP is used, but you can use a cookie (persistant or session) or any custom string, or combination of string from teh log data.

(a) Data were provided by Sawmill company (Graham Smith).

(b) With such log format, there is no user agent information in log file, so some reports are broken. For example, it's not possible to make reports on browser or os for (information is not stored in log file). To solve this, use another log format (like the combined format).