The ROI Revolution Blog
« Continual Conversion Rate Improvement Part II | Main | Google Analytics Case Studies to be Presented »
Collecting Web Data: A Look at Web Analytics Methodology
May 1, 2006
A few months back, I posted briefly on Script-Based Versus Log-Based Tracking, discussing the differences between various web analytics data collection methods. With more and more questions cropping up about reporting discrepencies between the two types, I felt the time was right to revisit the topic and put some key concerns to rest.
Logfile Analysis, the older of the two methods, simply counts the hits made in the web server logs and stores the data in an easily-readable, easily-managable format. This method is based on server-side data collection; there is nothing stored on the visitor's computer, nothing that runs in their browser.
In the late 1990s, search engine spiders were increasingly present on the web, and made a considerable impact on the logfiles of the sites they crawled. Along with web proxies, the popularity of consumer Internet service (and subsequent rise in dynamic IP assignment), and browse caching, it became apparent that logfile analysis needed a breath of fresh air. Supplementing logfile analysis with cookie tracking and robot exclude lists helped to solve some of the problem, but a second method was already being developed.
Page Tagging was meant to solve many of the accuracy concerns that had arisen with logfile analysis. You probably remember the iconic web counters from the mid-90s. These were some of the first examples of client-side web traffic analysis. Eventually, this method evolved into what it is today: script-based data collection which assigns a cookie to each user, analyzes their behavior on the website, and then processes the data remotely.
The popularity of page tagging is due in large part to this outsourcing. In many cases, giving the job to a remote service provider contributes to an ease of configuration and vastly decreased overhead. Both of these tracking methods have their own advantages and disadvantages, and when used alone, each can fail to provide the complete data-set of a website's performance.
Let's take a look at the methods' disadvantages, on their own:
| Logfile Analysis Disadvantages | Page Tagging Disadvantages |
|
|
So the natural assumption is, if we could just combine the two methods, we would get all of the advantages and bypass some of the drawbacks:
| Logfile Analysis Advantages | Page Tagging Advantages |
|
|
In fact, this is what many web analytics vendors are moving toward: hybrid data collection methods. While Google Analytics is itself a hosted page tagging solution, its predecessor, Urchin Software, uses cookie-enabled logfiles with page tagging to give its users the best of both worlds. This allows for greater accuracy of tracking sessions across multiple domains, eliminating the caching issues, and tracking detailed web design metrics. Bandwidth and search engine spider data is still available, and, if necessary, all data can be reprocessed, as it's based on webserver logfiles. While setting up a server-side package like Urchin could potentially involve more upfront overhead, Google Analytics Authorized Consultants like ROI Revolution can help you get over the finish line quickly, and on your way to getting the most out of your web analytics data.
If you have any questions about the methods discussed above, please drop us a line or leave a comment below.
Sources:
- Web Analytics Demystified by Eric Peterson
- Web Traffic Data Sources & Vendor Comparison by Omega Digital Media
Interested in learning more about Google Analytics?
Attend our LIVE Google Analytics Seminars for Success training in Atlanta, GA Wednesday, April 14th, 2010 and Thursday, April 15th, 2010 or get the latest tips and tricks sent to you via our free, twice-monthly Google Analytics newsletter.
Posted by Michael Harrison, Analytics and Optimization Specialist at 4:45 PM
Permalink | Comments ( 2 ) | TrackBacks ( 0 )
Filed under: Analytics
Tagged as: Analytics Basics, Analytics Technology
Comments
Thanks for the addenda, Robbin. I've added your very keen nominations to the list.
Receive new blog posts immediately direct to your email inbox!











A good analysis. I would like to nominate three more problems to page-tagging analytics: If you forget to tag a page, you're SOL. And, the data doesn't start until you start tagging (so those ten years of server logs are useless to a page-tagging solution without the hybrid option.) And if the analytics vendor goes down, you lose the data you would have collected.
Robbin
May 1, 2006 11:03 PM