The ROI Revolution Blog

Google Analytics Subdomain Tracking

January 5, 2011

submarine.jpgIf you do a quick search on “Google Analytics Subdomain Tracking”, you may have noticed that many of the top results are either woefully out of date or rather confusing. The purpose of this post is to provide my recommendations for Google Analytics subdomain tracking as of the current version of the asynchronous Google Analytics Tracking Code.

Currently there’s no specific article on Google Code dedicated to Google Analytics subdomain tracking. The closest is this, which recommends the following:

//Tracking code customizations only
var _gaq = _gaq || [];
_gaq.push([‘_setAccount’, ‘UA-12345-1′]);
_gaq.push([‘_setDomainName’, ‘.example-petstore.com’]);
_gaq.push([‘_setAllowHash’, false]);
_gaq.push([‘_trackPageview’]);

I propose that instead, for the vast majority of sites with subdomains, you should use the following:

//Tracking code customizations only
var _gaq = _gaq || [];
_gaq.push([‘_setAccount’, ‘UA-12345-1′]);
_gaq.push([‘_setDomainName’, ‘example-petstore.com’]);
_gaq.push([‘_addIgnoredRef’, ‘example-petstore.com’]);
_gaq.push([‘_trackPageview’]);

So what’s wrong with the code recommended on Google Code? It turns out there are three issues with the code that cause unnecessary problems:


1. Turn off hashing is bad.

Turning off the hash, either by [‘_setAllowHash’, false] or [‘_setDomainName’, ‘none’], is necessary for cross-domain tracking to work correctly with Google Analytics. It’s an unfortunate necessity, however, because domain hashing is actually quite useful.

By default, a script cannot identify the domain of a cookie; this information isn’t available unless it’s part of the cookie name or value itself. Including the hash provides that information so that the Google Analytics Code can read the correct set of cookies in situations where there might be more than one set.

Turning off the hash mean the Google Analytics Tracking Code has no way to tell which set of cookies is the right set. Most of the time there is only one set of cookies, so it’s not that big of a deal.

But if you were previously using Google Analytics without subdomain tracking, then you may end up with two sets of cookies for return visitors: one set created by your old code, and one created by your new code. This happens most often on subdomains, but could also happen on your main domain if you use [‘_setDomainName’, ‘none’] instead of [‘_setAllowHash’, false].

It’s also possible that instead of creating two sets of cookies, your new Google Analytics Tracking Code will destroy the cookies from your old Google Analytics Tracking Code because the hash codes don’t match. This would typically happen on your main domain rather than on a subdomain.

Eduardo Cereto has a post that looks into this issue in more detail and provides another use case where _setAllowHash causes issues. The bottom line here is that you need _setAllowHash to track across domains, but if you’re only doing subdomain tracking, it’s unnecessary and may cause problems.

2. The leading period causes cookie resets.

Google Code offers the following explanation for using the leading period when using _setDomainName:

“…if you want tracking across lower-level sub-domains:

* dogs.petstore.example.com and
* cats.petstore.example.com,

a leading period is required.”

If your site does use lower level subdomains, then you definitely need to use a leading period in order for subdomain tracking to work. If your site does not use lower level subdomain, however, then you’re actually better off not using a leading period.

The reason goes back to the hash again. The hash code that the Google Analytics Tracking Code generates when you use the leading period is different than the hash code generated when you don’t use the leading period. But the hash code generated when you don’t do any subdomain tracking on your main site is actually the same as the hash code generated when you use subdomain tracking without the leading period.

What this means is that if you weren’t doing subdomain tracking previously, using the leading period will cause your new Google Analytics Tracking Code to destroy your old cookies because the hash codes don’t match. This is similar to what happens when you turn hashing off.

Simply not including the leading period, if you don’t have to, means you’ll have less cookie reset, which will ease the transition to subdomain tracking.

3. Subdomain tracking without _addIgnoredRef causes self-referrals.

If your site has no subdomains the Google Analytics Tracking Code is able to detect when a visitor’s session has expired between pageviews and avoid overwriting their existing referral information with a self-referral or internal-referral from your own site.

That safe guard is removed, however, when you have subdomains, even if your code has the standard subdomain tracking code. This can result in a rather high percentage of self referrals, even though it seems like you’ve done everything right.

The solution is to use _addIgnoredRef, but how to use it is often misunderstood. The Google Code recommendation is to use something like this:

_gaq.push([‘_addIgnoredRef’, ‘www.sister-site.com’]);

I took a very close look at the ga.js code base and observed that something like this won’t actually work. The reason is because the Google Analytics Tracking Code considers www.sister-site.com to be the same as sister-site.com, so adding www.sister-site.com as an ignored referral doesn’t accomplish much. Using a leading period here also fails. But this works just fine:

_gaq.push([‘_addIgnoredRef’, ‘sister-site.com’]);

and in fact, so does this:

_gaq.push([‘_addIgnoredRef’, ‘sister-site’]);

The Google Analytics Tracking Code checks each ignored referral string you add and uses the indexOf method to determine whether or not that string is contained within the referring domain. If any of those checks return true, then the referral is ignored. Since the root level domain without the leading period will be contained in any of your subdomains, then passing that to _addIgnoredRef works just fine. This also eliminates the need to add a separate _addIgnoredRef statement for each subdomain.

You may still get self-referrals with _addIgnoredRef, however, though not any more than you would without subdomains. The reason is that _addIgnoredRef only works when the cookies contain existing referral information. If a new visitor comes to your site via a page without Google Analytics Tracking Code, then navigates to a page with Google Analytics Tracking Code, that should result in a self-referral, regardless of whether or not they crossed subdomains.

These types of self-referrals, however, can be avoided by making sure to tag any page of your site. Then, if there are any pages that have this issue, you can dig into your Google Analytics report data and determine exactly which pages are responsible for the issue. And, since you’re using _addIgnoredRef, it will be easier to find these pages since you won’t have to deal with the noise of self-referrals that occur for no apparent reason.

The important key takeaway here is that _addIgnoredRef should be included standard every time you do subdomain tracking, not just if you notice self-referrals. This will help you avoid needless self-referrals in the first place.

Hopefully this clears up some of the confusion surrounding subdomain tracking with Google Analytics. Feel free to leave any comments or questions.

Take the next step. Ecommerce retailers spending at least $5,000/month in AdWords qualify for a free 20-minute AdWords Diagnostic Checkup.

© 2002-2014 ROI Revolution, Inc. All rights reserved.