Wait a minute... those aren't my stats!

Using Google Analytics Include Filters to Remove the Other Guy's Data

Traffic here at Nerdliness is generally pretty level.  Sure, there's the occasional spike on days we post new content, and the overall trend is in the upward direction, but viewed over a monthly timeline the graphs are roughly flat.

A couple of weeks ago, I was doing my daily OCD-ish perusal of the Nerdliness.com Google Analytics reports and started seeing something odd:  a substantial bump in visitors.  My first thought was, of course, that Someone Important had recently discovered our sheer awesomeness and was preaching our gospel, but reality soon set in.

After investigating a little further, I noticed that the Content reports showed that all this traffic was directed at a page that doesn't exist.  I looked through our Apache access and error logs, trying to find any references to those URLs and came up with nothing.  

On the off chance that I might have actually made some sort of mistake, I also compared the Google Analytics report to our Google AdSense data.  Our 404 page has a couple of AdSense blocks on it, so I figured that, if it was the result of a bad link and visitors getting a 404 error, we'd see a proportional increase in AdSense impressions.

Nothing.

Looked around at every report, log, and portent I could think of and found nothing.  Double-checked our AdSense code snippet, looked good.  At that point, I was reasonably sure that it wasn't on our end and started to suspect that, perhaps, someone else was using our GA code.

So I emailed Google Analytics support. 

(Quick side note:  GA support contact link was a pain in the pooper to find.  If you ever need to contact them yourself, you first have to go through their Google Analytics Troubleshooter and just through a few hoops.  Assuming you don't find your answer, you should get a contact form at the end. 

Good news is that they were very quick to respond, replying to my ticket well within the promised 24 hours.  Good on you, Google.)

GA support didn't 100% confirm that someone was using our code, but their canned message did suggest that was the most likely cause:

Finding information about traffic to domains that are not yours in your reports is possibly the result of someone accidentally entering the wrong code on their own site, or borrowing/displaying some of your website's code for their website.

Even better, they included a possible fix:

If you're concerned about this data corrupting your own reports, Analytics can easily filter on a specific domain so you can avoid this problem. We recommend creating an 'Include' filter on your own domain:

Filter Type: Custom filter > Include
Filter Field: Hostname
Filter Pattern: your-domain-name.com
Case Sensitive: No

Sweet.

In case you don't know, you get to the Filter setup by:

  1. Log in to Google Analytics (duh).
  2. Click the Edit link for the domain you want to Filter.  It's in the far right column, under the Actions heading.
  3. Click the "+Add Filter" link.  It's in the third section down with the heading "Filters Applied to Profile."  The link is on the far right, on the same grey background as the heading title.

Once there, just follow the Google Analytics Support instructions above.

Now, it doesn't look like the filter will do anything to historical data, just info that comes in after the filter is in place.  That said, I think that our new Google Analytics setup will include setting up this input filters as soon as we set up a new domain, just in case.

Miscellaneous: