Let's talk about bot traffic in Google Analytics.
Most of the time, bot traffic in our analytics data gets a bad name. We think of it as spam, and we don't want this data anywhere near our reports.
Sure, there are spam bots sending hits to our analytics data. But there are also good bots. Or at least bots that we want to visit our websites, for testing, diagnostics, and even monitoring SEO results.
Whether or not you welcome this bot traffic, 99% of the time you don't want to see bot traffic in your analytics reports.
Why? Because Bots are not real users, and they don't perform like humans. Heavy bot traffic (5% of sessions or more) can skew our data and pollute our analytics.
How do you keep the bots out of your Google Analytics reports?
In this post, you're going to learn how to:
Interested in Google Analytics? Get our Google Analytics Account Setup & Optimization Checklist
Google doesn't always block the bot
I love Google Analytics… But sometimes their “one size fits all” tool misses the mark.
Now, I've been pretty outspoken about how Google handles spam traffic. And for a long time, my take was that Google wasn't doing much to keep the spam traffic out of our analytics reports.
Google was like Fredo in the Godfather II when it came to defending our data against spam – drastically underachieving!
And the community noticed. Like many Google Analytics users, I started checking out other analytics products. Maybe the grass was greener somewhere else?
When users started threatening to move away from Google Analytics, Google took notice. And since then things have gotten better. There's been a noticeable reduction of spam traffic in our analytics reports.
But, spam is just one type of bot traffic that can pollute our analytics data. Many varieties bots hit our websites, and sometimes we are the ones sending bots to our sites.
How do we identify the bot traffic?
Defending your data against bot traffic is a bit like playing whack-a-mole.
You have to identify the unwanted website hits and respond.
Let's look at how we can identify bot traffic. And let's walk through some strategies for blocking this traffic from our analytics reports.
Analytics Course Student Question
One of our Analytics Course students recently noticed a big issue with bot traffic in his reports. And he wants to know how to respond to the problem.
We recently sent an email blast out using an email list that we rented. The company is well known with a good reputation as far as we can tell. We received clicks to the site but no action once on the website and looking through GA I see that 82% of the clicks can be located to the United States but is (not set) for state, city & metro. I have not seen such a high number of (not set), in fact we on another site we have we had 300 not set in the last 250,000 sessions.
The email was targeted at Seattle and the Network/Service Provider dimension shows Microsoft Corporation. Looking at a number of other sites we manage, we don’t have anywhere near the % of location (not set), even for Network/Service Provider: Microsoft Corporation.
So what do you think. Is this a case of a narrow target to Seattle for a number of individuals at Microsoft that happen to block geo location or perhaps something fishy with bot clicks from the email list provider to show results? (no accusations, just haven’t seen numbers like this
Here's Keith problem: His company sent out a targeted email campaign, and it resulted in a bunch of unwanted hits in their Google Analytics reports. This happens from time to time.
But then something interesting happened. The geo-location for the majority of this traffic was “(not set).” And all the not set traffic is coming from one ISP organization – Microsoft Corporation.
Keith wants to know if this bot traffic?
It is most likely bot traffic
Here' why: Keith is getting a disproportionate amount of location (not set) data in his reports. Also, the traffic doesn't sound like it exhibits human behavior. It's doubtful that Microsoft employs a bunch of people to sit around and click email links, and then immediately bounce off his site once they click through.
So how do we keep this traffic out of our reports?
The Google black-box solution
The easiest way to keep bot traffic out of your Analytics reports is to use Google's automatic filter. To set up this filter, go to your view settings and check the box that says “Exclude all hits from known bots and spiders.”
I've used Google's auto filter on almost every account I've analyzed.
And I'd say 60% of the time; it works all the time.
When this filter doesn't work, two things could be happening.
False positives and false negatives.
Google's black-box doesn't always exclude the traffic you want to exclude. And it doesn't always include the traffic you want to include.
So, you have to run tests. You want to be proactive with testing your filters, and not trusting Google blindly.
Why? Because you can't remove bot traffic from your analytics data after the fact. So you want to put filters in place before this traffic becomes pervasive in your reports.
Here's how to set up bot filters in Google Analytics
Step #1 – Create a new Google Analytics View
When you create a view to test bot traffic, give your view a very specific name. That way other users in your account will know the view is only for testing bot filters. Google organizes views alphabetically. If you start the name of your view with “XX,” it should show up at the bottom of the view list, and most users won't see it in your account.
Step #2 – Uncheck your bot setting
In your new view, uncheck your bot filter. You want to let the bot traffic into this view.
Step #3 – Understand your bot traffic
In the example below I've tried to replicate Keith's problem.
If you look at the traffic coming from “microsoft corp,” you can see the average session duration is 2 seconds. The other behavior metrics are also different from the rest of the traffic.
The lousy traffic doesn't have all the bot qualities we usually see. Bot traffic typically has a 100% bounce and 1 page per session.
But it still looks like junk to me. So, I am calling it a bot!
Step #4 – Develop a filter pattern
In this case, we'll create a filter excluding traffic by ISP Organization.
Step #5 – Verify your filter
In your filter settings, use the “filter verification” to run a test.
Our verification test indicates that our filter should be useful.
Step #6 – See if it worked
You'll have to check your reports to see if your filter worked. It may take a little a while to find out if you blocked the bot traffic. Be patient! You should know after a couple of days if your filter has removed the junk traffic from your reports.
If your filter was successful, you could add it to your main view. If it didn't work? Try adjusting your exclusions again.
The waiting game is part of the life of an analyst. You're often waiting for the data to come in so that you can review results.
It's part of the cycle of reviewing your data quality.
The cycle works like this:
- You analyze the traffic in your reports
- Then you identify anomalies in your traffic.
- You determine the cause of those anomalies.
- Next, you implement a fix for these problems.
- Then, you document the flaws, using annotations or other records
- And you analyze your traffic again – a day, a week, a month later, to make sure your solutions are working.
Hopefully, you can apply this method to your future data quality analysis.
Filtering bot traffic: Questions or comments
We did our best to be thorough in explaining how to filter bot traffic from Google Analytics, but every situation is different. So let us know how we can help.
Do you have questions about identifying or excluding bot traffic? Leave a comment below, and I'll answer any questions you have.