- December 16, 2019
- Posted by: Nmcuong.91
- Categories: Data Analytics + Business Intelligence, Digital marketing
You may have noticed that some of your Google Analytics data isn’t entirely accurate.
Whether you saw a sudden, unwarranted change in user behavior, picked up on major differences after a redesign, or found some unexplainable information within a report, there are many things that can indicate issues with your data.
And that’s completely normal.
Google Analytics is one of the most popular (if not the most popular) platforms for monitoring site performance.
It can provide tons of valuable insight and is considered by many SEOs and site owners to be an indispensable tool.
But it isn’t perfect.
In fact, most site owners will run into a discrepancy at some point or another.
There are a few reasons for that.
These reasons fall into two broad categories:
- Issues you can fix
- Issues you can’t fix
That’s right: Some of the issues causing problems with your site are unavoidable because you have no control over them.
Fortunately, there are also plenty of issues that you can fix. And if you have any of those problems on your site, it’s important to do what you can to fix them.
Your data can provide valuable insight and help you improve your website and business as a whole.
But that’s only true if you’re basing your decisions on accurate information.
If your data is riddled with discrepancies and incorrect data, you might as well be guessing.
[tweet_box design=”default”]In fact, making decisions based on faulty data is arguably worse than guessing, since it gives you a false sense of confidence in your decisions.[/tweet_box]
But before we get into the issues that could be causing the issues you’ve noticed, it’s important to understand one issue that often goes undetected: dark traffic.
What is dark traffic?
If you’ve spent much time in Analytics, you’re familiar with the concept of “Direct” traffic.
In theory, this represents visitors that type your URL into their browsers, then navigate directly to your site.
But in reality, that’s not always the case.
For example, it makes complete sense for certain users to come directly to your homepage. Your domain name is likely short and easy to remember, so it’s easy to type in and visit your site.
But if you check out your Direct traffic report (by navigating to Acquisition > All Traffic > Channels, then selecting “Direct”), you’ll likely notice that a lot of your “direct” traffic is landing on pages other than your homepage.
In some cases, this makes sense, especially for pages with relatively short URLs.
But most visitors aren’t going to remember and return directly to deeper pages within your site. For example, how many users do you think would memorize a URL like this one from Groupon?
Not many, right?
That’s the logical answer.
But after digging into their data, Groupon found that a ton of their traffic to pages like this one was being attributed as direct.
They decided to determine why that was by temporarily de-indexing their site. As an aside, this is absolutely not something I recommend trying yourself.
But the idea here is that removing their site from search would temporarily eliminate their organic traffic. So if they also noticed a drop in direct traffic, they could safely assume that some of that traffic was actually coming from search.
So, is that what happened?
You can check out the results for yourself in the following graphs.
The purple lines represent the site’s direct and organic traffic during the week before the test, while the site was fully indexed.
The orange lines represent how both types of traffic fell during the test period.
Their organic traffic, as expected, dropped to almost nothing.
But the site’s direct traffic also decreased by 60%.
If all of the direct traffic reported during a normal week was really just from users directly typing a URL into their browsers, then that traffic level should’ve remained relatively stable even after the site was deindexed.
So, what’s the logical conclusion here?
That 60% of Groupon’s “direct” traffic was actually from organic search.
This is a huge discrepancy — but unfortunately not an uncommon scenario. In another example, The Atlantic determined that 25% of their traffic was being miscategorized a direct.
But how did they arrive at that figure, and how can you do the same?
The easiest way is to create a segment to isolate the dark traffic in your Google Analytics report. This segment will show traffic that’s being reported as direct, but landing on deep pages within your site — so you can logically conclude it’s not actually direct.
First, navigate to your Audience Overview report. Then, click “Add Segment.”
Then, you’ll need to select the “Traffic Sources” tab and choose “[direct].”
From here, you’ll need to add a filter to exclude the pages that users might actually be navigating to directly.
Select the “Conditions” tab and choose the “Landing Page” option. Then, select “is not one of” and enter a forward slash (“/”) into the field.
This way, your segment will only include users who arrive on pages other than your homepage.
In some cases, you want to add additional filters if there are other pages on your site you expect visitors to navigate to directly.
For example, in Groupon’s case, they often include the URL “groupon.com/getaways” in campaigns that focus on their vacation deals.
As a result, it makes sense that users would arrive on those pages.
So if you use easy-to-remember pages names in your offline advertising campaigns (like “yourdomain.com/radio” or “yourdomain.com/conference”), you’ll also want to filter out those visits.
You can also attribute those visits to specific channels or campaigns, so they’re not dark traffic.
Then, to the right of your settings, you’ll see a summary of how many of your users match the criteria you’ve set for your segment. This can give you a general idea of how much dark traffic is in your Analytics reports.
Finally, save your segment, and you’ll see a report that looks something like this:
This report will help you get an idea of how much of your site’s traffic is dark, as well as how many conversions are being falsely attributed to direct traffic.
And although you can’t retroactively determine where those visitors and traffic are really coming from, you can get a better understanding of how many of your digital marketing campaigns and strategies aren’t getting the credit they deserve for generating conversions and results.
As the Groupon example illustrates, a large chunk of this traffic could be from your SEO campaigns.
But if you run advertising campaigns on other online platforms, traffic from those campaigns could also be getting falsely lumped in with your direct traffic.
Fortunately, this is a fixable issue. In fact, it’s the first fixable issue we’ll get into in the next section.
So with that in mind, let’s jump into seven avoidable problems that might be causing inaccuracies in your Google Analytics reports, and what you can do to fix them.
Google Analytics errors you can fix
First, let’s start with the things you can do something about.
1. Failure to tag campaigns
One of the biggest causes of inaccuracies within Google Analytics is a lack of information about where your visitors are coming from.
And while some of that is unavoidable, you can address a large part of the problem by adding tracking information to the URLs you use in your online advertising campaigns.
If you aren’t tagging your campaigns, you’re missing out on a ton of helpful data in your Analytics reports.
Fortunately, tracking your campaigns is much easier than it might sound.
You’ll just need to start adding UTM parameters to your URLs.
UTM parameters are bits of data that tell Analytics where, exactly, the users who click a URL are coming from.
There are a few different parameters you can use to track this information:
- Medium: This indicates the channel is coming from. You can use standard channel names like Social, Paid, Email, and Referral.
- Source: This is the individual site within that channel, like Facebook.
- Campaign: This lets you include the name of a specific ad campaign.
- Content: If you’re running multiple ads within the same campaign, this parameter lets you track those ads individually.
- Term: This used to be a parameter for tracking keywords within paid search campaigns, but now that AdWords integration with Analytics, you’ll rarely ever need it.
Whenever you launch a new campaign, you can use Google’s Campaign URL Builder to easily create a trackable URL.
First, enter your domain into the URL box. Then, add a channel name.
From there, the rest of the fields are optional — but in most cases, it makes sense to at least include a medium.
For example, let’s say you wanted to track visits to your site from users who click on your email signature. Clicks from email aren’t normally, tracked, so they likely make up at least a portion of your dark traffic.
To track those clicks, you’d add “email” as the medium, and “signature” as the source.
Then, the builder will provide a URL you can copy and paste wherever relevant.
In this case, you’d use this link in your email signature. So whenever a user clicked that link, Analytics would be able to register exactly how they arrived on your site.
It’s a good idea to get into the habit of creating these URLs for each new digital marketing campaign you launch.
[tweet_box design=”default”]Creating a URL only takes a few seconds and will allow you to get more accurate insight from your Analytics.[/tweet_box]
Plus, for every tracked URL you create, you eliminate a source of dark traffic — meaning you can be more confident in the validity of the data in your reports.
2. Missing tracking code
So far, I’ve focused on issues that can prevent you from collecting accurate referral data about your traffic.
But it’s also important to consider an issue that can prevent you from collecting any data at all: missing tracking codes on your pages.
Google Analytics relies on a JavaScript “tag” on each page of your site to track visits. If you set up your account yourself, you likely copy and pasted a snippet that looked something like the following screenshot into your site’s code.
This code is what lets Analytics register your traffic and collect data about your users. It needs to be present on each page of your site.
If it’s not included on one of your pages, there won’t be any data for that page in Analytics.
There are a few ways this error might become apparent. For example, if you redesign or make any major changes to your site’s code, then notice a drop that looks like this, an issue with your tags could be to blame.
The untagged pages won’t register data, so it will appear that you’ve experienced a huge drop in traffic.
Improper tagging can also lead to self-referrals.
A self-referral is when your own domain is listed as a source of referral traffic to your site. If these are an issue for your site, they’ll appear in your referral report like this:
This isn’t supposed to happen.
When users move through correctly-tagged pages, Analytics can track that movement and give you more insight into your traffic flows.
But when a user moves from a page without your tracking code to a page that does have the tracking code, that first page is registered as a source of referral traffic.
This prevents you from accessing valuable data about how your visitor navigate your site.
Plus, it means that your traffic numbers are inaccurate.
Fortunately, this is a relatively straightforward issue to fix. You just need to include your tracking code on every page.
But first, you’ll need to figure out which pages are creating the problem.
If you only have a five-page website, you can easily check if the script is present manually. But if you have hundreds or thousands of pages, it’s difficult (if not impossible) to check each page individually.
Fortunately, there are tools you can use to verify that your pages are tagged and find the ones that aren’t.
So if you think that some of your pages are missing a tracking code, but you aren’t sure which, check out this tutorial on using Screaming Frog’s SEO spider tool to learn how to search every page for the presence of your Google Analytics code.
Then, you can remedy the issue by properly implementing your tracking code on the pages that are missing it.
The best way to do this is by including the code in your site’s header file.
Some site owners make the mistake of adding their code individually to each page. The reasoning behind this decision is often to avoid slowing down page load times.
They place it just above each page’s tag so that the visual elements can load before the code executes. This way, the code doesn’t delay the elements a user sees from loading.
However, this is an unnecessary concern. Google Analytics tracking codes load asynchronously, meaning they don’t slow down page load times — regardless of where they appear.
As a result, the best place for your tracking code is in the header file that already appears on each of your pages. This way, you don’t have to worry about adding it manually when you create new pages.
It’ll be included in your template, so you can start collecting data as soon as you upload the new page to your site.
3. More than one Google Analytics tracking code on the page
So: not having a tracking code on a page will mess up on your data.
And on the flip side, having multiple tracking codes on your page will also mess up your data.
For example, let’s say you update your site in a way that impacts a large chunk of your pages, like updating your main navigation bar.
Then, you see a sudden, roughly double, increase in pageviews that looks like this:
Did your update have that big of an impact on your site’s traffic?
Probably not.
The much more likely explanation is that your tracking code was duplicated on some (or all) of your pages.
Each script on your site records a new visit, even if those scripts are on the same page. This means that each visitor to a page with two scripts will register two pageviews.
So if you notice a quick, unexplainable spike in traffic, this could potentially be the cause.
Multiple scripts per page can also lead to inaccurate bounce rate reporting. That’s because when Analytics registers both scripts back-to-back, it sees this as a user moving between two pages on your site.
If you notice that your bounce rate suddenly drops to a number seems too good to be true, it probably is.
So before you start celebrating your increased traffic and nonexistent bounce rate, double check your pages to make sure they each only contain one tracking script.
4. Improper tracking of subdomains
If you have any subdomains on your site, these can easily create issues within your Analytics tracking.
For example, if your blog is hosted at “blog.yourdomain.com,” it could be counted as a source of referral traffic — even though it’s part of your site.
To eliminate this issue, you’ll need to modify your tracking code. You can do this by following this step-by-step tutorial for accurate cross-domain tracking.
You can also prevent your subdomains from appearing as referral sources by setting up referral exclusions in your Google Analytics settings.
First, open the Admin tab and click “Tracking Info” in the Property Column. Then, select “Referral Exclusion List” and click “Add Referral Exclusion.”
Here, enter your subdomain URL and click “Create.”
This will prevent your subdomain from appearing as a referral source, and keep Analytics from inaccurately registering users who move from your subdomain to your main site as referral traffic.
5. Internal traffic appearing in reports
The entire point of using a tool like Google Analytics is to learn about visitors who are presumably part of your target audience.
This way, you can get a better understanding of how they interact with your content, and improve your site to convert more of them into customers or clients.
But it’s important to remember that not every visitor to your site is part of your target audience.
For example, you.
You likely visit your site on a regular basis for maintenance purposes, or to see the changes that other members of your team have made.
For data purposes, your visits are meaningless. So having them registered in Analytics is unnecessary.
If you’re the only person who works on your site, this isn’t a huge deal. A few extra pageviews here and there won’t have a major impact on your data.
But what if you have an entire dev team working on your site on a regular basis?
Or what if members of your sales team regularly consult your service pages to double-check the features included in different plans?
This could have a much more significant impact.
Fortunately, this is another easy issue to fix. All you have to do is filter out visits from your office so that they don’t appear in your reports.
Open your Admin tab, and under your View, select “Filters.” Then, click “Add Filter.”
Give your filter a logical name to indicate the exact location you’re excluding from your reports. If you only have one office location, you could simply name your filter, “Office.”
If you have multiple offices or workspaces, however, you’ll want to be a bit more specific in the filter names for each.
Then, set your filter type to “Exclude” and select “Traffic from the IP Addresses” and “That Are Equal To.”
Next, enter your office’s IP address in the IP Address field.
If you’re not sure what your IP address is, you can find it by performing a Google search for the phrase, “what’s my IP address.”
Once you click “Save,” traffic from within your office will no longer be included in your Analytics reports.
Then, you can repeat this process for any other IP addresses that members of your team often access your site from. If you regularly work from home, for example, you’ll likely want to exclude your home IP address, too.
6. Improper goal setup
[tweet_box design=”default”]Goals are an extremely important part of your Google Analytics reports.[/tweet_box]
They help you determine how often users are taking important actions on your site, like making purchases and filling out contact forms. They also tell you where those users are coming from, and which content on your site they interact with before taking action.
This information gives you a better idea of your site’s performance, as well as insight into how you can get even more of your visitors to convert.
As a result, it’s essential that your Analytics account accurate tracks each conversion on your site.
So if you notice any issues with your goal tracking, this is something you’ll want to address immediately.
In most cases, it’s easy to determine when your goal reporting isn’t properly functioning because your Analytics account isn’t the only place you can see conversion-related information.
For example, if a user submits a contact form, their information will either show up in your inbox or a third-party management tool like Nutshell or Salesforce.
Although you likely won’t spend time individually tallying up the submissions you receive, it’s fairly easy to tell whether those submissions are being accurately reported in Analytics.
If you know you’ve been getting form submissions, but your conversion report doesn’t show them, that’s a clear indicator that there’s an issue with your goal setup.
If this is the case, you’ll need to do some digging into your Goal setup to see what’s happening.
Open your Admin tab, navigate to your View, and select “Goals.”
If any of your goals are showing zero conversions, and you know that isn’t actually the case, you can follow Google’s goal troubleshooting article to identify and solve the problem.
7. Referral spam
If you notice a large jump in referral traffic, that could be a great sign!
But before you get too excited, you should make sure that it’s coming from legitimate websites.
In many cases, sudden spikes are an indicator of referral spam — which isn’t valuable and can skew your reporting data.
If you’re unfamiliar with the concept of referral spam, it’s essentially “fake” traffic from bots and crawlers from spam sites. This means that these pageviews aren’t from real visitors.
As a result, referral spam can lead to artificially inflated pageviews and useless referral data.
You can see if this is an issue for your site by opening your referral report and checking to see if there are any strange domains sending your site traffic.
If you notice any “spammy”-sounding domains, look at the bounce rate of the traffic they’re sending you. If it’s 100% or close to 100%, you can safely assume that the traffic from that domain is referral spam.
For example, take a look at the two spam domains in this screenshot:
The “traffic” from these two domains isn’t real, so it doesn’t add any value to this site’s Analytics reports.
If you find that referral spam is an issue for your site, there are a few ways to address it.
First, you can remove spam from your existing reports by setting up a custom segment that excludes data from the domains you’ve identified as spam.
Then, you can follow this tutorial on filtering referral spam in your View to prevent “traffic” from those sites from being collected in the future.
Google Analytics errors you have no control over
Now that I’ve covered the issues you can fix, it’s also important to be aware of the ones you can’t.
Although there aren’t any steps you can take to eliminate these discrepancies, being aware of them will help you better understand your data.
With that in mind, let’s jump into seven unavoidable issues that might be impacting your Analytics data.
1. Some browsers have JavaScript disabled
As I mentioned above, Analytics tracks your traffic using a JavaScript tag.
But if a user visits your site with JavaScript disabled in their browser, you won’t be able to register any data for that user.
Thankfully, this is an extremely minimal issue.
In 2010, Yahoo reported that just over 2% of US internet users had JavaScript disabled.
And as of 2016, Blockmetry reported that only 0.2% of pageviews from worldwide traffic in 2016 came from browsers with JavaScript disabled.
So although it’s beneficial to know that this is a possibility, it likely doesn’t have much of an impact on your Analytics data.
In addition to your tracking code, Google Analytics also collects user data with cookies.
Cookies are important because Analytics can use them to tag a visitor, then aggregate that visitor’s behavior over the course of multiple visits.
Here’s a basic overview of how that works:
So, for example, let’s say a user first arrives on your site by clicking on a Facebook advertisement. They browse a few pages, then leave.
The next day, they type your brand name into Google, click your site in the search results, and make a purchase.
Using cookie data, Analytics can tell that these visits are from the same user. They’ll register as separate sessions, but you’ll be able to tell from your attribution reports that the user first arrived as a result of that Facebook ad.
Without cookie data, the platform wouldn’t be able to do this. It would be impossible to gain insight into how users behave on multiple visits to your site.
So if Analytics is unable to collect cookie data, this can skew the information in your reports.
There are a few things that can cause this to happen.
- The visitor’s browser doesn’t accept cookies.
- The visitor’s firewall blocks or deletes cookies
- The visitor deletes cookies manually.
Fortunately, in one study by Yell, they found that only 0.2% of users did not allow cookies.
Of course, this doesn’t account for cookie data being erased whenever a user clears their browsing history.
Still, it’s safe to assume that users who don’t accept cookies only have a minimal impact on your Analytics data, if any.
As we established in the last point, users who don’t accept cookies at all likely don’t have a huge impact on your reports.
But it’s also important to consider the possibility of cookies timing out.
Google Analytics uses two different types of cookies to track visitors:
A persistent cookie. This cookie is placed on a user’s device the first time they visit. It remains for two years, or until it is deleted, the browser is re-installed, or a manual removal occurs.
A session cookie. This is a new cookie that is placed each time a user visits. Users receive a new session cookie each time they arrive on your site.
Here’s a more in-depth explanation of the differences between the two:
So, why does this matter?
Consider the following scenario. A user:
- Visits a page
- Goes to dinner for two hours, leaving the page open
- Comes back and starts browsing
Now, keep in mind that Google Analytics ends a visitor session after 30 minutes of inactivity.
In the scenario described above, a new session cookie would be placed when the visitor starts browsing again. Even though they never actually navigated away from the page, it would be considered a brand new session by Google Analytics.
Now, in my opinion, this isn’t actually an “inaccurate” way to register data. It makes sense to consider two browsing sessions that take place 30+ minutes apart as separate occasions.
But it’s still a bit of a complicated reporting detail, and something to be aware of as you use Analytics.
4. The same user on different devices
Once again, an example is best to illustrate how users switching devices can impact your Analytics data.
Consider this scenario:
- Jane is standing in line at the grocery store.
- She starts researching a product on a website from her iPhone.
- A cookie is set on her iPhone.
- She gets home, unloads the groceries, and then purchases that same product from the same site on her laptop.
In a perfect world, Jane’s behavior would be tracked as she switches devices, and that site would be able to register her as the same user for both sessions.
But Analytics simply isn’t that advanced. It can’t track when a user switches between devices.
At least not yet.
As it is, the purchase will be attributed to a brand new visit.
Given the prevalence of cross-device usage, this can create discrepancies in your reports.
In fact, one survey found that 81% of Internet users browse on multiple devices, and 67% reported shopping on multiple devices.
Unfortunately, there’s nothing you can do to collect more accurate data on users who visit your site on multiple devices.
Still, being aware of this possibility can help you develop informed hypotheses for strange user behavior on your site.
5. Google Analytics doesn’t reprocess information
Let’s say you have been tracking data using Google Analytics for two years.
Then, you decide to gain more insight into how users are converting by setting up a funnel for one of your most important conversion goals.
You’ll be able to start collecting data on how users move through that funnel. But Google Analytics will not retroactively apply your funnel to the data it’s already collected.
Your goal funnel will be visible from the time you set it up forward, but no data will be available before that time.
To be fair, that would be a lot to ask of a free service that processes the massive amount of data that Analytics processes every day.
[tweet_box design=”default”]But it’s important to be aware that if you set up goal funnels or add new filters to fix your data, these actions won’t impact the data already in your reports.[/tweet_box]
So if you add a new filter, don’t expect that all of your data will automatically reflect that change, and give a more accurate picture of your site’s performance.
You can add segments to view your historical data the way you want, as I described in the section above about dark traffic.
But make sure to take this step — otherwise, you’ll be working with data that doesn’t reflect your new filter changes.
6. Google Analytics isn’t in real-time
Although this doesn’t necessarily make your reports inaccurate, it’s important to remember that Google Analytics doesn’t display information in “real-time.”
You can see some real-time data in your Real Time report
But for all of your other reports, it’s best to assume that Google Analytics is 24 hours behind.
So if you make any major changes to your site, wait at least a day to start looking for changes in Analytics.
Otherwise, the impact of your changes likely won’t be appearing yet — so it’s too soon to start analyzing your results.
7. The platform could be sampling your data
If you have a very active website and you try to run a report that includes a large amount of data, Google Analytics might use data sampling.
In other words, they won’t include all of your data in the report.
Including large quantities of data is extremely resource-intensive, so the platform can speed up the process by providing you with a subset of that data.
So if you see a yellow box like this one in the upper right corner of your report, your data is being sampled.
In the screenshot above, the report is only including data from about 30% of visits.
If your site doesn’t attract huge levels of traffic, and you don’t see this little box, you don’t need to worry about data sampling at all.
But if you do see this box, it’s important to recognize that your report doesn’t reflect all of your site’s traffic, and could be skewed as a result.
Conclusion
There are many different factors that play a role in the accuracy of your Analytics reports.
Fortunately, you have control over many of those factors. So if you find that any of the first seven listed in this article are impacting your reports, you can take steps to remedy those issues.
And as for the rest of the factors potentially impacting your reports, all you can do is be aware of them.
Let’s put it this way: It’s not ideal to build a house using a tape measure that measures a foot at 10 inches.
But as long as you’re aware of that error, and the error is consistent, you can still build a reasonably solid house.
If you can fix the tape measure, do so. But if you have no control over it, learn to make the judgments that you can, understanding that the data is flawed.
Even with slight discrepancies, Analytics is still a valuable tool for any site owner. But the more aware you are of those inaccuracies, the more informed decisions you can make to improve your site.
Source: https://www.crazyegg.com/