Semalt Expert Elaborates On How To Fight Rogue Traffic In Your Google Analytics

There are instances where rogue traffic may be sent to Google Analytics property. Rogue traffic refers to traffic that is not required in one's reporting statistics. This is simply because rogue traffic is not real traffic and therefore gives false data that skews statistics. The key to ending this menace is the use of a hostname inclusion.

Lisa Mitchell, the Customer Success Manager of Semalt Digital Services, defines how to avoid annoying rogue traffic and stay safe.

There are three key factors that bring about rogue traffic:

  • Data may be sent to the same Google Analytics account by test server traffic.
  • The same tracking code may be used accidentally on another website owned by the same user.
  • The property ID may be hijacked and used to send false data originating from unrelated websites.

Rogue traffic can be exempted from reporting data by use of either an exclude filter or an include filter. However, it is recommended to set up an include filter and use it to block rogue traffic. This is useful because it allows for reports to contain only data that is extracted from the correct hostname. A hostname refers to the domain on which the website of interest is running. The following are the requirements prior to the setup of a hostname:

  • A confirmation from the developers regarding the website domain
  • There should be a test view responsible for monitoring the filter effects before transfer to the main reporting view.
  • There should be acquired knowledge on how to write regular expressions.

One can find out if a Google Analytics account reports have unwanted sources through checking the hostname report. The hostname report can be arrived at by checking the Google Analytics account in the Audience section. Select the technology option followed by the network. This is followed by selecting the primary dimension of the hostname.

It is possible to receive traffic from development servers if a hostname has not been applied. Traffic from http://translate.googleusercontent.com/ and http://webcache.googleusercontent.com/ may be received and should be received mainly because:

1. They show when Google translate service has been used on a website

2. They show when the "cached" option has been selected on a search result.

How to Create the Regular Expression Pattern

  • An expression creator should be used in creation of the expression.
  • A syntax cheatsheet should be used.
  • Metacharacters should be avoided by putting a back (\) or a forward slash (/) in front of a full stop.

An example of a regular expression is freshegg\.(co\.uk|com)|googleusercontent. The filter should be thoroughly tested because once data has been excluded from the reporting view, it can never be recovered.

How to set up the filter in the test view

It is important to ensure that the filter does not get carried to unwanted properties by following this naming convention procedure:

Step 1

Go to the filter settings

Step 2

On the filter type, select the "custom include" option

Step 3

On the "filter field" option input the hostname

Step 4

On the "Filter pattern" input the regular expression hostname

Testing of the hostname

The following steps should be carried out before the filter is applied to the main reporting view

1. Real- time reports should be checked to identify occurring issues

2. Wait for a period of 2-5 days which depends on the volume of traffic

3. Compare the data obtained from the test view with that of reporting view

Filters should not be applied to a reporting view on Fridays because they will not be checked up until Monday.