blog posts

How to stop Ghost Spam with Google Analytics

Spam in GA is a serious problem. Due to the increase in spam, new sites and many other sources have installed filters to deal with spam to control non-useful information. However, there is no feeling of worry. In this article, we will mention the most common mistakes of users in dealing with spam and provide you with useful solutions. But first, let’s explain how spam works. Mr. Jared Gardner wrote a spam article a few months ago and examined its goals. He also mentioned many examples of spam.

Types of spam:

There are two types of spam in Google Analytics:

1- ghosts 2- crawlers

ghosts :

Most of the spam is of this type. These spams are called ghosts because they never reach your site. But it is important to use this spam to find a suitable solution to control spam. Contrary to what is expected, this type of spam has nothing to do with your site. Even though the main purpose of Google Analytics is to check visitors on the site, it is surprising how this type of spam has nothing to do with your site. . Ghost allows users to send data directly to google analytics servers through the measurement protocol. Through this method, spammers leave codes like (UA-XXXXX-1), and spammers send fake information in. They put a site without knowing the attacked site.

Crawler:

Crawlers, on the other hand, ghosts can access your site. As the name suggests, they crawl on your pages and even ignore the rules that do not allow robots to read your site and perform their functions. When they leave your site, they leave a visit that looks like a legitimate visit. Crawlers are difficult to identify. Because they know the desired destination and use real information, new examples are rarely found. Therefore, if you identify a visit that looks suspicious in analytics, search for it on Google. Or, based on this list, check whether it is spammy or not.

The most common mistakes in dealing with spam in Google Analytics:

During the past few months, we investigated this problem. Based on the suggestions and discussions, we reached the result that users mainly made three mistakes in dealing with this spam in Google Analytics, which are:

1- Block ghost spam from the htaccess file.

The biggest mistake that users made was that they blocked ghost spam from the htaccess file. These people were unaware of the function of such files because the main purpose of this file is (to block or access) your site. As far as we know, ghosts never access your site. Therefore, adding these ghosts to the said file has no effect. It doesn’t exist, and only a series of lines have been added to this file, which is unnecessary. Ghost spams usually appear on the page for a few days and then disappears. As a result, people think that they have been able to block spam. If this happens by itself and there is no connection with blocking them. . Then when the spammers come back, the users think that their solution was not useful in dealing with the spam, and the spam can pass through their blocks. The htaccess file can only block crawlers with the address buttons for website.com, and the rest will access your site. Most of the spams will not be blocked in this way, and there is no solution other than using filters to get rid of the spam. Does not exist.

2-Using referral exit list to stop spam

The name of this list misleads you because the purpose of this list is not to remove referrals but it pursues other goals. For example, when a customer makes an online purchase, he is directed to the third page for payment, and Google Analytics sees this return as It records a new visit. You can use the referral exclusion list to prevent this type of visit. If you use the referral list to control spam, the referral field will remain empty because no visits have been recorded in advance. As a result, the direct visits of the audience will be recorded and you will face more problems and there are still spams and direct visits. They are difficult to identify.

3-Concern regarding Bounce Rate and its effect on the credibility of the site. Due to the presence of spam, users see changes in the Bounce Rate and at that moment they feel worried about whether these changes affect their credibility in the SERPS.

google-analytics-ghost-spam-2

Google does not analyze BOUNCE RATE changes in google analytics algorithms. Here we give an explanation about this issue by the former head of Google’s spam team. Mr. Gut’s words will reassure you about this issue. Because although everyone has Google Analytics, not everyone uses it. Assuming that your site has been hacked, when people see new pages of spam on their site pages, they express concern that they may have been hacked.

google-analytics-ghost-spam-3

The pages that spam shows on the report pages do not exist at all, and if you try to open them, a 404 page will open, but your site is not disrupted. But you must make sure that these pages do not exist because there are other things besides spam. They destroy your site with malicious keyboards.

In what cases should you feel danger:

We will examine the security problems and their impact on the credibility and rank of the site later, but for now, we will express the importance of information and data. This is a concern that spammers will put fake information on your report. The amount and extent of destruction and its concern depend on the amount of traffic of your site. But every site is attacked by spam. Small and medium sites are easily affected by spam. Because these sites are usually controlled by individuals, and no analysis or webmaster controls them. Large sites with a lot of traffic are also affected by spam. Although their impact is very small, invalid traffic means false reports and has nothing to do with the website’s size. As an analyst, you have to master the events that happen on your reports.

You need a filter to fight ghost spam.

It is usually recommended to use the referral filter after attacking spam. Although this method is a quick action against spam attacks, it also has the following disadvantages:

  • Filtering every week for new spam is very tiring and time-consuming, especially if you manage many sites. Besides, when you start using the filter, some of your information has already been attacked by spam.
  • Some spammers use direct visits during referrals.
  • The attacks of these spammers do not stop in direct visits through the filter. Even if you turn off the referrals, you will still have invalid traffic. With this explanation, why some users have unusual spikes in direct traffic.

Fortunately, there is a solution to deal with all the problems. Most of the spammers attack Google Analytics IDs. In the sense that they do not know where they are attacking. Because the hostname is not set or a fake one is used (refer to bottom table)

google-analytics-ghost-spam-4

According to this table, you will notice that they used strange names or did not even bother to set them. add the In other words, the valid visit will always use the real Hostname. In most cases, it is in the form of a domain. Sites with payment services, translation services, or any site where you have entered Google Analytics codes can be the source of spam.

google-analytics-ghost-7

Based on this, you can use a filter that includes attacks with hostnames . Whether these attacks are in the form of referrals, keywords, screens, or even direct visits, it automatically excludes all attacks caused by ghost spam. they do.

To create a filter, you need to find a report of hostnames as follows:

1- Go to the reporting section in Google Analytics.

2- In the left panel, click on the Audience option.

3- Select the Network option in the Technology icon.

4- Click on hostname at the top of the opened page.

google-analytics-ghost-5

You will see the list of all hostnames. The list of hostnames used by spam can also be seen. The list of all valid hostnames is as follows:

Yourmaindomain.com 
Blog.yourmaindomain.com 
Es.yourmaimdomain.com 
Paying servise.com 
Translatedtool.com 
Anotheruseddomain.com 

For all sites, whether small or big, this list of hostnames will include the domain or several subdomains. After making sure that you have received all the hostnames, enter the following regular expression:

yourmaindomain\.com|anotheruseddomain\.com|payingservice\.com|translatetool\.com

 

There is no need to enter all the subdomains in the above statement. The main domain includes all of them. If you don’t have the view settings without filter, fix it. Then create custom filter. Select the Include option and then click on the hostname option in the filter field and copy your phrase in the filter pattern box