How to Exclude Bots Visits/Pageloads in Statcounter Web Analytics

Share on Google Plus

Statcounter

For those who prefer statcounter in monitoring day-to-day activities of their blog, there are some things that might help narrow down your stats to better reflects the actual visits and pageloads your site is getting, and not a rough estimate.

By default, statcounter does not only track incoming visitors to a specific webpage, it also track search engines bots and other bots and reflect them along with site visits and pageload statistics in dashboard. To better understand how this greatly affect your tracking, we'll share some insights. We've been statcounter user for a while, and in our years of using this platform on most of our blogs, we have learned that each time we publish a post, not less than two Google bots will visits the published page immediately to crawl it (this doesn't necessarily mean the post has been or will be indexed), each bot bearing different I.P and sometimes different domain extension. That is for one publication on a site receiving like 400 visits a day. The higher the number of site traffic, the higher the number of bots it attracts, a basic fact for all websites. So lets say you are a blogger getting a thousand visits and publishing like 10 articles a day, then you can as well expect nothing less than 40 Google bots to crawl your site in a day (excluding other bots), all of which are added to your visits and pageload stats in statcounter.

For now, we will be focusing on Google bots only. There is one other thing you need to know. Google has hundreds (if not thousands) of crawlers (or bots) governing the web, and each crawler has its own dedicated IP address. While statcounter has the ability to identify and track a single visitor irrespective of changes in his IP address or browser he use to access the site at a given time using some algorithms based on cache and cookies, this doesn't seem so with bots/crawlers visit. Bot can be describe as a lifeless being in a living body, ghost would have been a perfect word to describe it but it has a footprint (IP, Region/Location e.t.c). They have the ability to avoid and being thoroughly screened like real human, though not all of them.

In one word, each Google bot visit are counted as real visitor in statcounter by default. So lets say statcounter records 950 visits and 1,450 pageloads in a day, the actual or real visits might be 800 and pageloads 1,200. This doesn't exclude Google Analytics as well, but that's another entire topic which we won't be able to discuss for now (Google Analytics have bots filtering, you'll need to enable it though). We believe all webmasters know that the only way to know how successful a site will be is via its analytics. It is a crucial thing that should never be underrated or overlook, and by ignoring this little (but huge) miscalculation can lead to ones doom. So, how do we solve the problem?

As said earlier, we will be focusing on Google bots only. After monitoring these google bots activities (via statcounter) for a long time, we were able to to identify Google bots IP patterns that usually crawl the given site, and that gave us a solution in excluding them from our site analytics. There is an option in statcounter that lets you filter out IPs you don't want statcounter to track, and only with this option can you narrow down the percentage of miscalculation in your site analytics. As we already identify these Google bots IPs range, we will be explaining how you can apply it in your statcounter settings by following the steps below.

1.) Login to your statcounter dashboard on PC or mobile.

2.) Scroll down and click on Project Settings

3.) You will see IP Blocking among the list of options

4.) Enter these IPs in the box (exactly as written below)

64.233.*.*
66.102.*.*
66.249.*.*

The asterisks (*) here function as wildcard. It represents other different occurrences that are likely to occur within the sets of IPs at that range. There are hundreds of Google bots on the net, each with a dedicated IP following nearly the same pattern. The only way to address and capture them all is to make use of wildcard. When we said ALL, we do not mean the entire Google IPs, but the most common range. Please, do note that different Google bots IPs apply to different sites and different zones, you may need to figure yours out.

Doing the following will give you a better result. And if you notice other bots crawling your site to destabilize your statcounter analytic, just write down its IP and add it to the Block list. Also, make sure you create Blocking Cookie(under Installation and configuration settings).

Basic Ways to Identify a Bot

Some might ask, how do we identify a bot? Identifying a bot is simple and sometimes (in rare occasion) complex. You might be wondering why we do not mention Yandex and Bing bots, it is because these bots (Yandex and Bing) are configured to work in stealth mode and doesn't reflect their true identity. It is easy to identify Google bots by just glaring at label names in statcounter (and other analytic tools) because they all wear Google badge and are traceable. But Bing bot doesn't wear Bing badge, they wear something else that are not easily identifiable. And unlike Google bots, they rarely crawls a site multiple times a day, so they pose not much treat in increasing the figure of your statistics.

Do not forget there are hundreds of other search engines on the planet, but thankfully more than half of these search engines depend (or should we say uses) Google's own search algorithm to deliver their results, meaning, they do not send out bots to harvest the web but depend solely on Google Search to deliver their results. Also, some of these search engines are region specific in gathering their information, which means only sites in a specific zone are nurture and cater for e.g yandex.ru

How to identify bots

1.) They tend to land on homepage rather than separate pages of your site.

2.) They are persistent, very persistent (especially the dangerous ones).

3.) They give 100% bounce rate in analytics.

4.) Most times they are anonymous.


Any visits that tends to have these characteristics should be examined.
Share on Google Plus
Read it in Your Language

About CCN World Tech

logo
CCN World Tech is a platform specifically dedicated in providing latest tech related news and articles around the world. We also tutor people on how to get the best out of their handheld and tech pertaining devices. You can follow us on Facebook, Twitter, and Google+.

    Post Comment
    Facebook Comment

0 comments:

Post a Comment

Leave a reply

Disclaimer: Informations provided on this site are verified and are deemed to be accurate, but notwithstanding, they are subjected to be edited, rewritten, or modified at anytime.