Skip to main content
Posted in

Analytics

Referral Spammer Hitlist (.htaccess directives included)

Referral spam has been problematic for a long time; a quick search turned up an extensive list of referral spammers from 2009 on Perishable Press.

This thread from the Piwik repo on Github is more current (started in 2014) and mentions many of the sites currently contaminating my analytics reports.

My Referral Spammer Hitlist

reports and chart in google analytics show referral spam in 2014

The screenshot above is from Google Analytics (the last 6 months) for a site that I have access to the analytics data of but not the codebase/server (otherwise I’d have remedied the situation already). My significant other’s sister is the creator/founder/owner/president of Megan Lee Designs and, after learning of my vocation, granted me access to her analytics.

While 2.5% of traffic may not seem especially significant, it amounted to 16% (yep, 1 in 6) of their referral traffic! 42 of 143 sources of referral traffic were actually this referral spammer garbage (in the screenshot I included every website that was reported as having sent more than 1 visitor…there are another 35 domains (well, subdomains) that list a single visit. 34 of those are a subdomain of semalt.com (grumble, grumble)

When I first really noticed the problem in my analytics reports (in mid-2014) I came up with a blacklist. I called it a hitlist because, as I told my colleague Charlie, No workplace is complete without a hitlist ;-)

In the spirit of open-source, I give you my current list of directives:

<IfModule mod_rewrite.c>
RewriteEngine on
# Block referral spammers
# Added September 2014
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_REFERER} kambasoft\.com [NC,OR]
RewriteCond %{HTTP_REFERER} darodar\.com [NC,OR]
RewriteCond %{HTTP_REFERER} savetubevideo\.com [NC,OR]
RewriteCond %{HTTP_REFERER} descargar-musica-gratis\.net [NC,OR]
RewriteCond %{HTTP_REFERER} baixar-musicas-gratis\.com [NC,OR]
# Added in December 2014
RewriteCond %{HTTP_REFERER} 7makemoneyonline\.com [NC,OR]
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC]
RewriteRule .* - [F,L]
</IfModule>

Be forewarned, I’m no back-end developer or SysAdmin so while I can piece together enough RegEx to get simple things done, it is entirely possible that the directives written below are more efficient (and they are certainly more comprehensive).

My logic when I wrote them was simple:

  • I don’t care about protocols (http:// or https://)
  • I don’t care about subdomains (they often use lots of them)
  • I don’t care about case (hence the [NC] No Case [sensitivity] flag

If the domain listed as a referrer contains that series of characters (that string) I want to the RewriteRule to kill it before it hits my site (and my analytics) hence [F,L] the ‘fatal’ and ‘last’ rules on the re-write.

For anyone who might be of the copy and paste skill level (I myself was until fairly recently) be mindful of the ‘OR’ part of those directives [NC,OR], an [OR] is needed on all but the last RewriteCond, omitting it on prior conditions or adding it to the last condition will likely cause a series 500 error on your server.

A few months ago I addressed the most flagrant perpetrators—namely semalt.com—but suddenly I was getting visits from buttons-for-website.com.

Other Resources for Blocking Referral Spam

I recently tweeted about buttons-for-website.com and got a reply from @hbeckner who wrote a nice article (in German, but it translated well and the illustrations are in English so I had no problems) about it and the motivation behind it. He pointed me to this thread on the WordPress support forums.

In that thread someone shares their .htaccess directives to block these spammers and they list the following directives:

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://.*backgroundpictures\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*embedle\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*extener\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*fbfreegifts\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*feedouble\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*feedouble\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*joinandplay\.me/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*joingames\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*kambasoft\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*musicprojectfoundation\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*myprintscreen\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*openfrost\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*openmediasoft\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*savetubevideo\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*semalt\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*softomix\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*softomix\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*softomix\.ru/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*soundfrost\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*srecorder\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*vapmedia\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*videofrost\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*videofrost\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*youtubedownload\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*zazagames\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*buttons\-for\-website\.com [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*darodar\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*7makemoneyonline\.com/ [NC]
RewriteRule ^(.*)$ – [F,L]
</IfModule>

That list is a bit more extensive than mine and I also find it interesting to see how people write the directives in different ways.

I’ll try to keep this post up-to-date for my own use and yours if so inclined.

Data-driven digital marketing on a shoestring budget

A DIY guide to making the most of Google services through data sharing

This post expands on a short presentation entitled ‘Making the most of Google Services through data sharing’ I will be giving at the Web Analytics Wednesday Meetup in Columbus, Ohio on February 19, 2014.

The primary reasons I use Google services are low cost and staggering reach.

Most of the tools mentioned are free and the others can be setup—initially, at least—for less than $100 (their current evergreen AdWords introductory offer is ‘Spend $25 and get an additional $75 credit’).

In the presentation I touch on utilizing GTM, GA, GWT and AdWords but there are other tools that I regularly use in conjunction with those.

Those include:

  • Google Trends
  • Google Consumer Surveys
  • Google+ Local (aka Google Places)
  • Google+ (‘profiles’ for people, ‘pages’ for businesses)

All of the tools above are free.


Analytics Checklist

These are ordered by the order you will come across them if proceeding top to bottom, left to right (not by importance)

Admin Settings

google analytics admin interface

    • Account

Data Sharing
Share with other Google products
AdWords Linking
Link to AdWords Account
    • Property

Link to Webmaster Tools
Yes
Enhanced Link Attribution
Enable
Session Settings
Change default: Campaign Tracking 6 months => 24 months
Organic Search Sources
Add duckduckgo.com
    • View

Timezone
(set)
Default Page
(set if applicable)
E-Commerce
Enable if applicable
(On)Site Search
Enable if applicable
Goals
IMPORTANT:Configure Macro (and possibly Micro) Goals
Content Grouping, Segments, Custom Alerts, Scheduled Emails, Shortcuts
As appropriate

Reporting

google analytics custom report gallery for importing reports

In the past, I have published compilations of links to custom report templates that I found useful so I would have them in a single location for easy reference.

Thankfully, that is no longer needed as Google now offers a gallery of custom reports that can be imported directly

    • Import From Gallery

Occam’s Razor Awesomeness
Creme of the crop from Avinash Kaushik
New Google Analytics User Starter Bundle
Grom the Google Analytics Team
Justin Cutroni’s…everything
View all by him and import whatever might be relevant

There’s a fair amount of overlap between these (and some like those analyzing the number of keywords in a query are antiquated now that the majority of search is encrypted) so I recommend importing them all and cherry picking the ones that you like the best.


References

Data Layer Specification Draft:

www.w3.org/2013/12/ceddl-201312.pdf

Track Keyword Ranking as an event

cutroni.com/blog/2013/01/14/a-new-method-to-track-keyword-ranking-using-google-analytics/

Rename the Global Object

google tag mamager setting to rename global object

developers.google.com/analytics/devguides/collection/analyticsjs/advanced

Prevent data loss with remarketing tag

www.blastam.com/blog/index.php/2013/04/google-analytics-remarketing-tag-concerns-solved/

CRM integration
Universal

groups.google.com/forum/#!topic/google-analytics-data-export-api/l7-b5FoyW5Q

Async

cutroni.com/blog/2009/03/18/updated-integrating-google-analytics-with-a-crm/

Encrypted Search and Google Hummingbird: My Ideas

The collective SEO and analytics communities have a lot to talk about right now; Google released Hummingbird, its’ biggest algorithm change in over a decade, and made it clear that it is rapidly moving towards 100% encrypted search (which means no keyword referral data for those of us who work with analytics).

I started this post in response to a thread on Linked In, but I realized that I had more to say than could fit there. If you searched these terms and found this post you probably have a job like mine so feel free to skim, skip to the possible solutions, act like you knew it all along, or share with others…please! :)

Why Semantic Search Is Better than Keywords

Keyword data can be valuable but let me explain a real world pitfall: I’d like this site to rank for ‘SEO company columbus ohio’, yet a larger proportion of queries are for ‘SEO companies…’

Which version (or both) should I put in the title of the home page? What about the h1 and that precious first line of the first paragraph? And what about ‘firm(s)’ or ‘consultant(s)?’

‘Companies’ is inaccurate, but has twice the volume of ‘company’…hmm.

I think we all see where this is headed.

What modern inbound marketer hasn’t faced such a dilemma? Who among us hasn’t competed with a large organization with no local presence that has optimized for exactly most commonly searched product or service phrasing followed by a city or state?

That is keyword search and it kinda sucks; it’s dumb and easy to manipulate. For another good explanation check out this Hummingbird FAQ by Danny Sullivan, especially read the examples about halfway down the article.

Semantic (linguistic or Semantic Web) and intent based search (Hummingbird) seems a little more natural and genuine. Search engines had already gotten pretty good at understanding intent, interests, misspellings…in short, natural language. You know, build sites for users, not for search engines and all that jazz.

Potential Solutions to the Lack of Keyword Data

search result stating that yahoo gets more traffic than Google in 2013

First of all, let’s not forget there are other search engines and I haven’t heard they plan to stop passing keyword data. Even if Bing and Yahoo! continue at a humble 20% search market share, that is enough of a sample to provide a meaningful sample, and both of them have been more aggressive in their marketing lately (recall that for the first time since 2011 Yahoo! got more visitors than Google, and Bing’s ‘Bing It On’ campaign).

As for the rest of these I’ll spare you the prose, here’s the list:

Note: I am always careful to sort queries by Google property, and then likely filter out searches that are likely to have lower ‘buying intent’ (e.g. image searches). If you don’t do this it is entirely possible that pages which receive a lot of image/video traffic will have skewed data making it harder to divine searched phrases.

Lastly, use controlled experiments. Maybe you simultaneously create 2 very similar pages, each with a phrase variation to test. Give them a month to get indexed and see which does better (rank, conversions, visitors etc.). Granted, that seems a little spammy and is a pain in the butt however, if you are feeling bold you could alternate phrasing every month or few and after a few cycles could probably get a decent idea for which performs better.

Google Analytics Custom Reports, Advanced Segments via Avinash Kaushik

I’m a huge fan of Avinash Kaushik’s digital marketing blog. He mostly writes about analytics and he explains very clearly how it is relevant to real world business.

One of the things I find most useful is that he often shares assets such as custom reports and advanced segments.

As a matter of fact, I find it so useful that I found myself repeatedly going back to the blog and hunting through his articles for those assets (and that is time consuming).

Sometimes they work, sometimes they don’t (I have only ever had one report in the comments feed work—no idea why) and I assume it is due to incompatible versions of the assets among different versions of Google Analytics.

This is my cheatsheet to analytics configuration assets that I have found there.

Custom Reports:

All 14 Custom Reports
While signed into GA click on this link: https://www.google.com/analytics/web/template?uid=EKH3li44SlqD_BMlbRUd9g

Advanced Segments:

All 8 Advanced Segments
Just click right here: https://www.google.com/analytics/web/template?uid=Yiyp_RDNRTO7f3wwqr0bWg

Dashboard:

WordPress Dashboard
Available here: https://www.google.com/analytics/web/template?uid=PWVDHCrqTUCrMqg2zxH3FA

I want to give credit where credit is due but also tell anyone who finds this where they can find more details. I spent a few hours today trying to find every post that contained the links to these assets but I was unable. However, I did find most so here’s the detailed, individual list.

Custom Reports:

www.kaushik.net/avinash/best-downloadable-custom-web-analytics-reports
AK: Visitor Acquisition Efficiency Analysis
https://www.google.com/analytics/web/template?uid=KrFOxvPiS_65PrygiZ-caw
AK Content Efficiency Analysis Report v2
https://www.google.com/analytics/web/template?uid=btamH0_tSVqiNy8LhglWjQ
AK: Paid Search “Micro-Ecosystem” Report
https://www.google.com/analytics/web/template?uid=zqq78eNkSByiiIcIGnMOYA
AK: Content Efficiency Analysis Report
https://www.google.com/analytics/web/template?uid=LwyPayPLQGW9sexqO1WnJA
www.kaushik.net/avinash/google-analytics-custom-reports-paid-search-campaigns-analysis
AK: Search Traffic (Excluding Not Set, Not Provided)
https://www.google.com/analytics/web/template?uid=8o_sz196RmqFIn_ctfswSQ
AK: PPC Keyword/Matched Query Report
https://www.google.com/analytics/web/template?uid=6eW4itYMQvWNtZpCwXxAng
AK: E2E Paid Search Report
https://www.google.com/analytics/web/template?uid=d_oDoGmIQ56JyOgDf9gGtw
www.kaushik.net/avinash/actionable-web-analytics-custom-reports-advanced-segments
AK:Content Efficiency & KW Drilldown Ecommerce Rpt
https://www.google.com/analytics/web/template?uid=r_aF6Vh7TRidFIKXh_xrJg
www.kaushik.net/avinash/google-secure-search-keyword-data-analysis
AK: Google httpS change Impact
https://www.google.com/analytics/web/permalink?type=custom_report&uid=I3_ojx0zRYycZcCjbcrxzg
AK: Key Word Performance Analysis
https://www.google.com/analytics/web/permalink?type=custom_report&uid=rTrR8e_8QXiM_y5lkl2zSA
AK: All Search Performance
https://www.google.com/analytics/web/template?uid=6AA5gc2tT9qVQTTY00Ch3g
AK: All Traffic Sources e2e
https://www.google.com/analytics/web/template?uid=VPau_Vm4TRqpiEwqQ9FkoQ
AK: Complete Mobile Performance Report
https://www.google.com/analytics/web/template?uid=6fG5KWDZQNaLHg3FaPdSmA
AK: Landing Pages Analysis
https://www.google.com/analytics/web/template?uid=ok26ZT0BStyTYT-Xkv6Esw

Advanced Segments (by blog post):

www.kaushik.net/avinash/advanced-analytics-visitor-segments-engagement-social-media-search-long-tail
AK: All Social Media Visits
https://www.google.com/analytics/web/template?uid=aRzB5vouQoqibP-cAzGw0A
My Social Media Visits
https://www.google.com/analytics/web/template?uid=lrmmK5eUQ4eVL5DqzGzb6Q
AK: Non-Flirts, Potential Lovers
https://www.google.com/analytics/web/template?uid=ME1AXzOIRyiq1UPw_lHvVw
AK: Visits w/ 3, 4, 5, 10, 20, 20+ Words in Search Query
https://www.google.com/analytics/web/template?uid=aTPyE-d7SMqxz–QubsbZg
www.kaushik.net/avinash/actionable-web-analytics-custom-reports-advanced-segments
1-2 Word Searchers Traffic
https://www.google.com/analytics/web/template?uid=j13_pTFYQf-E3gpuLOzlYg
3 or more words in a search query
https://www.google.com/analytics/web/template?uid=XHGSfAk6TtqdLRh6H3Ijiw
Source post unknown:
AK: Visits via Search Queries w/ more than 4 words
https://www.google.com/analytics/web/template?uid=K70EUc_QRW24JAvOngXErw
AK: Visits via search queries w/ 4 words.
https://www.google.com/analytics/web/template?uid=DJ4jshiBTQq0Ab76UJYLew

WordPress Blog Dashboard

www.kaushik.net/avinash/amazing-bar-charts-content-consumption-share-of-search
Dasboard
https://www.google.com/analytics/web/template?uid=PWVDHCrqTUCrMqg2zxH3FA