Referral Spammer Hitlist (.htaccess directives included)

Referral spam has been problematic for a long time; a quick search turned up an extensive list of referral spammers from 2009 on Perishable Press.

This thread from the Piwik repo on Github is more current (started in 2014) and mentions many of the sites currently contaminating my analytics reports.

My Referral Spammer Hitlist

Screenshot from Google Analytics Showing Referral Spam

The screenshot above is from Google Analytics (the last 6 months) for a site that I have access to the analytics data of but not the codebase/server (otherwise I’d have remedied the situation already). My significant other’s sister is the creator/founder/owner/president of Megan Lee Designs and, after learning of my vocation, granted me access to her analytics.

While 2.5% of traffic may not seem especially significant, it amounted to 16% (yep, 1 in 6) of their referral traffic! 42 of 143 sources of referral traffic were actually this referral spammer garbage (in the screenshot I included every website that was reported as having sent more than 1 visitor…there are another 35 domains (well, subdomains) that list a single visit. 34 of those are a subdomain of semalt.com (grumble, grumble)

When I first really noticed the problem in my analytics reports (in mid-2014) I came up with a blacklist. I called it a hitlist because, as I told my colleague Charlie, No workplace is complete without a hitlist ;-)

In the spirit of open-source, I give you my current list of directives:

Be forewarned, I’m no back-end developer or SysAdmin so while I can piece together enough RegEx to get simple things done, it is entirely possible that the directives written below are more efficient (and they are certainly more comprehensive).

My logic when I wrote them was simple:

  • I don’t care about protocols (http:// or https://)
  • I don’t care about subdomains (they often use lots of them)
  • I don’t care about case (hence the [NC] No Case [sensitivity] flag

If the domain listed as a referrer contains that series of characters (that string) I want to the RewriteRule to kill it before it hits my site (and my analytics) hence [F,L] the ‘fatal’ and ‘last’ rules on the re-write.

For anyone who might be of the copy and paste skill level (I myself was until fairly recently) be mindful of the ‘OR’ part of those directives [NC,OR], an [OR] is needed on all but the last RewriteCond, omitting it on prior conditions or adding it to the last condition will likely cause a series 500 error on your server.

A few months ago I addressed the most flagrant perpetrators—namely semalt.com—but suddenly I was getting visits from buttons-for-website.com.

Other Resources for Blocking Referral Spam

I recently tweeted about buttons-for-website.com and got a reply from @hbeckner who wrote a nice article (in German, but it translated well and the illustrations are in English so I had no problems) about it and the motivation behind it. He pointed me to this thread on the WordPress support forums.

In that thread someone shares their .htaccess directives to block these spammers and they list the following directives:

That list is a bit more extensive than mine and I also find it interesting to see how people write the directives in different ways.

I’ll try to keep this post up-to-date for my own use and yours if so inclined.

GTM & AdWords Calls From Site Conversion Tutorial

Tonight I came across a really nice tutorial covering setting up Google AdWords call conversion tracking on a website using Google Tag Manager by Red Fly Marketing.

If you end up coming across this first, go read it. You may notice I left a comment about CDATA comments. I found that blog post so useful that I am writing this and sharing it before they have replied or I have had a chance to test the code. I have every reason to think it will work but again, I have not tested it. If you’re using Google AdWords call conversion tracking on your website and using Google Tag Manager (or considering implementing it) I think you’ll find it useful.

Google Doodle for Kandinsky, Why Not?

On the surface, it seems the Independent is questioning today’s Google Doodle celebrating Wassily Kandinsky. Google Search Results Page with result from The Independent questioning Wassily Kandinsky's Doodle

Kandinsky is Awesome!

Working in search, I’m adept at ignoring clickbait; I couldn’t ignore this. He is one of my favorite artists. There is no reason to question his being honored with a Google Doodle, he is one of the most talented and notable artists I know of.

The Independent’s Clickbait Title

Upon reading the article, it seems they are not questioning his Google Doodle. The article itself (which I am not linking to due to not wanting to support such tactics) never even mentions anything which seems to indicate it should be questioned and is a tepid and impotent summary. The article is so weak, I wonder if it is programmatically generated.

What’s So Great About Kandinsky?

I’m glad you asked. Allow me to share some of my favorites: (the first painting is the work featured in the Google Doodle) Wasilly Kandinsky's Composition VIII Wasilly Kandinsky painting featuring a black background with liighter colored circles scattered about the canvas Wassily Kandinsky Painting Wassily Kandinsky Painting Yeah, that’s what so great about Wassily Kandinsky.