Skip to main content

adam infinitum

analytics, optimization, marketing and a dash of web development

Referral Spammer Hitlist (.htaccess directives included)

Referral spam has been problematic for a long time; a quick search turned up an extensive list of referral spammers from 2009 on Perishable Press.

This thread from the Piwik repo on Github is more current (started in 2014) and mentions many of the sites currently contaminating my analytics reports.

My Referral Spammer Hitlist

reports and chart in google analytics show referral spam in 2014

The screenshot above is from Google Analytics (the last 6 months) for a site that I have access to the analytics data of but not the codebase/server (otherwise I’d have remedied the situation already). My significant other’s sister is the creator/founder/owner/president of Megan Lee Designs and, after learning of my vocation, granted me access to her analytics.

While 2.5% of traffic may not seem especially significant, it amounted to 16% (yep, 1 in 6) of their referral traffic! 42 of 143 sources of referral traffic were actually this referral spammer garbage (in the screenshot I included every website that was reported as having sent more than 1 visitor…there are another 35 domains (well, subdomains) that list a single visit. 34 of those are a subdomain of semalt.com (grumble, grumble)

When I first really noticed the problem in my analytics reports (in mid-2014) I came up with a blacklist. I called it a hitlist because, as I told my colleague Charlie, No workplace is complete without a hitlist ;-)

In the spirit of open-source, I give you my current list of directives:

<IfModule mod_rewrite.c>
RewriteEngine on
# Block referral spammers
# Added September 2014
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_REFERER} kambasoft\.com [NC,OR]
RewriteCond %{HTTP_REFERER} darodar\.com [NC,OR]
RewriteCond %{HTTP_REFERER} savetubevideo\.com [NC,OR]
RewriteCond %{HTTP_REFERER} descargar-musica-gratis\.net [NC,OR]
RewriteCond %{HTTP_REFERER} baixar-musicas-gratis\.com [NC,OR]
# Added in December 2014
RewriteCond %{HTTP_REFERER} 7makemoneyonline\.com [NC,OR]
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC]
RewriteRule .* - [F,L]
</IfModule>

Be forewarned, I’m no back-end developer or SysAdmin so while I can piece together enough RegEx to get simple things done, it is entirely possible that the directives written below are more efficient (and they are certainly more comprehensive).

My logic when I wrote them was simple:

  • I don’t care about protocols (http:// or https://)
  • I don’t care about subdomains (they often use lots of them)
  • I don’t care about case (hence the [NC] No Case [sensitivity] flag

If the domain listed as a referrer contains that series of characters (that string) I want to the RewriteRule to kill it before it hits my site (and my analytics) hence [F,L] the ‘fatal’ and ‘last’ rules on the re-write.

For anyone who might be of the copy and paste skill level (I myself was until fairly recently) be mindful of the ‘OR’ part of those directives [NC,OR], an [OR] is needed on all but the last RewriteCond, omitting it on prior conditions or adding it to the last condition will likely cause a series 500 error on your server.

A few months ago I addressed the most flagrant perpetrators—namely semalt.com—but suddenly I was getting visits from buttons-for-website.com.

Other Resources for Blocking Referral Spam

I recently tweeted about buttons-for-website.com and got a reply from @hbeckner who wrote a nice article (in German, but it translated well and the illustrations are in English so I had no problems) about it and the motivation behind it. He pointed me to this thread on the WordPress support forums.

In that thread someone shares their .htaccess directives to block these spammers and they list the following directives:

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://.*backgroundpictures\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*embedle\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*extener\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*fbfreegifts\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*feedouble\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*feedouble\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*joinandplay\.me/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*joingames\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*kambasoft\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*musicprojectfoundation\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*myprintscreen\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*openfrost\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*openmediasoft\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*savetubevideo\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*semalt\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*softomix\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*softomix\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*softomix\.ru/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*soundfrost\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*srecorder\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*vapmedia\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*videofrost\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*videofrost\.net/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*youtubedownload\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*zazagames\.org/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*buttons\-for\-website\.com [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*darodar\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://.*7makemoneyonline\.com/ [NC]
RewriteRule ^(.*)$ – [F,L]
</IfModule>

That list is a bit more extensive than mine and I also find it interesting to see how people write the directives in different ways.

I’ll try to keep this post up-to-date for my own use and yours if so inclined.

GTM & AdWords Calls From Site Conversion Tutorial

Tonight I came across a really nice tutorial covering setting up Google AdWords call conversion tracking on a website using Google Tag Manager by Red Fly Marketing.

If you end up coming across this first, go read it.

You may notice I left a comment about CDATA comments. I found that blog post so useful that I am writing this and sharing it before they have replied or I have had a chance to test the code. I have every reason to think it will work but again, I have not tested it.

If you’re using Google AdWords call conversion tracking on your website and using Google Tag Manager (or considering implementing it) I think you’ll find it useful.

Google Doodle for Kandinsky, Why Not?

On the surface, it seems the Independent is questioning today’s Google Doodle celebrating Wassily Kandinsky.
news headline questioning why a google doodle celebrated wassily kandinsky

Kandinsky is Awesome!

Working in search, I’m adept at ignoring clickbait; I couldn’t ignore this.

He is one of my favorite artists. There is no reason to question his being honored with a Google Doodle, he is one of the most talented and notable artists I know of.

The Independent’s Clickbait Title

Upon reading the article, it seems they are not questioning his Google Doodle. The article itself (which I am not linking to due to not wanting to support such tactics) never even mentions anything which seems to indicate it should be questioned and is a tepid and impotent summary. The article is so weak, I wonder if it is programmatically generated.

What’s So Great About Kandinsky?

I’m glad you asked. Allow me to share some of my favorites: (the first painting is the work featured in the Google Doodle)

Wassily Kandinsky painting Composition VIII which is geometric and colorful
Wasilly Kandinsky painting featuring a black background with liighter colored circles scattered about the canvas

Wassily Kandinsky Painting

Yeah, that’s what so great about Wassily Kandinsky.

XML is a gateway drug to proprietary file formats (HA!)

I am sometimes easily, albeit fleetingly, entertained.

Especially, when I have spent 20 of the last 24 hours trying to hack together enough Python and XPath to get a Scrapy web crawler to be more co-operative than I was as a surly 13 year-old.

In my teenage-like angst I came across a blog post about XML parsing libraries and trying to use them with HTML.

When it comes to web I tend to read for information so I take special delight in authors who can convey useful (to me at least) technical information clearly with just enough flair to make it fun.

Simon Timms succeeded when he coined this line:

…You don’t want to get involved in XML anyway—it is a gateway drug to proprietary file formats.

Whether you find it funny or not, it earned him an organic inbound link. And I’ll go out of my way to read his stuff in the future, it’s just the right mix of and technically sophisticated commentary and information and irreverence.

My Favorite File Renaming Tool

If you have ever worked with images, especially in bulk as is often needed for SEO, you know that cameras name images with meaningless hashes.

Granted they are generally sequential, so that makes sense within the context of the camera but file names like IMG_63826151.JPEG are meaningless to anyone else in any other context.

I mean, imagine your parents are standing next to you as you search for a certain photograph that you know is on your computer. That’s no time to discover that meaningless hashes force you to review the file to gauge its contents.

No Siree, I’d rather have that photo named naughty-picture-of-my-girlfriend.jpg then I can be sure not to preview that one when Mom’s in the room.

name changer app options interface

Photo Naming Conventions

I like image file names to be all lower case, with words separated by dashes, descriptively named and with a 3 letter extension (i.e. .jpg not .JPEG).

The Name Changer App by MRR Software makes changes like that easy and quick across any number of files.

Need your photos appended with a number?

No problem, it handles RegEx and more with ease and yet still has a pleasant intuitive GUI.

I suppose there might be better utilities for this out there but this one is good enough that I quit looking when I found it.

Plus it’s free (one of these days I’ll donate).