Both style and national security are impacted by the use of passive voice, the NSA said today. Having spent many billions of taxpayer dollars to capture all private electronic communication, the agency is frustrated that poor writing habits are making this data difficult to analyze. We strongly prefer short declarative sentences where the actor is clearly identified, said an NSA spokesperson. Instead of writing, ‘The protest will be attended by many activists,’ it would be better to write, ‘Known dissidents Amy Goodman, Laura Poitras, and Glenn Greenwald will travel by bus to the protest in Washington Square Park, New York, and will arrive at approximately 1:04 p.m. on April 1st, 2015.’ The NSA further suggested that instead of composing private email, citizens could instead fill out a webform at NSA.gov or travel to Bluffdale, Utah and share all of their most private secrets with the NSA in person.
See the original email here: https://supporters.eff.org/civicrm/mailing/view
I know HTML well and I always author it with semantic elements like paragraphs. I don’t insert blank paragraphs between paragraphs for spacing, and why in the world would anyone wrap an image or a figure (both block elements) in a paragraph?
I recently came up with a partial solution that doesn’t require a plugin or editing functions.php.
It’s not perfect but it’s easy to use and it doesn’t require access to the codebase.
Without further ado I give you (drum roll please):
My Auto-Formatting Solution: class="wp"
Have a look at the code of this post.
You’ll notice that all the <p> elements have a class of wp like this: <p class="wp">.
I don’t use forced line breaks very often (<br>) but it works to preserve those as well.
Anyone who comes after me and works on a site where I have used this method, and isn’t aware of it, could easily be tricked, just like WordPress is, into thinking there’s some significance to this class; there isn’t.
As soon as a class is added, WordPress stops wantonly deleting my semantic HTML presumably because the software is intelligent enough to realize that classes generally indicate a default element no longer has default properties, that it must have additional properties or stylings associated with it.
I don’t really care why it works, I don’t really care why WordPress innately strips paragraphs (and then adds them programmatically when it renders it‽).
What I care about is that I don’t have blank lines (i.e. new lines \n) followed by in between lines of text that isn’t wrapped in an appropriate element like a <p>.
Other Solutions to Auto-Formatting
With varying degrees of success, I have used 2 plugins (my current favorite pictured above) to help prevent auto-formatting, and my current starter theme, this theme (open-source and freely available on Github), includes functions for some of this.
The problem is they are hit or miss.
I don’t know why; I’m not really a developer. I loathe error logs and debug modes.
On some sites, the <img> tags end up wrapped in perfect little <figure> tags and any captions show up wrapped in <figcaption> tags with no despicable wrapping paragraphs standing on every street corner proclaiming their right to be there.
On other sites it doesn’t work.
Same goes for the plugins, especially the first one I found, Don’t Muck My Markup.
No, I haven’t reported a bug because as I said it’s hit or miss. Sometimes it works like a charm and sometimes it fails miserably and given the complexity of most sites I work on (think of all the variables: server software, PHP version, WordPress version, other plugins, performance enhancements [i.e. cache], proxies, human error) I don’t even try to figure it out, in my recent experience, it probably works correctly about 2/3 of the time.
PS Disable Auto-Formatting has been more reliable for me so it’s now my plugin of choice (it has taken the place of Don’t Muck My Markup in the script that automagically installs plugins for me).
I logged into Google Webmaster Tools (GWT) today and noticed something new: More granular reporting of traffic, all the way down to single clicks.
For anyone familiar with GWT, that’s quite a change because, at least for the last 3 years, it hasn’t reported anything less than 5 clicks/impressions.
Not that I felt it wouldn’t register anything unless you had at least 5 click/impressions, my gut feeling was that when it said 5 clicks it meant more than 0 but less than 6.
Why Did GWT Change It’s Reporting of Clicks/Impressions
It seems so long ago now that Google announced it would start obfuscating the vast majority of it’s keyword referral data——the (not provided) keyword mass extinction.
This thread from the Piwik repo on Github is more current (started in 2014) and mentions many of the sites currently contaminating my analytics reports.
My Referral Spammer Hitlist
The screenshot above is from Google Analytics (the last 6 months) for a site that I have access to the analytics data of but not the codebase/server (otherwise I’d have remedied the situation already). My significant other’s sister is the creator/founder/owner/president of Megan Lee Designs and, after learning of my vocation, granted me access to her analytics.
While 2.5% of traffic may not seem especially significant, it amounted to 16% (yep, 1 in 6) of their referral traffic! 42 of 143 sources of referral traffic were actually this referral spammer garbage (in the screenshot I included every website that was reported as having sent more than 1 visitor…there are another 35 domains (well, subdomains) that list a single visit. 34 of those are a subdomain of semalt.com (grumble, grumble)
When I first really noticed the problem in my analytics reports (in mid-2014) I came up with a blacklist. I called it a hitlist because, as I told my colleague Charlie, No workplace is complete without a hitlist ;-)
In the spirit of open-source, I give you my current list of directives:
Be forewarned, I’m no back-end developer or SysAdmin so while I can piece together enough RegEx to get simple things done, it is entirely possible that the directives written below are more efficient (and they are certainly more comprehensive).
My logic when I wrote them was simple:
I don’t care about protocols (http:// or https://)
I don’t care about subdomains (they often use lots of them)
I don’t care about case (hence the [NC] No Case [sensitivity] flag
If the domain listed as a referrer contains that series of characters (that string) I want to the RewriteRule to kill it before it hits my site (and my analytics) hence [F,L] the ‘fatal’ and ‘last’ rules on the re-write.
For anyone who might be of the copy and paste skill level (I myself was until fairly recently) be mindful of the ‘OR’ part of those directives [NC,OR], an [OR] is needed on all but the last RewriteCond, omitting it on prior conditions or adding it to the last condition will likely cause a series 500 error on your server.
A few months ago I addressed the most flagrant perpetrators—namely semalt.com—but suddenly I was getting visits from buttons-for-website.com.