How to disagree on Twitter, machine-readable

URI link condom for social mediaWith standard hyperlinks you can add a rel="crap nofollow" attribute to your A elements. But how do you tell search engine crawlers and other Web robots that you disagree with a link’s content, when you post the URI at Twitter or elsewhere?

You cannot rely on the HTML presentation layer of social media sites. Despite the fact that most of them add a condom to all UGC links, crawlers do follow those links. Nowadays crawlers grab tweets and their embedded links long before they bother to fetch the HTML pages. They fatten their indexers with contents scraped from feeds. That means indexers don’t (really) take the implicit disagreement into account.

As long as you operate your own URI shortener, there’s a solution.

Condomize URIs, not A elements

Here’s how to nofollow a plain link drop, where you’ve no control over link attributes like rel-nofollow:

  • Prerequisite: understanding the anatomy of a URI shortener.
  • Add an attribute like shortUri.suriNofollowed, boolean, default=false, to your shortened URIs database table. In the Web form where you create and edit short URIs, add a corresponding checkbox and update your affected scripts.
  • Make sure your search engine crawler detection is up-to-date.
  • Change the piece of code that redirects to the original URI:
    if ($isCrawler && $suriNofollowed) {
    header("HTTP/1.1 403 Forbidden redirect target", TRUE, 403);
    print "<html><head><title>This link is condomized!</title></head><body><p>Search engines are not allowed to follow this link: <code>$suriUri</code></p></body></html>";
    }
    else {
    header("HTTP/1.1 301 Here you go", TRUE, 301);
    header("Location: $suriUri");
    }
    exit;

Here’s an example: This shortened URI takes you to a Bing SEO tip. Search engine crawlers get bagged in a 403 link condom.

Since you can’t test it yourself (user agent spoofing doesn’t work), here’s a header reported by Googlebot (requesting the condomized URI above) today:


HTTP/1.1 403 Forbidden
Date: Thu, 07 Jan 2010 10:19:16 GMT
...
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

The error page just says:
Title + H1: Link is nofollow'ed
P: Sorry, this shortened URI must not get followed by search engines.

If you can’t roll your own, feel free to make use of my URI Condomizer. Have fun condomizing crappy links on Twitter.

URI:
Nofollow

If you check “Nofollow” your URI gets condomized. That means, search engines can’t request it from the shortened URI, but users and other Web robots get redirected.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

6 Comments to "How to disagree on Twitter, machine-readable"

  1. Everfluxx on 7 January, 2010  #link

    $isCrawler? I smell cloaking. :D Why not redirect through a robots.txt’ed out location instead?

  2. Sebastian on 7 January, 2010  #link

    Heh. $isCrawler is totally innocent WRT cloaking. Like in “if ($isCrawler) {logThisRequest();}”. ;-)
    Not trusting search engines is a good habit. I know redirect chains involving disallow’ed scripts work with Google, but I wouldn’t bet some of the minor players possibly get its semantics. Also, it’s way too complex, and slower because it adds a totally useless redirect to the chain. Serving users the requested location whilst blocking crawlers is clean, elegant, and 100% safe.

  3. Jane on 7 January, 2010  #link

    I agree with Sebastian, I don’t really trust search engines.

  4. Inside the Webb on 10 January, 2010  #link

    This is a great article man! I just found your site through Sphinn and I really like your posts, keep up the great content I’ll be following

  5. Max on 30 January, 2010  #link

    Comdomizer tool really rocks! I will try to condomize all my URI ))))

    [It seems you didn’t understand it.]

  6. Moris on 20 December, 2010  #link

    Its a good term you used “Condomizer”. You are doing good. Good luck to you.

Leave a reply


[If you don't do the math, or the answer is wrong, you'd better have saved your comment before hitting submit. Here is why.]

Be nice and feel free to link out when a link adds value to your comment. More in my comment policy.