Nofollow still means don’t follow, and how to instruct Google to crawl nofollow’ed links nevertheless

Posted on 23 February, 2008

painting a nofollow'ed link dofollow What was meant as a quick test of rel-nofollow once again (inspired by Michelle’s post stating that nofollow’ed comment author links result in rankings), turned out to some interesting observations:

Google uses sneaky JavaScript links (that mask nofollow’ed static links) for discovery crawling, and indexes the link destinations despite there’s no hard coded link on any page on the whole Web.
Google doesn’t crawl URIs found in nofollow’ed links only.
Google most probably doesn’t use anchor text outputted client sided in rankings for the page that carries the JavaScript link.
Google most probably doesn’t pass anchor text of JavaScript links to the link destination.
Google doesn’t pass anchor text of (hard coded) nofollow’ed links to the link destination.

As for my inspiration, I guess not all links in Michelle’s test were truly nofollow’ed. However, she’s spot on stating that condomized author links aren’t useless because they bring in traffic, and can result in clean links when a reader copies the URI from the comment author link and drops it elsewhere. Don’t pay too much attention on REL attributes when you spread your links.

As for my quick test explained below, please consider it an inspiration too. It’s not a full blown SEO test, because I’ve checked one single scenario for a short period of time. However, looking at its results within 24 hours after uploading the test only, makes quite sure that the test isn’t influenced by external noise, for example scraped links and such stuff.

On 2008-02-22 06:20:00 I’ve put a new nofollow’ed link onto my sidebar: Zilchish Crap <a href="http://sebastians-pamphlets.com/repstuff/something.php" id="repstuff-something-a" rel="nofollow"><span id="repstuff-something-b">Zilchish Crap</span></a> <script type="text/javascript"> handle=document.getElementById(‘repstuff-something-b’); handle.firstChild.data=‘Nillified, Nil’; handle=document.getElementById(‘repstuff-something-a’); handle.href=‘http://sebastians-pamphlets.com/repstuff/something.php?nil=js1’; handle.rel=‘dofollow’; </script>
(The JavaScript code changes the link’s HREF, REL and anchor text.)

The purpose of the JavaScript crap was to mask the anchor text, fool CSS that highlights nofollow’ed links (to avoid clean links to the test URI during the test), and to separate requests from crawlers and humans with different URIs.

Google crawls URIs extracted from somewhat sneaky JavaScript code

20 minutes later Googlebot requested the ?nil=js1 URI from the JavaScript code and totally ignored the hard coded URI in the A element’s HREF: 66.249.72.5 2008-02-22 06:47:07 200-OK Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) /repstuff/something.php?nil=js1

Roughly three hours after this visit Googlebot fetched an URI provided only in JS code on the test page: handle=document.getElementById(‘a1’); handle.href=‘http://sebastians-pamphlets.com/repstuff/something.php?nil=js2’; handle.rel=‘dofollow’;
From the log: 66.249.72.5 2008-02-22 09:37:11 200-OK Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) /repstuff/something.php?nil=js2

So far Google ignored the hidden JavaScript link to /repstuff/something.php?nil=js3 on the test page. Its code doesn’t change a static link, so that makes sense in the context of repeated statements like “Google ignores JavaScript links / treats them like nofollow’ed links” by Google reps.

Of course the JS code above is easy to analyze, but don’t think that you can fool Google with concatenated strings, external JS files or encoded JavaScript statements!

Google indexes pages that have only JavaScript links pointing to them

The next day I’ve checked the search index, and the results are interesting:

rel-nofollow-test search results

The first search result is the content of the URI with the query string parameter ?nil=js1, which is outputted with a JavaScript statement on my sidebar, masking the hard coded URI /repstuff/something.php without query string. There’s not a single real link to this URI elsewhere.

The second search result is a post URI where Google recognized the hard coded anchor text “zilchish crap”, but not the JS code that overwrites it with “Nillified, Nil”. With the SERP-URI parameter “&filter=0″ Google shows more posts that are findable with the search term [zilchish]. (Hey Matt and Brian, here’s room for improvement!)

Google doesn’t pass anchor text of nofollow’ed links to the link destination

A search for [zilchish site:sebastians-pamphlets.com] doesn’t show the testpage that doesn’t carry this term. In other words, so far the anchor text “zilchish crap” of the nofollow’ed sidebar link didn’t impact the test page’s rankings yet.

Google doesn’t treat anchor text of JavaScript links as textual content

A search for [nillified site:sebastians-pamphlets.com] doesn’t show any URIs that have “nil, nillified” as client sided anchor text on the sidebar, just the test page:

rel-nofollow-test search results

Results, conclusions, speculation

This test wasn’t intended to evaluate whether JS outputted anchor text gets passed to the link destination or not. Unfortunately “nil” and “nillified” appear both in the JS anchor text as well as on the page, so that’s for another post. However, it seems the JS anchor text isn’t indexed for the pages carrying the JS code, at least they don’t appear in search results for the JS anchor text, so most likely it will not be assigned to the link destination’s relevancy for “nil” or “nillified” as well.

Maybe Google’s algos dealing with client sided outputs need more than 24 hours to assign JS anchor text to link destinations; time will tell if nobody ruins my experiment with links, and that includes unavoidable scraping and its sometimes undetectable links that Google knows but never shows.

However, Google can assign static anchor text pretty fast (within less than 24 hours after link discovery), so I’m quite confident that condomized links still don’t pass reputation, nor topically relevance. My test page is unfindable for the nofollow’ed [zilchish crap]. If that changes later on, that will be the result of other factors, for example scraped pages that link without condom.

How to safely strip a link condom

And what’s the actual “news”? Well, say you’ve links that you must condomize because they’re paid or whatever, but you want that Google discovers the link destinations nevertheless. To accomplish that, just output a nofollow’ed link server sided, and change it to a clean link with JavaScript. Google told us for ages that JS links don’t count, so that’s perfectly in line with Google’s guidelines. And if you keep your anchor text as well as URI, title text and such identical, you don’t cloak with deceitful intent. Other search engines might even pass reputation and relevance based on the client sided version of the link. Isn’t that neat?

Link condoms with juicy taste faking good karma

Of course you can use the JS trick without SEO in mind too. E.g. to prettify your condomized ads and paid links. If a visitor uses CSS to highlight nofollow, they look plain ugly otherwise.

Here is how you can do this for a complete Web page. This link is nofollow’ed. The JavaScript code below changed its REL value to “dofollow”. When you put this code at the bottom of your pages, it will un-condomize all your nofollow’ed links. <script type="text/javascript"> if (document.getElementsByTagName) { var aElements = document.getElementsByTagName("a"); for (var i=0; i<aElements.length; i++) { var relvalue = aElements[i].rel.toUpperCase(); if (relvalue.match("NOFOLLOW") != "null") { aElements[i].rel = "dofollow"; } } } </script>

(You’ll find still condomized links on this page. That’s because the JavaScript routine above changes only links placed above it.)

When you add JavaScript routines like that to your pages, you’ll increase their page loading time. IOW you slow them down. Also, you should add a note to your linking policy to avoid confused advertisers who chase toolbar PageRank.

Updates: Obviously Google distrusts me, how come? Four days after the link discovery the search quality archangel requested the nofollow’ed URI -without query string- possibly to check whether I serve different stuff to bots and people. As if I’d cloak, laughable. (Or an assclown linked the URI without condom.)
Day five: Google’s crawler requested the URI from the totally hidden JavaScript link at the bottom of the test page. Did I hear Google reps stating quite often they aren’t interested in client-sided links at all?

Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to

Entries

Comments

All Comments

Sebastian | Paid Links, Testing, Anchor Text, Cloaking, Google, SEO, Nofollow | Related posts

19 Comments to "Nofollow still means don't follow, and how to instruct Google to crawl nofollow'ed links nevertheless"

Michael VanDeMar on 23 February, 2008 #link

Sebastian,

What’s really kewl is that there is a child test that you could spin off of this very easily… see how strict Googlebot’s js interpreter is, ie. construct some code that works in IE or Safari, but bombs in FF, see if Gbot still interprets it correctly.
Btw, as of now it looks like Y! is very aware of the link, but is choosing not to follow it (at least, has not cached it at all):

http://tinyurl.com/2vu69p
Miguel on 23 February, 2008 #link

So what I am pulling out of this is that nofollow links are in fact not passing authority at all like they are supposed to. Am I right or am I confused? I have been struggling with this idea as I am starting to think that nofollow links are passing link juice. I was recently told by a certain link broker selling a PR 9 link on statcounter.com for an ungodly amount of $$ that even though the links were nofollowed that they were still passing authority.

I am very interested to hear your thoughts on this. I plan on grilling this question into many people’s heads at SMX West this week too.
Sebastian on 23 February, 2008 #link

Michael, that would be a nice test, and I bet that Google figures out even code that bombs FF (I don’t buy that they use (only) the Mozilla engine). Google not only interprets JS to look at the rendered stuff AFAIK, they’ve their own JS analysis running too.

As for Yahoo, they’ve recognized even the JS links in Threadwatch’s sidebar while Nick was testing the del.icio.us “threadwatch” tag in 2005 or so, just to use a somewhat prominent example, and that behavior wasn’t new back then. Yahoo just indexed the URI, but didn’t crawl it yet (see the crawler log on the test page), maybe they never do although from my experience Slurp visits URIs from condomized links at least when they count a lot of them (in this case they might count the many links on the sidebars as one ROS link or so).

BTW I wouldn’t be surprised to find the nofollow’ed link, without QS parameters, in my GWC link stats. Knowing of a link and showing it in reverse citation results doesn’t mean those links count for anything.
Sebastian on 23 February, 2008 #link

Miguel, nofollow’ed links never passed reputation, at least not at Google. They were never supposed to pass link juice, but in the beginning all engines used them for their discovery crawling. The same goes for Yahoo and MSN/LiveSearch. Ask doesn’t support rel-nofollow because they think their Teoma algos are good enough so that they don’t need the publisher’s help. Google stopped crawling condomized links with the “BigDaddy” change of their infrastructure.

The sole reputation you can gain from condomized links is when someone picks the link and drops it at a place that doesn’t nofollow it. Feel free to grill your link broker, and please invite us to the BBQ.
Miguel on 23 February, 2008 #link

Awesome, you have just validated my stance on the subject. And I already roasted her, sorry. I immediately responded back with an “are you crazy” type of email. But I appreciate the analogy!

I do see Google Webmaster tools picking up nofollowed links in their report along with most other link analysis tools. So that had always bewildered me.

Something else that has always mystified me is the idea of real page rank. Everyone talks about it and how the toolbar is obviously showing “fake” page rank. Well I would LOVE to know how one can determine “real” page rank. Do you know how? There has to be some sort of way to isolate this.
Michael VanDeMar on 23 February, 2008 #link

Sebastian… you should change the behavior of the js real quick, and have it modify the anchor text to something not currently used on the landing page, some other fictional or extremely low occurrence phrase.
Sebastian on 24 February, 2008 #link

Miguel, to “isolate real PageRank” check these offerings. Seriously, real PageRank changes by the minute and Google doesn’t leak it out. Neither toolbar PR nor directory sliders show the real thing, and the same goes for the PageRank info in GWC, although that’s probably more recent and maybe more accurate.
Sebastian on 24 February, 2008 #link

Michael, I seldom change experiments, that’s somewhat semi-optimal. This one was to show that Google doesn’t crawl condomized links and doesn’t pass juice from them. The observations with regard to JS code changing static links as well as whether or not such outputs pass anchor text is enough stuff for a dediceted test scenario. I highly doubt that client sided stuff passes PageRank, so I probably won’t test that.
Tim Nash on 24 February, 2008 #link

This sort of contradicts a lot of real world results, for example the whole methodology in the recent link Bomb for clueless numpties was based using nofollows. So not only did Google rank a company for a (very uncompetitive) term under text found only in anchor text all the links with that anchor text were nofollowed.
Sebastian on 24 February, 2008 #link

Tim, provided all your links were truly condomized and nobody has put a clean link at places not under your control, that’s astonishing at the first sight.

Think outside the box. What if your tiny link grenade worked although none of your links passed PageRank and anchor text? Google found a couple somewhat trusted pages talking about “clueless numpties” and “websearchpr dot com”. The pages on Websearchpr are not indexable and the URI’s linkpop is pretty weak, hence Google needs other signals to figure out what the heck these assclowns do (perhaps they’ve even checked the signals provided by you guys against the crawled but not indexed contents of websearchpr dot com). Like with sterile low-life directory links that some folks suspect to pass imaginary juice, or UGC mentions in sterilized SM environments, this could lead to the SERP you’ve linked. Maybe its the textual content, including condomized anchor text, not the nofollow’ed links that created Websearchpr’s relevance for “clueless numpties”?
Tim Nash on 24 February, 2008 #link

First off all as this was out in the real world I cannot guarantee a splog didn’t pick up one of the linkers page and remove the nofollow. Infact its for this reason all these tests rarely work 100%

However following your idea I did a search for clueless Google Bomb. If you were right and it wasn’t picking direct anchor text we should see the page in the SERPs for other really uncompetitive terms.
Nada I’m all for another explanation and will continue to test various ideas. But thought I would throw the hand grenade into the works
p.s thanks for correcting the previous link
Miguel on 24 February, 2008 #link

Sebastian, thanks for this test and your comments. It is always refreshing to see such technical people playing in the world of organic SEO, testing things and using methodolgies to prove certain points.

I have to say that the idea of nofollow links and whether or not they are in fact passing authority is probably one of the most hot topics in our field right now. I hope that it gets alot of attention at SMX west. And if you are going to be there is would be great to get a chance to meet you.
Sebastian on 24 February, 2008 #link

Thanks Miguel, unfortunately I can’t make it to SMX West.
Sebastian on 24 February, 2008 #link

Ok Tim, lets discuss What If Nofollow * thingies.

When Google spots a condomized link, the algo tries to figure out the publisher’s intention:
- The publisher is too lazy to moderate user generated links.
- The publisher has no clue that some or all links are condomized.
- The publisher wholeheartedly disagrees with the link destination’s content, or hates its publisher.
- The publisher fell for search engine FUD and nofollow’s everything except google.com and the W3C.
- The link destination is crap.
- What else?

Next the algo looks at the publisher’s authority, editorial competence as well as trust gathered so far, and checks whether the link is somewhat related and how relevant the publisher’s textual contents are for the link destination.

Now if the link destination as well as the source page are not too dubious, none of the two is red flagged for spammy practices or has other negative signals like assumed payment for the hypervote or so on file, the algo starts to distrust the necessity of a link condom.

Of course Matt told the algo that a condomized link must not pass PageRank nor anchor text, but it’s smart. It thinks that even when it doesn’t factor in the PageRank the link could carry, nor the net power of its anchor text, it has spotted a valuable relevancy signal, so why not use it accordingly? Thus the algo scrubs words and phrases which it considers relevant to the link destination from the source page and assigns them along with a scoring to the link destination. It logs a short “bah humbug” note gibing Google’s public rel-nofollow definition and doubles the relevancy score for the phrase the publisher has put in the condomized anchor text.

The algo grins like a Cheshire cat. No harm done to the official nofollow procedure, because neither PageRank nor anchor text were passed (e.g. the link destination’s cached page copy will not show a hint like “the search term appears only in links to this page”), but mission outsmart nofollow for the sake of our search quality accomplished.

Sounds at least somewhat plausible. What do you guys think?
Search Engine Land: News About Search Engines & Search Marketing on 25 February, 2008 #link

SearchCap: The Day In Search, February 25, 2008…

Below is what happened in search today, as reported on Search Engine Land and from other places across the web…….
I Call Them Gonzo SEO’s for a Reason | Gonzo SEO on 5 March, 2008 #link

[…] Sebastian. Sebastian’s Pamphlets, in my opinion, is one of the top resources for technical SEO information. With each topic he goes into an in-depth analysis of why something works the way it does. The best part about his blog posts: he tests everything and shows you the results! Instead of just saying that he tried something, he shows you what he did. It is a great blog to learn advanced SEO tips and tricks from. My favorite post on his blog: Nofollow Still Means Don’t Follow, and How to Instruct Google to Crawl Nofollow’ed Links Neverth… […]
Sunday SEO / Search Supplemental Week 14 « SEO Company | North South Media on 6 April, 2008 #link

[…] I use SEO for Firefox to highlight nofollowed links but as Sebastian showed us all you can even fool plugins and CSS with a cool piece of Javascript (I know Sebastian’s post was not put up in the last week but it is useful to know that these […]
seo company on 31 July, 2008 #link

Nice analysis, I was searching for something like this since a long time and at last.. got it here at your blog.. really appreciate your analysis and the javascript code you have provided..

if (document.getElementsByTagName) {
var aElements = document.getElementsByTagName(”a”);
for (var i=0; i

I’m gonna check it with one of my site.. Thanks a lot dude..
keep up the good work..
I Call Them Gonzo SEO’s for a Reason on 3 February, 2009 #link

[…] Sebastian. Sebastian’s Pamphlets, in my opinion, is one of the top resources for technical SEO information. With each topic he goes into an in-depth analysis of why something works the way it does. The best part about his blog posts: he tests everything and shows you the results! Instead of just saying that he tried something, he shows you what he did. It is a great blog to learn advanced SEO tips and tricks from. My favorite post on his blog: Nofollow Still Means Don’t Follow, and How to Instruct Google to Crawl Nofollow’ed Links Neverth… […]

Sebastian’s Pamphlets