<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.2.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Sebastian's Pamphlets &#187; Paid Links</title>
	<link>http://sebastians-pamphlets.com</link>
	<description>If you've read my articles somewhere on the Internet, expect something different here.</description>
	<pubDate>Mon, 30 Jun 2008 20:12:40 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.3</generator>
	<language>en</language>
			<item>
		<title>You can&#8217;t escape from Google-Jail when &#8230;</title>
		<link>http://sebastians-pamphlets.com/stuck-in-google-jail/</link>
		<comments>http://sebastians-pamphlets.com/stuck-in-google-jail/#comments</comments>
		<pubDate>Wed, 27 Feb 2008 11:50:18 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Reciprocal Links]]></category>

		<category><![CDATA[Webspam]]></category>

		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Spam Report]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/stuck-in-google-jail/</guid>
		<description><![CDATA[&#8230; you&#8217;ve boosted your business Web site&#8217;s rankings with shitloads of crappy links. The 11th SEO commandment: Don&#8217;t promote your white hat sites with black hat link building methods! It may work for a while, but once you find your butt in Google-jail, there&#8217;s no way out. Not even a reconsideration request can help because [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://sebastians-pamphlets.com/img/posts/spammers-stuck-in-google-jail.png" width="200" height="301" align="right" style="margin-left:4px;" alt="spammers stuck in google jail" title="Stuck in Google-Jail" />&#8230; you&#8217;ve boosted your business Web site&#8217;s rankings with shitloads of crappy links. <abbr title="Don't halloo till you're out of the wood!">The 11th SEO commandment</abbr>: <b>Don&#8217;t promote your white hat sites with black hat link building methods!</b> It may work for a while, but once you find your butt in Google-jail, there&#8217;s no way out. Not even a reconsideration request can help because you can&#8217;t provide its prerequisites.</p>
<p>When you&#8217;re caught eventually &#8211;penalized for tons of stinky links&#8211; and have to file a reinclusion request, Google wants you to remove all the shady links you&#8217;ve spread on the Web before they lift your penalty. Here is an <a href="http://groups.google.com/group/Google_Webmaster_Help-Indexing/browse_thread/thread/59e796b85327a4b2/">example</a>, well documented in a Google Groups thread started by a penalized site owner with official statements from <a href="http://mattcutts.com/blog/">Matt Cutts</a> and <a href="http://johnmu.com/">John Müller</a> from Google.</p>
<p>The <a id="a-farawayfurniture" href="http://www.farawayfurniture.co.uk/" rel="nofollow until unpenalized">site in question</a>, a small family business from the UK, has used more or less every tactic from a lazy link builder&#8217;s textbook to create 40,000+ inbound links. Sponsored WordPress themes, paid links, comment spam, artificial link exchanges and whatnot. <script type="text/javascript">handle=document.getElementById("a-farawayfurniture"); handle.rel="client-sided dofollow"; </script></p>
<p>Most sites that carry these links are in no way related to the penalized site, which deals with <em>modern teak garden furniture and home furniture sets</em>, for example porn galleries, Web designers, US city guides, obscure oriental blogs, job boards, or cat masturbation guides. (Don&#8217;t get me wrong. Of course not every link has to be topically related. Every link from a trusted page can pass PageRank, and can improve crawling, indexing, and so on.) </p>
<p>Google has absolutely no problem with unrelated links, unless a site&#8217;s link profile consists of way too many spammy and/or unrelated links. That does not mean that spreading a gazillion low-life links pointing to a competitor will get this site penalized or even banned. Negative SEO is not that simple. For an innocent site Google just ignores spammy inbound links, but most probably flags it for further investigations, both manually as well as algorithmically.</p>
<p>If on the other hand Google finds evidence that a site is actively involved in link monkey business of any kind, that&#8217;s a completely different story. Such evidence could be massively linking out to spammy places, hosting reciprocal links pages or FFA directories, unskillful (manual|automated) comment spam, signature links and mentions at places that trade links, textual contents made for (paid) link campaigns when reused too often, buying links from trackable services, (link request emails forwarded via) paid-link/spam reports, and so on. </p>
<p>Below is the &#8220;how to file a successful reconsideration request when your sins include link spam&#8221; from Googlers.</p>
<p><a href="http://groups.google.com/group/Google_Webmaster_Help-Indexing/msg/32db3e8e1fbf54e8">Matt Cutts</a>:</p>
<blockquote><p>The recommendation from your SEO guy led you directly into a pretty high-risk area; I doubt you really want pages like <a rel="nofollow" onclick="return false;" href="http://www.fuckingfilthy.com/filthy-hardcore/amateur-black-teen-sex-freak-is-a-closet-lesbian/"><script type="text/javascript">document.write("http:// www.fuckingfilthy.com/ filthy-hardcore/ amateur-black-teen-sex-freak-is-a-closet-lesbian/");</script> (<b>NSAW</b>)</a> having sponsored links to your furniture site anyway. It&#8217;s definitely possible to extricate your site, but I would make an effort to contact the sites with your sponsored links and request that they remove the links, and then do a reconsideration request. Maybe in the text of your reconsideration request, I&#8217;d include a pointer to this thread as well.</p>
</blockquote>
<p><a href="http://groups.google.com/group/Google_Webmaster_Help-Indexing/msg/6ac5fb93035e9735">John Müller</a>:</p>
<blockquote><p>You may want to consider what you can do to help clean up similar [=spammy] links on other people&#8217;s sites. Blogs and newspaper sites such as <a href="http://media.www.dailypennsylvanian.com/media/storage/paper882/news/2002/09/30/Opinion/Kelly.Lynch.And.Dennis.Tupper.A.Focus.On.Integrity-2156926.shtml">http://media.www.dailypennsylvanian.com</a> sometimes receive short comments such as &#8220;dont agree&#8221;, apparently only for a link back to a site. These comments often use keywords from that site instead of a user name, perhaps &#8220;tree bench&#8221; for a furniture site or &#8220;sexy shoes&#8221; for a footwear site. If this kind of behavior might have taken place for your site, you may want to work on rectifying it and include some information on it in your reconsideration request. Given your situation, the person considering your reconsideration request might be curious about links like that.</p>
</blockquote>
<p>Translation: <b>We&#8217;ll ignore your weekly reconsideration requests unless you&#8217;ve removed all artificial links pointing to your site</b>. You&#8217;re stuck in Google&#8217;s dungeon because they&#8217;ve thrown away the keys.</p>
<p>I&#8217;d guess that for a site that has filed a reinclusion request stating the site was involved in some sort of link monkey business, Google applies a more strict policy than with a site that was attacked by negative SEO methods. I highly doubt that when caught red-handed a lame excuse like &#8220;I didn&#8217;t create those links&#8221; is a tactic I could recommend, because Googlers hate it when an applicant lies in a reinclusion request.</p>
<p>Once caught and penalized, the &#8220;since when do inbound links count as negative votes&#8221; argument doesn&#8217;t apply. It&#8217;s quite clear that removing the traces (admitted as well as not admitted shady links) is a prerequisite for a penalty lift. And that even though Google has already discounted these links. That&#8217;s the same as with penalized doorway pages. Redirecting doorways to legit landing pages doesn&#8217;t count, Google wants to see a 410-Gone HTTP response code (or at least a 404) before they un-penalize a site.</p>
<p>I doubt that&#8217;s common knowledge to folks who promote their white hat sites with black hat methods. Getting links wiped out at places that didn&#8217;t check the intention of inserted links in the first place is a royal PITA, in other words, it&#8217;s impossible to get all shady links removed once you find your butt in Google-jail. That&#8217;s extremely uncomfortable for site owners who fell for questionable forum advice or hired a promotional service (no, I don&#8217;t call such assclowns SEOs) applying shady marketing methods without a clear and written warning that those are extremely risky, fully explained and signed by the client.</p>
<p>Maybe in some cases Google will un-penalize a great site although not all link spam was wiped out. However, the costs and efforts of preparing a successful resonsideration request are immense, not to speak of the massive loss of traffic and income.</p>
<p>As <a href="http://cartoonbarry.com/">Barry</a> mentioned, the thread linked above might be interesting for folks keen on an official confirmation that <a href="http://www.seroundtable.com/archives/016342.html">Google -60 penalties</a> exist. I&#8217;d say such SERP penalties (aka <a href="http://googlewebmastercentral.blogspot.com/2007/03/update-on-spam-reporting.html">red &amp; yellow cards</a>) aren&#8217;t exactly new, and it plays no role to which position a site penalized for guideline violations gets downranked. When I&#8217;ve lost a top spot for gaming Google, that&#8217;s kismet. I&#8217;m not interested in figuring out that 20k spammy links get me a -30 penalty, 40k shady links result in a -60 penalty, and 100k unnatural links qualify me for the famous -950 bashing (the numbers are made up of course). If I&#8217;d spam, then I&#8217;d just move on because I&#8217;d have already launched enough other projects to compensate the losses.</p>
<p>PS: While I was typing, Barry Schwartz posted his <a href="http://www.seroundtable.com/archives/016380.html">Google-Jail story at SE Roundtable</a>.</p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/stuck-in-google-jail/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Nofollow still means don&#8217;t follow, and how to instruct Google to crawl nofollow&#8217;ed links nevertheless</title>
		<link>http://sebastians-pamphlets.com/how-to-dynamically-change-nofollow-to-dofollow/</link>
		<comments>http://sebastians-pamphlets.com/how-to-dynamically-change-nofollow-to-dofollow/#comments</comments>
		<pubDate>Sat, 23 Feb 2008 14:51:14 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[Testing]]></category>

		<category><![CDATA[Anchor Text]]></category>

		<category><![CDATA[Cloaking]]></category>

		<category><![CDATA[Google]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Nofollow]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/how-to-dynamically-change-nofollow-to-dofollow/</guid>
		<description><![CDATA[What was meant as a quick test of rel-nofollow once again (inspired by Michelle&#8217;s post stating that nofollow&#8217;ed comment author links result in rankings), turned out to some interesting observations:

Google uses sneaky JavaScript links (that mask nofollow&#8217;ed static links) for discovery crawling, and indexes the link destinations despite there&#8217;s no hard coded link on any [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://sebastians-pamphlets.com/img/posts/painting-nofollow-dofollow.png" width="250" height="220" align="right" alt="painting a nofollow'ed link dofollow" style="margin-left:4px;" title="How to paint a nofollow'ed link dofollow" />What was meant as a quick test of <a href="http://sebastians-pamphlets.com/links/categories/&amp;cat=nofollow">rel-nofollow</a> once again (inspired by <a href="http://www.michellemacphearson.com/do-nofollow-links-count-redux/">Michelle&#8217;s post</a> stating that nofollow&#8217;ed comment author links result in rankings), turned out to some interesting observations:</p>
<ul>
<li>Google uses sneaky JavaScript links (that mask nofollow&#8217;ed static links) for discovery crawling, and indexes the link destinations despite there&#8217;s no hard coded link on any page on the whole Web.</li>
<li>Google doesn&#8217;t crawl URIs found in nofollow&#8217;ed links only.</li>
<li>Google most probably doesn&#8217;t use anchor text outputted client sided in rankings for the page that carries the JavaScript link.</li>
<li>Google most probably doesn&#8217;t pass anchor text of JavaScript links to the link destination.</li>
<li>Google doesn&#8217;t pass anchor text of (hard coded) nofollow&#8217;ed links to the link destination.</li>
</ul>
<p>As for my inspiration, I guess not all links in Michelle&#8217;s test were truly nofollow&#8217;ed. However, she&#8217;s spot on stating that condomized author links aren&#8217;t useless because they bring in traffic, and can result in clean links when a reader copies the URI from the comment author link and drops it elsewhere. Don&#8217;t pay too much attention on REL attributes when you spread your links.</p>
<p>As for my quick test explained below, please consider it an inspiration too. It&#8217;s not a full blown SEO test, because I&#8217;ve checked one single scenario for a short period of time. However, looking at its results within 24 hours after uploading the test only, makes quite sure that the test isn&#8217;t influenced by external noise, for example scraped links and such stuff.</p>
<p>On 2008-02-22 06:20:00 I&#8217;ve put a new nofollow&#8217;ed link onto my sidebar: <a href="http://sebastians-pamphlets.com/repstuff/something.php" id="repstuff-something-2-a" rel="nofollow"><span id="repstuff-something-2-b">Zilchish Crap</span></a> <script type="text/javascript"> handle=document.getElementById("repstuff-something-2-b"); handle.firstChild.data="Nillified, Nil"; handle=document.getElementById("repstuff-something-2-a"); handle.href="http://sebastians-pamphlets.com/repstuff/something.php?nil=js1"; handle.rel="dofollow"; </script><code><small><br />
&lt;a href=&quot;http://sebastians-pamphlets.com/repstuff/something.php&quot; id=&quot;repstuff-something-a&quot; rel=&quot;nofollow&quot;&gt;&lt;span id=&quot;repstuff-something-b&quot;&gt;Zilchish Crap&lt;/span&gt;&lt;/a&gt;<br />
&lt;script type=&quot;text/javascript&quot;&gt;<br />
handle=document.getElementById(&lsquo;repstuff-something-b&rsquo;);<br />
handle.firstChild.data=&lsquo;Nillified, Nil&rsquo;;<br />
handle=document.getElementById(&lsquo;repstuff-something-a&rsquo;);<br />
handle.href=&lsquo;http://sebastians-pamphlets.com/repstuff/something.php?nil=js1&rsquo;;<br />
handle.rel=&lsquo;dofollow&rsquo;;<br />
&lt;/script&gt; </small></code><br />
(The JavaScript code changes the link&#8217;s HREF, REL and anchor text.)</p>
<p>The purpose of the JavaScript crap was to mask the anchor text, fool CSS that highlights nofollow&#8217;ed links (to avoid clean links to the test URI during the test), and to separate requests from crawlers and humans with different URIs.</p>
<h3>Google crawls URIs extracted from somewhat sneaky JavaScript code</h3>
<p>20 minutes later Googlebot requested the ?nil=js1 URI from the JavaScript code and totally ignored the hard coded URI in the A element&#8217;s HREF: <code><br />
66.249.72.5 	2008-02-22 06:47:07 	200-OK 	Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 	/repstuff/something.php?nil=js1</code></p>
<p>Roughly three hours after this visit Googlebot fetched an URI provided only in JS code on the test page: <code><small><br />
handle=document.getElementById(&lsquo;a1&rsquo;);<br />
handle.href=&lsquo;http://sebastians-pamphlets.com/repstuff/something.php?nil=js2&rsquo;;<br />
handle.rel=&lsquo;dofollow&rsquo;; </small></code><br />
From the log: <code><br />
66.249.72.5 	2008-02-22 09:37:11 	200-OK 	Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 	/repstuff/something.php?nil=js2</code></p>
<p>So far Google ignored the hidden JavaScript link to <code>/repstuff/something.php?nil=js3</code> on the test page. Its code doesn&#8217;t change a static link, so that makes sense in the context of repeated statements like &#8220;Google ignores JavaScript links / treats them like nofollow&#8217;ed links&#8221; by Google reps.</p>
<p class="excursus">Of course the JS code above is easy to analyze, but don&#8217;t think that you can fool Google with concatenated strings, external JS files or encoded JavaScript statements!</p>
<h3>Google indexes pages that have only JavaScript links pointing to them</h3>
<p>The next day I&#8217;ve checked the search index, and the <a href="http://www.google.com/search?num=100&#038;hl=en&#038;safe=off&#038;q=zilchish%7Cnillyfiable+site%3Asebastians-pamphlets.com">results</a> are interesting:</p>
<p><img src="http://sebastians-pamphlets.com/img/google/nofollow-zilchish-nullifable-google-serp-24h.png" width="498" height="421" alt="rel-nofollow-test search results" title="Google indexes JS manipulated anchor text and content referenced only in JS links" /></p>
<p>The first search result is the content of the URI with the query string parameter <code>?nil=js1</code>, which is outputted with a JavaScript statement on my sidebar, masking the hard coded URI <code>/repstuff/something.php</code> without query string. There&#8217;s not a single real link to this URI elsewhere.</p>
<p>The second search result is a post URI where Google recognized the hard coded anchor text &#8220;zilchish crap&#8221;, but not the JS code that overwrites it with &#8220;Nillified, Nil&#8221;. With the SERP-URI parameter &#8220;&amp;filter=0&#8243; Google shows more posts that are findable with the search term [zilchish]. (Hey <a href="http://mattcutts.com/blog/">Matt</a> and <a href="http://brianwhite.org/">Brian</a>, here&#8217;s room for improvement!)</p>
<h3>Google doesn&#8217;t pass anchor text of nofollow&#8217;ed links to the link destination</h3>
<p>A search for [<a href="http://www.google.com/search?q=zilchish+site:sebastians-pamphlets.com&#038;num=100&#038;hl=en&#038;filter=0&#038;safe=off">zilchish site:sebastians-pamphlets.com</a>] doesn&#8217;t show the testpage that doesn&#8217;t carry this term. In other words, so far the anchor text &#8220;zilchish crap&#8221; of the nofollow&#8217;ed sidebar link didn&#8217;t impact the test page&#8217;s rankings yet. </p>
<h3>Google doesn&#8217;t treat anchor text of JavaScript links as textual content</h3>
<p>A search for [<a href="http://www.google.com/search?num=100&#038;hl=en&#038;safe=off&#038;q=nillified+site%3Asebastians-pamphlets.com">nillified site:sebastians-pamphlets.com</a>] doesn&#8217;t show any URIs that have &#8220;nil, nillified&#8221; as client sided anchor text on the sidebar, just the test page:</p>
<p><img src="http://sebastians-pamphlets.com/img/google/nofollow-nillified-google-serp-24h.png" width="498" height="277" alt="rel-nofollow-test search results" title="Google indexes content from JS manipulated URIs" /></p>
<h3>Results, conclusions, speculation</h3>
<p>This test wasn&#8217;t intended to evaluate whether JS outputted anchor text gets passed to the link destination or not. Unfortunately &#8220;nil&#8221; and &#8220;nillified&#8221; appear both in the JS anchor text as well as on the page, so that&#8217;s for another post. However, it seems the JS anchor text isn&#8217;t indexed for the pages carrying the JS code, at least they don&#8217;t appear in search results for the JS anchor text, so most likely it will not be assigned to the link destination&#8217;s relevancy for &#8220;nil&#8221; or &#8220;nillified&#8221; as well. </p>
<p>Maybe Google&#8217;s algos dealing with client sided outputs need more than 24 hours to assign JS anchor text to link destinations; time will tell if nobody ruins my experiment with links, and that includes unavoidable scraping and its sometimes undetectable links that Google knows but never shows. </p>
<p>However, Google can assign static anchor text pretty fast (within less than 24 hours after link discovery), so I&#8217;m quite confident that condomized links still don&#8217;t pass reputation, nor topically relevance. My test page is unfindable for the nofollow&#8217;ed [zilchish crap]. If that changes later on, that will be the result of other factors, for example scraped pages that link without condom.</p>
<h3>How to safely strip a <a href="http://link-condom.com/">link condom</a></h3>
<p><b>And what&#8217;s the actual &#8220;news&#8221;?</b> Well, say you&#8217;ve links that you must condomize because they&#8217;re paid or whatever, but you want that Google discovers the link destinations nevertheless. To accomplish that, just output a nofollow&#8217;ed link server sided, and change it to a clean link with JavaScript. Google told us for ages that JS links don&#8217;t count, so that&#8217;s perfectly in line with Google&#8217;s guidelines. And if you keep your anchor text as well as URI, title text and such identical, you don&#8217;t cloak with deceitful intent. Other search engines might even pass reputation and relevance based on the client sided version of the link. Isn&#8217;t that neat?</p>
<h3>Link condoms <strike>with juicy taste</strike> faking good karma</h3>
<p>Of course you can use the JS trick without SEO in mind too. E.g. to prettify your condomized ads and paid links. If a visitor uses CSS to highlight nofollow, they <i style="border: medium dotted firebrick; color:navy; background:pink;">look plain ugly</i> otherwise.</p>
<p>Here is how you can do this for a complete Web page. <a href="http://example.com/" rel="nofollow example" title="Nofollow'ed and unclickable link example, use 'view source' to check it out" onclick="return false;">This link is nofollow&#8217;ed</a>. The JavaScript code below changed its REL value to &#8220;dofollow&#8221;. When you put this code <em>at the bottom of your pages</em>, it will un-condomize all your nofollow&#8217;ed links. <code><br />
&lt;script type=&quot;text/javascript&quot;&gt;<br />
    if (document.getElementsByTagName) {<br />
        var aElements = document.getElementsByTagName(&quot;a&quot;);<br />
        for (var i=0; i&lt;aElements.length; i++) {<br />
            var relvalue = aElements[i].rel.toUpperCase();<br />
            if (relvalue.match(&quot;NOFOLLOW&quot;) != &quot;null&quot;) {<br />
                aElements[i].rel = &quot;dofollow&quot;;<br />
            }<br />
        }<br />
    }<br />
&lt;/script&gt;   </code></p>
<p><script type="text/javascript">
    if (document.getElementsByTagName) {
        var aelements = document.getElementsByTagName("a");
        for (var i=0; i<aelements.length; i++) {
            var relvalue = aelements[i].rel.toUpperCase();
            if (relvalue.match("NOFOLLOW") != "null") {
                aelements[i].rel = "dofollow";
            }
        }
    }
</script></p>
<p>(You&#8217;ll find still condomized links on this page. That&#8217;s because the JavaScript routine above changes only links placed above it.)</p>
<p>When you add JavaScript routines like that to your pages, you&#8217;ll increase their page loading time. IOW you slow them down. Also, you should add a note to your <a href="http://sebastians-pamphlets.com/links/full-disclosure/">linking policy</a> to avoid confused advertisers who chase toolbar PageRank.</p>
<p><b>Updates:</b> Obviously Google distrusts me, how come? Four days after the link discovery the <abbr title="Googlebot coming from another IP">search quality archangel</abbr> requested the nofollow&#8217;ed URI &#8211;without query string&#8211; possibly to check whether I serve different stuff to bots and people. As if I&#8217;d cloak, laughable. (Or an assclown linked the URI without condom.)<br />
Day five: Google&#8217;s crawler requested the URI from the totally hidden JavaScript link at the bottom of the test page. Did I hear Google reps stating quite often they aren&#8217;t interested in client-sided links at all?</p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/how-to-dynamically-change-nofollow-to-dofollow/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Act out your sophisticated affiliate link paranoia</title>
		<link>http://sebastians-pamphlets.com/linking-guide-for-paranoid-affiliate-marketers/</link>
		<comments>http://sebastians-pamphlets.com/linking-guide-for-paranoid-affiliate-marketers/#comments</comments>
		<pubDate>Tue, 13 Nov 2007 07:09:30 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Search Quality]]></category>

		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Web development]]></category>

		<category><![CDATA[X-Robots-Tag]]></category>

		<category><![CDATA[Redirects]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[Crawler Directives]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Google]]></category>

		<category><![CDATA[robots.txt]]></category>

		<category><![CDATA[E-Commerce]]></category>

		<category><![CDATA[Cloaking]]></category>

		<category><![CDATA[Nofollow]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/linking-guide-for-paranoid-affiliate-marketers/</guid>
		<description><![CDATA[My recent posts on managing affiliate links and nofollow cloaking paid links led to so many reactions from my readers that I thought explaining possible protection levels could make sense. Google&#8217;s request to condomize affiliate links is a bit, well, thin when it comes to technical tips and tricks:
Links purchased for advertising should be designated [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://sebastians-pamphlets.com/img/posts/paranoid-affiliate-link.png" width="250" height="231" border="0" align="right" style="margin-left:4px;" alt="GOOD: paranoid affiliate link" title="Paranoid on affiliate links" />My recent posts on <a href="http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/">managing affiliate links</a> and <a href="http://sebastians-pamphlets.com/a-pragmatic-defense-against-googles-anti-paid-links-campaign/">nofollow cloaking</a> <a href="http://sebastians-pamphlets.com/text-link-broker-woes-smart-paid-links-sniffers-fromgoogle/">paid links</a> led to so many reactions from my readers that I thought explaining possible protection levels could make sense. <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66736">Google&#8217;s request to condomize affiliate links</a> is a bit, well, thin when it comes to technical tips and tricks:<br />
<blockquote>Links purchased for advertising should be designated as such. This can be done in several ways, such as:<br />
    * Adding a rel=&#8221;nofollow&#8221; attribute to the &lt;a&gt; tag<br />
    * Redirecting the links to an intermediate page that is blocked from search engines with a robots.txt file</p></blockquote>
<p> Also, Google doesn&#8217;t define <a href="http://sebastians-pamphlets.com/links/categories/?cat=paid-links">paid links</a> that clearly, so try this <a href="http://www.stonetemple.com/blog/?p=196">paid link definition</a> instead before your read on. <b>Here is my linking guide for the paranoid affiliate marketer.</b></p>
<p><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=76465">Google recommends hiding of any content provided by affiliate programs from their crawlers</a>. That means not only links and banner ads, so think about tactics to hide content pulled from a merchants data feed too. Linked graphics along with text links, testimonials and whatnot copied from an affiliate program&#8217;s sales tools page count as duplicate content (snippet) in its worst occurance.</p>
<p>Pasting code copied from a merchant&#8217;s site into a page&#8217;s or template&#8217;s HTML is not exactly a smart way to put ads. Those ads aren&#8217;t manageable nor trackable, and when anything must be changed, editing tons of files is a royal PITA. Even when you&#8217;re just running a few ads on your blog, a simple ad management script allows flexible administration of your adverts. </p>
<p>There are tons of such scripts out there, so I don&#8217;t post a complete solution, but just the code which saves your ass when a search engine hating your ads and paid links comes by. To keep it simple and stupid my code snippets are mostly taken from this blog, so when you&#8217;ve a WordPress blog you can adapt them with ease. </p>
<h3>Cover your ass with a linking policy</h3>
<p>Googlers as well as hired guns do review Web sites for violations of Google&#8217;s guidelines, also competitors might be in the mood to turn you in with a spam report or paid links report. A (prominently linked) <a href="http://sebastians-pamphlets.com/links/full-disclosure/">full disclosure of your linking attitude</a> can help to pass a human review by search engine staff. By the way, having a <a href="http://sebastians-pamphlets.com/about/policies/#commenting">policy for dofollowed blog comments</a> is also a good idea.</p>
<p>Since crawler directives like <a href="http://sebastians-pamphlets.com/links/categories/?cat=nofollow">link condoms</a> are for search engines (only), and those pay attention to your source code and hints addressing search engines like <a href="http://sebastians-pamphlets.com/links/categories/?cat=robotstxt">robots.txt</a>, you should leave a note <a href="http://sebastians-pamphlets.com/robots.txt" rel="nofollow nocontent">there</a> too, look into the source of this page for an example. <a onclick="showContent('sample-code-disclosure'); this.style.display = 'none'; return false;">View sample HTML comment.</a> <b id="sample-code-disclosure" style="display:none;">Sample HTML comment: <code>&lt;&#33;--</code>This site serves machine-readable disclosures, e.g. crawler directives like rel-nofollow applied to links with commercial intent, to Web robots only.<code>--&gt;</code></b> </p>
<h3>Block crawlers from your propaganda scripts</h3>
<p>Put all your stuff related to advertising (scripts, images, movies&#8230;) in a subdirectory and disallow search engine crawling in your <a href="http://www.smart-it-consulting.com/article.htm?node=140&#038;page=46">/robots.txt</a> file: <code><br />
User-agent: *<br />
Disallow: /propaganda/ </code><br />
Of course you&#8217;ll use an innocuous name like &#8220;gnisitrevda&#8221; for this folder, which lacks a default document and can&#8217;t get browsed because you&#8217;ve a <code><br />
Options -Indexes </code><br />
statement in your .htaccess file. (Watch out, Google knows what &#8220;gnisitrevda&#8221; means, so be creative or cryptic.)</p>
<p>Crawlers sent out by major search engines do respect robots.txt, hence it&#8217;s guaranteed that regular spiders don&#8217;t fetch it. As long as you don&#8217;t cheat too much, you&#8217;re not haunted by those legendary anti-webspam bots sneakily accessing your site via AOL proxies or Level3 IPs. A robots.txt block doesn&#8217;t prevent you from surfing search engine staff, but I don&#8217;t tell you things you&#8217;d better hide from Matt&#8217;s gang.</p>
<h3>Detect search engine crawlers</h3>
<p>Basically there are three common methods to detect requests by search engine crawlers.
<ol>
<li>Testing the user agent name (HTTP_USER_AGENT) for strings like &#8220;Googlebot&#8221;, &#8220;Slurp&#8221;, &#8220;MSNbot&#8221; or so which identify crawlers. That&#8217;s easy to spoof, for example <a href="http://sebastians-pamphlets.com/referrer-spoofing-with-prefbar-341/">PrefBar for FireFox</a> lets you choose from a list of user agents.</li>
<li>Checking the user agent name, and only when it indicates a crawler, verifying the requestor&#8217;s IP address with a reverse lookup, respectively against a cache of verified crawler IP addresses and host names.</li>
<li>Maintaining a list of all search engine crawler IP addresses known to man,  checking the requestor&#8217;s IP (REMOTE_ADDR) against this list. (That alone isn&#8217;t bullet-proof, but I&#8217;m not going to write a tutorial on industrial-strength <strike>cloaking</strike> IP delivery, I leave that to the real <a href="http://fantomaster.com/fantomNews">experts</a>.)</li>
</ol>
<p>For our purposes we use method 1) and 2). When it comes to outputting ads or other paid links, checking the user agent is save enough. Also, this allows your business partners to evaluate your linkage using a crawler as user agent name. Some affiliate programs won&#8217;t activate your account without testing your links. When crawlers try to follow affiliate links on the other hand, you need to verify their IP addresses for two reasons. First, you should be able to upsell spoofing users too. Second, if you allow crawlers to follow your affiliate links, this may have impact on the merchants&#8217; search engine rankings, and that&#8217;s evil in Google&#8217;s eyes.  </p>
<p>We use two PHP functions to detect search engine crawlers. checkCrawlerUA() returns TRUE and sets an expected crawler host name, if the user agent name identifies a major search engine&#8217;s spider, or FALSE otherwise. checkCrawlerIP($string) verifies the requestor&#8217;s IP address and returns TRUE if the user agent is indeed a crawler, or FALSE otherwise. checkCrawlerIP() does a primitive caching in a flat file, so that once a crawler was verified on its very first content request, it can be detected from this cache to avoid pretty slow DNS lookups. The input parameter is any string which will make it into the log file. checkCrawlerIP() does not verify an IP address if the user agent string doesn&#8217;t match a crawler name. </p>
<p><b id="grab-php-code-check-crawler"><a onclick="showContent('php-code-check-crawler'); return false;">View</a>|<a onclick="hideContent('php-code-check-crawler'); return false;">hide</a> PHP code.</b> (If you&#8217;ve disabled JavaScript you can&#8217;t grab the PHP source code!)<br />
<code id="php-code-check-crawler" style="display:none;"><b><br />
// file system path to crawler IP log, scripts etc.,<br />
// without trailing slash:<br />
$includePath   = $_SERVER[&quot;DOCUMENT_ROOT&quot;] . &quot;/propaganda&quot;;<br />
// edit &quot;propaganda&quot; and CHMOD 777 the directory !<br />
// file names:<br />
$crawlerIps  = $includePath .&quot;/crawler-ip-addresses.txt&quot;;<br />
// misc. stuff:<br />
$timestamp     = date(&#8217;Y-m-d H:i:s&#8217;);<br />
$ipAddy        = $_SERVER[&quot;REMOTE_ADDR&quot;];<br />
$referrer      = $_SERVER[&quot;HTTP_REFERER&quot;];<br />
$userAgent     = $_SERVER[&quot;HTTP_USER_AGENT&quot;];<br />
$requestUri    = $_SERVER[&quot;REQUEST_URI&quot;];<br />
$queryString   = $_SERVER[&quot;QUERY_STRING&quot;];<br />
$isCrawler     = FALSE;<br />
$crawlerServer = &quot;&quot;;<br />
$delimiter     = &quot;|&quot;;<br />
$idString      = &quot;&quot;;<br />
if (empty($includePath)) {<br />
   $includePath = $_SERVER[&quot;DOCUMENT_ROOT&quot;] . &quot;/propaganda&quot;; // CHMOD 777<br />
}<br />
// Write a file to disk<br />
if (!function_exists(&quot;writeLocalFile&quot;)) {<br />
function writeLocalFile ($file, $content) {<br />
   if (!is_writable($file)) {<br />
      $lok = @chmod ( $file, 0777 );<br />
   }<br />
   // file_put_contents() not avail in PHP 4.3x<br />
   $fp = @fopen(&quot;$file&quot;,&quot;w+&quot;);<br />
   if ($fp) {<br />
       $lOk = @fwrite($fp, $content, strlen($content));<br />
       @fclose($fp);<br />
       // make sure file may get overwritten or removed later on<br />
       $lok = @chmod ( $file, 0777 );<br />
       return TRUE;<br />
   } // endif $fp<br />
   return FALSE;<br />
} // end function writeLocalFile<br />
}<br />
if (!function_exists(&quot;checkCrawlerUA&quot;)) {<br />
function checkCrawlerUA () {<br />
    GLOBAL $userAgent;<br />
    GLOBAL $crawlerServer;<br />
    $crawlerServer = &quot;&quot;;<br />
    $crawlers  = array(&quot;Googlebot&quot;,&quot;Mediapartners&quot;,&quot;Slurp&quot;,&quot;MSNbot&quot;,&quot;Ask&quot;,&quot;Teoma&quot;);<br />
    foreach ($crawlers as $crawler) {<br />
        if (stristr($userAgent,$crawler)) {<br />
            if (stristr($crawler,&quot;Googlebot&quot;) ||<br />
                stristr($crawler,&quot;Mediapartners&quot;)) {<br />
                $crawlerServer = &quot;.googlebot.com&quot;;<br />
            } // Google<br />
            if (stristr($crawler,&quot;Slurp&quot;)) {<br />
                $crawlerServer = &quot;.crawl.yahoo.net&quot;;<br />
            } // Yahoo<br />
            if (stristr($crawler,&quot;MSNbot&quot;)) {<br />
                $crawlerServer = &quot;.search.live.com&quot;;<br />
            } // MSN/Live<br />
            if (stristr($crawler,&quot;Ask&quot;) ||<br />
                stristr($crawler,&quot;Teoma&quot;)) {<br />
                $crawlerServer = &quot;.ask.com&quot;;<br />
            } // Ask<br />
        }<br />
    } // foreach crawlers<br />
    if (!empty($crawlerServer)) return TRUE;<br />
    return FALSE;<br />
} // end function checkCrawlerUA<br />
}<br />
if (!function_exists(&quot;checkCrawlerIP&quot;)) {<br />
function checkCrawlerIP ($idString) {<br />
    GLOBAL $ipAddy;<br />
    GLOBAL $crawlerIps;<br />
    GLOBAL $delimiter;<br />
    GLOBAL $timestamp;<br />
    GLOBAL $userAgent;<br />
    GLOBAL $crawlerServer;<br />
    $isCrawler = checkCrawlerUA();<br />
    if ($isCrawler === FALSE)  return FALSE;<br />
    if (empty($crawlerServer)) return FALSE;<br />
//<br />
// DEBUG: $crawlerServer = &quot;.national-net.com&quot;;<br />
// Use your ISPs host name for testing with a spoofed user agent name<br />
//<br />
    $crawlerIpsContent = @file_get_contents($crawlerIps);<br />
    if (!empty($crawlerIpsContent)) {<br />
        if (stristr($crawlerIpsContent, &quot;\n$ipAddy$delimiter&quot;)) {<br />
            return TRUE;<br />
        }<br />
    }<br />
    $crawlerHost = @gethostbyaddr($ipAddy);<br />
    if (!stristr($crawlerHost,$crawlerServer)) {<br />
        return FALSE;<br />
    }<br />
    if (&quot;$crawlerHost&quot; == &quot;$ipAddy&quot;) {<br />
        return FALSE;<br />
    }<br />
    $ipAddyRev = @gethostbyname($crawlerHost);<br />
    if (&quot;$ipAddyRev&quot; != &quot;$ipAddy&quot;) {<br />
        return FALSE;<br />
    }<br />
    $crawlerIpsContent .= &quot;\n&quot; .$ipAddy .$delimiter<br />
                          .$timestamp   .$delimiter<br />
                          .$crawlerHost .$delimiter<br />
                          .$idString    .$delimiter<br />
                          .$userAgent   .$delimiter;<br />
    $lOk = writeLocalFile ($crawlerIps, $crawlerIpsContent);<br />
    return TRUE;<br />
} // end function checkCrawlerIP<br />
}<br />
</b></code><br />
Grab and implement the PHP source, then you can code statements like <code><br />
$isSpider = checkCrawlerUA ();<br />
...<br />
if ($isSpider) {<br />
    $relAttribute = &quot; rel=\&quot;nofollow\&quot; &quot;;<br />
}<br />
...<br />
$affLink = &quot;&lt;a href=\&quot;$affUrl\&quot; $relAttribute&gt;call for action&lt;/a&gt;&quot;;<br />
</code><br />
or <code><br />
$isSpider = checkCrawlerIP ($sponsorUrl);<br />
...<br />
if ($isSpider) {<br />
    // don't redirect to the sponsor, return a 403 or 410 instead<br />
}</code><br />
More on that later.</p>
<h3>Don&#8217;t deliver your advertising to search engine crawlers</h3>
<p>It&#8217;s possible to serve totally clean pages to crawlers, that is without any advertising, not even JavaScript ads like AdSense&#8217;s script calls. Whether you go that far or not depends on the grade of your paranoia. Suppressing ads on a (thin|sheer) affiliate site can make sense. Bear in mind that hiding all promotional links and related content can&#8217;t guarantee indexing, because Google doesn&#8217;t index shitloads of templated pages witch hide duplicate content as well as ads from crawling, without carrying a single piece of somewhat compelling content.</p>
<p>Here is how you could output a totally uncrawlable banner ad: <code><br />
...<br />
$isSpider = checkCrawlerIP ($PHP_SELF);<br />
...<br />
print &quot;&lt;div class=\&quot;css-class-sidebar robots-nocontent\&quot;&gt;&quot;;<br />
// output RSS buttons or so<br />
if (!$isSpider) {<br />
    print &quot;&lt;script type=\&quot;text/javascript\&quot; src=\&quot;http://sebastians-pamphlets.com/propaganda/output.js.php? adName=seobook&#038;adServed=banner\&quot;&gt;&lt;/script&gt;&quot;;<br />
    ...<br />
}<br />
...<br />
print &quot;&lt;/div&gt;\n&quot;;<br />
...</code><br />
Lets look at the code above. First we detect crawlers &#8220;without doubt&#8221; (well, in some rare cases it can still happen that a suspected Yahoo crawler comes from a non-&#8217;.crawl.yahoo.net&#8217; host but another IP owned by Yahoo, Inktomi, Altavista or AllTheWeb/FAST, and I&#8217;ve seen similar reports of such misbehavior for other engines too, but that might have been employees surfing with a crawler-UA).</p>
<p>Currently the <em>robots-nocontent</em>&nbsp; class name in the DIV is not supported by Google, MSN and Ask, but it tells Yahoo that everything in this DIV shall not be used for ranking purposes. That doesn&#8217;t conflict with class names used with your CSS, because each X/HTML element can have an unlimited list of space delimited class names. Like Google&#8217;s section targeting that&#8217;s a <a href="http://sebastians-pamphlets.com/yahoo-search-going-to-torture-webmasters/">crappy crawler directive</a>, though. However, it doesn&#8217;t hurt to make use of this Yahoo feature with all sorts of screen real estate that is not relevant for search engine ranking algos, for example RSS links (use autodetect and pings to submit), &#8220;buy now&#8221;/&#8221;view basket&#8221; links or references to TOS pages and alike, templated text like terms of delivery (but not the street address provided for local search) &#8230; and of course ads.</p>
<p>Ads aren&#8217;t outputted when a crawler requests a page. Of course that&#8217;s cloaking, but unless the united search engine geeks come out with a standardized procedure to handle code and contents which aren&#8217;t relevant for indexing that&#8217;s not <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66355">deceitful cloaking</a> in my opinion. Interestingly, in many cases cloaking is the last weapon in a webmaster&#8217;s arsenal that s/he can fire up to comply to search engine rules when everything else fails, because the crawlers behave more and more like browsers. </p>
<p>Delivering user specific contents in general is fine with the engines, for example geo targeting, profile/logout links, or buddy lists shown to registered users only and stuff like that, aren&#8217;t penalized. Since Web robots can&#8217;t pull out the plastic, there&#8217;s no reason to serve them ads just to waste bandwidth. In some cases search engines even require cloaking, for example to prevent their crawlers from fetching URLs with tracking variables and unavoidable duplicate content. (<a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35769">Example from Google</a>: &#8220;Allow search bots to crawl your sites without session IDs or arguments that track their path through the site&#8221; is a call for <a href="http://www.smart-it-consulting.com/article.htm?node=148&#038;page=103">search engine friendly URL cloaking</a>.) </p>
<h3>Is hiding ads from crawlers &#8220;safe with Google&#8221; or not?</h3>
<p><img src="http://sebastians-pamphlets.com/img/posts/uncloaked-affiliate-link.png" width="200" height="188" border="0" align="right" style="margin-left:4px;" alt="BAD: uncloaked affiliate link" title="Uncloaked affiliate link" />Cloaking ads away is a double edged sword from a search engine&#8217;s perspective. Way too strictly interpreted that&#8217;s against the cloaking rule which states &#8220;don&#8217;t show crawlers other content than humans&#8221;, and search engines like to be aware of advertising in order to rank estimated user experiences algorithmically. On the other hand they provide us with mechanisms (Google&#8217;s section targeting or Yahoo&#8217;s robots-nocontent class name) to disable such page areas for ranking purposes, and they code their own ads in a way that crawlers don&#8217;t count them as on-the-page contents.</p>
<p>Although Google says that AdSense text link ads are content too, they ignore their textual contents in ranking algos. Actually, their crawlers and indexers don&#8217;t render them, they just notice the number of script calls and their placement (at least if above the fold) to identify <acronym title="Made For AdSense/Advertising">MFA</acronym> pages. In general, they ignore ads as well as other content outputted with client sided scripts or hybrid technologies like AJAX, at least when it comes to rankings. </p>
<p>Since in theory the contents of JavaScript ads aren&#8217;t considered food for rankings, cloaking them completely away (supressing the JS code when a crawler fetches the page) can&#8217;t be wrong. Of course these script calls as well as on-page JS code are a ranking factors. Google possibly counts ads, maybe calculates even ratios like screen size used for advertising etc. vs. space used for content presentation to determine whether a particular page provides a good surfing experience for their users or not, but they can&#8217;t argue seriously that hiding such tiny signals &#8211;which they use for the sole purposes of possible downranks&#8211; is against their guidelines.</p>
<p>For ages search engines reps used to encourage webmasters to obfuscate all sorts of stuff they want to hide from crawlers, like commercial links or redundant snippets, by linking/outputting with JavaScript instead of crawlable X/HTML code. Just because their crawlers evolve, that doesn&#8217;t mean that they can take back this advice. All this JS stuff is out there, on gazillions of sites, often on pages which will never be edited again.</p>
<p><b>Dear search engines, if it does not count, then you cannot demand to keep it crawlable.</b> Well, a few super mega white hat <acronym title="Dougie ...">trolls</acronym> might disagree, and depending on the implementation on individual sites maybe hiding ads isn&#8217;t totally riskless in any case, so decide yourself. I just cloak machine-readable disclosures because crawler directives are not for humans, but don&#8217;t try to hide the fact that I run ads on this blog.</p>
<p>Usually I don&#8217;t argue with fair vs. unfair, because we talk about <strike>war</strike> business here, what means that everything goes. However, Google does everything to talk the whole Internet into <strike>obfuscating</strike> disclosing ads with link condoms of any kind, and they take a lot of flak for such campaigns, hence I doubt they would cry foul today when webmasters hide both client sided as well as server sided delivery of advertising from their crawlers. Penalizing for delivery of sheer contents would be unfair. <img src='http://sebastians-pamphlets.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> (Of course that&#8217;s stuff for a great debate. If Google decides that hiding ads from spiders is evil, they will react and don&#8217;t care about bad press. So please don&#8217;t take my opinion as professional advice. I might change my mind tomorrow, because actually I can imagine why Google might raise their eyebrows over such statements.)</p>
<h3>Outputting ads with JavaScript, preferably in iFrames</h3>
<p>Delivering adverts with JavaScript does not mean that one can&#8217;t use server sided scripting to adjust them dynamically. With content management systems it&#8217;s not always possible to use PHP or so. In WordPress for example, PHP is executable in templates, posts and pages (requires a plugin), but not in sidebar widgets. A piece of JavaScript on the other hand works (nearly) everywhere, as long as it doesn&#8217;t come with single quotes (WordPress escapes them for storage in its MySQL database, and then fails to output them properly, that is single quotes are converted to fancy symbols which break eval&#8217;ing the PHP code).</p>
<p>Lets see how that works. Here is a banner ad created with a PHP script and delivered via JavaScript:<br />
<script type="text/javascript" src="http://sebastians-pamphlets.com/ads/output.js.php?adName=seobook&#038;adServed=banner"></script><br />
And here is the JS call of the PHP script: <code><br />
&lt;script type=&quot;text/javascript&quot; src=&quot;http://sebastians-pamphlets.com/propaganda/output.js.php? adName=seobook&#038;adServed=banner&quot;&gt;&lt;/script&gt;</code></p>
<p>The PHP script <code>/propaganda/output.js.php</code> evaluates the query string to pull the requested ad&#8217;s components. In case it&#8217;s expired (e.g. promotions of conferences, affiliate program went belly up or so) it looks for an alternative (there are tons of neat ways to deliver different ads dependent on the requestor&#8217;s location and whatnot, but that&#8217;s not the point here, hence the lack of more examples). Then it checks whether the requestor is a crawler. If the user agent indicates a spider, it adds rel=nofollow to the ad&#8217;s links. Once the HTML code is ready, it outputs a JavaScript statement: <code><br />
document.write(&lsquo;&lt;a href=&quot;http://sebastians-pamphlets.com/propaganda/router.php? adName=seobook&#038;adServed=banner&quot; title=&quot;DOWNLOAD THE BOOK ON SEO!&quot;&gt;&lt;img src=&quot;http://sebastians-pamphlets.com/propaganda/seobook/468-60.gif&quot; width=&quot;468&quot; height=&quot;60&quot; border=&quot;0&quot; alt=&quot;The only current book on SEO&quot; title=&quot;The only current book on SEO&quot;  /&gt;&lt;/a&gt;&rsquo;); </code> which the browser executes within the <code>script</code> tags (replace single quotes in the HTML code with double quotes). A static ad for surfers using ancient browsers goes into the noscript tag. </p>
<p>Matt Cutts <a href="http://www.stonetemple.com/articles/interview-matt-cutts.shtml">said</a> that <a href="http://www.mattcutts.com/blog/bot-obedience-herding-googlebot/#comment-45561">JavaScript links don&#8217;t prevent Googlebot from crawling</a>, but that <a href="http://www.seomoz.org/blog/the-paid-links-debate-rages-on-ses-san-jose-2007">those links</a> <a href="http://www.mattcutts.com/blog/how-to-report-paid-links/#comment-101482">don&#8217;t count for rankings</a> (not long ago I read a more recent quote from Matt where he stated that this is future-proof, but I can&#8217;t find the link right now). We know that Google can interpret internal and external JavaScript code, as long as it&#8217;s fetchable by crawlers, so I wouldn&#8217;t say that delivering advertising with client sided technologies like JavaScript or Flash is a bullet-proof procedure to hide ads from Google, and the same goes for other major engines. That&#8217;s why I use rel-nofollow &#8211;on crawler requests&#8211; even in JS ads.</p>
<p>Change your user agent name to Googlebot or so, install <a href="http://www.mattcutts.com/blog/seeing-nofollow-links/">Matt&#8217;s show nofollow hack</a> or something similar, and you&#8217;ll see that the affiliate-URL gets nofollow&#8217;ed for crawlers. The dotted border in firebrick is extremely ugly, detecting condomized links this way is pretty popular, and I want to serve nice looking pages, thus I really can&#8217;t offend my readers with nofollow&#8217;ed links (although I don&#8217;t care about crawler spoofing, actually that&#8217;s a good procedure to let advertisers check out my linking attitude).</p>
<p>We look at the affiliate URL from the code above later on, first lets discuss other ways to make ads more search engine friendly. Search engines don&#8217;t count pages displayed in iFrames as on-page contents, especially not when the iFrame&#8217;s content is hosted on another domain. Here is an example straight from the horse&#8217;s mouth: <code><br />
&lt;iframe name=&quot;google_ads_frame&quot; src=&quot;http://pagead2.googlesyndication.com/pagead/ads? very-long-and-ugly-query-string&quot; marginwidth=&quot;0&quot; marginheight=&quot;0&quot; vspace=&quot;0&quot; hspace=&quot;0&quot; allowtransparency=&quot;true&quot; frameborder=&quot;0&quot; height=&quot;90&quot; scrolling=&quot;no&quot; width=&quot;728&quot;&gt;&lt;/iframe&gt;</code> In a noframes tag we could put a static ad for surfers using browsers which don&#8217;t support frames/iFrames. </p>
<p>If for some reasons you don&#8217;t want to detect crawlers, or it makes sound sense to hide ads from other Web robots too, you could encode your JavaScript ads. This way you deliver totally and utterly useless gibberish to anybody, and just browsers requesting a page will render the ads. Example: any sort of text or html block that you would like to encrypt and hide from snoops, scrapers, parasites, or bots, can be run through Michael&#8217;s <a href="http://www.bad-neighborhood.com/htmlhashing.htm">Full Text/HTML Obfuscator Tool</a> (hat tip to <a href="http://www.seo-scoop.com/2007/09/13/new-tool-to-hide-stuff/">Donna</a>).</p>
<h3>Always redirect to affiliate URLs</h3>
<p>There&#8217;s absolutely no point in using ugly affiliate URLs on your pages. Actually, that&#8217;s the last thing you want to do for various reasons.
<ul>
<li>For example, affiliate URLs as well as source codes can change, and you don&#8217;t want to edit tons of pages if that happens.</li>
<li>When an affiliate program doesn&#8217;t work for you, goes belly up or bans you, you need to route all clicks to another destination when the shit hits the fan. In an ideal world, you&#8217;d replace outdated ads completely with one mouse click or so.</li>
<li>Tracking ad clicks is no fun when you need to pull your stats from various sites, all of them in another time zone, using their own &#8211;often confusing&#8211; layouts, providing different views on your data, and delivering program specific interpretations of impressions or click throughs. Also, if you don&#8217;t track your outgoing traffic, some sponsors will cheat and you can&#8217;t prove your gut feelings.</li>
<li>Scrapers can steal revenue by replacing affiliate codes in URLs, but may overlook hard coded absolute URLs which don&#8217;t smell like affiliate URLs.</li>
<li><b>&#8230;</b></li>
</ul>
<p>When you replace all affiliate URLs with the URL of a smart redirect script on one of your domains, you can really <b>manage your affiliate links</b>. There are many more good reasons for utilizing ad-servers, for example smart search engines which might think that your advertising is overwhelming.</p>
<p>Affiliate links provide great footprints. Unique URL parts respectively <b>query string variable names</b> gathered by Google from all affiliate programs out there are one clear signal they use to identify affiliate links. The <b>values</b> identify the single affiliate marketer. Google loves to identify networks of ((thin) affiliate) sites by affiliate IDs. That does not mean that Google detects each and every affiliate link at the time of the very first fetch by Ms. Googlebot and the possibly following indexing. Processes identifying pages with (many) affiliate links and sites plastered with ads instead of unique contents can run afterwords, utilizing a well indexed database of links and linking patterns, reporting the findings to the search index respectively delivering minus points to the query engine. Also, that doesn&#8217;t mean that affiliate URLs are the one and only trackable footmark Google relies on. But that&#8217;s one trackable footprint you can avoid to some degree. </p>
<p>If the redirect-script&#8217;s location is on the same server (in fact it&#8217;s not thanks to symlinks) and not named &#8220;adserver&#8221; or so, chances are that a heuristic check won&#8217;t identify the link&#8217;s intent as promotional. Of course statistical methods can discover your affiliate links by analyzing patterns, but those might be similar to patterns which have nothing to do with advertising, for example click tracking of editorial votes, links to contact pages which aren&#8217;t crawlable with paramaters, or similar &#8220;legit&#8221; stuff. However, you can&#8217;t fool smart algos forever, but if you&#8217;ve a good reason to hide ads every little might help. Of course, providing lots of great contents countervails lots of ads (from a search engine&#8217;s point of view, and users might agree on this).</p>
<p>Besides all these (pseudo) black hat thoughts and reasoning, there is a way more important advantage of redirecting links to sponsors: blocking crawlers. Yup, search engine crawlers must not follow affiliate URLs, because it doesn&#8217;t benefit you (<a href="http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/">usually</a>). Actually, every affiliate link is a useless PageRank leak. Why should you boost the merchants search engine rankings? Better take care of your own rankings by hiding such outgoing links from crawlers, and stopping crawlers before they spot the redirect, if they by accident found an affiliate link without link condom.</p>
<h3>The behavior of an adserver URL masking an affiliate link</h3>
<p>Lets look at the redirect-script&#8217;s URL from my code example above:<br />
<a href="http://sebastians-pamphlets.com/ads/router.php?adName=seobook&#038;adServed=banner">/propaganda/router.php?adName=seobook&#038;adServed=banner</a><br />
On request of router.php the $adName variable identifies the affiliate link, $adServed tells which sort/type/variation of ad was clicked, and all that gets stored with a timestamp under title and URL of the page carrying the advert. </p>
<p>Now that we&#8217;ve covered the statistical requirements, router.php calls the checkCrawlerIP() function setting $isSpider to TRUE only when both the user agent as well as the host name of the requestor&#8217;s IP address identify a search engine crawler, and a reverse DNS lookup equals the requestor&#8217;s IP addy.</p>
<p>If the requestor is not a verified crawler, router.php does a 307 redirect to the sponsor&#8217;s landing page: <code><br />
$sponsorUrl      = &quot;http://www.seobook.com/262.html&quot;;<br />
$requestProtocol = $_SERVER[&quot;SERVER_PROTOCOL&quot;];<br />
$protocolArr     = explode(&quot;/&quot;,$requestProtocol);<br />
$protocolName    = trim($protocolArr[0]);<br />
$protocolVersion = trim($protocolArr[1]);<br />
if (stristr($protocolName,&quot;HTTP&quot;)<br />
    &#038;&#038; strtolower($protocolVersion) > &quot;1.0&quot; ) {<br />
    $httpStatusCode = 307;<br />
}<br />
else {<br />
    $httpStatusCode = 302;<br />
}<br />
$httpStatusLine = &quot;$requestProtocol $httpStatusCode Temporary Redirect&quot;;<br />
@header($httpStatusLine, TRUE, $httpStatusCode);<br />
@header(&quot;Location: $sponsorUrl&quot;);<br />
exit;</code><br />
A 307 redirect avoids caching issues, because 307 redirects must not be cached by the user agent. That means that changes of sponsor URLs take effect immediately, even when the user agent has cached the destination page from a previous redirect. If the request came in via HTTP/1.0, we must perform a 302 redirect, because the 307 response code was introduced with HTTP/1.1 and some older user agents might not be able to handle 307 redirects properly. User agents can cache the locations provided by 302 redirects, so possibly when they run into a page known to redirect, they might request the outdated location. For obvious reasons we can&#8217;t use the 301 response code, because 301 redirects are always cachable. (<a href="http://sebastians-pamphlets.com/the-anatomy-of-http-redirects-301-302-307/">More information on HTTP redirects</a>.)</p>
<p>If the requestor is a major search engine&#8217;s crawler, we perform the most brutal bounce back known to man: <code><br />
if ($isSpider) {<br />
    @header(&quot;HTTP/1.1 403 Sorry Crawlers Not Allowed&quot;, TRUE, 403);<br />
    @header(&quot;X-Robots-Tag: nofollow,noindex,noarchive&quot;);<br />
    exit;<br />
}</code><br />
The 403 response code translates to &#8220;kiss my ass and get the fuck outta here&#8221;. The X-Robots-Tag in the HTTP header instructs crawlers that the requested URL must not be indexed, doesn&#8217;t provide links the poor beast could follow, and must not be publically cached by search engines. In other words the HTTP header tells the search engine &#8220;forget this URL, don&#8217;t request it again&#8221;. Of course we could use the 410 response code instead, which tells the requestor that a resource is irrevocably dead, gone, vanished, non-existent, and further requests are forbidden. Both the 403-Forbidden response as well as the 410-Gone return code prevent you from URL-only listings on the SERPs (once the URL was crawled). Personally, I prefer the 403 response, because it perfectly and unmistakably expresses my opinion on this sort of search engine guidelines, although currently nobody except Google understands or supports X-Robots-Tags in HTTP headers.</p>
<p>If you don&#8217;t use URLs provided by affiliate programs, your affiliate links can never influence search engine rankings, hence the engines are happy because you did their job so obedient. Not that they otherwise would count (most of) your affiliate links for rankings, but forcing you to castrate your links yourself makes their life much easier, and you don&#8217;t need to live in fear of penalties.</p>
<h3 id="recap-hide-afflinks">Recap</h3>
<p><img src="http://sebastians-pamphlets.com/img/posts/prospering-affiliate-link.png" width="200" height="200" border="0" align="right" style="margin-left:4px;" alt="NICE: prospering affiliate link" title="Prospering affiliate link" />Before you output a page carrying ads, paid links, or other selfish links with commercial intent, check if the requestor is a search engine crawler, and act accordingly.</p>
<p>Don&#8217;t deliver different (editorial) contents to users and crawlers, but also don&#8217;t serve ads to crawlers. They just don&#8217;t buy your eBook or whatever you sell, unless a search engine sends out Web robots with credit cards able to understand Ajax, respectively authorized to fill out and submit Web forms.</p>
<p>Your ads look plain ugly with dotted borders in firebrick, hence don&#8217;t apply rel=&#8221;nofollow&#8221; to links when the requestor is not a search engine crawler. The engines are happy with machine-readable disclosures, and you can discuss everything else with the FTC yourself.</p>
<p>No nay never use links or content provided by affiliate programs on your pages. Encapsulate this kind of content delivery in AdServers. </p>
<p>Do not allow search engine crawlers to follow your affiliate links, paid links, nor other disliked votes as per search engine guidelines. Of course condomizing such links is not your responsibility, but getting penalized for not doing Google&#8217;s job is not exactly funny.</p>
<p>I admit that some of the stuff above is for extremely paranoid folks only, but knowing how to be paranoid might prevent you from making silly mistakes. Just because you believe that you&#8217;re not paranoid, that does not mean Google will not chase you down. You really don&#8217;t need to be a so called black hat to displease Google. Not knowing respectively not understanding <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35769">Google&#8217;s 12 commandments</a> doesn&#8217;t prevent you from being spanked for sins you&#8217;ve never heard of. If you&#8217;re keen on Google&#8217;s nicely targeted traffic, better play by Google&#8217;s rules, leastwise on creawler requests.</p>
<p>Feel free to contribute your tips and tricks in the comments.</p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/linking-guide-for-paranoid-affiliate-marketers/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Text link broker woes: Google&#8217;s smart paid link sniffers</title>
		<link>http://sebastians-pamphlets.com/text-link-broker-woes-smart-paid-links-sniffers-fromgoogle/</link>
		<comments>http://sebastians-pamphlets.com/text-link-broker-woes-smart-paid-links-sniffers-fromgoogle/#comments</comments>
		<pubDate>Tue, 06 Nov 2007 16:44:19 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Link Building]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[Anchor Text]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/text-link-broker-woes-smart-paid-links-sniffers-fromgoogle/</guid>
		<description><![CDATA[After the recent toolbar PageRank massacre link brokers are in the spotlight. One of them, TNX beta1, asked me to post a paid review of their service. It took a while to explain that nobody can buy a sales pitch here. I offered to write a pitiless honest review for a low hourly fee, provided [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://sebastians-pamphlets.com/img/google/googles-smart-paid-link-sniffer.png" width="200" height="192" border="0" align="right" style="margin-left:4px;" alt="Google's smart paid link sniffer at work" title="Google's smart paid link sniffer at work" />After the recent toolbar PageRank massacre link brokers are in the spotlight. One of them, <a href="http://sebastians-pamphlets.com/ads/router.php?adName=paidreviews&#038;adServed=http://www.tnx.net/">TNX beta</a><sup><a href="http://sebastians-pamphlets.com/links/full-disclosure/">1</a></sup>, asked me to post a paid review of their service. It took a while to explain that nobody can buy a sales pitch here. I offered to write a pitiless honest review for a low hourly fee, provided a sample on their request, but got no order or payment yet. Never mind. Since the topic is hot, here&#8217;s my review, paid or not.</p>
<p>So what does TNX offer? Basically it&#8217;s a semi-automated link exchange where everybody can sign up to sell and/or purchase text links. TNX takes 25% commission, 12.5% from the publisher, and 12.5% from the advertiser. They calculate the prices based on Google&#8217;s toolbar PageRank and link popularity pulled from Yahoo. For example a <a href="http://www.resourceshelf.com/" title="Example I've used for this calculation">site</a> putting five blocks of four links each on one page with toolbar PageRank 4/10 and four pages with a toolbar PR 3/10 will earn $46.80 monthly.</p>
<p>TNX provides a <a href="http://forums.digitalpoint.com/showpost.php?s=65a5dfb0898e19d51b87fdb14d456457&#038;p=4940146&#038;postcount=1">tool</a> to vary the links, so that when an advertiser purchases for example 100 links it&#8217;s possible to output those in 100 variations of anchor text as well as surrounding text before and after the A element, on possibly 100 different sites. Also TNX has a solution to increase the number of links slowly, so that search engines can&#8217;t find a gazillion of uniformed links to a (new) site all of a sudden. Whether or not that&#8217;s sufficient to simulate natural link growth remains an unanswered question, because I&#8217;ve no access to their algorithm.</p>
<p>Links as well as participating sites are reviewed by TNX staff, and frequently checked with bots. Links shouldn&#8217;t appear on pages which aren&#8217;t indexed by search engines or viewed by humans, or on 404 pages, pages with long and ugly URLs and such. They don&#8217;t accept <acronym title="Porn, Pills, Casinos">PPC</acronym> links or offensive ads. </p>
<p>All links are outputted server sided, what requires PHP or Perl (ASP/ASPX coming soon). There is a cache option, so it&#8217;s not necessary to download the links from the TNX servers for each page view. TNX recommends renaming the /cache/ directory to avoid an easily detectable sign for the occurence of TNX paid links on a Web site. Links are <a href="http://sebastians-pamphlets.com/ads/router.php?adName=paidreviews&#038;adServed=http://members.tnx.net/generated_test.txt">stored</a> as plain HTML, besides the <code>target="_blank"</code> attribute there is no obvious footprint or pattern on link level. Example: <code><br />
Have a website? See this &lt;a href="http://www.example.com" target="_blank"&gt;free affiliate program&lt;/a&gt;.<br />
Have a blog? Check this &lt;a href="http://www.example.com" target="_blank"&gt;affiliate program with high comissions&lt;/a&gt; for publishers. </code><br />
Webmasters can enter any string as delimiter, for example <code>&lt;br /&gt;</code> or &#8220;&bull;&#8221;:<br />
<blockquote>Have a website? See this <a href="http://www.sebastians-pamphlets.com/" title="EXAMPLE">free affiliate program</a>. &bull; Have a blog? Check this <a href="http://www.sebastians-pamphlets.com/"  title="EXAMPLE">affiliate program with high comissions</a> for publishers. </p></blockquote>
<p>Publishers can choose from 17 niches, 7 languages, 5 linkpop levels, and 7 toolbar PageRank values to target their ads.</p>
<p>From the system stats in the members area the service is widely used:<br />
<blockquote>
<ul>
<li>As of today [2007-11-06] we have 31,802 users (daily growth: +0.62%)</li>
<li>Links in the system: 31,431,380</li>
<li>Links created in last hour: 1,616</li>
<li>Number of pages indexed by TNX: 37,221,398</li>
</ul>
</blockquote>
<p>Long story short, TNX jumped through many hoops to develop a system which is supposed to trade paid links that are undetectable by search engines. Is that so?</p>
<p>The major weak point is the system&#8217;s growth and that its users are humans. Even if such a system would be perfect, users will make mistakes and reveal the whole network to search engines. <b>Here is how Google has identified most if not all of the TNX paid links</b>:</p>
<p>Some Webmasters put their TNX links in sidebars under a label that identifies them as paid links. Google crawled those pages, and stored the link destinations in its paid links database. Also, they devalued at least the labelled links, if not the whole page or even the complete site lost its ability to pass link juice because the few paid links aren&#8217;t condomized.</p>
<p>Many Webmasters implemented their TNX links in templates, so that they appear on a large number of pages. Actually, that&#8217;s recommended by TNX. Even if the advertisers have used the text variation tool, their URLs appeared multiple times on each site. Google can detect site wide links, even if not each and every link appears on all pages, and flags them accordingly.</p>
<p>Maybe even a few Googlers have signed up and served the TNX links on their personal sites to gather examples, although that wasn&#8217;t neccessary because so many Webmasters with URLs in their signatures have told Google in this <a href="http://forums.digitalpoint.com/showthread.php?t=477444">DP thread</a> that they&#8217;ve signed up and at least tested TNX links on their pages.</p>
<p>Next Google compared the anchor text as well as the surrounding text of all flagged links, and found some patterns. Of course putting text before and after the linked anchor text seems to be a smart way to fake a natural link, but in fact Webmasters applied a bullet-proof procedure to outsmart themselves, because with multiple occurences of the same text constellations pointing to an URL, especially when found on unrelated sites (different owners, hosts etc., topically irrelevancy plays no role in this context), paid link detection is a breeze. Linkage like that may be &#8220;natural&#8221; with regard to patterns like site wide advertising or navigation, but a lookup in Google&#8217;s links database revealed that the same text constellations and URLs were found on <em>n</em>&nbsp; other sites too.</p>
<p>Now that Google had compiled the seed, each and every instance of Googlebot delivered more evidence. It took Google only one crawl cycle to identify most sites carrying TNX links, and all TNX advertisers. Paid link flags from pages on sites with a low crawling frequency were delivered in addition. Meanwhile Google has drawed a comprehensive picture of the whole TNX network.</p>
<p>I&#8217;ve developed such a link network many years ago (it&#8217;s defunct now). It was successful because only very experienced Webmasters controlling a fair amount of squeaky clean sites were invited. Allowing newbies to participate in such an organized link swindle is the kiss of death, because newbies do make newbie mistakes, and Google makes use of newbie mistakes to catch all participants. By the way, with the capabilities Google has today, my former approach to manipulate rankings with artificial linkage would be detectable with statistical methods similar to the algo outlined above, despite the closed circle of savvy participants.</p>
<p>From reading the various DP threads about TNX as well as their sales pitches, I&#8217;ve recognized a very popular misunderstanding of Google&#8217;s mentality. Folks are worrying whether an algo can detect the intention of links or not, usually focusing on particular links or linking methods. Google on the other hand looks at the whole crawlable Web. When they develop a paid link detection algo, they have a copy of the known universe to play with, as well as a complete history of each and every hyperlink crawled by Ms. Googlebot since 1998 or so. Naturally, their statistical methods will catch massive artificial linkage first, but fine tuning the sensitivity of paid link sniffers respectively creating variants to cover different linking patterns is no big deal. Of course there is always a way to hide a paid link, but nobody can hide millions of them. </p>
<p>Unfortunately, the unique selling point of the TNX service &#8211;that goes for all link brokers by the way&#8211; is manipulation of search engine rankings, hence even if they would offer nofollow&#8217;ed links to trade traffic instead of PageRank, most probably they would be forced to reduce the prices. Since TNX links are rather cheap, I&#8217;m not sure that will pay. It would be a shame when they decide to change the business model but it doesn&#8217;t pay for TNX, because the underlying concept is great. It just shouldn&#8217;t be used to exchange clean links. All the tricks developed to outsmart Google, like the text variation tool or not putting links on not exactly trafficked pages, are suitable to serve non-repetitive ads (coming with attractive CTRs) to humans.</p>
<p>I&#8217;ve asked TNX: I&#8217;ve decided to review your service on my blog, regardless whether you pay me or not. The result of my research is that I can&#8217;t recommend TNX in its current shape. If you still want a paid review, and/or a quote in the article, I&#8217;ve a question: <strong>Provided Google has drawn a detailed picture of your complete network, are you ready to switch to nofollow&#8217;ed links in order to trade traffic instead of PageRank, possibly with slightly reduced prices?</strong> Their answer:<br />
<blockquote>We would be glad to accept your offer of a free review, because we don&#8217;t want to pay for a negative review.<br />
Nobody can draw a detailed picture of our network - it&#8217;s impossible for one advertiser to buy links from all or a majority sites of our network. Many webmasters choose only relevant advertisers.<br />
We will not switch to nofollow&#8217;ed links, but we are planning not to use Google PR for link pricing in the near future - we plan to use our own real-time page-value rank.</p></blockquote>
<p>Well, it&#8217;s not necessary to find one or more links on all sites to identify a network. </p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/text-link-broker-woes-smart-paid-links-sniffers-fromgoogle/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A pragmatic defence against Google&#8217;s anti paid links campaign</title>
		<link>http://sebastians-pamphlets.com/a-pragmatic-defense-against-googles-anti-paid-links-campaign/</link>
		<comments>http://sebastians-pamphlets.com/a-pragmatic-defense-against-googles-anti-paid-links-campaign/#comments</comments>
		<pubDate>Fri, 26 Oct 2007 14:39:00 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Search Quality]]></category>

		<category><![CDATA[Crawler Directives]]></category>

		<category><![CDATA[Cloaking]]></category>

		<category><![CDATA[Google]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Nofollow]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/a-pragmatic-defense-against-googles-anti-paid-links-campaign/</guid>
		<description><![CDATA[Google&#8217;s recent shot across the bows of a gazillion sites handling paid links, advertising, or internal cross links not compliant to Google&#8217;s imagination of a natural link is a call for action. Google&#8217;s message is clear: &#8220;condomize your commercial links or suffer&#8221; (from deducted toolbar PageRank, links without the ability to pass real PageRank and [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://sebastians-pamphlets.com/google-pagerank-deductions-october-2007/">Google&#8217;s recent shot across the bows</a> of a gazillion sites handling <a href="http://sebastians-pamphlets.com/links/categories/?cat=paid-links">paid links</a>, <a href="http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/">advertising</a>, or <a href="http://sebastians-pamphlets.com/links/categories/?cat=risky-linkage">internal cross links</a> not compliant to <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66736">Google&#8217;s imagination of a natural link</a> is a call for action. Google&#8217;s message is clear: &#8220;<a href="http://sebastians-pamphlets.com/links/categories/?cat=nofollow">condomize</a> your commercial links or suffer&#8221; (from deducted toolbar PageRank, links without the ability to pass real PageRank and relevancy signals, or perhaps even penalties).</p>
<p><img src="http://sebastians-pamphlets.com/img/posts/paid-links-evil-versus-good.png" width="250" height="116" align="right" style="margin-left:4px;" alt="Paid links: good versus evil" title="Paid links: Google versus Web" />Of course that&#8217;s somewhat evil, because applying <a href="http://sebastians-pamphlets.com/links/categories/?cat=nofollow">nofollow values</a> to all sorts of links is not exactly a natural thing to do; visitors don&#8217;t care about invisible link attributes and sometimes they&#8217;re even pissed when they get redirected to an URL not displayed in their status bar. Also, this requirement forces Webmasters to invest enormous efforts in code maintenance for the sole purpose of satisfying search engines. The argument &#8220;if Google doesn&#8217;t like these links, then they can discount them in their system, without bothering us&#8221; has its merits, but unfortunately that&#8217;s not the way Google&#8217;s cookie crumbles for various reasons. Hence lets develop a pragmatic procedure to handle those links.</p>
<h3>The problem</h3>
<p>Google thinks that uncondomized paid links as well as commercial links to sponsors or affiliated entities aren&#8217;t natural, because the terms &#8220;sponsor|pay for review|advertising|my other site|sign-up|&#8230;&#8221; and &#8220;editorial vote&#8221; are not compatible in the sense of <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35769">Google&#8217;s guidelines</a>. This view at the Web&#8217;s linkage is pretty black vs. white.</p>
<p>Either you link out because a sponsor bought ads, or you don&#8217;t sell ads and link out for free because you honestly think your visitors will like a page. Links to sponsors without condom are black, links to sites you like and which you don&#8217;t label &#8220;sponsor&#8221; are white. </p>
<p>There&#8217;s nothing in between, respectively gray areas like links to hand picked sponsors on a page with a gazillion of links count as black. Google doesn&#8217;t care whether or not your clean links actually pass a reasonable amount of PageRank to link destinations which buy ad space too, the sole possibility that those links <em>could</em>&nbsp; influence search results is enough to qualify you as sort of a link seller. </p>
<p>The same goes for paid reviews on blogs and whatnot, see for example <a href="http://andybeard.eu/2007/10/penalty-confirmed-but-i-dont-sell-pagerank.html">Andy&#8217;s problem</a> with his honest reviews which Google classifies as paid links, and of course all sorts of traffic deals, <a href="http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/">affiliate links</a>, banner ads and stuff like that. </p>
<p>You don&#8217;t even need to label a clean link as advert or sponsored. If the link destination matches a domain in Google&#8217;s database of on-line advertisers, link buyers, e-commerce sites / merchants etcetera, or Google figures out that you link too much to affiliated sites or other sites you own or control, then your toolbar PageRank is toast and most probably your outgoing links will be penalized. Possibly these penalties have impact on your internal links too, what results in less PageRank landing on subsidiary pages. Less PageRank gathered by your landing pages means less crawling, less ranking, less SERP referrers, less revenue.</p>
<h3>The solution</h3>
<p>You&#8217;re absolutely right when you say that such search engine nitpicking should not force you to throw nofollow crap on your links like confetti. From your and my point of view condomizing links is wrong, but sometimes it&#8217;s better to pragmatically comply to such policies in order to stay in the game.  </p>
<p>Although uncrawlable redirect scripts have advantages in some cases, the simplest procedure to condomize a link is the <a href="http://sebastians-pamphlets.com/links/categories/?cat=nofollow">rel-nofollow</a> <a href="http://sebastians-pamphlets.com/links/categories/?cat=microformats">microformat</a>. Here is an example of a googlified affiliate link:<code><br />
&lt;a href="http://sponsor.com/?affID=1" rel="nofollow"&gt;Sponsor&lt;/a&gt;</code></p>
<h3>Why serve your visitors search engine crawler directives?</h3>
<p>Complying to Google&#8217;s laws does not mean that you must deliver <a href="http://sebastians-pamphlets.com/links/categories/?cat=crawler-directives">crawler directives</a> like rel=&#8221;nofollow&#8221; to your visitors. Since Google is concerned about search engine rankings influenced by uncondomized links with commercial intent, serving crawler directives to crawlers and clean links to users is perfectly in line with Google&#8217;s goals. Actually, initiatives like the <a href="http://sebastians-pamphlets.com/links/categories/?cat=x-robots-tag">X-Robots-Tag</a> make clear that hiding crawler directives from users is fine with Google. To underline that, here is a quote from <a href="http://www.mattcutts.com/blog/hidden-links/">Matt Cutts</a>:<br />
<blockquote>[&#8230;] If you want to sell a link, <b>you should at least provide machine-readable disclosure</b> for paid links by making your link in a way that doesn’t affect search engines. [&#8230;]</p>
<p>The other best practice I’d advise is to provide human readable disclosure that a link/review/article is paid. You could put a badge on your site to disclose that some links, posts, or reviews are paid, but including the disclosure on a per-post level would better. Even something as simple as &#8220;This is a paid review&#8221; fulfills the human-readable aspect of disclosing a paid article. [&#8230;]</p>
<p><b>Google’s quality guidelines are more concerned with the machine-readable aspect of disclosing paid links/posts</b> [&#8230;]</p>
<p>To make sure that you’re in good shape, go with both human-readable disclosure and machine-readable disclosure, using any of the methods [uncrawlable redirects, rel-nofollow] I mentioned above.<br />
[emphasis mine]</p></blockquote>
<p>Since Google devalues paid links anyway, search engine friendly cloaking of rel-nofollow for Googlebot is a non-issue with advertisers, as long as this fact is disclosed. I bet most link buyers look at the magic green pixels anyway, but that&#8217;s their problem.</p>
<h3>How to cloak rel-nofollow for search engine crawlers</h3>
<p>I&#8217;ll discuss a PHP/Apache example, but this method is adaptable to other server sided scripting languages like ASP or so with ease. If you&#8217;ve a static site and PHP is available on your (*ix) host, you need to tell Apache that you&#8217;re using PHP in .html (.htm) files. Put this statement in your root&#8217;s .htaccess file: <code><br />
AddType application/x-httpd-php .html .htm</code></p>
<p>Next create a plain text file, insert the code below, and upload it as &#8220;funct_nofollow.php&#8221; or so to your server&#8217;s root directory (or a subdirectory, but then you need to change some code below). <code><br />
&lt;?php<br />
function makeRelAttribute ($linkClass) {<br />
    $numargs = func_num_args();<br />
    // optional 2nd input parameter: $relValue<br />
    if ($numargs >= 2) {<br />
        $relValue = func_get_arg(1) .&quot; &quot;;<br />
    }<br />
    $referrer                   = $_SERVER[&quot;HTTP_REFERER&quot;];<br />
    $refUrl                     = parse_url($referrer);<br />
    $isSerpReferrer             = FALSE;<br />
    if (stristr($refUrl[host], &quot;google.&quot;) ||<br />
        stristr($refUrl[host], &quot;yahoo.&quot;))<br />
        $isSerpReferrer         = TRUE;<br />
    $userAgent                  = $_SERVER[&quot;HTTP_USER_AGENT&quot;];<br />
    $isCrawler                  = FALSE;<br />
    if (stristr($userAgent, &quot;Googlebot&quot;) ||<br />
        stristr($userAgent, &quot;Slurp&quot;))<br />
        $isCrawler              = TRUE;<br />
    if ($isCrawler  <b>/*</b>|| $isSerpReferrer<b>*/</b> ) {<br />
        if (&quot;$linkClass&quot; == &quot;ad&quot;)   $relValue .= &quot;advertising nofollow&quot;;<br />
        if (&quot;$linkClass&quot; == &quot;paid&quot;) $relValue .= &quot;sponsored nofollow&quot;;<br />
        if (&quot;$linkClass&quot; == &quot;own&quot;)  $relValue .= &quot;affiliated nofollow&quot;;<br />
        if (&quot;$linkClass&quot; == &quot;vote&quot;) $relValue .= &quot;editorial dofollow&quot;;<br />
    }<br />
    if (empty($relValue))<br />
        return &quot;&quot;;<br />
    return &quot; rel=\&quot;&quot; .trim($relValue) .&quot;\&quot; &quot;;<br />
} // end function makeRelValue<br />
?&gt;   </code></p>
<p>Next put the code below in a PHP file you&#8217;ve included in all scripts, for example header.php. If you&#8217;ve static pages, then insert the code at the very top. <code><br />
&lt;?php<br />
@include($_SERVER[&quot;DOCUMENT_ROOT&quot;] .&quot;/funct_nofollow.php&quot;);<br />
?&gt;   </code><br />
Do not paste the function <code>makeRelValue</code> itself! If you spread code this way you&#8217;ve to edit tons of files when you need to change the functionality later on.</p>
<p>Now you can use the function <code>makeRelValue($linkClass,$relValue)</code> within the scripts or HTML pages. The function has an input parameter $linkClass and knows the (self-explanatory) values &#8220;ad&#8221;, &#8220;paid&#8221;, &#8220;own&#8221; and &#8220;vote&#8221;. The second (optional) input parameter is a value for the <a href="http://www.smart-it-consulting.com/article.htm?node=155&#038;page=90#a-rel">A element&#8217;s REL attribute</a> itself. If you provide it, it gets appended, or, if <code>makeRelValue</code> doesn&#8217;t detect a spider, it creates a REL attribute with this value. Examples below. You can add more user agents, or serve rel-nofollow to visitors coming from SERPs by enabling the <code>|| $isSerpReferrer</code> condition (remove the bold <code>/*</code>&amp;<code>*/</code>).</p>
<p>When you code a hyperlink, just add the function to the A tag. Here is a PHP example: <code><br />
print &quot;&lt;a href=\&quot;http://google.com/\&quot;&quot; .makeRelAttribute(&quot;ad&quot;) .&quot;&gt;Google&lt;/a&gt;&quot;; </code><br />
will output<br />
<code>&lt;a href=&quot;http://google.com/&quot; rel=&quot;advertising nofollow&quot; &gt;Google&lt;/a&gt;</code><br />
when the user agent is Googlebot, and<br />
<code>&lt;a href=&quot;http://google.com/&quot;&gt;Google&lt;/a&gt;</code><br />
to a browser.</p>
<p>If you can&#8217;t write nice PHP code, for example because you&#8217;ve to follow crappy guidelines and worst practices with a WordPress blog, then you can mix <span style="color:blue;">HTML</span> and <span style="color:green;">PHP</span> tags: <code><br />
<span style="color:blue;">&lt;a href=&quot;http://search.yahoo.com/&quot;</span><span style="color:green;">&lt;?php print makeRelAttribute(&quot;paid&quot;); ?&gt;</span><span style="color:blue;">&gt;Yahoo&lt;/a&gt;</span>   </code></p>
<p>Please note that this method is not safe with search engines or unfriendly competitors when you want to cloak for other purposes. Also, the link condoms are served to crawlers only, that means search engine staff reviewing your site with a non-crawler user agent name won&#8217;t spot the nofollow&#8217;ed links unless they check the engine&#8217;s cached page copy. An HTML comment in HEAD like &#8220;This site serves machine-readable disclosures, e.g. crawler directives like rel-nofollow applied to links with commercial intent, to Web robots only.&#8221; as well as a similar comment line in robots.txt would certainly help to pass reviews by humans.</p>
<h3>A Google-friendly way to handle paid links, affiliate links, and cross linking</h3>
<p>Load this page with different user agents and referrers. You can do this for example with a FireFox extension like <a href="http://sebastians-pamphlets.com/referrer-spoofing-with-prefbar-341/">PrefBar</a>. For testing purposes you can use these user agent names: <code><br />
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)<br />
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) </code><br />
and these SERP referrer URLs: <code><br />
http://google.com/search?q=viagra<br />
http://search.yahoo.com/search?p=viagra&#038;ei=utf-8&#038;iscqry=&#038;fr=sfp </code><br />
Just enter these values in PrefBar&#8217;s user agent respectively referrer spoofing options (click &#8220;Customize&#8221; on the toolbar, select &#8220;User Agent&#8221; / &#8220;Referrerspoof&#8221;, click &#8220;Edit&#8221;, add a new item, label it, then insert the strings above). Here is the code above in action:</p>
<table style="margin-bottom:30px;">
<tr>
<td valign="top"><b>Referrer URL:</b></td>
<td valign="top"></td>
</tr>
<tr>
<td valign="top"><b>User Agent Name:</b></td>
<td valign="top">CCBot/1.0 (+http://www.commoncrawl.org/bot.html)</td>
</tr>
<tr>
<td valign="top"><b>Ad</b> makeRelAttribute(&#8221;ad&#8221;): </td>
<td valign="top"><a href="http://google.com/">Google</a> <code></code></td>
</tr>
<tr>
<td valign="top"><b>Paid</b> makeRelAttribute(&#8221;paid&#8221;): </td>
<td valign="top"><a href="http://search.yahoo.com/"  >Yahoo</a> <code></code></td>
</tr>
<tr>
<td valign="top"><b>Own</b> makeRelAttribute(&#8221;own&#8221;): </td>
<td valign="top"><a href="http://sebastians-pamphlets.com/"  >Sebastian&#8217;s Pamphlets</a> <code></code></td>
</tr>
<tr>
<td valign="top"><b>Vote</b> makeRelAttribute(&#8221;vote&#8221;): </td>
<td valign="top"><a href="http://link-condom.com/"  >The Link Condom</a> <code></code></td>
</tr>
<tr>
<td valign="top"><b>External</b> makeRelAttribute(&#8221;", &#8220;external&#8221;): </td>
<td valign="top"><a href="http://w3.org/"  rel="external"  >W3C</a> <code> rel="external" </code></td>
</tr>
<tr>
<td valign="top"><b>Without parameters</b> makeRelAttribute(&#8221;"): </td>
<td valign="top"><a href="http://sphinn.com/"  >Sphinn</a> <code></code></td>
</tr>
</table>
<p>When you change your browser&#8217;s user agent to a crawler name, or fake a SERP referrer, the REL value will appear in the right column.</p>
<p>When you&#8217;ve developed a better solution, or when you&#8217;ve a nofollow-cloaking tutorial for other programming languages or platforms, please let me know in the comments. Thanks in advance!</p>
<p><!-- Processed by EzStatic --></p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/a-pragmatic-defense-against-googles-anti-paid-links-campaign/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Google Toolbar PageRank deductions make sense</title>
		<link>http://sebastians-pamphlets.com/google-pagerank-deductions-october-2007/</link>
		<comments>http://sebastians-pamphlets.com/google-pagerank-deductions-october-2007/#comments</comments>
		<pubDate>Wed, 24 Oct 2007 21:19:22 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/google-pagerank-deductions-october-2007/</guid>
		<description><![CDATA[Since toolbar PR is stale since April, and now only a few sites were &#8220;updated&#8221; without any traffic losses, I can imagine that&#8217;s just a &#8220;watch out&#8221; signal from Google, not yet a penalty. Of course it&#8217;s not a conventional toolbar PageRank update, because new pages aren&#8217;t affected. That means the deductions are not caused [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://sebastians-pamphlets.com/img/posts/google-warnings-via-toolbar-pagerank-deductions.png" width="200" height="337" align="right" style="margin-left:4px;" alt="Google policing the Web's linkage" title="Google policing the Web's linkage" />Since toolbar PR is stale since April, and now only a few sites were <a href="http://www.searchenginejournal.com/google-drops-pagerank-for-many-sites-paid-links-or-new-algorithm/5890/">&#8220;updated&#8221; without any traffic losses</a>, I can imagine that&#8217;s just a &#8220;watch out&#8221; signal from Google, not yet a penalty. Of course it&#8217;s not a conventional toolbar PageRank update, because new pages aren&#8217;t affected. That means the deductions are not caused by a finite amount of PageRank spread over more pages discovered by Google since the last toolbar PR update. </p>
<p>Unfortunately, in the current <a href="http://www.gregboser.com/toolbar-hysteria/">toolbar PR hysteria</a> next to nobody tries to figure out Google&#8217;s message. Crying foul is not very helpful, since Google is not exactly known as a company revising such decisions based on Webmaster rants lashing &#8220;unfair penalties&#8221;.</p>
<p>By the way, I think <a href="http://andybeard.eu/2007/10/pagerank-update.html">Andy</a> is spot on. Paid links are <a href="http://searchengineland.com/071007-173841.php">definitively</a> a cause of toolbar PageRank downgrades. Artificial links of any kind is another issue. Google obviously has a different take on interlinking respectively crosslinking for example. Site owners argue that it makes  business sense, but Google might think most of these links come without value for their users. And there are tons more pretty common instances of &#8220;link monkey business&#8221;.</p>
<p>Maybe Google alerts all sorts of sites violating the <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35769#quality">SEO bible&#8217;s twelve commandments</a> with a few less green pixels, before they roll out new filters which would catch those sins and penalize the offending pages accordingly. Actually, this would make a lot of sense. </p>
<p>All site owners and Webmasters monitor their toolbar PR. Any significant changes are discussed in a huge community. If the crowd assumes that artifical links cause toolbar PR deductions, many sites will change their linkage. This happened already after the first shot across the bows two weeks ago. And it will work again. Google gets the desired results: less disliked linkage, less sites selling uncondomized links. </p>
<p>That&#8217;s quite smart. Google has learned that they can&#8217;t ban or overpenalize popular sites, because that leads to fucked up search results for not only navigational search queries, in other words pissed searchers. Taking back a few green pixels from the toolbar on the other hand is not an effective penalty, because toolbar PR is unrelated to everything that matters. It is however a message with guaranteed delivery.</p>
<p>Running algos in development stage on the whole index and using their findings to manipulate toolbar PageRank data hurts nobody, but might force many Webmasters to change their stuff in order to comply to Google&#8217;s laws. As a side effect, this procedure even helps to avoid too much collateral damage when the actual filters become active later on. </p>
<p>There seems to exist another pattern. Most sites targeted by the recent toolbar PageRank deductions are SEO aware to some degree. They will spread the word. And complain loudly. Google has quite a few folks on the payroll who monitor the blogosphere, SEO forums, Webmaster hangouts and whatnot. Analyzing such reactions is a great way to gather input usable to validate and fine tune not yet launched algos.</p>
<p>Of course that&#8217;s sheer speculation. What do you think, does Google use toolbar PR as a &#8220;change your stuff or find yourself kicked out soon&#8221; message? Or ist it just a try to make link selling less attractive?</p>
<p><b>Update</b>  Insightful posts on Google&#8217;s toolbar PageRank manipulations:
<ul>
<li><a href="http://www.seroundtable.com/archives/015129.html">What Does This Google PageRank Message Mean?</a> SE Roundtable</li>
<li><a href="http://www.highrankings.com/advisor/paid-link-smack/">Google’s Paid-link Smack in the Face</a> Jill Whalen</li>
<li><a href="http://www.gregboser.com/toolbar-hysteria">Toolbar Hysteria – It Isn’t Really a Penalty</a> Greg Boser</li>
<li><a href="http://www.jlh-design.com/2007/10/digital-point-members-put-on-suicide-watch/">Digital Point Members put on Suicide Watch</a> John Honeck</li>
<li><a href="http://www.cornwallseo.com/search/index.php/2007/10/24/my-opinion-on-the-google-page-rank-massacre/">My Opinion on the Google Page Rank Massacre</a> Lyndon Antcliff</li>
<li><a href="http://www.searchengineguide.com/jennifer-laycock/anyone-have-some-boots-i-could-borrow.php">Anyone Have Some Boots I Could Borrow?</a> Jennifer Laycock</li>
<li><a href="http://www.searchenginejournal.com/matt-cutts-confirms-paid-links-google-pagerank-update/5906/">Matt Cutts confirms paid links message via Google Toolbar PageRank update</a> Loren Baker / Matt Cutts</li>
</ul>
<p><b>And here is a pragmatic answer to Google&#8217;s paid links requirements: <a href="http://sebastians-pamphlets.com/a-pragmatic-defense-against-googles-anti-paid-links-campaign/">Cloak the hell out of your links with commercial intent!</a></b></p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/google-pagerank-deductions-october-2007/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Google says you must manage your affiliate links in order to get indexed</title>
		<link>http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/</link>
		<comments>http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/#comments</comments>
		<pubDate>Wed, 12 Sep 2007 15:31:16 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Search Quality]]></category>

		<category><![CDATA[Duplicate Content]]></category>

		<category><![CDATA[Redirects]]></category>

		<category><![CDATA[Internet Marketing]]></category>

		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[E-Commerce]]></category>

		<category><![CDATA[Cloaking]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/</guid>
		<description><![CDATA[I&#8217;ve worked hard to overtake the SERP positions of a couple merchants allowing me to link to them with an affiliate ID, and now the allmighty Google tells the sponsors they must screw me with internal 301 redirects to rescue their rankings. Bugger. Since I read the shocking news on Google&#8217;s official Webmaster blog this [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://sebastians-pamphlets.com/img/posts/screwing-affiliates.png" border="0" width="200" height="234" alt="Screwing affiliates recommended by Google ;=)" title="Screwing affiliates recommended by Google ;=)" align="right" />I&#8217;ve worked hard to overtake the SERP positions of a couple merchants allowing me to link to them with an affiliate ID, and now the allmighty Google tells the sponsors they must screw me with internal 301 redirects to rescue their rankings. Bugger. Since I read the <a href="http://googlewebmastercentral.blogspot.com/2007/09/google-duplicate-content-caused-by-url.html">shocking news</a> on Google&#8217;s official Webmaster blog this morning I worked on a counter strategy, with success. Affiliate programs will not screw me, not even with Google&#8217;s help. They&#8217;ll be hoist by their own petard. I&#8217;ll strike back with <a href="http://sebastians-pamphlets.com/links/categories/?cat=nofollow">nofollow</a> and I&#8217;ll take no prisoners.</p>
<p>Seriously, the story reads a little different and is not breaking news at all. <a href="http://groups.google.com/groups/profile?enc_user=UUR88zMAAAC0ZCEBAysSlShC_gPAdXUZBk4iRL7ea_2wrzhBYOac1941dLVcXuhkYDUCu-gHSafWeW2F1jzNcUJkZz1jkOrt">Maile Ohye</a> from Google just endorsed best practices I&#8217;ve <a href="http://www.smart-it-consulting.com/article.htm?node=148&#038;page=103">recommended</a> for ages. Here is my recap.</p>
<h3>The problem</h3>
<p>Actually, there are problems on both sides of an affiliate link. The affiliate needs to hide these links from Google to avoid a so called &#8220;thin affiliate site penalty&#8221;, and the affiliate program suffers from duplicate content issues, link juice dilution, and often even URL hijacking by affiliate links.</p>
<p>Diligent affiliates gathering tons of PageRank on their pages can &#8220;unintentionally&#8221; overtake URLs on the SERPs by fooling the canonicalization algos. When Google discovers lots of links from strong pages on different hosts pointing to <code>http://sponsor.com/?affid=me</code> and this page adds <code>?affid=me</code> to its internal links, my URL on the sponsor&#8217;s site can &#8220;outrank&#8221; the official home page, or landing page, <code>http://sponsor.com/</code>. When I choose the right anchor text, Google will feed my affiliate page with free traffic, whilst the affiliate program&#8217;s very own pages don&#8217;t exist on the SERPs.</p>
<h3>Managing incoming affiliate links (merchants)</h3>
<p>The best procedure is capturing all incoming traffic before a single byte of content is sent to the user agent, extracting the affiliate ID from the URL, storing it in a cookie, then 301-redirecting the user agent to the canonical version of the landing page, that is a page without affiliate or user specific parameters in the URL. That goes for all user agents (humans accepting the cookie and Web robots which don&#8217;t accept cookies and start a new session with every request).</p>
<p>Users not accepting cookies are redirected to a version of the landing page blocked by robots.txt, the affiliate ID sticks with the URLs in this case. Search engine crawlers, identified by their user agent name or whatever, are treated as users and shall never see (internal) links to URLs with tracking parameters in the query string. </p>
<p>This 301 redirect passes all the link juice, that is PageRank &amp; Co. as well as anchor text, to the canonical URL. Search engines can no longer index page versions owned by affiliates. (This procedure doesn&#8217;t prevent you from 302 hijacking where your content gets indexed under the affiliate&#8217;s URL.)</p>
<h3>Putting safe affiliate links (online marketers)</h3>
<p>Honestly, there&#8217;s no such thing as a safe affiliate link, at least not safe with regard to picky search engines. Masking complex URLs with redirect services like tinyurl.com or so doesn&#8217;t help, because the crawlers get the real URL from the redirect header and will leave a note in the record of the original link on the page carrying the affiliate link. Anyways, the tiny URL will fool most visitors, and if you own the redirect service it makes managing affiliate links easier.</p>
<p>Of course you can cloak the hell out of your thin affiliate pages by showing the engines links to authority pages whilst humans get the ads, but then better forget the Google traffic (I know, I know &#8230; cloaking still works if you can handle it properly, but not everybody can handle the risks so better leave that to the experts). </p>
<p>There&#8217;s only one official approach to make a page plastered with affiliate links safe with search engines: replace it with a content rich page, of course Google wants unique and compelling content and checks its uniqueness, then sensibly work in the commercial links. Best link within the content to the merchants, apply rel-nofollow to all affiliate links, and avoid banner farms in the sidebars and above the fold.  </p>
<p><small><b>Update:</b> I&#8217;ve sanitized the title, &#8220;Google says you must screw your affiliates in order to get indexed&#8221; was not one of my best title baits.</small></p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/google-recommends-screwing-affiliates-in-exchange-for-better-serp-positioning/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Who is responsible for the paid link mess?</title>
		<link>http://sebastians-pamphlets.com/who-is-responsible-for-the-paid-link-mess/</link>
		<comments>http://sebastians-pamphlets.com/who-is-responsible-for-the-paid-link-mess/#comments</comments>
		<pubDate>Thu, 05 Jul 2007 17:13:00 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Fun]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/who-is-responsible-for-the-paid-link-mess/</guid>
		<description><![CDATA[Look at this graph showing the number of [buy link] searches since 2004: Interestingly this search term starts out in September or October 2004, and shows a quite stable trend until the recent paid links debate started. 
Who or what caused SEOs to massively buy links since 2004?

The Playboy interview with Google cofounders Larry Page [...]]]></description>
			<content:encoded><![CDATA[<p>Look at this graph showing the number of [<a href="http://www.google.com/trends?q=buy+link&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0">buy link</a>] searches since 2004:<br /><a href="http://www.smart-it-consulting.com/img/misc/buy-link-trend.png"><img src="http://www.smart-it-consulting.com/img/misc/buy-link-trend.png" border="1" bordercolor="silver" width="99%" alt="" title=""  /></a> <br />Interestingly this search term starts out in September or October 2004, and shows a quite stable trend until the recent paid links debate started. </p>
<p><strong>Who or what caused SEOs to massively buy links since 2004?</strong>
<ul>
<li>The Playboy interview with Google cofounders Larry Page and Sergey Brin just before Google was about to go public?</li>
<li>Google&#8217;s IPO?</li>
<li>Rumors that Google ran out of index space and therefore might restrict the number of doorway pages in the search index?</li>
<li>Nick Wilson preparing the launch of Threadwatch?</li>
<li>AdWords and Overture no longer running gambling ads?</li>
<li>The Internet Advancement scandal?</li>
<li>Google&#8217;s shortage of beer at the SES Google dance?</li>
<li>A couple <a href="http://maps.google.com/maps?um=1&#038;tab=wl&#038;hl=en&#038;lr=&#038;rls=GGGL%2CGGGL%3A2004-07%2CGGGL%3Aen&#038;q=Ripon%20UK">UK</a> <a href="http://www.smart-it-consulting.com/img/misc/who-invented-bought-organic-rankings.png">based</a> SEOs invented <strong>bought organic rankings</strong>?</li>
</ul>
<p>Seriously, buying links for rankings was an established practice way before 2004. If you know the answer, or if you&#8217;ve a somewhat plausible theory, leave it in the comments. I&#8217;m really curious. Thanks.</p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/who-is-responsible-for-the-paid-link-mess/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Google to kill the power of links</title>
		<link>http://sebastians-pamphlets.com/google-to-kill-the-power-of-links/</link>
		<comments>http://sebastians-pamphlets.com/google-to-kill-the-power-of-links/#comments</comments>
		<pubDate>Fri, 08 Jun 2007 15:57:00 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Reciprocal Links]]></category>

		<category><![CDATA[Search Quality]]></category>

		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[Google]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Nofollow]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/google-to-kill-the-power-of-links/</guid>
		<description><![CDATA[Well, a few types of links will survive and don&#8217;t do evil in Google&#8217;s search index   &#160;&#160; I&#8217;ve updated my first take on Google&#8217;s updated guidelines stating paid links and reciprocal links are evil. Well, regardless whether one likes or dislikes this policy, it&#8217;s already factored in - case closed by Google. There [...]]]></description>
			<content:encoded><![CDATA[<p>Well, a few types of links will survive and don&#8217;t do evil in Google&#8217;s search index <img src='http://sebastians-pamphlets.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  &nbsp;&nbsp; I&#8217;ve updated my first take on <a href="http://googlewebmastercentral.blogspot.com/2007/06/more-details-about-our-webmaster.html">Google&#8217;s updated guidelines</a> stating <a href="http://sebastianx.blogspot.com/2007/06/google-enhances-quality-guidelines.html#paid-links">paid links and reciprocal links are evil</a>. Well, regardless whether one likes or dislikes this policy, it&#8217;s already factored in - case closed by Google. There are so many ways to generate natural links &#8230; </p>
<p>The <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66736&#038;topic=8524">official call for paid-link reports</a> is pretty much disliked across the boards:<br /><a href="http://www.threadwatch.org/node/15095">Google is Now The Morality Police on the Internet</a><br /><a href="http://fantomaster.com/fantomNews/archives/2007/06/05/googles-ideal-webmaster-snitch-rake-it-in-and-dont-deliver/">Google&#8217;s Ideal Webmaster: Snitch, Rake It In And Don&#8217;t Deliver</a><br /><a href="http://www.jlh-design.com/2007/06/other-sites-can-hurt-your-ranking/">Other sites can hurt your ranking</a><br /><a href="http://www.seroundtable.com/archives/013761.html">Google&#8217;s Updated Webmaster Guidelines Addresses Linking Practices</a><br /><a href="http://www.webmasterworld.com/google/3359535.htm">Google clarifies its stance on links</a></p>
<p>More information, and discussion of paid/exchanged links in my pamphlets:<br /><a href="http://sebastianx.blogspot.com/2007/05/google-hunts-paid-links-and-reciprocal.html">Matt Cutts and Adam Lasnik define &#8220;paid link&#8221;</a><br /><a href="http://sebastianx.blogspot.com/2007/04/where-is-precise-definition-of-paid.html">Where is the precise definition of a paid link?</a><br /><a href="http://sebastianx.blogspot.com/2007/04/full-disclosure-of-paid-links.html">Full disclosure of paid links</a><br /><a href="http://sebastianx.blogspot.com/2007/04/revise-your-linkage-now.html">Revise your linkage</a><br /><a href="http://sebastianx.blogspot.com/2007/04/link-monkey-business-is-not-worth.html">Link monkey business is not worth a whoop</a><br /><a href="http://sebastianx.blogspot.com/2006/02/is-buying-and-selling-text-links-risky.html">Is buying and selling links risky?</a> (02/2006)</p>
<hr />Copyright &copy; 2008 <strong><a href="http://sebastians-pamphlets.com/">Sebastian`s Pamphlets</a></strong>. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator/feed reader, the site you are looking at is guilty of copyright infringement and will be put down immediately. Please contact sebastians-pamphlets.com so we can take legal action immediately.<br /><span style="float: right;font-size: 7pt"><a href="http://blog.taragana.com/index.php/archive/wordpress-plugins-provided-by-taraganacom/">Plugin</a> by <a href="http://www.taragana.com/">Taragana</a></span>]]></content:encoded>
			<wfw:commentRss>http://sebastians-pamphlets.com/google-to-kill-the-power-of-links/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Google enhances the quality guidelines</title>
		<link>http://sebastians-pamphlets.com/google-enhances-the-quality-guidelines/</link>
		<comments>http://sebastians-pamphlets.com/google-enhances-the-quality-guidelines/#comments</comments>
		<pubDate>Tue, 05 Jun 2007 22:18:00 +0000</pubDate>
		<dc:creator>Sebastian</dc:creator>
		
		<category><![CDATA[Reciprocal Links]]></category>

		<category><![CDATA[Webspam]]></category>

		<category><![CDATA[Search Quality]]></category>

		<category><![CDATA[Risky Linkage]]></category>

		<category><![CDATA[Paid Links]]></category>

		<category><![CDATA[SEO]]></category>

		<category><![CDATA[Webmaster Central]]></category>

		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://sebastians-pamphlets.com/google-enhances-the-quality-guidelines/</guid>
		<description><![CDATA[Maybe todays update of Google&#8217;s quality guidelines is the first phase of the Webmaster help system revamp project. I know there&#8217;s more to come, Google has great plans for the help center. So don&#8217;t miss out on the opportunity to tell Google&#8217;s Webmaster Central team what you&#8217;d like to have added or changed. Only 14 [...]]]></description>
			<content:encoded><![CDATA[<p>Maybe todays update of Google&#8217;s quality guidelines is the first phase of the <a href="http://sebastianx.blogspot.com/2007/06/help-google-revealing-secret-sauce.html"><em>Webmaster help system revamp</em> project</a>. I know there&#8217;s more to come, <a href="http://googlewebmastercentral.blogspot.com/2007/06/revamping-webmaster-tools-help-center.html">Google has great plans for the help center</a>. So don&#8217;t miss out on the opportunity to <a href="http://groups.google.com/group/Google_Webmaster_Help-Requests/browse_thread/thread/fe39c698f998075d/">tell Google&#8217;s Webmaster Central team what you&#8217;d like to have added or changed</a>. Only 14 replies to this call for input is an evidence of incapacity, shame on the Webmasters community.</p>
<p>I haven&#8217;t had the time to write a full-blown review of the updates, so here are just a few remarks from a Webmaster&#8217;s perspective. Scroll down to <strong><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35769#quality">Quality guidelines</a> - specific guidelines</strong> to view the updates, that means click the links to the new (sometimes overlapping) detail pages.</p>
<p><a href="http://sebastianx.blogspot.com/2007/06/help-google-revealing-secret-sauce.html">As always</a>, the guidelines outline best practices of Web development, refer to common sense, and don&#8217;t encourage over-interpretations (not that those are avoidable, nor utterly useless). Now providing Webmasters with more explanatory directives, detailed definitions and even examples in the &#8220;Don&#8217;ts&#8221; section is very much appreciated. Look at the over five years old <a href="http://sebastianx.blogspot.com/2007/06/help-google-revealing-secret-sauce.html">first version</a> of this document before you bitch <img src='http://sebastians-pamphlets.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p><strong style="font-size:12pt; margin-bottom:5px;">Avoid <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66353">hidden text or hidden links</a></strong><br />The new help page on hidden text and links is descriptive and comes with examples, well done. What I miss is a hint with regard to CSS menus and other content which is hidden until the user performs a particular action. Google states &#8220;Text (such as excessive keywords) can be hidden in several ways, including [&#8230;] Using CSS to hide text&#8221;. The same goes for links by the way. I wish they would add something in the lines of &#8220;&#8230; Using CSS to hide text in a way that a user can&#8217;t visualize it by a common action like moving the mouse over a pointer to a hidden element, or clicking a text link or descriptive widget or icon&#8221;. The hint at the bottom &#8220;If you do find hidden text or links on your site, either remove them or, if they are relevant for your site&#8217;s visitors, make them easily viewable&#8221; comes close to this but lacks an example. </p>
<p><a href="http://groups.google.com/groups/profile?enc_user=bfiHVTYAAAC0ZCEBAysSlShC_gPAdXUZVBJwbVrQrjnMBIELTfghYz0h8XMSxFluWFJdWS1BQ425JCaQw8q0rq28lNpPHZyv">Susan Moskwa</a> from Google <a href="http://groups.google.com/group/Google_Webmaster_Help-Indexing/browse_thread/thread/928aa76a1226cf89/07ff235c6aeae4ef#07ff235c6aeae4ef">clarifies</a> what one can hide with CSS, and what sorts of CSS hidden stuff is considered a violation of the guidelines, in the Google forum on June/11/2007:<br />
<blockquote>If your intent in hiding text is to deceive the search engines, we frown on that; if your intent is purely to improve the visual user experience (e.g. by replacing some text with a fancier image of that same text), you don&#8217;t need to worry. Of course, as with many techniques, there are shades of gray between &#8220;this is clearly deceptive and wrong&#8221; and &#8220;this is perfectly acceptable&#8221;. Matt [Cutts] did say that hiding text moves you a step further towards the gray area. But if you&#8217;re running a perfectly legitimate site, you don&#8217;t need to worry about it. If, on the other hand, your site already exhibits a bunch of other semi-shady techniques, hidden text starts to look like one more item on that list. [&#8230;] As the Guidelines say, focus on intent. If you&#8217;re using CSS techniques purely to improve your users&#8217; experience and/or accessibility, you shouldn&#8217;t need to worry. One good way to keep it on the up-and-up (if you&#8217;re replacing text w/ images) is to make sure the text you&#8217;re hiding is being replaced by an image with the exact same text.</p></blockquote>
<p><strong style="font-size:12pt; margin-bottom:5px;" id="cloaking">Don&#8217;t use <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66355">cloaking or sneaky redirects</a></strong><br />This sentence in bold red blinking uppercase letters should be pinned 5 pixels below the heading: &#8220;When examining [&#8230;] your site to ensure your site adheres to our guidelines, <b>consider the intent</b>&#8221; (emphasis mine). There are so many perfectly legit ways to do the content presentation, that it is impossible to assign particular techniques to good versus bad intent, nor vice versa. </p>
<p>I think this page leads to misinterpretations. The major point of confusion is, that Google argues completely from a search engine&#8217;s perspective and dosn&#8217;t write for the targeted audience, that is Webmasters and Web developers. Instead of all the talk about users vs. search engines, it should distinguish plain user agents (crawlers, text browsers, JavaScript disabled &#8230;) from enhanced user agents (JS/AJAX enabled, installed and activated plug-ins &#8230;). Don&#8217;t get me wrong, this page gives the right advice, but the good advice is somewhat obfuscated in phrases like &#8220;Rather, you should consider visitors to your site who are unable to view these elements as well&#8221;. </p>
<p>For example &#8220;Serving a page of HTML text to search engines, while showing a page of images or Flash to users [is considered deceptive cloaking]&#8221; puts down a gazillion of legit sites which serve the same contents in different formats (and often under different URLs) depending on the ability of the current user agent to render particular stuff like Flash, and a bazillion of perfectly legit AJAX driven sites which provide crawlers and text browsers with a somewhat static structure of HTML pages, too. </p>
<p>&#8220;Serving different content to search engines than to users [is considered deceptive cloaking]&#8221; puts it better, because in reverse that reads &#8220;Feel free to serve identical contents under different URLs and in different formats to users and search engines. Just make sure that you accurately detect the capabilities of the user agent before you decide to alter a requested plain HTML page into a fancy conglomerate of flashing widgets with sound and other good vibrations, respectively vice versa&#8221;.  </p>
<p><strong style="font-size:12pt; margin-bottom:5px;">Don&#8217;t send <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66357">automated queries to Google</a></strong><br />This page doesn&#8217;t provide much more information than the paragraph on the main page, but there&#8217;s not that much to explain: don&#8217;t use WebPosition Gold&trade;. Period.</p>
<p><strong style="font-size:12pt; margin-bottom:5px;">Don&#8217;t <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66358">load pages with irrelevant keywords</a></strong><br />Tells why keyword stuffing is not a bright idea, nothing to note.</p>
<p><strong style="font-size:12pt; margin-bottom:5px;">Don&#8217;t create multiple pages, subdomains, or domains with substantially <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66359">duplicate content</a></strong><br />This detail page is a must read. It starts with a to the point definition &#8220;Duplicate content generally refers to substantive blocks of content within <em>or across</em> domains that either completely match other content or are appreciably similar&#8221;, followed by a ton of good tips and valuable information. And fortunately it expresses that there&#8217;s no such thing as a general duplicate content penalty.</p>
<p><strong style="font-size:12pt; margin-bottom:5px;">Don&#8217;t <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66354">create pages that install viruses, trojans, or other badware</a></strong><br />Describes <a href="http://blogs.stopbadware.org/articles/2007/02/26/google-expands-badware-notifications-for-webmasters">Google&#8217;s</a> <a href="http://googlewebmastercentral.blogspot.com/2007/02/better-badware-notifications-for.html">service</a> in <a href="http://www.stopbadware.org/home/thirdparties">partnership</a> with <a href="http://www.stopbadware.org/home/clearinghouse">StopBADware.org</a>, highlighting the <a href="http://stopbadware.org/home/review">quickest procedure to get Google&#8217;s malware warning removed</a>.  </p>
<p><strong style="font-size:12pt; margin-bottom:5px;">Avoid <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66355">&#8220;doorway&#8221; pages created just for search engines</a>, or other &#8220;cookie cutter&#8221; approaches such as affiliate programs with <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66361">little or no original content</a></strong><br />The info on doorway pages is just a paragraph on the &#8220;cloaking and sneaky redirect&#8221; page. I miss a few tips on how one can identify unintentional doorway pages created by just bad design, without any deceptive intent. Also, I think a few sentences on thin SERP-like pages would be helpful in this context.</p>
<p>&#8220;Little or no original content&#8221; targets thin affiliate sites, again doorway pages, auto-generated content, and scraped content. It becomes clear that Google does not love <acronym title="Made For AdSense">MFA</acronym> sites.</p>
<p><strong style="font-size:12pt; margin-bottom:5px;">If your site participates in an affiliate program, make s