<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.3" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Crawling vs. Indexing</title>
	<link>http://sebastians-pamphlets.com/crawling-vs-indexing/</link>
	<description>If you've read my articles somewhere on the Internet, expect something different here.</description>
	<pubDate>Mon, 21 May 2012 23:50:26 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.3</generator>

	<item>
		<title>By: eUKhost</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-2967</link>
		<dc:creator>eUKhost</dc:creator>
		<pubDate>Wed, 23 Nov 2011 09:14:42 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-2967</guid>
		<description>Thanks for explaining the difference between Crawling and Indexing nicely. The way you have explained it is very easy and helpful.</description>
		<content:encoded><![CDATA[<p>Thanks for explaining the difference between Crawling and Indexing nicely. The way you have explained it is very easy and helpful.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: How to Spam Your Competitors&#8217; Search Results</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-2821</link>
		<dc:creator>How to Spam Your Competitors&#8217; Search Results</dc:creator>
		<pubDate>Sun, 06 Mar 2011 17:39:53 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-2821</guid>
		<description>[...] them from appearing in Google&#8217;s search results, then you need to study the difference between crawling and indexing. The only thing I found interesting about this particular SERP listing is the title. As far as I [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] them from appearing in Google&#8217;s search results, then you need to study the difference between crawling and indexing. The only thing I found interesting about this particular SERP listing is the title. As far as I [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: How do Majestic and LinkScape get their raw data?</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-2169</link>
		<dc:creator>How do Majestic and LinkScape get their raw data?</dc:creator>
		<pubDate>Thu, 21 Jan 2010 21:01:29 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-2169</guid>
		<description>[...] 63 million root index pages, carrying 700 billion links&#8221;. 13 links per page is plausible. Crawling 55 billion URIs requires sending out HTTP GET requests to fetch 55 billion Web resources within 45 [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] 63 million root index pages, carrying 700 billion links&#8221;. 13 links per page is plausible. Crawling 55 billion URIs requires sending out HTTP GET requests to fetch 55 billion Web resources within 45 [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: The perfect robots.txt for News Corp</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1997</link>
		<dc:creator>The perfect robots.txt for News Corp</dc:creator>
		<pubDate>Mon, 07 Dec 2009 08:25:36 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1997</guid>
		<description>[...] I appreciate Google&#8217;s brand new News User Agent. It is, however, not a perfect solution, because it doesn&#8217;t distinguish indexing and crawling. [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] I appreciate Google&#8217;s brand new News User Agent. It is, however, not a perfect solution, because it doesn&#8217;t distinguish indexing and crawling. [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Zen Cart SEO - 12 Steps to Success &#124; E-Commerce for All</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1778</link>
		<dc:creator>Zen Cart SEO - 12 Steps to Success &#124; E-Commerce for All</dc:creator>
		<pubDate>Thu, 10 Sep 2009 13:58:30 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1778</guid>
		<description>[...] to do it properly. Note, however, and I don&#8217;t care what anyone else has told you&#8230;. Your robots.txt WILL NOT prevent indexing or remove indexed pages. To acheive this you will need a noindex tag in the page in [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] to do it properly. Note, however, and I don&#8217;t care what anyone else has told you&#8230;. Your robots.txt WILL NOT prevent indexing or remove indexed pages. To acheive this you will need a noindex tag in the page in [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregg Hilferding &#187; Blog Archive &#187; You too can have a web crawler and index!</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1465</link>
		<dc:creator>Gregg Hilferding &#187; Blog Archive &#187; You too can have a web crawler and index!</dc:creator>
		<pubDate>Tue, 21 Oct 2008 20:08:54 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1465</guid>
		<description>[...] not SEOmoz has their own crawler/spider. All the word play is confusing. If you haven&#8217;t yet, this explanation of Crawling vs. Indexing clarifies the terms [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] not SEOmoz has their own crawler/spider. All the word play is confusing. If you haven&#8217;t yet, this explanation of Crawling vs. Indexing clarifies the terms [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Heil</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1464</link>
		<dc:creator>Doug Heil</dc:creator>
		<pubDate>Tue, 21 Oct 2008 19:04:23 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1464</guid>
		<description>&lt;p&gt;Goodness; is that so very true!&lt;/p&gt;

&lt;p&gt;I could give a rats arse what kind of data the tool shows for me OR my clients, but that is not the point at all. It's one SEO's opinion about link data anyway, so I don't care about that. &lt;span style="color:black; background:black; font-size: 1pt;"&gt;&lt;strike&gt;If the firm had been totally honest and not deceitful at all about what they were doing and had been doing for quite awhile now, not much would have been said, other than the fact the tool is really a rogue scraper. The fact his firm promotes themselves as do-gooders to the SEO industry and best practices, etc, makes what he has done and how they did it, all the more pathetic.&lt;/strike&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;[Again, off-topic rants blurred ...]&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Goodness; is that so very true!</p>
<p>I could give a rats arse what kind of data the tool shows for me OR my clients, but that is not the point at all. It&#8217;s one SEO&#8217;s opinion about link data anyway, so I don&#8217;t care about that. <span style="color:black; background:black; font-size: 1pt;"><strike>If the firm had been totally honest and not deceitful at all about what they were doing and had been doing for quite awhile now, not much would have been said, other than the fact the tool is really a rogue scraper. The fact his firm promotes themselves as do-gooders to the SEO industry and best practices, etc, makes what he has done and how they did it, all the more pathetic.</strike></span></p>
<p>[Again, off-topic rants blurred &#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sebastian</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1463</link>
		<dc:creator>Sebastian</dc:creator>
		<pubDate>Tue, 21 Oct 2008 18:28:03 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1463</guid>
		<description>&lt;p&gt;The meta tag method is nice to have, but not practically, &lt;a href="http://sphinn.com/story/80187#c56318"&gt;IOW&lt;/a&gt; more or less propaganda. I'm not excited to add a shitload of daily code bloat to the HEAD section of all my pages just because someone releases a new tool. Actually, I won't do it except for indexing purposes addressing major search engines, and I'm in good company.&lt;/p&gt;
&lt;p&gt;Frankly, I don't care about the data they publish, those are available elsewhere for free. Technically spoken, the offered on-page 'noindex' directive offers a way to opt out partially, but that's IMHO just a legal thingy. BTW the same legal butt covering that all major engines do practice since Web search exists, without much whining across the boards. So that's not the point.&lt;/p&gt;
&lt;p&gt;The point is that a SEO company should be way more sensible than a search engine when it comes to Webmaster concerns. Part of the canonical way to launch such SEO tools is a simple procedure to opt-out without hassles, and in a timely manner. Probably very few folks would actually opt out when such an option comes with the first product release. Now every dog and its fleas, well, flee.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>The meta tag method is nice to have, but not practically, <a href="http://sphinn.com/story/80187#c56318">IOW</a> more or less propaganda. I&#8217;m not excited to add a shitload of daily code bloat to the HEAD section of all my pages just because someone releases a new tool. Actually, I won&#8217;t do it except for indexing purposes addressing major search engines, and I&#8217;m in good company.</p>
<p>Frankly, I don&#8217;t care about the data they publish, those are available elsewhere for free. Technically spoken, the offered on-page &#8216;noindex&#8217; directive offers a way to opt out partially, but that&#8217;s IMHO just a legal thingy. BTW the same legal butt covering that all major engines do practice since Web search exists, without much whining across the boards. So that&#8217;s not the point.</p>
<p>The point is that a SEO company should be way more sensible than a search engine when it comes to Webmaster concerns. Part of the canonical way to launch such SEO tools is a simple procedure to opt-out without hassles, and in a timely manner. Probably very few folks would actually opt out when such an option comes with the first product release. Now every dog and its fleas, well, flee.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Heil</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1462</link>
		<dc:creator>Doug Heil</dc:creator>
		<pubDate>Tue, 21 Oct 2008 17:30:37 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1462</guid>
		<description>Yep.

@Sebastian; you wrote:
"and whether or not the method(s) given to Webmasters are reasonable or not"

Any method having to do advertising SEOMOZ in all pages head tag's is not reasonable at all. It's a "linkage" tool, so meta tag blocking is not possible. He knows that.

Besides all of that; did you see his latest statement about his tool and methods? He stated he was a rogue bot with not much anyone can do about it. I paraphrased. 

But anyway; he chooses to run his firm in this fashion, so we can choose how we wish to continue to view his firm. The industry will decide as a whole.

Again Sebastian; thanks for putting the definitions out there extremely concise and clear.</description>
		<content:encoded><![CDATA[<p>Yep.</p>
<p>@Sebastian; you wrote:<br />
&#8220;and whether or not the method(s) given to Webmasters are reasonable or not&#8221;</p>
<p>Any method having to do advertising SEOMOZ in all pages head tag&#8217;s is not reasonable at all. It&#8217;s a &#8220;linkage&#8221; tool, so meta tag blocking is not possible. He knows that.</p>
<p>Besides all of that; did you see his latest statement about his tool and methods? He stated he was a rogue bot with not much anyone can do about it. I paraphrased. </p>
<p>But anyway; he chooses to run his firm in this fashion, so we can choose how we wish to continue to view his firm. The industry will decide as a whole.</p>
<p>Again Sebastian; thanks for putting the definitions out there extremely concise and clear.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sebastian</title>
		<link>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1460</link>
		<dc:creator>Sebastian</dc:creator>
		<pubDate>Tue, 21 Oct 2008 17:02:17 +0000</pubDate>
		<guid>http://sebastians-pamphlets.com/crawling-vs-indexing/#comment-1460</guid>
		<description>Doug, 
 
thanks for the compliment. In the sense of my statement above I had to blur your rant. Everything you said can be found in endless variations at Sphinn and various blogs discussing the LinkScape launch. I don't think that it's  helpful to spread repetitive rants. Rand has no chance to reply to every post on gazillions of SEO hangouts, and that doesn't fit my understanding of fairness in a professional discussion.  
 
I want to keep the comments to this post on topic. The point is crawling vs. indexing, respectively whether or not it's possible to opt out of possible crawls that feed only the LinkScape index, and whether or not the method(s) given to Webmasters are reasonable or not. BTW, Rand has announced that his team will apply changes to the current system by the end of this year. From a developers POV this timeline is reasonable, although I do think that this functionality should have been an essential part of the concept, delivered with the very first product release. For example a refetch of outdated robots.txt files to check for (new) exclusionary statements before an index update should be doable in a system of that size. 
 
Thanks for your understanding. 
Sebastian</description>
		<content:encoded><![CDATA[<p>Doug, </p>
<p>thanks for the compliment. In the sense of my statement above I had to blur your rant. Everything you said can be found in endless variations at Sphinn and various blogs discussing the LinkScape launch. I don&#8217;t think that it&#8217;s  helpful to spread repetitive rants. Rand has no chance to reply to every post on gazillions of SEO hangouts, and that doesn&#8217;t fit my understanding of fairness in a professional discussion.  </p>
<p>I want to keep the comments to this post on topic. The point is crawling vs. indexing, respectively whether or not it&#8217;s possible to opt out of possible crawls that feed only the LinkScape index, and whether or not the method(s) given to Webmasters are reasonable or not. BTW, Rand has announced that his team will apply changes to the current system by the end of this year. From a developers POV this timeline is reasonable, although I do think that this functionality should have been an essential part of the concept, delivered with the very first product release. For example a refetch of outdated robots.txt files to check for (new) exclusionary statements before an index update should be doable in a system of that size. </p>
<p>Thanks for your understanding.<br />
Sebastian</p>
]]></content:encoded>
	</item>
</channel>
</rss>

