Blogger to rule search engine visibility?
Via Google’s Webmaster Forum I found this curiosity:
http://www.stockweb.blogspot.com/robots.txt
User-agent: *
Disallow: /search
Disallow: /
A standard robots.txt at *.blogspot.com looks different:
User-agent: *
Disallow: /search
Sitemap: http://*.blogspot.com/feeds/posts/default?orderby=updated
According to the blogger the blog is not private, what would explain the crawler blocking:
It is a public blog. In the past it had a standard robots.txt, but 10 days ago it changed to “Disallow: /”
Copyscape thinks that the blog in question shares a fair amount of content with other Web pages. So does blog search:http://stockweb.blogspot.com/2007/07/ukraine-stock-index-pfts-gained-97-ytd.html
has a duplicate, posted by the same author, athttp://business-house.net/nokia-nok-gains-from-n-series-smart-phones/, http://stockweb.blogspot.com/2007/07/prague-energy-exchange-starts-trading.html
is reprinted athttp://business-house.net/prague-energy-exchange-starts-trading-tomorrow/
and so on. Probably a further investigation would reveal more duplicated contents.
It’s understandable that Blogger is not interested in wasting Google’s resources by letting Ms. Googlebot crawl the same contents from different sources. But why do they block other search engines too? And why do they block the source (the posts reprinted at business-house.net state “Originally posted at [blogspot URL]”)?
Is this really censorship, or just a software glitch, or is it all the blogger’s fault?
Update 07/26/2007: The robots.txt reverted to standard contents for unknown reasons. However, with a shabby link neigborhood as expressed in the blog’s footer I doubt the crawlers will enjoy their visits. At least the indexers will consider this sort of spider fodder nauseous.
|
Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb Subscribe to |
11 comments Sebastian | Duplicate Content, Blogger, robots.txt, Google
![Matt Cutts's first spot for [buy cheap viagra algorithmically] Matt Cutts's first spot for [buy cheap viagra algorithmically]](http://www.smart-it-consulting.com/img/misc/matt-no1-buy-cheap-viagra-algorithmically.png)


It’s hard to find an obscure search query like [
