Googlebots go Fishing with Sitemaps

I’ve used Google Sitemaps since it was launched in June. Six weeks later I say ‘Kudos to Google’, because it works even better than expected. Making use of Google Sitemaps is definitely a must, at least for established Web sites (it doesn’t help much with new sites).

From my logging I found some patterns, here is how the Googlebot sisters go fishing:
· Googlebot-Mozilla downloads the sitemaps 6 times per day, every 8 hours 2 fetches like a clockwork (or every 12 hours lately, now up to 4 fetches within a few minutes from the same IP address). Since this behavior is not documented, I recommend the implementation of automated resubmit-pings however.
· Googlebot fetches new and updated pages harvested from the sitemap, at the latest 2 days after inclusion in the XML file, respectively after providing a current last modified value. Time to index is constantly maximal 2 days. There is just one fetch per page (as long as the sitemap doesn’t submit another update), resulting in a complete indexing (Title, snippets, and cached page). Sometimes she ‘forgets’ a sitemap-submitted URL, but fetches it later following links (this happens with very similar new URLs, especially when they differ only in a query string value). She crawls and indexes even (new) orphans (pages not linked from anywhere).
· Googlebot-Mozilla acts as a weasel in Googlebot’s backwash and is suspected to reveal her secrets to AdSense.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Be the first to add a Comment to "Googlebots go Fishing with Sitemaps"

Leave a reply


[If you don't do the math, or the answer is wrong, you'd better have saved your comment before hitting submit. Here is why.]

Be nice and feel free to link out when a link adds value to your comment. More in my comment policy.