Google Blog Search Banned Legit Webmaster Forum

I’ve been able to get all sorts of non-blog stuff onto the SERPs of Google’s blog search in the past. However, my attempt to get contents hosted by Google into blog search is best described as miserable failure. Although Google Blog Search BETA delivers results from all kind of forums, it obviously can’t deal with threaded content from a source which recently got rid of its BETA stage.

First I’ve tried to ping blog search, submitted feeds, linked to threads from here and in a feed regulary fetched for blog search as well. No results. No robots.txt barriers or noindex tags, just badly malformed code but Google’s bot can eat not properly closed alternate links pointing to an RSS feed … drove me nuts. Must be a ban or at least a heavy troll-penalty I thought, went to Yahoo, masked the feed URLs, submitted again but no avail.

Try for yourself, submit a feed to Google Blog Search, then use a somewhat unique thread title and do a blog search. Got zilch too? Try a web search to double check that the content is crawlable. It is. Conclusion? Google banned its very own Google Groups.

Too sad, poor PageRank addicts running blog searches will miss out on tidbits like this quote from Google’s Adam Lasnik, asked why URLs blocked from crawlers show toolbar-PR:

As for the PR showing… it’s my understanding that the toolbar is using non-private info (PR data from other pages in that domain) to extrapolate/infer/guess a PR for that page :).

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Code Monkey Very Simple Man/Woman

Rcjordan over at Threadwatch pointed me to a nice song perfectly explaining romors like “Google’s verification tags get you into supplemental hell” and thoughtless SEO theories like “self-closing meta tags in HTML 4.x documents and uppercase element/attribute names in XHTML documents prevent search engine crawlers from indexing”. You don’t believe such crappy “advice” can make it to the headlines? Just wait for an appropiate thread at your preferred SEO forum picked by a popular but technically challenged blogger. This wacky hogwash is the most popular lame excuse for MSSA issues (aka “Google is broke coz my site sitting at top10 positions since the stone age disappeared all of a sudden”) at Google’s very own Webmaster Central.

Here is a quote:

“The robot [search engine crawler] HAS to read syntactically … And I opt for this explanation exactly because it makes sense to me [the code monkley] that robots have to be dilligent in crawling syntactically in order to do a good job of indexing … The old robots [Googlebot 2.x] did not actually parse syntactically - they sucked in all characters and sifted them into keywords - text but also tags and JS content if the syntax was broken, they didn’t discrimnate. Most websites were originally indexed that way. The new robots [Mozilla compatible Googlebot] parse with syntax in mind. If it’s badly broken (and improper closing of a tag in the head section of a non-xhtml dtd is badly broken), they stop or skip over everything else until they find their bearings again. With a broken head that happens the </html> tag or thereabouts”.

Basically this means that the crawler ignores the remaining code in HEAD or even hops to the end of the document not reading the page’s contents.

In reality search engine crawlers are pretty robust and fault tolerant, designed to eat and digest the worst code one can provide. These Google’s Sandbox“).

Just hire code monkeys for code monkey tasks, and SEOs for everything else ;)

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Finally dumping M$-Office…

… in favour of “Google Office”, err, Google Apps.

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Hapless Structures and Weak Linkage

Michael Martinez over at SEO-Theory (moved!) has a nice write-up on how to get crawled and indexed. The post titled “Search engine love: now they crawl me, now they don’t” discusses the importance of internal linkage, PageRank distribution, and Google’s recent architectural changes — topics which are “hot” in Google’s Webmaster Help Center, where I hang out every now and then. I thought I blog Michael’s nice essay as sort of multi-link-bookmark making link drops easier, so here is some of my stuff related to crawling and indexing:

About Google’s Toolbar-PageRank
High PageRank leads to frequent crawling, but nonetheless ignore green pixels.

The Top-5 Methods to Attract Search Engine Spiders
Get deep links to great content.

Supporting search engine crawling
The syntax of a search engine friendly Web site.

Web Site Structuring
Do’s and don’ts on information architectures.

Optimizing Web Site Navigation
Tweak your UI for users to make it crawler friendly.

Linking is All About Popularity and Authority
LOL: Link out loud.

Related information

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Interested in buying a text link

Today I give up on answering emails like this one:


First of all I would like to introduce my company as one of the best web hosting service provider from [country] named [link]. We are in the hosting business since 2004 and have more than 3000 satisfied customers.

We are having PR -6 and an alexa ranking of 63,697

We are interested to purchase a link at your site, please provide us with a suitable quotation.

Waiting for your kind reply.

[Name, Company …]

Besides the fact that a page claiming a PageRank of minus six most probably is not that kind of neighborhood I’d tend to link out to, it’s a kinda stupid attempt.

Not only the page where the contact link was clicked is in no way related to web hosting services (it just triggers a few green pixels in the Google toolbar). Each and every page on this topic has a link leading to my take on paid links, which does not encourage link monkey business, so to say.

My usual reply to such emails was “Thanks for writing, you can buy a nofollow’ed link marked as advertising for a low as [tiny monthly fee] when you suggest a page on my site which is relevant to yours and I like what you provide to your visitors/users” plus an explanation of the link condom. No takers.

The message above is from a clown abusing my contact form today, so I guess it’s OK to quote it. It is however symptomatic, there are lots of folks out there who still believe that fooling the engines is that simple. I admit it can be done, but I’m with Eric Ward who says it’s not worth it.

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Priceless SEO Advice

Just stumbled upon: If you are too stupid to use a computer you might try giving SEO advice. Best business plan ever, for idots ;)

My favourite:

Q: Why are SEO Consultants too expensive for webmasters?
A: I personally used a firm that did a 250,000 site submissions for my site, it worked great.

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Google going to revamp the rel=nofollow microformat?

I’ve asked Adam Lasnik, Google’s search evangelist:

Adam, what is Google’s take on extending the nofollow functionality by working out a microformat that covers the existing mechanism w/o being that unclear and confusing, and which takes care of similar needs like section targeting on element level and qualified votes as well?

and he answered

Sebastian, nothing’s set in stone. Stuff is likely to evolve :)

That’s an elating signal, thank you Adam. And it leads to a bunch of questions.

Will Google continue to cook nofollow in its secret sauce, revealing morphed semantics (affiliate links), unpopular areas of application (paid links) and changed functionality (no longer fetching the linked resource) every now and then? From my interpretation of Google’s ongoing move to candidness I guess not.

Will Google gather a couple search companies to work out a new standard? I hope not, it would be a mistake not to involve content providers, webmasters, publishers, CMS vendors, even SEOs and opinion makers again.

Will Google ask for input? Will the process of defining a standard for micro crawler directives be an open and public discussion? Are we talking about an extended microformat, limited to the A element’s rel and rev attributes, or does Google think of a broader approach covering for example section targeting and other crawler directives in class attributes on block level too? Will a new or more powerful interfere other norms like , , , or drafts like the not yet that comprehensive microformat (also badly named because it covers inclusion too)? By the way, the links above lead you to interesting thoughts on reach, functionality and implementation of an extended norm replacing nofollow, and I, like many of you, have a couple more ideas and concepts in mind.

I take Adam’s tidbit as call for participation. Dear no-to-nofollow-sayers and nofollow-supporters out there, join the crowd at the white board! Throw in your thoughts, concepts, wishes and ideas.

In the meantime make use of this catalogue of do-follow plugins.

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Say No to NoFollow Follow-up

Say NO to NOFOLLOW - copyright jlh-design.comI don’t want to make this the nofollow-blog, but since more and more good folks don’t love the nofollow-beast any more, here is a follow-up on the recent nofollow discussion. Follow the no-to-nofollow trend here:

Loren Baker posts 13 very good reasons why rel=nofollow sucks. He got dugg, buried, but tons of responses in the comments, where most people state that rel=nofollow was a failure with regard to the current amount of comment spam, because the spammers spam for traffic, not link love. Well, that’s true, but rel=nofollow at least nullifies the impact spamming of unmoderated blogs had on search results, says Google. Good point, but is it fair to penalize honest comment authors by nofollow’ing their relevant links by default? Not really. The search engines should work harder on solving this problem algorithmically, and CMS vendors should go back to the white board to develop a reasonable solution. Matt Mullenweg from WordPress admits that “in hindsight, I don’t think nofollow had much of an effect [in fighting comment spam]”, and I hope this insight triggers a well thought out workflow replacing the unethical nofollow-by-default (see follow you, follow me).

At Google’s Webmaster Help Center regular posters nag Googlers with questions like Is rel=nofollow becoming the norm? Google’s search evangelist Adam Lasnik stepped in and states “As you might have noticed, many of the world’s most successful sites link liberally to other sites, and this sort of thing is often appreciated by and rewarded by visitors. And if you’re editorially linking to sites you can personally vouch for, I can’t see a reason to no-follow those.” and “On the whole [nofollow thingie], while Matt’s been pretty forthcoming and descriptive, I do think we Googlers on the whole can do a better job in explaining and justifying nofollow“. Thanks Adam, while explaining Google’s take on rel=nofollow to the great unwashed, why not start a major clean-up to extend this microformat and to make it useful, useable and less confusing for the masses?

While waiting for actions promised by the nofollow inventor, here is a good summary of nofollow clarifications by Googlers. I’ve a ton of respect for Matt, I know he listens and picks reasonable arguments even from negative posts, so stay tuned (I do hope my tiny revamp-nofollow campaign is not seen as negative press by the way).

A very good starting point to examine the destructive impact rel=nofollow had, has, and will have if not revamped, is Carsten Cumbrowski’s essay explaining why rel=nofollow leverages mistrust among people. I do not provide quotes because I want you all to read and reread this great article.

Robert Scoble rethinking his nofollow support says “I was wrong about “NoFollow” … I’m very concerned, for instance, about Wikipedia’s use of nofollow“. Scroll down, don’t miss out on the comments.

Michael Gray’s strong statement Google’s policy on No follow and reviews is hypocritical and wrong is worth a read, he’s backing his point of view providing a complete nofollow-history along with many quotes and nofollow-tidbits.

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

The Nofollow-Universe of Black Holes

I pretty much dislike the rel=nofollow fiasco for various reasons, especially its ongoing semantic morphing and often unethical implementation. Recently I wrote about nofollow-confusion and beginning nofollow-insane. Meanwhile the nofollow-debacle went a major step forwards: bloggers fight huge black holes (the completely link-condomized Wikipedia) with many tiny black holes (plug-ins castrating links leading to Wikipedia).

Folks, do you realize that actually you’ve joined the nofollow-nightmare you’re ranting about? Instead of trying to change things with constructive criticism addressing nofollow-supporters, you take the Old Testament approach, escalating an IMHO still remediable aberration. This senseless attitude supports the hapless nofollow-mechanism by the way. You’re acting like defiant kids crying “nofollow is sooooo unfair” while you strike back with tactical weapons unsuitable to solve the nofollow-problem. Devaluing Wikipedia links because Wikipedia is de facto an untrusted source of information OTOH makes sound sense, although semantically rel=nofollow is not the right way to go in this case.

I understand that losing the (imputed!) link juice of a couple Wikipedia links is not nice. However, I don’t buy that these links were boosting SE rankings in the first place –although a few sites having only Wikipedia inbound links drop out of the SERPs currently–, their real value is extremely well targeted traffic, and these links are still clickable.

I agree that Wikipedia’s decision to link-condomize all outbound links is a thoughtless, lazy, and pretty insufficient try to fight vandalizing link droppers. It is even “unfair”, because the black hole Wikipedia now sucks the whole Web’s link juice while giving nothing (except nicely targeted traffic) in return. But I must admit that there were not that many options, since there are no search engine crawler directives on link level providing the granularity Wikipedia probably needs.

Lets imagine the hapless nofollow value of the REL attribute would not exist. In this scenario Wikipedia could implement 4-eyes link tagging as follows:
1. New outgoing links would get tagged rel=”unapproved”. Search engines would not count a vote for the link destination, but follow the link.
2. Later on, when a couple trusted users and/or admins have approved the link, “unapproved” would get removed forever (URL and REL values stored in combination with the article’s URL to automatically reinstate the link’s stage on edits where a link gets removed, added, removed and added again…). So far that would even work with the misguiding “nofollow” value, but an extended microformat would allow meaningful followup-tags like “example”, “source”, “inventor”, “norm”, “worstenemy”, “hownotto” or whatever.

Instead of ranting and vandalizing links we should begin to establish a RFC on crawler directives on HTML element level. That would be a really productive approach.

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

Dear search engines, please bury the rel=nofollow-fiasko

The misuse of the rel=nofollow initiative is getting out of control. Invented to fight comment spam, nowadays it is applied to commercial links, biased editorial links, navigational links, links to worst enemies (funny example: Matt Cutts links to a SEO-Blackhat with rel=nofollow) and whatever else. Gazillions of publishers and site owners add it to their links for the wrong reasons, simply because they don’t understand its intention, its mechanism, and especially not the ongoing morphing of its semantics. Even professional webmasters and search engine experts have a hard time to follow the nofollow-beast semantically. As more its initial usage gets diluted, as more folks suspect search engines cook their secret sauce with indigestibly nofollow-ingredients.

Not only rel=nofollow wasn’t able to stop blog-spam-bots, it came with a build-in flaw: confusion.

Good news is that currently the nofollow-debate gets stoked again. Threadwatch hosts a thread titled Nofollow’s Historical Changes and Associated Hypocrisy, folks are ranting on the questionable Wikipedia decision to nofollow all outbound links, Google video folks manipulated the PageRank algo by plastering most of their links with rel=nofollow by mistake, and even Yahoo’s top gun Jeremy Zawodny is not that happy with the nofollow-debacle for a while now.

Say NO to NOFOLLOW - copyright jlh-design.comI say that it is possible to replace the unsuccessful nofollow-mechanism with an understandable and reasonable functionality to allow search engine crawler directives on link level. It can be done although there are shitloads of rel=nofollow links out there. Here is why, and how:

The value “nofollow” in the link’s REL attribute creates misunderstandings, recently even in the inventor’s company, because it is, hmmm, hapless.

In fact, back then it meant “passnoreputation” and nothing more. That is search engines shall follow those links, and they shall index the destination page, and they shall show those links in reversed citation results. They just must not pass any reputation or topical relevancy with that link.

There were micro formats better suitable to achieve the goal, for example Technorati’s votelinks, but unfortunately the united search geeks have chosen a value adapted from the robots exclusion standard, which is plain misleading because it has absolutely nothing to do with its (intended) core functionality.

I can think of cases where a real nofollow-directive for spiders on link level makes perfect sense. It could tell the spider not to fetch a particular link destination, even if the page’s robots tag says “follow”, for example printer friendly pages. I’d use an “ignore this link” directive for example in crawlable horizontal popup menus to avoid theme dilution when every page of a section (or site) links to every other page. Actually, there is more need for spider directives on HTML element level, not only in links, for example to tag templated and/or navigational page areas like with Google’s section targeting.

There is nothing wrong with a mechanism to neutralize links in user input. Just the value “nofollow” in the type-of-forward-relationship attribute is not suitable to label unchecked or not (yet) trusted links. If it is really necessary to adopt a well known value from the robots exclusion standard (and don’t misunderstand me, reusing familiar terms in the right context is a good idea in general), the “noindex” value would have been be a better choice (although not perfect). “Noindex” describes way better what happens in a SE ranking algo: it doesn’t index (in its technical meaning) a vote for the target. Period.

It is not too late to replace the rel=nofollow-fiasco with a better solution which could take care of some similar use cases too. Folks at Technorati, the W3C and whereever have done the initial work already, so it’s just a tiny task left: extending an existing norm to enable a reasonable granularity of crawler directives on link level, or better for HTML elements at all. Rel=nofollow would get deprecated, replaced by suitable and standardized values, and for a couple years the engines could interpret rel=nofollow in its primordial meaning.

Since the rel=nofollow thingy exists, it has confused gazillions of non-geeky site owners, publishers and editors on the net. Last year I’ve got a new client who added rel=nofollow to all his internal links because he saw nofollowed links on a popular and well ranked site in his industry and thought rel=nofollow could perhaps improve his own rankings. That’s just one example of many where I’ve seen intended as well as mistakenly misuse of the way too geeky nofollow-value. As Jill Whalen points out to Matt Cutts, that’s just the beginning of net-wide nofollow-insane.

Ok, we’ve learned that the “nofollow” value is a notional monster, so can we please have it removed from the search engine algos in favour of a well thought out solution, preferably asap? Thanks.

Tags: ()

Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments

« Previous Page  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28  Next Page »