Beware of the narrow-minded coders

or Ignorance is no excuse

Long winded story on SEO-ignorant pommy coders putting their customers at risk. Hop away if e-commerce software vs. SEO dramas don’t thrill you.

Recently I’ve answered a “Why did Google deindex my pages” question in Google’s Webmaster Forum. It turned out that the underlying shopping cart software (EROL) maintained somewhat static pages as spider fodder, which redirect human visitors to another URL serving the same contents client sided. Silly thing to do, but pretty common for shopping carts. I’ve used the case as an example of a nice shopping cart coming with destructive SEO in a post on flawed shopping carts in general.

Day by day other site owners operating Erol driven online shops popped up in the Google Groups or emailed me directly, so I realized that there is a darn widespread problem involving a very popular UK based shopping cart software responsible for Google cloaking penalties. From my contacts I knew that Erol’s software engineers and self-appointed SEO experts believe in weird SEO theories and don’t consider that their software architecture itself could be the cause of the mess. So I wrote a follow-up addressing Erol directly. Google penalizes Erol-driven e-commerce sites explaines Google’s take on cloaking and sneaky JavaScript redirects to Erol and its customers.

My initial post got linked and discussed in Erol’s support forum and kept my blog stats counter buzzy over the weekend. Accused of posting crap I showed up and posted a short summary over there:

Howdy, I’m the author of the blog post you’re discussing here: Why eCommerce systems suck

As for crap or not crap, judge yourself. This blog post was addressed to ecommerce systems in general. Erol was mentioned as an example of a nice shopping cart coming with destructive SEO. To avoid more misunderstandings and to stress the issues Google has with Erol’s JavaScript redirects, I’ve posted a follow-up: Google deindexing Erol-driven ecommerce sites.

This post contains related quotes from Matt Cutts, head of Google’s web spam team, and Google’s quality guidelines. I guess that piece should bring my point home:

If you’re keen on search engine traffic then do not deliver one page to the crawlers and another page to users. Redirecting to another URL which serves the same contents client sided gives Google an idea of intent, but honest intent is not a permission to cloak. Google says JS redirects are against the guidelines, so don’t cloak. It’s that simple.

If you’ve questions, post a comment on my blog or drop me a line. Thanks for listening

Sebastian

Next the links to this blog were edited out and Erol posted a longish but pointless charade. Click the link to read it in full, summarizing it tells the worried Erol victims that Google has no clue at all, frames and JS redirects are great for online shops, and waiting for the next software release providing meaningful URLs will fix everything. Ok, that’s polemic, so here are at least a few quotes:

[…] A number of people have been asking for a little reassurance on the fact that EROL’s x.html pages are getting listed by Google. Below is a list of keyword phrases, with the number of competing pages and the x.html page that gets listed [4 examples provided].
[…]
EROL does use frames to display the store in the browser, however all the individual pages generated and uploaded by EROL are static HTML pages (x.html pages) that can be optimised for search engines. These pages are spidered and indexed by the search engines. Each of these x.html pages have a redirect that loads the page into the store frameset automatically when the page is requested.
[…]
EROL is a JavaScript shopping cart, however all the links within the store (links to other EROL pages) that are added using EROL Link Items are written to the static HTML pages as a standard <a href=”"> links - not a JavaScript link. This helps the search engines spider other pages in your store.

The ’sneaky re-directs’ being discussed most likely relate to an older SEO technique used by some companies to auto-forward from an SEO-optimised page/URL to the actual URL the site-owner wants you to see.

EROL doesn’t do this - EROL’s page load actually works more like an include than the redirect mentioned above. In its raw form, the ‘x123.html’ page carries visible content, readable by the search engines. In it’s rendered form, the page loads the same content but the JavaScript rewrites the rendered page to include page and product layout attributes and to load the frameset. You are never redirected to another html page or URL. [Not true, the JS function displayPage() changes the location of all pages indexed by Google, and property names like ‘hidepage’ speak for themselves. Example: x999.html redirects to erol.html#999×0&&]
[…]
We have, for the past 6 months, been working with search engine optimisation experts to help update the code that EROL writes to the web page, making it even more search engine friendly.

As part of the recommendations suggested by the SEO experts, pages names will become more search engine friendly, moving way from page names such as ‘x123.hml’ to ‘my-product-page-123.html’. […]

Still in friendly and helpful mood I wrote a reply:

With all respect, if I understand your post correctly that’s not going to solve the problem.

As long as a crawlable URL like http://www.example.com/x123.html or http://www.example.com/product-name-123.html resolves to
http://www.example.com/erol.html#123×0&& or whatever that’s a violation of Google’s quality guidelines. Whether you call that redirect sneaky (Google’s language) or not that’s not the point. It’s Google’s search engine, so their rules apply. These rules state clearly that pages which do a JS redirect to another URL (on the same server or not, delivering the same contents or not) do not get indexed, or, if discovered later on, get deindexed.

The fact that many x-pages are still indexed and may even rank for their targeted keywords means nothing. Google cannot discover and delist all pages utilizing a particular disliked technique overnight, and never has. Sometimes that’s a process lasting months or even years.

The problem is, that these redirects put your customers at risk. Again, Google didn’t change its Webmaster guidelines which forbid JS redirects since the stone age, it has recently changed its ability to discover violations in the search index. Google does frequently improve its algos, so please don’t expect to get away with it. Quite the opposite, expect each and every page with these redirects vanishing over the years.

A good approach to avoid Google’s cloaking penalties is utilizing one single URL as spider fodder as well as content presentation to browsers. When a Googler loads such a page with a browser and compares the URL to the spidered one, you get away with nearly everything CSS and JS can accomplish — as long as the URLs are identical. If OTOH the JS code changes the location you’re toast.

Posting this response failed, because Erol’s forum admin banned me after censoring my previous post. By the way according to posts outside their sphere and from what I’ve seen watching the discussion they censor posts of customers too. Well, that’s fine with me since that’s Erol’s forum and they make the rules. However, still eager to help I emailed my reply to Erol, and to Erol customers asking for my take on Erol’s final statement.

You ask why I post this long winded stuff? Well, it seems to me that Erol prefers a few fast bucks over satisfied customers, thus I fear they will not tell their cutomers the truth. Actually, they simply don’t get it. However, I don’t care whether their intention to prevaricate is greed or ignorance, I really don’t know, but all the store operators suffering from Google’s penalties deserve the information. A few of them have subscribed to my feed, so I hope my message gets spread. Continuation

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Good Bye Nofollow: How to DOfollow comments with blogger

Andy Beard pointed me to a neat procedure to DOFOLLOW links in blog comments with blogger.com: Remove Nofollow Attribute on Blogger.com Blog Comments:

Edit the template’s HTML and remove “rel=’nofollow’” in this line:
<a expr:href='data:comment.authorUrl' rel='nofollow'><data:comment.author/></a>

Now I’ve a good reason to upgrade the software. Sadly I’ve hacked the template so badly, I doubt it will work with the new version :(



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google deindexing Erol driven ecommerce sites

Follow-up post - see why e-commerce software sucks.

Erol is a shopping cart software invented by DreamTeam, a UK based Web design firm. One of its core features is the on-the-fly conversion of crawlable HTML pages to fancy JS driven pages. Looks great in a JavaScript-enabled browser, and ugly w/o client sided formatting.

Erol, offering not that cheap SEO services itself, claims that it is perfectly OK to show Googlebot a content page without gimmicks, whilst human users get redirected to another URL.

Erol victims suffer from deindexing of all Erol-driven pages, Google just keeps pages in the index which do not contain Erol’s JS code. Considering how many online shops make use of Erol software in the UK, this massive traffic drop may have a visible impact on the gross national product ;) … Ok, sorry, kidding with so many businesses at risk does not amuse the Queen.

Dear “SEO experts” at Erol, could you please read Google’s quality guidelines:

· Don’t […] present different content to search engines than you display to users, which is commonly referred to as “cloaking.”
· Don’t employ cloaking or sneaky redirects.
· If a site doesn’t meet our quality guidelines, it may be blocked from the index.

Google did your customers a favour by not banning their whole sites, probably because the standard Erol content presentation technique is (SEO-wise) just amateurish, not caused by deceitful intent. So please stop whining

We are currently still investigating the recent changes Google have made which have caused some drop-off in results for some EROL stores. It is as a result of the changes by Google, rather than a change we have made in the EROL code that some sites have dropped. We are investigating all possible reasons for the changes affecting some EROL stores and we will, of course, feedback any definitive answers and solutions as soon as possible.

and listen to your customers stating

Hey Erol Support
Maybe you should investigate doorway pages with sneaky redirects? I’ve heard that they might cause “issues” such as full bans.

Tell your victims customers the truth, they deserve it.

Telling your customers that Googlebot crawling their redirecting pages will soon result in reindexing those is plain false by the way. Just because the crawler fetches a questionable page that doesn’t mean that the indexing process reinstates its accessibility for the query engine. Googlebot is just checking whether the sneaky JavaScript code was removed or not.

Go back to the whiteboard. See a professional SEO. Apply common sense. Develop a clean user interface pleasing human users and search engine robots as well. Without frames, sneaky respectively superfluous JavaScript redirects, and amateurish BS like that. In the meantime provide help and work arounds (for example a tutorial like “How to build an Erol shopping site without page loading messages which will result in search engine penalties”), otherwise you don’t need the revamp because your customer base will shrink to zilch.

Update: It seems that there’s a patch available. In Erol’s support forum member Craig Bradshaw posts “Erols new patch and instructions clearly tell customers not to use the page loading messages as these are no longer used by the software.”.

Tags: ()

Related links:
Matt Cutts August 19, 2005: “If you make lots of pages, don’t put JavaScript redirects on all of them … of course we’re working on better algorithmic solutions as well. In fact, I’ll issue a small weather report: I would not recommend using sneaky JavaScript redirects. Your domains might get rained on in the near future.”
Matt Cutts December 11, 2005: “A sneaky redirect is typically used to show one page to a search engine, but as soon as a user lands on the page, they get a JavaScript or other technique which redirects them to a completely different page.”
Matt Cutts September 18, 2005: “If […] you employ […] things outside Google’s guidelines, and your site has taken a precipitous drop recently, you may have a spam penalty. A reinclusion request asks Google to remove any potential spam penalty. … Are there […] pages that do a JavaScript or some other redirect to a different page? … Whatever you find that you think may have been against Google’s guidelines, correct or remove those pages. … I’d recommend giving a short explanation of what happened from your perspective: what actions may have led to any penalties and any corrective action that you’ve taken to prevent any spam in the future.”
Matt Cutts July 31, 2006: “I’m talking about JavaScript redirects used in a way to show users and search engines different content. You could also cloak and then use (meta refresh, 301/302) to be sneaky.”
Matt Cutts December 27, 2006 and December 28, 2006: “We have written about sneaky redirects in our webmaster guidelines for years. The specific part is ‘Don’t employ cloaking or sneaky redirects.’ We make our webmaster guidelines available in over 10 different languages … Ultimately, you are responsible for your own site. If a piece of shopping cart code put loads of white text on a white background, you are still responsible for your site. In fact, we’ve taken action on cases like that in the past. … If for example I did a search […] and saw a bunch of pages […], and when I clicked on one, I immediately got whisked away to a completely different url, that would be setting off alarm bells ringing in my head. … And personally, I’d be talking to the webshop that set that up (to see why on earth someone would put up pages like that) more than talking to the search engine.”

Matt Cutts heads Google’s Web spam team and has discussed these issues since the stone age at many places. Look at the dates above, penalties for cloaking / JS redirects are not a new thing. The answer to “It is as a result of the changes by Google, rather than a change we have made in the EROL code that some sites have dropped.” (Erol statement) is: Just because you’ve got away so long that does not mean that JS redirects are fine with Google. The cause of the mess is not a recent change of code, it’s the architecture by itself which is considered “cloaking / sneaky redirect” by Google. Google recently has improved its automated detection of client sided redirects, not its guidelines. Considering that both Erol created pages (the crawlable static page and the contents served by the URL invoked by the JS redirect) present similar contents, Google will have sympathy for all reinclusion requests, provided that the sites in question were made squeaky-clean before.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google’s Anchor Text Reports Encourage Spamming and Scraping

Despite the title-bait Google’s new Webmaster Toy is an interesting tool, thanks! I’ll look at it from a Webmaster’s perspective, all SEO hats in the wardrobe.

Danny tells us more, for example the data source:

A few more details about the anchor text data. First, it comes only from external links to your site. Anchor text you use on your own site isn’t counted.
Second, links to any subdomains you have are NOT included in the data.

From a quick view I doubted that Google filters internal links properly. So I’ve checked a few sites and found internal anchor text leading the new anchor text stats. Well, it’s not unusual that folks copy and paste links including anchor text. Using page titles as anchor text is also business as usual. But why the heck do page titles and shortcuts from vertical menu bars appear to be the most prominent external anchor text? With large sites having tons of inbounds that seems not so easy to investigate.

Thus I’ve looked at a tiny site of mine, this blog. It carries a few recent bollocks posts which were so useless that nobody bothered linking, I thought. Well, partly that’s the case, there were no links by humans, but enough fully automated links to get Ms. Googlebot’s attention. Thanks to scrapers, RSS aggregators, forums linking the author’s recent blog posts and so on, everything on this planet gets linked, so duplicated post titles make it in the anchor text stats.

Back to larger sites I found out that scraper sites and indexed search results were responsible for the extremely misleading ordering of Google’s anchor text excerpts. Both page types should not get indexed in the first place, and it’s ridiculous that crappy data, respectively irrelevantly accumulated data, dilute a well meant effort to report inbound linkage to Webmasters.

Unweighted ordering of inbound anchor text by commonness and limiting the number of listed phrases to 100 makes this report utterly useless for many sites, and so much the worse it sends a strong but wrong signal. Importance in the eye of the beholder gets expressed by the top of an ordered list or result set.

Joe Webmaster tracking down the sources of his top-10 inbound links finds a shitload of low-life pages, thinks hey, linkage is after all a game of large numbers when Google says those links are most important, launches a gazillion of doorways and joins every link farm out there. Bummer. Next week Joe Webmaster pops up in the Google Forum telling the world that his innocent site got tanked because he followed Google’s suggestions and we’re bothered with just another huge but useless thread discussing whether scraper links can damage rankings or not, finally inventing the “buffy blonde girl pointy stick penalty” on page 128.

Roundtrip to hat rack, SEO hat attached. I perfectly understand that Google is not keen on disclosing trust/quality/reputation scores. I can read these stats because I understand the anatomy (intention, data, methods, context), and perhaps I can even get something useful out of them. Explaining this to an impatient site owner who unwillingly bought the link quality counts argument but still believes that nofollow’ed links carry some mysterious weight because they appear in reversed citation results is a completely other story.

Dear Google, if you really can’t hand out information on link weighting by ordering inbound anchor text by trust or other signs of importance, then can you please at least filter out all the useless crap? This shouldn’t be that hard to accomplish, since even simple site-search queries for “scraping” reveal tons of definitely not index-worthy pages which already do not pass any search engine love to the link destinations. Thanks in advance!

When I write “Dear Google”, can Vanessa hear me? Please :)

Tags: ()

Update: Check your anchor text stats page every now and then, don’t miss out on updates :)



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Spamming wannabe SEOs: SEO Affiliates, Cleveland, Ohio

Got this email from a site owner today:

I can put your site at the top of a search engines listings. This is no joke and I can show proven results from all our past clients.
If this is something you might be interested in, send me a reply with the web addresses you want to promote and the best way to contact you with some options.

Thanks in advance,

Sarah Lohman
SEO Affiliates
124 Middle Ave.
Cleveland, OH 44035

I told him long ago that he should dump that site if its affiliate earnings from MSN and Yahoo traffic drop too much, and now some spamming assclowns from Cleveland, Ohio guarantee a #1 listing for crap, sheesh. Here’s the polite reply:

I’m sure you’re joking coz being a that experienced SEO you should have noticed that this site suffers from Google’s death penalty for various good reasons since 2002.
Go figure spammer

Sigh. Off to submit a ton of my email addresses to the spammer’s idiot collector.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Why eCommerce Systems Suck at SEO

Listening to whiners and disappointed site owners across the boards I guess in a few weeks we’ll discuss Google’s brand new e-commerce penalties in instances of -30, -900 and -supphell. NOT! A recent algo tweak may have figured out how to identify more crap, but I doubt Google has launched an anti-eCommerce campaign.

One don’t need an award-winning mid-range e-commerce shopping cart like Erol to gain the Google death penalty. Thanks to this award winning software sold as “search engine friendly” on the home page, respectively its crappy architecture (sneaky JS redirects as per Google’s Webmaster guidelines), many innocent shopping sites from Erol’s client list have vanished, or will be deindexed soon. Unbelievable when you read more about their so-called SEO Services. Oh well, so far an actual example. The following comments do not address Erol shopping carts, but e-commerce systems in general.

My usual question when asked to optimize eCommerce sites is “are you willing to dump everything except the core shopping cart module?”. Unfortunately, that’s the best as well as the cheapest solution in most cases. The technical crux with eCommerce software is, that it’s developed by programmers, not Web developers, and software shops don’t bother asking for SEO advice. The result is often fancy crap.

Another common problem is, that the UI is optimized for shoppers (that’s a subclass of ’surfers’, the latter is decently emulated by search engine crawlers). Navigation is mostly shortcut- and search driven (POST created results not crawlable) and relies on variables stored in cookies and whereever (invisible to spiders). All the navigational goodies which make the surfing experience are implemented with client sided technologies, or -if put server sided- served by ugly URLs with nasty session-IDs (ignored by crawlers or at least heavily downranked for various reasons). What’s left for the engines? Deep hierarchical structures of thin pages plastered with duplicated text and buy-now links. That’s not the sort of spider food Ms. Googlebot and her colleagues love to eat.

Guess why Google doesn’t crawl search results. Because search results are an inedible spider fodder not worth indexing. The same goes for badly linked conglomerates of thin product pages. Think of a different approach. Instead of trying to shove thin product pages into search indexes write informative pages on product lines/groups/… and link to the product pages within the text. When these well linked info pages provide enough product details they’ll rank for product related search queries. And you’ll generate linkworthy content. Don’t forget to disallow /shop, /search and /products in your robots.txt.

Disclaimer: I’ve checked essentialaids.com, Erol’s software does JavaScript redirects obfuscating the linked URLs to deliver the content client sided. I’ve followed this case over a few days watching Google deindexing the whole site page by page. This kind of redirects is considered “sneaky” by Google and Google’s spam filters detect it automatically. Although there is no bad intent, Google bans all sites using this technique. Since this is a key feature of the software, how can they advertise it as “search engine friendly”? From their testimonials (most are affiliates) I’ve looked at irishmusicmail.com and found that Google has indexed only 250 pages from well over 800, it looks like the Erol shopping system was removed. The other non-affiliated testimonial is from heroesforkids.co.uk, a badly framed site which is also not viewable without JavaScript. Due to SE-unfriendliness Google has indexed only 50 out of 190 pages (deindexing the site a few days later). Another reference brambleandwillow.com didn’t load at all, Google has no references but I found Erol-styled URLs in Yahoo’s index. Next pensdirect.co.uk suffers from the same flawed architecture as heroesforkids.co.uk, although the pages/indexed-URLs ratio is slightly better (15 of 40+). From a quick look at the Erol JS source all pages will get removed from Google’s search index. I didn’t write that to slander Erol and its inventor Dreamteam UK, however these guys would deserve it. It’s just a warning that good looking software which might perfectly support all related business processes can be extremely destructive from a SEO perspective.

Update: Probably it’s possible to make Erol driven shops compliant to Google’s quality guidelines by creating the pages without a software functionality called “page loading messages”. More information is provided by several posts in Erol’s support forums.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Q: Does Googlebot obey ftp://ftp.example.com/robots.txt?

Google states

Search engine robots, including our very own Googlebot, are incredibly polite. They work hard to respect your every wish regarding what pages they should and should not crawl.

A site owner posted logs where Googlebot fetches files although these are disallowed in robots.txt.

Anyone?



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google Webmasters Help FAQ launched

John Web has launched the Inofficial Google Webmaster FAQs, stop by and contribute your knowledge.

This FAQ was born in a Google Groups thread where regular posters discussed the handling of redundant questions in Google’s Webmaster Help Center.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Blogger.com Hates Digg

Ever tried to add a digg button to a blogspot blog? You get a thoughtless error message like “HTML tag <script type=’text/javascript’> is not allowed”. Since when is JavaScript code an HTML tag? BS. <polemic mode>Why does Blogger.com allow JavaScript code made by AdSense?

Conclusion: Blogger.com hates digg.com but loves Google’s AdSense.

Did I hear Googlers encouraging link building tactics like link baiting and social bookmarking? Perhaps I just misunderstood those messages … </polemic mode>

Here is the code downtrodden by Google:

<script type="text/javascript">digg_url = ‘http://sebastianx.blogspot.com/2007/02/why-proper-error-handling-is-important.html‘;</script><script src=”http://digg.com/tools/diggthis.js” type=”text/javascript”></script>


Ok, I admit that Google censors filters blog posts in the best interest of noobs not knowing what a piece of JS code can achieve. However, I’m not an ignorant noob and I do want to make use of JavaScript in my blog posts.

Hey Google, please listen to my rant! Thanks.

PS: Dear Google, please don’t tell me that I can use JS within the template. I don’t think every post of mine is worth a digg button and I’m too lazy to code the conditions.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Does Adam Lasnik like Rel=Nofollow or not?

Spotting the headline “Google’s Lasnik Wishes ‘NoFollow Didn’t Exist’” I was quite astonished. My first thought was “logic can’t explain such a reversal”. And it turned out as kinda blog hoax.

Adam’s “I wish nofollow didn’t exist” put back in context clarifies Google’s position:

“My core point […] was that it’d be really nice if nofollow wasn’t necessary. As it stands, it’s an admittedly imperfect yet important indicator that helps maintain the quality of the Web for users.

It’d be nice if there was less confusion about what nofollow does and when it’s useful. It’d be great if we could return to a more innocent time when practically all links to other sites really WERE true votes, folks clearly vouching for a site on behalf of their users.

But we don’t live in perfect, innocent times, and at Google we’re dedicated to doing what it takes to improve the signal-to-noise ratio in search quality.”

I like the “admittedly imperfect” piece ;)

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

« Previous Page  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29  Next Page »