Archived posts from the 'JavaScript Redirects' Category

Geo targeting without IP delivery is like throwing a perfectly grilled steak at a vegan

So Gareth James asked me to blather about the role of IP delivery in geo targeting. I answered “That’s a complex topic with gazillions of ‘depends’ lacking the potential of getting handled with a panacea”, and thought he’d just bugger off before I’ve to write a book published on his pathetic UK SEO blog. Unfortunately, it didn’t work according to plan A. This @seo_doctor dude is as persistent as a blowfly attacking a huge horse dump. He dared to reply “lol thats why I asked you!”. OMFG! Usually I throw insults at folks starting a sentence with “lol”, and I don’t communicate with native speakers who niggardly shorten “that’s” to “thats” and don’t capitalize any letter except of “I” for egomaniac purposes.

However, I didn’t annoy the Interwebz with a pamphlet for (perceived) ages, and the topic doesn’t exactly lacks controversial discussion, so read on. By the way, Gareth James is a decent guy. I’m just not fair making fun out of his interesting question for the sake of a somewhat funny opening. (That’s why you’ve read this pamphlet on his SEO blog earlier.)

How to increase your bounce rate and get your site tanked on search engine result pages with IP delivery in geo targeting

A sure fire way to make me use my browser’s back button is any sort of redirect based on my current latitude and longitude. If you try it, you can measure my blood pressure in comparision to an altitude some light-years above mother earth’s ground. You’ve seriously fucked up my surfing experience, therefore you’re blacklisted back to the stone age, and even a few stones farther just to make sure your shitty Internet outlet can’t make it to my browser’s rendering engine any more. Also, I’ll report your crappy attempt to make me sick of you to all major search engines for deceptive cloaking. Don’t screw red crabs.

Related protip: Treat your visitors with due respect.

Geo targeted ads are annoying enough. When I’m in a Swiss airport’s transit area reading an article on any US news site about the congress’ latest fuck-up in foreign policy, most probably it’s not your best idea to plaster my cell phone’s limited screen real estate with ads recommending Zurich’s hottest brothel that offers a flat rate as low as 500 ‘fränkli’ (SFR) per night. It makes no sense to make me horny minutes before I enter a plane where I can’t smoke for fucking eight+ hours!

Then if you’re the popular search engine that in its almighty wisdom decides that I’ve to seek a reservation Web form of Boston’s best whorehouse for 10am local time (that’s ETA Logan + 2 hours) via google.ch in french language, you’re totally screwed. In other words, because it’s not Google, I go search for it at Bing. (The “goto Google.com” thingy is not exactly reliable, and a totally obsolete detour when I come by with a google.com cookie.)

The same goes for a popular shopping site that redirects me to its Swiss outlet based on my location, although I want to order a book to be delivered to the United States. I’ll place my order elsewhere.

Got it? It’s perfectly fine with me to ask “Do you want to visit our Swiss site? Click here for its version in French, German, Italian or English language”. Just do not force me to view crap I can’t read and didn’t expect to see when I clicked a link!

Regardless whether you redirect me server sided using a questionable ip2location lookup, or client sided evaluating the location I carelessly opened up to your HTML5 based code, you’re doomed coz I’m pissed. (Regardless whether you do that under one URI, respectively the same URI with different hashbang crap, or a chain of actual redirects.) I’ve just increased your bounce rate in lightning speed, and trust me that’s not just yours truly alone who tells click tracking search engines that your site is scum.

How to fuck up your geo targeting with IP delivery, SEO-wise

Of course there’s no bullet proof way to obtain a visitor’s actual location based on the HTTP request’s IP address. Also, if the visitor is a search engine crawler, it requests your stuff from Mountain View, Redmond, or an undisclosed location in China, Russia, or some dubious banana republic. I bet that as a US based Internet marketer offering local services accross all states you can’t serve a meaningful ad targeting Berlin, Paris, Moscow or Canton. Not that Ms Googlebot appreciates cloaked content tailored for folks residing at 1600 Amphitheatre Parkway, by the way.

There’s nothing wrong with delivering a cialis™ or viagra® peddler’s sales pitch to search engine users from a throwaway domain that appeared on a [how to enhance my sexual performance] SERP for undisclosable reasons, but you really shouldn’t do that (or something similar) from your bread and butter site.

When you’ve content in different languages and/or you’re targeting different countries, regions, or whatever, you shall link that content together by language and geographical targets, providing prominent but not obfuscating links to other areas of your site (or local domains) for visitors who –indicated by browser language settings, search terms taken from the query string of the referring page, detected (well, guessed) location, or other available signals– might be interested in these versions. Create kinda regional sites within your site which are easy to navigate for the targeted customers. You can and should group those site areas by sitemaps as well as reasonable internal linkage, and use other techniques that distribute link love to each localized version.

Thou shalt not serve more than one version of localized content under one URI! If you can’t resist, you’ll piss off your visitors and you’ll ask for troubles with search engines. Most of your stuff will never see the daylight of a SERP by design.

This golden rule applies to IP delivery as well as to any other method that redirects users without explicit agreement. Don’t rely on cookies and such to determine the user’s preferred region or language, always provide visible alternatives when you serve localized content based on previously collected user decisions.

But …

Of course there are exceptions to this rule. For example it’s not exactly recommended to provide content featuring freedom of assembly and expression in fascist countries like Iran, Russia or China, and bare boobs as well as Web analytics or Facebook ‘like’ buttons can get you into deep shit in countries like Germany, where last century nazis make the Internat laws. So sometimes, IP delivery is the way to go.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

How to fuck up click tracking with the JavaScript onclick trigger

Fuck up click trackingThere’s a somewhat heated debate over at Sphinn and many other places as well where folks call each other guppy and dumbass try to figure out whether a particular directory’s click tracking sinks PageRank distribution or not. Besides interesting replies from Matt Cutts, an essential result of this debate is that Sphinn will implement a dumbass button.

Usually I wouldn’t write about desperate PageRank junkies going cold turkey, not even as a TGIF post, but the reason why this blog directory most probably doesn’t pass PageRank is interesting, because it has nothing to do with onclick myths. Of course the existence of an intrinsic event handler (aka onclick trigger) in an A element alone has nothing to do with Google’s take on the link’s intention, hence an onclick event itself doesn’t pull a link’s ability to pass Google-juice.

To fuck up your click tracking you really need to forget everything you’ve ever read in Google’s Webmaster Guidelines. Unfortunately, Web developers usually don’t bother reading dull stuff like that and code the desired functionality in a way that Google as well as other search engines puke on the generated code. However, ignorance is no excuse when Google talks best practices.

Lets look at the code. Code reveals everything and not every piece of code is poetry. That’s crap:
.html: <a href="http://sebastians-pamphlets.com"
id="1234"
onclick="return o('sebastians-blog');">
http://sebastians-pamphlets.com</a>

.js: function o(lnk){ window.open('/out/'+lnk+'.html'); return false; }

The script /out/sebastians-blog.html counts the click and then performs a redirect to the HREF’s value.

Why can and most probably will Google consider the hapless code above deceptive? A human visitor using a JavaScript enabled user agent clicking the link will land exactly where expected. The same goes for humans using a browser that doesn’t understand JS, and users surfing with JS turned off. A search engine crawler ignoring JS code will follow the HREF’s value pointing to the same location. All final destinations are equal. Nothing wrong with that. Really?

Nope. The problem is that Google’s spam filters can analyze client sided scripting, but don’t execute JavaScript. Google’s algos don’t ignore JavaScript code, they parse it to figure out the intent of links (and other stuff as well). So what does the algo do, see, and how does it judge eventually?

It understands the URL in HREF as definitive and ultimate destination. Then it reads the onclick trigger and fetches the external JS files to lookup the o() function. It will notice that the function returns an unconditional FALSE. The algo knows that the return value FALSE will not allow all user agents to load the URL provided in HREF. Even if o() would do nothing else, a human visitor with a JS enabled browser will not land at the HREF’s URL when clicking the link. Not good.

Next the window.open statement loads http://this-blog-directory.com/out/sebastians-blog.html, not http://sebastians-pamphlets.com (truncating the trailing slash is a BS practice as well, but that’s not the issue here). The URLs put in HREF and built in the JS code aren’t identical. That’s a full stop for the algo. Probably it does not request the redirect script http://this-blog-directory.com/out/sebastians-blog.html to analyze its header which sends a Location: http://sebastians-pamphlets.com line. (Actually, this request would tell Google that there’s no deceiptful intent, just plain hapless and overcomplicated coding, what might result in a judgement like “unreliable construct, ignore this link” or so, depending on other signals available).

From the algo’s perspective the JavaScript code performs a more or less sneaky redirect. It flags the link as shady and moves on. Guess what happens in Google’s indexing process with pages that carry tons of shady links … those links not passing PageRank sounds like a secondary problem. Perhaps Google is smart enough not to penalize legit sites for, well, hapless coding, but that’s sheer speculation.

However, shit happens, so every once in a while such a link will slip thru and may even appear in reverse citation results like link: searches or Google Webmaster Central link reports. That’s enough to fool even experts like Andy Beard (maybe Google even shows bogus link data to mislead SEO researches of any kind? Never mind).

Ok, now that we know how not to implement onclick click tracking, here’s an example of a bullet-proof method to track user clicks with the onclick event:
<a href="http://sebastians-pamphlets.com/"
id="link-1234"
onclick="return trackclick(this.href, this.name);">
Sebastian's Pamphlets</a>
trackclick() is a function that calls a server sided script to store the click and returns TRUE without doing a redirect or opening a new window.

Here is more information on search engine friendly click tracking using the onlick event. The article is from 2005, but not outdated. Of course you can add onclick triggers to all links with a few lines of JS code. That’s good practice because it avoids clutter in the A elements and makes sure that every (external) link is trackable. For this more elegant way to track clicks the warnings above apply too: don’t return false and don’t manipulate the HREF’s URL.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google and Yahoo accept undelayed meta refreshs as 301 redirects

Although the meta refresh often gets abused to trick visitors into popup hells by sneaky pages on low-life free hosts (poor man’s cloaking), search engines don’t treat every instance of the meta refresh as Webspam. Folks moving their free hosted stuff to their own domains rely on it to redirect to the new location:
<meta http-equiv=refresh content="0; url=http://example.com/newurl" />

Yahoo clearly states how they treat a zero meta refresh, that is a redirect with a delay of zero seconds:

META Refresh: <meta http-equiv=”refresh” content=…> is recognized as a 301 if it specifies little or no delay or as a 302 if it specifies noticeable delay.

Google is in the process of rewriting their documentation, in the current version of their help documents the meta refresh is not (yet!) mentioned. The Google Mini treats all meta refreshs as 302:

A META tag that specifies http-equiv=”refresh” is handled as a 302 redirect.

but that’s handled differently on the Web. I’ve asked Google’s search evangelist Adam Lasnik and he said:

[The] best idea is to use 301/302s directly whenever possible; otherwise, next best is to do a metarefresh with 0 for a 301. I don’t believe we recommend or support any 302-alternative.

Thanks Adam! I’ll update the last meta refresh thread.

If you have the chance to do 301 redirects don’t mess with the meta refresh. Utilize this method only when there’s absolutely no other chance.

Full stop for search geeks. What follows is an explanation for not that experienced Webmasters in need to move their stuff away from greedy Web content funeral services, aka free hosts of any sort.

Ok, now that we know the major search engines accept an undelayed meta refresh as poor man’s 301 redirect, how should a page having this tag look like in order to act as a provisional permanent redirect? As plain and functional as possible:
<html>
<head>
<title>Moved to new URL: http://example.com/newurl</title>
<meta http-equiv=refresh content="0; url=http://example.com/newurl" />
<meta name="robots" content="noindex,follow" />
</head>
<body>
<h1>This page has been moved to http://example.com/newurl</h1>
<p>If your browser doesn't redirect you to the new location please <a href="http://example.com/newurl"><b>click here</b></a>, sorry for the hassles!</p>
</body>
</html>

As long as the server delivers the content above under the old URL sending a 200-OK, Google’s crawl stats should not list the URL under 404 errors. If it does appear under “Not found”, something went awfully bad, probably on the free host’s side. As long as you’ve control over the account, you must not delete the page because the search engines revisit it from time to time checking whether you still redirect with that URL or not.

[Excursus: When a search engine crawler fetches this page, the server returns a 200-OK because, well, it’s there. Acting as a 301/302 does not make it a standard redirect. That sounds confusing to some people, so here is the technical explanation. Server sided response codes like 200, 302, 301, 404 or 410 are sent by the Web server to the user agent in the HTTP header before the server delivers any page content to the user agent (Web browser, search engine crawler, …). The meta refresh OTOH is a client sided directive telling the user agent to disregard the page’s content and to fetch the given (new) URL to render it instead of the initially requested URL. The browser parses the redirect directive out of the file which was received with a HTTP response code 200 (OK). That’s why you don’t get a 302 or 301 when you use a server header checker.]

When a search engine crawler fetches the page above, that’s just the beginning of a pretty complex process. Search engines are large scaled systems which make use of asynchronous communication between tons of highly specialized programs. The crawler itself has nothing to do with indexing. Maybe it follows server sided redirects instantly, but that’s unlikely with meta refreshs because crawlers just fetch Web contents for unprocessed delivery to a data pool from where all sorts of processes like (vertical) indexers pull their fodder. Deleting a redirecting page in the search index might be done by process A running hourly, whilst process B instructing the crawler to fetch the redirect’s destination runs once a day, then the crawler may be swamped so that it delivers the new content a month later to process C which ran just five minutes before the content delivery and starts again not before next Monday if that’s not a bank holiday…

That means the old page may gets deindexed way before the new URL makes it in the search index. If you change anything during this period, you just confuse the pretty complex chain of processes what means that perhaps the search engine starts over by rolling back all transactions and refetching the redirecting page. Not good. Keep all kind of permanent redirects forever.

Actually, a zero meta refresh works like a 301 redirect because the engines (shall) treat is as a permanent redirect, but it’s not a native 301. In fact, due to so much abuse by spammers it might be considered less reliable than a server sided 301 sent in the HTTP header. Hence you want to express your intention clearly to the engines. You do that with several elements of the meta refresh’ing page:

  • The page title says that the resource was moved and tells the new location. Words like “moved” and “new URL” without surrounding gimmicks clear the message.
  • The zero (second) delay parameter shows that you don’t deliver visible content to (most) human visitors but switch their user agent right to the new URL.
  • The “noindex” robots meta tag telling the engines not to index the actual page’s contents is a signal that you don’t cheat. The “follow” value (referring to links in BODY) is just a fallback mechanismn to ensure that engines having troubles to understand the redirect at least follow and index the “click here” link.
  • The lack of indexable content and keywords makes clear that you don’t try to achieve SE rankings for anything except the new URL.
  • The H1 heading repeating the title tag’s content on the page, visible for users surfing with meta refresh = off, accelerates the message and helps the engines to figure out the seriousness of your intent.
  • The same goes for the text message with a clear call for action underlined with the URL introduced by other elements.

Meta refreshs like other client sided redirects (e.g. window.location = "http://example.com/newurl"; in JavaScript) can be found in every spammer’s toolbox, so don’t leave the outdated content on the page and add a JavaScript redirect only to contentless pages like the sample above. Actually, you don’t need to do that, because the number of users surfing with meta-refresh=off is only a tiny fraction of your visitors, and using JavaScript redirects is way more risky (WRT picky search engines) than a zero meta refresh. Also, JavaScript redirects –if captured by a search engine– should count as 302 and you really don’t want to deal with all the disadvantages of soft redirects.

Another interesting question is whether removing the content from the outdated page makes a difference or not. Doing a mass search+replace to insert the meta tags (refresh and robots) with no further changes to the HTML source might seem attractive from a Webmaster’s perspective. It’s fault-prone however. Creating a list mapping outdated pages to their new locations to feed a quick+dirty desktop program generating the simple HTML code above is actually easier and eliminates a couple points of failure.

Finally: Make use of meta refreshs on free hosts only. Professional hosting firms let you do server sided redirects!



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Erol ships patch fixing deindexing of online stores by Google

If you run an Erol driven store and you suffer from a loss of Google traffic, or you just want to make sure that your store’s content presentation is more compliant to Google’s guidelines, then patch your Erol software (*ix hosts / Apache only). For a history of this patch and more information click here.

Tip: Save your /.htaccess file before you publish the store. If it contains statements not related to Erol, then add the code shipped with this patch manually to your local copy of .htaccess and the .htaccess file in the Web host’s root directory. If you can’t see the (new) .htaccess file in your FTP client, then add “-a” to the external file mask. If your FTP client transfers .htaccess in binary mode, then add “.htaccess” to the list of ASCII files in the settings. If you upload .htaccess in binary mode, it may not exactly do what you expect it to accomplish.

I don’t know when/if Erol will ship a patch for IIS. (As a side note, I can’t imagine one single reason why hosting an online store under Windows could make sense. OTOH there are many reasons to avoid hosting of anything keen on search engine traffic on a Windows box.)



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Follow-up: Erol’s patch fixing Google troubles

Erol developers test their first Google-patch for sites hosted on UNIX boxes. You can preview it here: x55.html. When you request the page with a search engine crawler identifier as user-agent name, the JavaScript code redirecting to the frameset erol.html#55×0&& gets replaced with a HTML comment explaining why human visitors are treated different from search engine spiders. The anatomy of this patch is described here, your feedback is welcome.

Erol told me they will be running tests on this site over the coming weeks, as they always do before going live with an update. So stay tuned for the release. When things run smoothly on UNIX hosts, a patch for Windows environments shall follow. On IIS the implementation is a bit trickier, because it needs changes of the server configuration. I’ll keep you updated.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Erol to ship a Patch Fixing Google Troubles

Background: read these four posts on Google penalizing respectively deindexing e-commerce sites. Long story short: Recently Google’s enhanced algos began to deindex e-commerce sites powered by Erol’s shopping cart software. The shopping cart maintains a static HTML file which redirects user agents executing JavaScript to another URL. This happens with each and every page, so it’s quite understandable that Ms. Googlebot was not amused. I got involved as a few worried store owners asked for help in Google’s Webmaster Forum. After lots of threads and posts on the subject Erol’s managing director got in touch with me and we agreed to team up to find a solution to help the store owners suffering from a huge traffic loss. Here’s my report of the first technical round.

Understanding how Erol 4.x (and all prior versions) works:

The software generates a HTML page offline, which functions as an XML-like content source (called “x-page”, I use that term because all Erol customers are familar with it). The “x-page” gets uploaded to the server and is crawlable, but not really viewable. Requested by a robot it responds with 200-Ok. Requested by a human, it does a JavaScript redirect to a complex frameset, which loads the “x-page” and visualizes its contents. It responds to browsers if directly called, but returns a 404-NotFound error to robots. Example:

“x-page”: x999.html
Frameset: erol.html#999×0&&

To view the source of the “x-page” disable JavaScript before you click the link.

Understanding how search engines handle Erol’s pages:

There are two major weak points with regard to crawling and indexing. The crawlable page redirects, and the destination does not exist if requested by a crawler. This leads to these scenarios:

  1. A search engine ignoring JavaScript on crawled pages fetches the “x-page” and indexes it. That’s the default behavior of yesterdays crawlers, and still works this way at several search engines.
  2. A search engine not executing JavaScript on crawled pages fetches the “x-page”, analyzes the client sided script, and discovers the redirect (please note that a search engine crawler may change its behavior, so this can happen all of a sudden to properly indexed pages!). Possible consequences:
    • It tries to fetch the destination, gets the 404 response multiple times, and deindexes the “x-page” eventually. That would mean that depending on the crawling frequency and depth per domain the pages disappear quite fast or rather slow until the last page is phased out. Google would keep a copy in the supplemental index for a while, but this listing cannot return to the main index.
    • It’s trained to consider the unconditional JavaScript redirect “sneaky” and flags the URL accordingly. This can result in temporarily and permanent deindexing as well.
  3. A search engine executing JavaScript on crawled pages fetches the “x-page”, performs the redirect (thus ignores the contents of the “x-page”), and renders the frameset for indexing. Chances are it gives up on the complexity of the nested frames, indexes the noframe-tag of the frameset and perhaps a few snippets from subframes, considers the whole conglomerate thin, hence assignes the lowest possible priority for the query engine and moves on.

Unfortunately the search engine delivering the most traffic began to improve its crawling and indexing, hence many sites formerly receiving a fair amount of Google traffic began to suffer from scenario 2 — deindexing.

Outlining a possible work around to get the deleted pages back in the search index:

In six months or so Erol will ship version 5 of its shopping cart, and this software dumps frames, JavaScript redirects and ugly stuff like that in favor of clean XHTML and CSS. By the way, Erol has asked me for my input on their new version, so you can bet it will be search engine friendly. So what can we do in the meantime to help legions of store owners running version 4 and below?

We’ve got the static “x-page” which should not get indexed because it redirects, and which cannot be changed to serve the contents itself. The frameset cannot be indexed because it doesn’t exist for robots, and even if a crawler could eat it, we don’t consider it easy to digest spider fodder.

Let’s look at Google’s guidelines, which are the strictest around, thus applicable for other engines as well:

  1. Don’t […] present different content to search engines than you display to users, which is commonly referred to as “cloaking.”
  2. Don’t employ cloaking or sneaky redirects.

If we find a way to suppress the JavaScript code on the “x-page” when a crawler requests it, the now more sophisticated crawlers will handle the “x-page” like their predecessors, that is they would fetch the “x-pages” and hand them over to the indexer without vicious remarks. Serving identical content under different URLs to users and crawlers does not contradict the first prescript. And we’d comply to the second rule, because loading a frameset for human vistors but not for crawlers is definitely not sneaky.

Ok, now how to tell the static page that it has to behave dynamically, that is outputting different contents server sided depending on the user agent’s name? Well, Erol’s desktop software which generates the HTML can easily insert PHP tags too. The browser would not render those on a local machine, but who cares when it works after the upload on the server. Here’s the procedure for Apache servers:

In the root’s .htaccess file we enable PHP parsing of .html files:
AddType application/x-httpd-php .html

Next we create a PHP include file xinc.php which prevents crawlers from reading the offending JavaScript code:
<?php
$crawlerUAs = array(”Googlebot”, “Slurp”, “MSNbot”, “teoma”, “Scooter”, “Mercator”, “FAST”);
$isSpider = FALSE;
$userAgent = getenv(”HTTP_USER_AGENT”);
foreach ($crawlerUAs as $crawlerUA) {
if (stristr($userAgent, $crawlerUA)) $isSpider = TRUE;
}
if (!$isSpider) {
print “<script type=\”text/javascript\”> [a whole bunch of JS code] </script>\n”;
}
if ($isSpider) {
print “<!– Dear search engine staff: we’ve suppressed the JavaScript code redirecting browsers to “erol.html”, that’s a frameset serving this page’s contents more pleasant for human eyes. –>\n”;
}
?>

Erol’s HTML generator now puts <?php @include(”x.php”); ?> instead of a whole bunch of JavaScript code.

The implementation for other environments is quite similar. If PHP is not available we can do it with SSI and PERL. On Windows we can tell IIS to process all .html extensions as ASP (App Mappings) and use an ASP include. That would give three versions of that patch which should help 99% of all Erol customers until they can upgrade to version 5.

This solution comes with two disadvantages. First, the cached page copies, clickable from the SERPs and toolbars, would render pretty ugly because they lack the JavaScript code. Second, perhaps automated tools searching for deceitful cloaking might red-flag the URLs for a human review. Hopefully the search engine executioner reading the comment in the source code will be fine with it and give it a go. If not, there’s still the reinclusion request. I think store owners can live with that when they get their Google traffic back.

Rolling out the patch:

Erol thinks the above said makes sense and there is a chance of implementing it soon. While the developers are at work, please provide feedback if you think we didn’t interpret Google’s Webmaster Guidelines strict enough. Keep in mind that this is an interim solution and that the new version will handle things more standardized. Thanks.

Paid-Links-Disclosure: I do this pro bono job for the sake of the suffering store owners. Hence the links pointing to Erol and Erol’s customers are not nofollow’ed. Not that I’d nofollow them otherwise ;)



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Follow-up on "Google penalizes Erol stores"

Background: these three posts on Google penalizing e-commerce sites.

Erol has contacted me and we will discuss the technical issues within the next days or maybe weeks or so. I understand this as a positive signal, especially because previously my impression was that Erol is not willing to listen constructive criticism, regardless Googles shot across the bow (more on that later). We agreed that before we come to the real (SEO) issues it’s a good idea to render a few points made in my previous posts more precisely. In the following I quote parts of Erol’s emails with permission:

Your blog has made for interesting reading but the first point I would like to raise with you is about the tone of your comments, not necessarily the comments themselves.

Question of personal style, point taken.

Your article entitled ‘Why eCommerce Systems Suck‘, dated March 12th, includes specific reference to EROL and your opinion of its SEO capability. Under such a generic title for an article, readers should expect to read about other shopping cart systems and any opinion you may care to share about them. In particular, the points you raise about other elements of SEO in the same article, (’Google doesn’t crawl search results’, navigation being ‘POST created results not crawlable’) are cited as examples of ways other shopping carts work badly in reference to SEO - importantly, this is NOT the way EROL stores work. Yet, because you do not include any other cart references by name or exclude EROL from these specific points, the whole article reads as if it is entirely aimed at EROL software and none others.

Indeed, that’s not fair. Navigation solely based on uncrawlable search results without crawler shortcuts or sheer POST results are definitely not issues I’ve stumbled upon while investigating penalized Erol driven online stores. Google’s problem with Erol driven stores is client sided cloaking without malicious intent. I’ve updated the post to make that clear.

Your comment in another article, ‘Beware of the narrow-minded coders‘ dated 26 March where you state: “I’ve used the case [EROL] as an example of a nice shopping cart coming with destructive SEO.” So by this I understand that your opinion is EROL is actually ‘a nice shopping cart’ but it’s SEO capabilities could be better. Yet your articles read through as EROL is generally bad all round. Your original article should surely be titled “Why eCommerce Systems Suck at SEO” and take a more rounded approach to shopping cart SEO capabilities, not merely “Why eCommerce Systems Suck”? This may seem a trivial point to you, but how it reflects overall on our product and clouds it’s capability to perform its main function (provide an online ecommerce solution) is really what concerns me.

Indeed, I meant that Erol is a nice shopping cart lacking SEO capabilities as long as not the major SEO issues get addressed asap. And I mean in the current version, which clearly violates Google’s quality guidelines. From what I’ve read in the meantime, the next version to be released in 6 months or so should eleminate the two major flaws with regard to search engine compatibility. I’ve changed the post’s title, the suggestion makes sense for me too.

I do not enjoy the Google.co.uk traffic from search terms like “Erol sucks” or “Erol is crap” because that’s simply not true. As I said before I think that Erol is a well rounded software nicely supporting the business processes its designed for, and the many store owners using Erol I’ve communicated with recently all tell me that too.

I noted with interest that your original article ‘Why eCommerce Systems Suck’ was dated 12th March. Coincidentally, this was the date Google began to re-index EROL stores following the Google update, so I presume that your article was originally written following the threads on the Google webmaster forums etc. prior to the 12th March where you had, no doubt, been answering questions for some of our customers about their de-listing during the update. You appear to add extra updates and information in your blogs but, disappointingly, you have not seen fit to include the fact that EROL stores are being re-listed in any update to your blog so, once again, the article reads as though all EROL stores have been de-listed completely, never to be seen again.

With all respect, nope. Google did not reindex Erol driven pages, Google had just lifted a “yellow card” penalty for a few sites. That is not a carte blanque but in the opposite Google’s last warning before the site in question gets the “red card”, that is a full ban lasting at least a couple of months or even longer. As said before it means absolutely nothing when Google crawls penalized sites or when a couple of pages reappear on the SERPs. Here is the official statement: “Google might also choose to give a site a ‘yellow card’ so that the site can not be found in the index for a short time. However, if a webmaster ignores this signal, then a ‘red card’ with a longer-lasting effect might follow.”
(Yellow / red cards: soccer terminology, yellow is a warning and red the sending-off.)

I found your comments about our business preferring “a few fast bucks”, suggesting we are driven by “greed” and calling our customers “victims” particularly distasteful. Especially the latter, because you infer that we have deliberately set out to create software that is not capable of performing its function and/or not capable of being listed in the search engines and that we have deliberately done this in pursuit of monetary gain at the expense of reputation and our customers. These remarks I really do find offensive and politely ask that they be removed or changed. In your article “Google deindexing Erol driven ecommerce sites” on March 23rd, you actually state that “the standard Erol content presentation is just amateurish, not caused by deceitful intent”. So which is it to be - are we deceitful, greedy, victimising capitalists, or just amateurish and without deceitful intent? I support your rights to your opinions on the technical proficiency of our product for SEO, but I certainly do not support your rights to your opinions of our company and its ethics which border on slander and, at the very least, are completely unprofessional from someone who is positioning themselves as just that - an SEO professional.

To summarise, your points of view are not the problem, but the tone and language with which they are presented and I sincerely hope you will see fit to moderate these entries.

C’mon, now you’re getting polemic;) In this post I’ve admitted to be polemic to bring my point home, and in the very first post on the topic I clearly stated that my intention was not slandering Erol. However, since you’ve agreed to an open discussion of the SEO flaws I think it’s no longer suitable to call your customers victims, so I’ve changed that. Also in my previous post I’ll insert a link near “greed” and “fast bucks” pointing to this paragraph to make it absolutely clear that I did not meant what you insinuate when I wrote:

Ignorance is no excuse […] Well, it seems to me that Erol prefers a few fast bucks over satisfied customers, thus I fear they will not tell their cutomers the truth. Actually, they simply don’t get it. However, I don’t care whether their intention to prevaricate is greed or ignorance, I really don’t know, but all the store operators suffering from Google’s penalties deserve the information.

Actually, I still stand by my provoking comments because at this time they perfectly described the impression you’ve created with your actions respectively lack of fitly activities in the public.

  1. Critical customers asking whether the loss of Google traffic might be caused by the way your software handles HTML outputs in your support forums were downtrodden and censored.
  2. Your public answers to worried customers were plain wrong, SEO-wise. Instead of “we take your hints seriously and will examine whether JavaScript redirects may cause Google penalties or not” you said that search engines do index cloaking pages just fine, that Googlebot crawling penalized sites is a good sign, and all the mess is kinda Google hiccup. At this point the truth was out long enough, so your most probably unintended disinformation has worried a number of your customers, and gave folks like me the impression that you’re not willing to undertake the necessary steps.
  3. Offering SEO services yourself as well as forum talks praising Erol’s SEO experts don’t put you in a “we just make great shopping cart software and are not responsible for search engine weaknesses” position. Frankly that’s not conceivable as responsible management of customer expectations. It’s great that your next version will dump frames and JavaScript redirects, but that’s a bit too late in the eyes of your customers, and way too late from a SEO perspective, because Google never permitted the use of JavaScript redirects and all the disadvantages of frames were public knowledge since the glory days of Altavista, Excite and Infoseek, long before Google overtook search.

To set the record straight: I don’t think and never thought that you’ve greedily or deliberately put your customers at risk in pursuit of monetary gain. You’ve just ignored Google’s guidelines and best practices of Web development too long, but –as the sub-title of my previous post hints– ignorance is no excuse.

Now that we’ve handled the public relation stuff, I’ll look into the remaining information Erol sent over hoping that I’ll be able to provide some reasonable input in the best interest of Erol’s customers.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Beware of the narrow-minded coders

or Ignorance is no excuse

Long winded story on SEO-ignorant pommy coders putting their customers at risk. Hop away if e-commerce software vs. SEO dramas don’t thrill you.

Recently I’ve answered a “Why did Google deindex my pages” question in Google’s Webmaster Forum. It turned out that the underlying shopping cart software (EROL) maintained somewhat static pages as spider fodder, which redirect human visitors to another URL serving the same contents client sided. Silly thing to do, but pretty common for shopping carts. I’ve used the case as an example of a nice shopping cart coming with destructive SEO in a post on flawed shopping carts in general.

Day by day other site owners operating Erol driven online shops popped up in the Google Groups or emailed me directly, so I realized that there is a darn widespread problem involving a very popular UK based shopping cart software responsible for Google cloaking penalties. From my contacts I knew that Erol’s software engineers and self-appointed SEO experts believe in weird SEO theories and don’t consider that their software architecture itself could be the cause of the mess. So I wrote a follow-up addressing Erol directly. Google penalizes Erol-driven e-commerce sites explaines Google’s take on cloaking and sneaky JavaScript redirects to Erol and its customers.

My initial post got linked and discussed in Erol’s support forum and kept my blog stats counter buzzy over the weekend. Accused of posting crap I showed up and posted a short summary over there:

Howdy, I’m the author of the blog post you’re discussing here: Why eCommerce systems suck

As for crap or not crap, judge yourself. This blog post was addressed to ecommerce systems in general. Erol was mentioned as an example of a nice shopping cart coming with destructive SEO. To avoid more misunderstandings and to stress the issues Google has with Erol’s JavaScript redirects, I’ve posted a follow-up: Google deindexing Erol-driven ecommerce sites.

This post contains related quotes from Matt Cutts, head of Google’s web spam team, and Google’s quality guidelines. I guess that piece should bring my point home:

If you’re keen on search engine traffic then do not deliver one page to the crawlers and another page to users. Redirecting to another URL which serves the same contents client sided gives Google an idea of intent, but honest intent is not a permission to cloak. Google says JS redirects are against the guidelines, so don’t cloak. It’s that simple.

If you’ve questions, post a comment on my blog or drop me a line. Thanks for listening

Sebastian

Next the links to this blog were edited out and Erol posted a longish but pointless charade. Click the link to read it in full, summarizing it tells the worried Erol victims that Google has no clue at all, frames and JS redirects are great for online shops, and waiting for the next software release providing meaningful URLs will fix everything. Ok, that’s polemic, so here are at least a few quotes:

[…] A number of people have been asking for a little reassurance on the fact that EROL’s x.html pages are getting listed by Google. Below is a list of keyword phrases, with the number of competing pages and the x.html page that gets listed [4 examples provided].
[…]
EROL does use frames to display the store in the browser, however all the individual pages generated and uploaded by EROL are static HTML pages (x.html pages) that can be optimised for search engines. These pages are spidered and indexed by the search engines. Each of these x.html pages have a redirect that loads the page into the store frameset automatically when the page is requested.
[…]
EROL is a JavaScript shopping cart, however all the links within the store (links to other EROL pages) that are added using EROL Link Items are written to the static HTML pages as a standard <a href=”"> links - not a JavaScript link. This helps the search engines spider other pages in your store.

The ’sneaky re-directs’ being discussed most likely relate to an older SEO technique used by some companies to auto-forward from an SEO-optimised page/URL to the actual URL the site-owner wants you to see.

EROL doesn’t do this - EROL’s page load actually works more like an include than the redirect mentioned above. In its raw form, the ‘x123.html’ page carries visible content, readable by the search engines. In it’s rendered form, the page loads the same content but the JavaScript rewrites the rendered page to include page and product layout attributes and to load the frameset. You are never redirected to another html page or URL. [Not true, the JS function displayPage() changes the location of all pages indexed by Google, and property names like ‘hidepage’ speak for themselves. Example: x999.html redirects to erol.html#999×0&&]
[…]
We have, for the past 6 months, been working with search engine optimisation experts to help update the code that EROL writes to the web page, making it even more search engine friendly.

As part of the recommendations suggested by the SEO experts, pages names will become more search engine friendly, moving way from page names such as ‘x123.hml’ to ‘my-product-page-123.html’. […]

Still in friendly and helpful mood I wrote a reply:

With all respect, if I understand your post correctly that’s not going to solve the problem.

As long as a crawlable URL like http://www.example.com/x123.html or http://www.example.com/product-name-123.html resolves to
http://www.example.com/erol.html#123×0&& or whatever that’s a violation of Google’s quality guidelines. Whether you call that redirect sneaky (Google’s language) or not that’s not the point. It’s Google’s search engine, so their rules apply. These rules state clearly that pages which do a JS redirect to another URL (on the same server or not, delivering the same contents or not) do not get indexed, or, if discovered later on, get deindexed.

The fact that many x-pages are still indexed and may even rank for their targeted keywords means nothing. Google cannot discover and delist all pages utilizing a particular disliked technique overnight, and never has. Sometimes that’s a process lasting months or even years.

The problem is, that these redirects put your customers at risk. Again, Google didn’t change its Webmaster guidelines which forbid JS redirects since the stone age, it has recently changed its ability to discover violations in the search index. Google does frequently improve its algos, so please don’t expect to get away with it. Quite the opposite, expect each and every page with these redirects vanishing over the years.

A good approach to avoid Google’s cloaking penalties is utilizing one single URL as spider fodder as well as content presentation to browsers. When a Googler loads such a page with a browser and compares the URL to the spidered one, you get away with nearly everything CSS and JS can accomplish — as long as the URLs are identical. If OTOH the JS code changes the location you’re toast.

Posting this response failed, because Erol’s forum admin banned me after censoring my previous post. By the way according to posts outside their sphere and from what I’ve seen watching the discussion they censor posts of customers too. Well, that’s fine with me since that’s Erol’s forum and they make the rules. However, still eager to help I emailed my reply to Erol, and to Erol customers asking for my take on Erol’s final statement.

You ask why I post this long winded stuff? Well, it seems to me that Erol prefers a few fast bucks over satisfied customers, thus I fear they will not tell their cutomers the truth. Actually, they simply don’t get it. However, I don’t care whether their intention to prevaricate is greed or ignorance, I really don’t know, but all the store operators suffering from Google’s penalties deserve the information. A few of them have subscribed to my feed, so I hope my message gets spread. Continuation

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google deindexing Erol driven ecommerce sites

Follow-up post - see why e-commerce software sucks.

Erol is a shopping cart software invented by DreamTeam, a UK based Web design firm. One of its core features is the on-the-fly conversion of crawlable HTML pages to fancy JS driven pages. Looks great in a JavaScript-enabled browser, and ugly w/o client sided formatting.

Erol, offering not that cheap SEO services itself, claims that it is perfectly OK to show Googlebot a content page without gimmicks, whilst human users get redirected to another URL.

Erol victims suffer from deindexing of all Erol-driven pages, Google just keeps pages in the index which do not contain Erol’s JS code. Considering how many online shops make use of Erol software in the UK, this massive traffic drop may have a visible impact on the gross national product ;) … Ok, sorry, kidding with so many businesses at risk does not amuse the Queen.

Dear “SEO experts” at Erol, could you please read Google’s quality guidelines:

· Don’t […] present different content to search engines than you display to users, which is commonly referred to as “cloaking.”
· Don’t employ cloaking or sneaky redirects.
· If a site doesn’t meet our quality guidelines, it may be blocked from the index.

Google did your customers a favour by not banning their whole sites, probably because the standard Erol content presentation technique is (SEO-wise) just amateurish, not caused by deceitful intent. So please stop whining

We are currently still investigating the recent changes Google have made which have caused some drop-off in results for some EROL stores. It is as a result of the changes by Google, rather than a change we have made in the EROL code that some sites have dropped. We are investigating all possible reasons for the changes affecting some EROL stores and we will, of course, feedback any definitive answers and solutions as soon as possible.

and listen to your customers stating

Hey Erol Support
Maybe you should investigate doorway pages with sneaky redirects? I’ve heard that they might cause “issues” such as full bans.

Tell your victims customers the truth, they deserve it.

Telling your customers that Googlebot crawling their redirecting pages will soon result in reindexing those is plain false by the way. Just because the crawler fetches a questionable page that doesn’t mean that the indexing process reinstates its accessibility for the query engine. Googlebot is just checking whether the sneaky JavaScript code was removed or not.

Go back to the whiteboard. See a professional SEO. Apply common sense. Develop a clean user interface pleasing human users and search engine robots as well. Without frames, sneaky respectively superfluous JavaScript redirects, and amateurish BS like that. In the meantime provide help and work arounds (for example a tutorial like “How to build an Erol shopping site without page loading messages which will result in search engine penalties”), otherwise you don’t need the revamp because your customer base will shrink to zilch.

Update: It seems that there’s a patch available. In Erol’s support forum member Craig Bradshaw posts “Erols new patch and instructions clearly tell customers not to use the page loading messages as these are no longer used by the software.”.

Tags: ()

Related links:
Matt Cutts August 19, 2005: “If you make lots of pages, don’t put JavaScript redirects on all of them … of course we’re working on better algorithmic solutions as well. In fact, I’ll issue a small weather report: I would not recommend using sneaky JavaScript redirects. Your domains might get rained on in the near future.”
Matt Cutts December 11, 2005: “A sneaky redirect is typically used to show one page to a search engine, but as soon as a user lands on the page, they get a JavaScript or other technique which redirects them to a completely different page.”
Matt Cutts September 18, 2005: “If […] you employ […] things outside Google’s guidelines, and your site has taken a precipitous drop recently, you may have a spam penalty. A reinclusion request asks Google to remove any potential spam penalty. … Are there […] pages that do a JavaScript or some other redirect to a different page? … Whatever you find that you think may have been against Google’s guidelines, correct or remove those pages. … I’d recommend giving a short explanation of what happened from your perspective: what actions may have led to any penalties and any corrective action that you’ve taken to prevent any spam in the future.”
Matt Cutts July 31, 2006: “I’m talking about JavaScript redirects used in a way to show users and search engines different content. You could also cloak and then use (meta refresh, 301/302) to be sneaky.”
Matt Cutts December 27, 2006 and December 28, 2006: “We have written about sneaky redirects in our webmaster guidelines for years. The specific part is ‘Don’t employ cloaking or sneaky redirects.’ We make our webmaster guidelines available in over 10 different languages … Ultimately, you are responsible for your own site. If a piece of shopping cart code put loads of white text on a white background, you are still responsible for your site. In fact, we’ve taken action on cases like that in the past. … If for example I did a search […] and saw a bunch of pages […], and when I clicked on one, I immediately got whisked away to a completely different url, that would be setting off alarm bells ringing in my head. … And personally, I’d be talking to the webshop that set that up (to see why on earth someone would put up pages like that) more than talking to the search engine.”

Matt Cutts heads Google’s Web spam team and has discussed these issues since the stone age at many places. Look at the dates above, penalties for cloaking / JS redirects are not a new thing. The answer to “It is as a result of the changes by Google, rather than a change we have made in the EROL code that some sites have dropped.” (Erol statement) is: Just because you’ve got away so long that does not mean that JS redirects are fine with Google. The cause of the mess is not a recent change of code, it’s the architecture by itself which is considered “cloaking / sneaky redirect” by Google. Google recently has improved its automated detection of client sided redirects, not its guidelines. Considering that both Erol created pages (the crawlable static page and the contents served by the URL invoked by the JS redirect) present similar contents, Google will have sympathy for all reinclusion requests, provided that the sites in question were made squeaky-clean before.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Why eCommerce Systems Suck at SEO

Listening to whiners and disappointed site owners across the boards I guess in a few weeks we’ll discuss Google’s brand new e-commerce penalties in instances of -30, -900 and -supphell. NOT! A recent algo tweak may have figured out how to identify more crap, but I doubt Google has launched an anti-eCommerce campaign.

One don’t need an award-winning mid-range e-commerce shopping cart like Erol to gain the Google death penalty. Thanks to this award winning software sold as “search engine friendly” on the home page, respectively its crappy architecture (sneaky JS redirects as per Google’s Webmaster guidelines), many innocent shopping sites from Erol’s client list have vanished, or will be deindexed soon. Unbelievable when you read more about their so-called SEO Services. Oh well, so far an actual example. The following comments do not address Erol shopping carts, but e-commerce systems in general.

My usual question when asked to optimize eCommerce sites is “are you willing to dump everything except the core shopping cart module?”. Unfortunately, that’s the best as well as the cheapest solution in most cases. The technical crux with eCommerce software is, that it’s developed by programmers, not Web developers, and software shops don’t bother asking for SEO advice. The result is often fancy crap.

Another common problem is, that the UI is optimized for shoppers (that’s a subclass of ’surfers’, the latter is decently emulated by search engine crawlers). Navigation is mostly shortcut- and search driven (POST created results not crawlable) and relies on variables stored in cookies and whereever (invisible to spiders). All the navigational goodies which make the surfing experience are implemented with client sided technologies, or -if put server sided- served by ugly URLs with nasty session-IDs (ignored by crawlers or at least heavily downranked for various reasons). What’s left for the engines? Deep hierarchical structures of thin pages plastered with duplicated text and buy-now links. That’s not the sort of spider food Ms. Googlebot and her colleagues love to eat.

Guess why Google doesn’t crawl search results. Because search results are an inedible spider fodder not worth indexing. The same goes for badly linked conglomerates of thin product pages. Think of a different approach. Instead of trying to shove thin product pages into search indexes write informative pages on product lines/groups/… and link to the product pages within the text. When these well linked info pages provide enough product details they’ll rank for product related search queries. And you’ll generate linkworthy content. Don’t forget to disallow /shop, /search and /products in your robots.txt.

Disclaimer: I’ve checked essentialaids.com, Erol’s software does JavaScript redirects obfuscating the linked URLs to deliver the content client sided. I’ve followed this case over a few days watching Google deindexing the whole site page by page. This kind of redirects is considered “sneaky” by Google and Google’s spam filters detect it automatically. Although there is no bad intent, Google bans all sites using this technique. Since this is a key feature of the software, how can they advertise it as “search engine friendly”? From their testimonials (most are affiliates) I’ve looked at irishmusicmail.com and found that Google has indexed only 250 pages from well over 800, it looks like the Erol shopping system was removed. The other non-affiliated testimonial is from heroesforkids.co.uk, a badly framed site which is also not viewable without JavaScript. Due to SE-unfriendliness Google has indexed only 50 out of 190 pages (deindexing the site a few days later). Another reference brambleandwillow.com didn’t load at all, Google has no references but I found Erol-styled URLs in Yahoo’s index. Next pensdirect.co.uk suffers from the same flawed architecture as heroesforkids.co.uk, although the pages/indexed-URLs ratio is slightly better (15 of 40+). From a quick look at the Erol JS source all pages will get removed from Google’s search index. I didn’t write that to slander Erol and its inventor Dreamteam UK, however these guys would deserve it. It’s just a warning that good looking software which might perfectly support all related business processes can be extremely destructive from a SEO perspective.

Update: Probably it’s possible to make Erol driven shops compliant to Google’s quality guidelines by creating the pages without a software functionality called “page loading messages”. More information is provided by several posts in Erol’s support forums.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments