When it comes to cluelessness [silliness, idiocy, stupidity … you name it], you can’t beat CMS developers. You really can’t. There’s literally no way to kill search engine traffic that the average content management system (CMS) developer doesn’t implement. Poor publishers, probably you suffer from the top 10 issues on my shitlist. Sigh.
Imagine you’re the proud owner of a Web site that enables logged-in users customizing the look & feel and whatnot. Here’s how your CMS does the trick:
Unusable user interface
The user control panel offers a gazillion of settings that can overwrite each and every CSS property out there. To keep the user-cp pages lean and fast loading, the properties are spread over 550 pages with 10 attributes each, all with very comfortable Previous|Next-Page navigation. Even when the user has choosen a predefined template, the CMS saves each property in the
user table. Of course that’s necessary because the site admin could change a template in use.
Amateurish database design
Not only for this purpose each
user tuple comes with 512 mandatory attributes. Unfortunately, the underlaying database doesn’t handle tables with more than 512 columns, so the overflow gets stored in an array, using the large text column #512.
Since every database access is expensive, the login procedure creates a persistent cookie (today + 365 * 30) for each user property. Dynamic and user specific external CSS files as well as style-sheets served in the HEAD section could fail to apply, so all CMS scripts use a routine that converts the user settings into inline style directives like
style="color:red; text-align:bolder; text-decoration:none; ...". The developer consults the W3C CSS guidelines to make sure that not a single CSS property is left out.
Excessive query strings
Actually, not all user agents handle cookies properly. Especially cached pages clicked from SERPs load with a rather weird design. The same goes for standard compliant browsers. Seems to depend on the user agent string, so the developer adds a
if ($well_behaving_user_agent_string <> $HTTP_USER_AGENT) then [read the user record and add each property as GET variable to the URI’s querystring]) sanity check. Of course the
$well_behaving_user_agent_string variable gets populated with a constant containing the developer’s ancient IE user agent, and the GET inputs overwrite the values gathered from cookies.
Even more sanitizing
Some unhappy campers still claim that the CMS ignores some user properties, so the developer adds a routine that reads the
user table and populates all variables that previously were filled from GET inputs overwriting cookie inputs. All clients are happy now.
“Cached copy” links from SERPs still produce weird pages. The developer stumbles upon my blog and adds crawler detection. S/he creates a tuple for each known search engine crawler in the
user table of her/his local database and codes
if ($isSpider) then [select * from user where user.usrName = $spiderName, populating the current script's CSS property variables from the requesting crawler's user settings]. Testing the rendering with a user agent faker gives fine results: bug fixed. To make sure that all user agents get a nice page, the developer sets the output default to “printer”, which produces a printable page ignoring all user settings that assign
style="display:none;" to superfluous HTML elements.
Users are happy, they don’t spot the code bloat. But search engine crawlers do. They sneakily request a few pages as a crawler, and as a browser. Comparing the results they find the “poor” pages delivered to the feigned browser way too different from the “rich” pages serving as crawler fodder. The domain gets banned for poor-man’s-cloaking (as if cloaking in general could be a bad thing, but that’s a completely different story). The publisher spots decreasing search engine traffic and wonders why. No help avail from the CMS vendor. Must be unintentionally deceptive SEO copywritig or so. Crap. That’s self-banning by software design.
Ok, before you read on: get a calming tune.
How can I detect a shitty CMS?
Well, you can’t, at least not as a non-geeky publisher. Not really. Of course you can check the “cached copy” links from your SERPs all night long. If they show way too different results compared to your browser’s rendering you’re at risk. You can look at your browser’s address bar to check your URIs for query strings with overlength, and if you can’t find the end of the URI perhaps you’re toast, search engine wise. You can download tools to check a page’s cookies, then if there are more than 50 you’re potentially search-engine-dead. Probably you can’t do a code review yourself coz you can’t read source code natively, and your CMS vendor has delivered spaghetti code. Also, as a publisher, you can’t tell whether your crappy rankings depend on shitty code or on your skills as as a copywriter. When you ask your CMS vendor, usually the search engine algo is faulty (especially Google, Yahoo, Microsoft and Ask) but some exotic search engine from Togo or so sets the standards for state of the art search engine technology.
Last but not least, as a non-search-geek challenged by Web development techniques you won’t recognize most of the laughable –but very common– mistakes outlined above. Actually, most savvy developers will not be able to create a complete shitlist from my scenario. Also, there a tons of other common CMS issues that do resolve in different crawlability issues - each as bad as this one, or even worse.
Now what can you do? Well, my best advice is: don’t click on Google ads titled “CMS”, and don’t look at prices. The cheapest CMS will cost you the most at the end of the day. And if your budget exceeds a grand or two, then please hire an experienced search engine optimizer (SEO) or search savvy Web developer before you implement a CMS.
Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to Entries Comments All Comments