Spamming wannabe SEOs: SEO Affiliates, Cleveland, Ohio

Got this email from a site owner today:

I can put your site at the top of a search engines listings. This is no joke and I can show proven results from all our past clients.
If this is something you might be interested in, send me a reply with the web addresses you want to promote and the best way to contact you with some options.

Thanks in advance,

Sarah Lohman
SEO Affiliates
124 Middle Ave.
Cleveland, OH 44035

I told him long ago that he should dump that site if its affiliate earnings from MSN and Yahoo traffic drop too much, and now some spamming assclowns from Cleveland, Ohio guarantee a #1 listing for crap, sheesh. Here’s the polite reply:

I’m sure you’re joking coz being a that experienced SEO you should have noticed that this site suffers from Google’s death penalty for various good reasons since 2002.
Go figure spammer

Sigh. Off to submit a ton of my email addresses to the spammer’s idiot collector.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Why eCommerce Systems Suck at SEO

Listening to whiners and disappointed site owners across the boards I guess in a few weeks we’ll discuss Google’s brand new e-commerce penalties in instances of -30, -900 and -supphell. NOT! A recent algo tweak may have figured out how to identify more crap, but I doubt Google has launched an anti-eCommerce campaign.

One don’t need an award-winning mid-range e-commerce shopping cart like Erol to gain the Google death penalty. Thanks to this award winning software sold as “search engine friendly” on the home page, respectively its crappy architecture (sneaky JS redirects as per Google’s Webmaster guidelines), many innocent shopping sites from Erol’s client list have vanished, or will be deindexed soon. Unbelievable when you read more about their so-called SEO Services. Oh well, so far an actual example. The following comments do not address Erol shopping carts, but e-commerce systems in general.

My usual question when asked to optimize eCommerce sites is “are you willing to dump everything except the core shopping cart module?”. Unfortunately, that’s the best as well as the cheapest solution in most cases. The technical crux with eCommerce software is, that it’s developed by programmers, not Web developers, and software shops don’t bother asking for SEO advice. The result is often fancy crap.

Another common problem is, that the UI is optimized for shoppers (that’s a subclass of ’surfers’, the latter is decently emulated by search engine crawlers). Navigation is mostly shortcut- and search driven (POST created results not crawlable) and relies on variables stored in cookies and whereever (invisible to spiders). All the navigational goodies which make the surfing experience are implemented with client sided technologies, or -if put server sided- served by ugly URLs with nasty session-IDs (ignored by crawlers or at least heavily downranked for various reasons). What’s left for the engines? Deep hierarchical structures of thin pages plastered with duplicated text and buy-now links. That’s not the sort of spider food Ms. Googlebot and her colleagues love to eat.

Guess why Google doesn’t crawl search results. Because search results are an inedible spider fodder not worth indexing. The same goes for badly linked conglomerates of thin product pages. Think of a different approach. Instead of trying to shove thin product pages into search indexes write informative pages on product lines/groups/… and link to the product pages within the text. When these well linked info pages provide enough product details they’ll rank for product related search queries. And you’ll generate linkworthy content. Don’t forget to disallow /shop, /search and /products in your robots.txt.

Disclaimer: I’ve checked essentialaids.com, Erol’s software does JavaScript redirects obfuscating the linked URLs to deliver the content client sided. I’ve followed this case over a few days watching Google deindexing the whole site page by page. This kind of redirects is considered “sneaky” by Google and Google’s spam filters detect it automatically. Although there is no bad intent, Google bans all sites using this technique. Since this is a key feature of the software, how can they advertise it as “search engine friendly”? From their testimonials (most are affiliates) I’ve looked at irishmusicmail.com and found that Google has indexed only 250 pages from well over 800, it looks like the Erol shopping system was removed. The other non-affiliated testimonial is from heroesforkids.co.uk, a badly framed site which is also not viewable without JavaScript. Due to SE-unfriendliness Google has indexed only 50 out of 190 pages (deindexing the site a few days later). Another reference brambleandwillow.com didn’t load at all, Google has no references but I found Erol-styled URLs in Yahoo’s index. Next pensdirect.co.uk suffers from the same flawed architecture as heroesforkids.co.uk, although the pages/indexed-URLs ratio is slightly better (15 of 40+). From a quick look at the Erol JS source all pages will get removed from Google’s search index. I didn’t write that to slander Erol and its inventor Dreamteam UK, however these guys would deserve it. It’s just a warning that good looking software which might perfectly support all related business processes can be extremely destructive from a SEO perspective.

Update: Probably it’s possible to make Erol driven shops compliant to Google’s quality guidelines by creating the pages without a software functionality called “page loading messages”. More information is provided by several posts in Erol’s support forums.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Q: Does Googlebot obey ftp://ftp.example.com/robots.txt?

Google states

Search engine robots, including our very own Googlebot, are incredibly polite. They work hard to respect your every wish regarding what pages they should and should not crawl.

A site owner posted logs where Googlebot fetches files although these are disallowed in robots.txt.

Anyone?



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google Webmasters Help FAQ launched

John Web has launched the Inofficial Google Webmaster FAQs, stop by and contribute your knowledge.

This FAQ was born in a Google Groups thread where regular posters discussed the handling of redundant questions in Google’s Webmaster Help Center.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Blogger.com Hates Digg

Ever tried to add a digg button to a blogspot blog? You get a thoughtless error message like “HTML tag <script type=’text/javascript’> is not allowed”. Since when is JavaScript code an HTML tag? BS. <polemic mode>Why does Blogger.com allow JavaScript code made by AdSense?

Conclusion: Blogger.com hates digg.com but loves Google’s AdSense.

Did I hear Googlers encouraging link building tactics like link baiting and social bookmarking? Perhaps I just misunderstood those messages … </polemic mode>

Here is the code downtrodden by Google:

<script type="text/javascript">digg_url = ‘http://sebastianx.blogspot.com/2007/02/why-proper-error-handling-is-important.html‘;</script><script src=”http://digg.com/tools/diggthis.js” type=”text/javascript”></script>


Ok, I admit that Google censors filters blog posts in the best interest of noobs not knowing what a piece of JS code can achieve. However, I’m not an ignorant noob and I do want to make use of JavaScript in my blog posts.

Hey Google, please listen to my rant! Thanks.

PS: Dear Google, please don’t tell me that I can use JS within the template. I don’t think every post of mine is worth a digg button and I’m too lazy to code the conditions.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Does Adam Lasnik like Rel=Nofollow or not?

Spotting the headline “Google’s Lasnik Wishes ‘NoFollow Didn’t Exist’” I was quite astonished. My first thought was “logic can’t explain such a reversal”. And it turned out as kinda blog hoax.

Adam’s “I wish nofollow didn’t exist” put back in context clarifies Google’s position:

“My core point […] was that it’d be really nice if nofollow wasn’t necessary. As it stands, it’s an admittedly imperfect yet important indicator that helps maintain the quality of the Web for users.

It’d be nice if there was less confusion about what nofollow does and when it’s useful. It’d be great if we could return to a more innocent time when practically all links to other sites really WERE true votes, folks clearly vouching for a site on behalf of their users.

But we don’t live in perfect, innocent times, and at Google we’re dedicated to doing what it takes to improve the signal-to-noise ratio in search quality.”

I like the “admittedly imperfect” piece ;)

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google pulls CIA data

Bollocks devoted to Daniel Brandt

Playing with 71 new search keywords by Google I noticed that many search queries get answered with CIA data. Try “national holiday” Canada, Germany, France, Italy and so on, in all cases you get directed to the CIA factbook.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

How Google’s Web Spam Team finds your link scheme

Natural Search Blog has a nice piece reporting that Matt’s team makes use of a proprietary tool to identify webspam trying to manipulate Google’s PageRank.

Ever wondered why Google catches PR-boosting services scams in no time?



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Why proper error handling is important

Misconfigured servers can prevent search engines from crawling and indexing. I admit that’s news of yesterday. However, standard setups and code copied from low quality resources are underestimated –but very popular– points of failure. According to Google a missing robots.txt file in combination with amateurish error handling can result in invisibility on Google’s SERPs. That’s a very common setup by the way.

Googler Jonathon Simon said:

This way [correct setup] when the Google crawler or other search engine checks for a robots.txt file, they get a 200 response if the file is found and a 404 response if it is not found. If they get a 200 response for both cases then it is ambiguous if your site has blocked search engines or not, reducing the likelihood your site will be fully crawled and indexed.

That’s a very carefully written warning, so I try to rephrase the message between the lines:

If you have no robots.txt and your server responds “Ok” (or 302 on a request of robots.txt followed by a 200 response on request of the error page) when Googlebot tries to fetch it, Googlebot might not be willing to crawl your stuff further, hence your pages will not make it in Google’s search index.

If you don’t suffer from IIS (Windows hosting is a horrible nightmare coming with more pitfalls than countable objects in the universe: go find a reliable host) here is a bullet-proof setup.

If you don’t have a robots.txt file yet, create one and upload it today:

User-agent: *
Disallow:

This tells crawlers that your whole domain is spiderable. If you want to exclude particular pages, file-types or areas of your site, refer to the robots.txt manual.

Next look at the .htaccess file in your server’s Web root directory. If your FTP client doesn’t show it, add “-a” to “external mask” in the settings and reconnect. If you find complete URLs in lines starting with “ErrorDocument”, your error handling is screwed up. What happens is that your server does a soft redirect to the given URL, which probably responds with “200-Ok”, and the actual error code gets lost in cyberspace. Sending 401 errors to absolute URLs will slow your server down to the performance of a single IBM-XT hosting Google.com, all other error directives pointing to absolute URLs result in crap. Here is a well formed .htaccess sample:

ErrorDocument 401 /get-the-fuck-outta-here.html
ErrorDocument 403 /get-the-fudge-outta-here.html
ErrorDocument 404 /404-not-found.html
ErrorDocument 410 /410-gone-forever.html
Options -Indexes
<Files “.ht*”>
deny from all
</Files>
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.canonical-server-name\.com [NC]
RewriteRule (.*) http://www.canonical-server-name.com/$1 [R=301,L]

With “ErrorDocument” directives you can capture other clumsiness as well, for example 500 errors with /server-too-buzzy.html or so. Or make the error handling comfortable using /error.php?errno=[insert err#]. In any case avoid relative URLs (src attribute in IMG elements, CSS/feed links, href attributes of A elements …) on all landing pages. You can test actual HTTP response codes with online header checkers.

The other statements above do different neat things. Options -Indexes disallows directory browsing, the next block makes sure that nobody can read your server directives, and the last three lines redirect invalid server names to your canonical server address.

.htaccess is a plain ASCII file, it can get screwed when you upload it in binary mode or when you change it with a word processor. Best edit it with an ASCII/ANSI editor (vi, notepad) as htaccess.txt on your local machine (most FTP clients choose ASCII mode for text files) and rename it to “.htaccess” on the server. Keep in mind that file names are case sensitive.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Getting Help and Answers from Google

For webmasters and publishers not having Googlers on their IM buddy list or in their email address book, Google has opened a communication channel for the masses. Google’s Webmaster Blog is open for webmaster comments, and Googlers answer crawling and indexing related questions in Google’s Webmaster Help Central. Due to the disadvantages of snowboarding participation of Googlers in the forum slowed down a bit lately, but things are going to evolve to the better as I’ve recognized.

As great as all these honest efforts to communicate with webmasters are, large user groups come with disadvantages like trolling and more noise than signal. So I’ve tried to find ways to make Google’s Webmaster Forums more useful. Since the Google Groups platform doesn’t offer RSS feeds for search results, I tried to track particular topics and authors as well with Google’s blog search. This experiment turned out to miserable failure.

Tracking discussions via web search is way to slow because time to index reaches a couple days, not minutes or hours like with blog search or news search. The RSS feeds provided contain all the noise and trolling I don’t want to see, they don’t even come with useful author tags, so I needed a simple and stupid procedure to filter RSS feeds with Google Reader. I thought I’d use Yahoo pipes to create the filters, and this worked just fine as long as I viewed the RSS output as source code or formated by Yahoo. Seems today is my miserable failure day: Google Reader told me my famous piped feeds contain zero items, no title, nor all the neat stuff I’ve seen seconds ago in the feed’s source. Aaaahhhrrrrgggg … I’m going back to track threads (missing lots of valuable post due to senseless thread titles or topic changes within threads) and profiles, for example Adam Lasnik (Google’s Search Evangelist), John Mueller (Softplus), Jonathan Simon (Google), Maile Ohye (Google), Thu Tu (Google), Vanessa Fox (Google) and Google is awesome, not perfect but still awesome. Seems my intention (constructive criticism) got obscured by my sometimes weird sense of humor and my preference for snaky irony and exaggeration to bring a point home.

Update July/05/2007: Google has fixed the broken RSS feeds.

Tags: ()



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

« Previous Page  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28  Next Page »