MSN spam to continue says the Live Search Blog

MSN Live Search clueless webspam detectionIt seems MSN/LiveSearch has tweaked their rogue bots and continues to spam innocent Web sites just in case they could cloak. I see a rant coming, but first the facts and news.

Since August 2007 MSN runs a bogus bot faking a human visitor coming from a search results page, that follows their crawler. This spambot downloads everything from a page, that is images and other objects, external CSS/JS files, and ad blocks rendering even contextual advertising from Google and Yahoo. It fakes MSN SERP referrers diluting the search term stats with generic and unrelated keywords. Webmasters running non-adult sites wondered why a database tutorial suddenly ranks for [oral sex] and why MSN sends visitors searching for [MILF pix] to a teenager’s diary. Webmasters assumed that MSN is after deceitful cloaking, and laughed out loud because their webspam detection method was that primitive and easy to fool.

Now MSN admits all their sins –except the launch of a porn affiliate program– and posted a vague excuse on their Webmaster Blog telling the world that they discovered the evil cloakers and their index is somewhat spam free now. Donna has chatted with the MSN spam team about their spambot and reports that blocking its IP addresses is a bad idea, even for sites that don’t cloak. Vanessa Fox summarized MSN’s poor man’s cloaking detection at Search Engine Land:

And one has to wonder how effective methods like this really are. Those savvy enough to cloak may be able to cloak for this new cloaker detection bot as well.

They say that they no longer spam sites that don’t cloak, but reverse this statement telling Donna

we need to be able to identify the legitimate and illegitimate content

and Vanessa

sites that are cloaking may continue to see some amount of traffic from this bot. This tool crawls sites throughout the web — both those that cloak and those that don’t — but those not found to be cloaking won’t continue to see traffic.

Here is an excerpt from yesterdays referrer log of a site that does not cloak, and never did:
http://search.live.com/results.aspx?q=webmaster&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=smart&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=search&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=progress&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=google&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=google&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=domain&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=database&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=content&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=business&mrt=en-us&FORM=LIVSOP

Why can’t the MSN dudes tell the truth, not even when they apologize?

Another lie is “we obey robots.txt”. Of course the spambot doesn’t request it to bypass bot traps, but according to MSN it uses a copy served to the LiveSearch crawler “msnbot”:

Yes, this robot does follow the robots.txt file. The reason you don’t see it download it, is that we use a fresh copy from our index. The tool does respect the robots.txt the same way that MSNBot does with a caveat; the tool behaves like a browser and some files that a crawler would ignore will be viewed just like real user would.

In reality, it doesn’t help to block CSS/JS files or images in robots.txt, because MSN’s spambot will download them anyway. The long winded statement above translates to “We promise to obey robots.txt, but if it fits our needs we’ll ignore it”.

Well, MSN is not the only search engine running stealthy bots to detect cloaking, but they aren’t clever enough to do it in a less abusive and detectable way.

Their insane spambot led all cloaking specialists out there to their not that obvious spam detection methods. They may have caught a few cloaking sites, but considering the short life cycle of Webspam on throwaway domains they shot themselves in both feet. What they really have achieved is that the cloaking scripts are MSN spam detection immune now.

Was it really necessary to annoy and defraud the whole Webmaster community and to burn huge amounts of bandwidth just to catch a few cloakers who launched new scripts on new throwaway domains hours after the first appearance of the MSN spam bot?

Can cosmetic changes with regard to their useless spam activities restore MSN’s lost reputation? I doubt it. They’ve admitted their miserable failure five months too late. Instead of dumping the spambot, they announce that they’ll spam away for the foreseeable future. How silly is that? I thought Microsoft is somewhat profit orientated, why do they burn their and our money with such amateurish projects?

Besides all this crap MSN has good news too. Microsoft Live Search told Search Engine Roundtable that they’ll spam our sites with keywords related to our content from now on, at least they’ll try it. And they have a forum and a contact form to gather complaints. Crap on, so much bureaucratic efforts to administer their ridiculous spam fighting funeral. They’d better build a search engine that actually sends human traffic.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

9 Comments to "MSN spam to continue says the Live Search Blog"

  1. SlightlyShadySEO on 5 December, 2007  #link

    Ugh. So now I get to update cloakers again. I’m starting to think “screw it”. I might just goatse all MSN traffic.

    But I’m also starting to think this isn’t so much a standard test, but more of a possible beta for a future(secondary) crawler that’s going to be around for a long, long time.

  2. Sebastian on 5 December, 2007  #link

    Look at IncrediBill’s piece The referrer spam from MSN is just a huge red herring, and this post from an ex-MS employee confirming that MSN does quality control with various user agents and tools from this IP range. At least log all requests from 131.107.0.* because most of them are bogus. If you don’t cloak for MSN, just block the IP range, the artificial traffic isn’t worth a dime. If they really delist you at Live Search because you’ve blocked their spambots, who cares, they don’t send enough human traffic to compensate the bot load.

  3. […] MSN spam to continue says the Live Search Blog, sebastians-pamphlets.com […]

  4. ryanol on 6 December, 2007  #link

    I read your post and couldn’t help myself.

    I had to create a cheap Shepard Fairey “obey” knock off.

    http://www.constantlycomplaining.com/blog/files/imagepicker/r/ryanol/obeymsn.gif

  5. david deangelo on 7 December, 2007  #link

    I actually don’t understand what MSN is trying to achieve out of all this.

  6. Tanner Christensen on 7 December, 2007  #link

    MSN has always had some wacky way of going about things. It’s no wonder MSN is ranking behind Google and Yahoo for search.

  7. Sebastian on 17 December, 2007  #link

    Today I found more fake referrers from MSN with a new “referrer URL” http://search.msn.com/results.aspx?q=index&scope=&first=0&FORM=PERE. There’s no reason my site could rank for [index], and I highly doubt that on a Sunday 32 MSN users search for “index” and land on a script URL that captures invalid URIs and redirects such requests to the best matching pages. Not that I did believe the lies that MSN will reduce the spam and that it will be better targeted …

  8. […] MSN LiveSearch is well known for spamming, it’s not a big surprise that they support hackers, scrapers and other content […]

  9. […] SE candidate is Bing’s spam bot that tries to manipulate stats on search engine usage. If you don’t approve such scams, block […]

Leave a reply


[If you don't do the math, or the answer is wrong, you'd better have saved your comment before hitting submit. Here is why.]

Be nice and feel free to link out when a link adds value to your comment. More in my comment policy.