Archived posts from the 'Webspam' Category

You can’t escape from Google-Jail when …

spammers stuck in google jail… you’ve boosted your business Web site’s rankings with shitloads of crappy links. The 11th SEO commandment: Don’t promote your white hat sites with black hat link building methods! It may work for a while, but once you find your butt in Google-jail, there’s no way out. Not even a reconsideration request can help because you can’t provide its prerequisites.

When you’re caught eventually –penalized for tons of stinky links– and have to file a reinclusion request, Google wants you to remove all the shady links you’ve spread on the Web before they lift your penalty. Here is an example, well documented in a Google Groups thread started by a penalized site owner with official statements from Matt Cutts and John Müller from Google.

The site in question, a small family business from the UK, has used more or less every tactic from a lazy link builder’s textbook to create 40,000+ inbound links. Sponsored WordPress themes, paid links, comment spam, artificial link exchanges and whatnot.

Most sites that carry these links are in no way related to the penalized site, which deals with modern teak garden furniture and home furniture sets, for example porn galleries, Web designers, US city guides, obscure oriental blogs, job boards, or cat masturbation guides. (Don’t get me wrong. Of course not every link has to be topically related. Every link from a trusted page can pass PageRank, and can improve crawling, indexing, and so on.)

Google has absolutely no problem with unrelated links, unless a site’s link profile consists of way too many spammy and/or unrelated links. That does not mean that spreading a gazillion low-life links pointing to a competitor will get this site penalized or even banned. Negative SEO is not that simple. For an innocent site Google just ignores spammy inbound links, but most probably flags it for further investigations, both manually as well as algorithmically.

If on the other hand Google finds evidence that a site is actively involved in link monkey business of any kind, that’s a completely different story. Such evidence could be massively linking out to spammy places, hosting reciprocal links pages or FFA directories, unskillful (manual|automated) comment spam, signature links and mentions at places that trade links, textual contents made for (paid) link campaigns when reused too often, buying links from trackable services, (link request emails forwarded via) paid-link/spam reports, and so on.

Below is the “how to file a successful reconsideration request when your sins include link spam” from Googlers.

Matt Cutts:

The recommendation from your SEO guy led you directly into a pretty high-risk area; I doubt you really want pages like (NSAW) having sponsored links to your furniture site anyway. It’s definitely possible to extricate your site, but I would make an effort to contact the sites with your sponsored links and request that they remove the links, and then do a reconsideration request. Maybe in the text of your reconsideration request, I’d include a pointer to this thread as well.

John Müller:

You may want to consider what you can do to help clean up similar [=spammy] links on other people’s sites. Blogs and newspaper sites such as http://media.www.dailypennsylvanian.com sometimes receive short comments such as “dont agree”, apparently only for a link back to a site. These comments often use keywords from that site instead of a user name, perhaps “tree bench” for a furniture site or “sexy shoes” for a footwear site. If this kind of behavior might have taken place for your site, you may want to work on rectifying it and include some information on it in your reconsideration request. Given your situation, the person considering your reconsideration request might be curious about links like that.

Translation: We’ll ignore your weekly reconsideration requests unless you’ve removed all artificial links pointing to your site. You’re stuck in Google’s dungeon because they’ve thrown away the keys.

I’d guess that for a site that has filed a reinclusion request stating the site was involved in some sort of link monkey business, Google applies a more strict policy than with a site that was attacked by negative SEO methods. I highly doubt that when caught red-handed a lame excuse like “I didn’t create those links” is a tactic I could recommend, because Googlers hate it when an applicant lies in a reinclusion request.

Once caught and penalized, the “since when do inbound links count as negative votes” argument doesn’t apply. It’s quite clear that removing the traces (admitted as well as not admitted shady links) is a prerequisite for a penalty lift. And that even though Google has already discounted these links. That’s the same as with penalized doorway pages. Redirecting doorways to legit landing pages doesn’t count, Google wants to see a 410-Gone HTTP response code (or at least a 404) before they un-penalize a site.

I doubt that’s common knowledge to folks who promote their white hat sites with black hat methods. Getting links wiped out at places that didn’t check the intention of inserted links in the first place is a royal PITA, in other words, it’s impossible to get all shady links removed once you find your butt in Google-jail. That’s extremely uncomfortable for site owners who fell for questionable forum advice or hired a promotional service (no, I don’t call such assclowns SEOs) applying shady marketing methods without a clear and written warning that those are extremely risky, fully explained and signed by the client.

Maybe in some cases Google will un-penalize a great site although not all link spam was wiped out. However, the costs and efforts of preparing a successful resonsideration request are immense, not to speak of the massive loss of traffic and income.

As Barry mentioned, the thread linked above might be interesting for folks keen on an official confirmation that Google -60 penalties exist. I’d say such SERP penalties (aka red & yellow cards) aren’t exactly new, and it plays no role to which position a site penalized for guideline violations gets downranked. When I’ve lost a top spot for gaming Google, that’s kismet. I’m not interested in figuring out that 20k spammy links get me a -30 penalty, 40k shady links result in a -60 penalty, and 100k unnatural links qualify me for the famous -950 bashing (the numbers are made up of course). If I’d spam, then I’d just move on because I’d have already launched enough other projects to compensate the losses.

PS: While I was typing, Barry Schwartz posted his Google-Jail story at SE Roundtable.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Dealing with spamming content thieves / plagiarists (oylinki.com)

Dealing with plagiaristsWhen it comes to crap like plagiarism you shouldn’t consider me a gentleman.

If assclowns like Veronica Domb steal my content and publish it along with likewise stolen comments on their blatantly spamming site oylinki.com, I’m somewhat upset.

Then when I leave a polite note asking the thief Veronica Domb from EmeryVille to remove my stuff asap, see my comment marked as “in moderation”, but neither my content gets removed nor my comment is published within 24 hours, I stay annoyed.

When I’m annoyed, I write blog posts like this one. I’m sure it will rank high enough for [Veronica Domb] when the assclown’s banker or taxman searches for her name. I’m sure it’ll be visible on any SERP that any other (potential) business partner submits at a major search engine.

Content Thieves Veronica Domb et al, P.O.BOX 99800, EmeryVille, 94662, CA are blatant spammers

Hey, outing content thieves is way more fun than filing boring DMCA complaints, and way more effective. Plagiarists do ego searches too, and from now on Veronica Domb from EmeryVille will find the footsteps of her criminal activities on the Web with each and every ego search. Isn’t that nice?

Not. Of course Veronica Domb is a pseudonym of Slade Kitchens, Jamil Akhtar, … However, some plagiarists and scam artists aren’t smart enough to hide their identity, so watch out.

Maybe I’ve done some companies a little favor, because they certainly don’t need to sent out money sneakily “earned” with Web spam and criminal activities that violate the TOS of most affiliate programs.

AdBrite will love to cancel the account for these affiliate links:
http://ads.adbrite.com/mb/text_group.php?sid=448245&br=1 &dk=736d616c6c20627573696e6573735f355f315f776562
http://www.adbrite.com/mb/commerce/purchase_form.php?opid=448245&afsid=1

Google’s webspam team as well as other search engines will most likely delist oylinki.com that comes with 100% stolen text and links and faked whois info as well.

Spamcop and alike will happily blacklist oylinki.com (IP: 66.199.174.80 , cwh2.canadianwebhosting.com) because the assclown’s blog software sends out email spam masked as trackbacks.

If anybody is interested, here’s a track of the real “Veronica Domb” from Canada clicking the link to this post from her WP admin panel:
74.14.107.36 - - [21/Jan/2008:07:50:40 -0500] "GET /outing-plagiarist-2008-01-21/ HTTP/1.1" 200 9921 "http://oylinki.com/blog/wp-admin/edit-comments.php" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SU 3.005; .NET CLR 1.1.4322; InfoPath.1; Alexa Toolbar; .NET CLR 2.0.50727)"

Common sense is not as common as you think.

Disclaimer: I’ve outed plagiarists in the past, because it works. Whether you do that on ego-SERPs or not depends on your ethics. Some folks think that’s even worse than theft and spamming. I say that publishing plagiarisms in the first place deserves bad publicity.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

MSN spam to continue says the Live Search Blog

MSN Live Search clueless webspam detectionIt seems MSN/LiveSearch has tweaked their rogue bots and continues to spam innocent Web sites just in case they could cloak. I see a rant coming, but first the facts and news.

Since August 2007 MSN runs a bogus bot faking a human visitor coming from a search results page, that follows their crawler. This spambot downloads everything from a page, that is images and other objects, external CSS/JS files, and ad blocks rendering even contextual advertising from Google and Yahoo. It fakes MSN SERP referrers diluting the search term stats with generic and unrelated keywords. Webmasters running non-adult sites wondered why a database tutorial suddenly ranks for [oral sex] and why MSN sends visitors searching for [MILF pix] to a teenager’s diary. Webmasters assumed that MSN is after deceitful cloaking, and laughed out loud because their webspam detection method was that primitive and easy to fool.

Now MSN admits all their sins –except the launch of a porn affiliate program– and posted a vague excuse on their Webmaster Blog telling the world that they discovered the evil cloakers and their index is somewhat spam free now. Donna has chatted with the MSN spam team about their spambot and reports that blocking its IP addresses is a bad idea, even for sites that don’t cloak. Vanessa Fox summarized MSN’s poor man’s cloaking detection at Search Engine Land:

And one has to wonder how effective methods like this really are. Those savvy enough to cloak may be able to cloak for this new cloaker detection bot as well.

They say that they no longer spam sites that don’t cloak, but reverse this statement telling Donna

we need to be able to identify the legitimate and illegitimate content

and Vanessa

sites that are cloaking may continue to see some amount of traffic from this bot. This tool crawls sites throughout the web — both those that cloak and those that don’t — but those not found to be cloaking won’t continue to see traffic.

Here is an excerpt from yesterdays referrer log of a site that does not cloak, and never did:
http://search.live.com/results.aspx?q=webmaster&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=smart&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=search&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=progress&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=google&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=google&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=domain&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=database&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=content&mrt=en-us&FORM=LIVSOP
http://search.live.com/results.aspx?q=business&mrt=en-us&FORM=LIVSOP

Why can’t the MSN dudes tell the truth, not even when they apologize?

Another lie is “we obey robots.txt”. Of course the spambot doesn’t request it to bypass bot traps, but according to MSN it uses a copy served to the LiveSearch crawler “msnbot”:

Yes, this robot does follow the robots.txt file. The reason you don’t see it download it, is that we use a fresh copy from our index. The tool does respect the robots.txt the same way that MSNBot does with a caveat; the tool behaves like a browser and some files that a crawler would ignore will be viewed just like real user would.

In reality, it doesn’t help to block CSS/JS files or images in robots.txt, because MSN’s spambot will download them anyway. The long winded statement above translates to “We promise to obey robots.txt, but if it fits our needs we’ll ignore it”.

Well, MSN is not the only search engine running stealthy bots to detect cloaking, but they aren’t clever enough to do it in a less abusive and detectable way.

Their insane spambot led all cloaking specialists out there to their not that obvious spam detection methods. They may have caught a few cloaking sites, but considering the short life cycle of Webspam on throwaway domains they shot themselves in both feet. What they really have achieved is that the cloaking scripts are MSN spam detection immune now.

Was it really necessary to annoy and defraud the whole Webmaster community and to burn huge amounts of bandwidth just to catch a few cloakers who launched new scripts on new throwaway domains hours after the first appearance of the MSN spam bot?

Can cosmetic changes with regard to their useless spam activities restore MSN’s lost reputation? I doubt it. They’ve admitted their miserable failure five months too late. Instead of dumping the spambot, they announce that they’ll spam away for the foreseeable future. How silly is that? I thought Microsoft is somewhat profit orientated, why do they burn their and our money with such amateurish projects?

Besides all this crap MSN has good news too. Microsoft Live Search told Search Engine Roundtable that they’ll spam our sites with keywords related to our content from now on, at least they’ll try it. And they have a forum and a contact form to gather complaints. Crap on, so much bureaucratic efforts to administer their ridiculous spam fighting funeral. They’d better build a search engine that actually sends human traffic.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Buying cheap viagra algorithmically

Since Google can’t manage to clean up [Buy cheap viagra] let’s do it ourselves. Go seek a somewhat trusted search blog mentioning “buy cheap viagra” somewhere in the archives and link to the post with a slightly diversified anchor text like “how to buy cheap viagra online“. Matt deserves a #1 spot by the way so spread many links …

Then when Matt is annoyed enough and Google has kicked out the unrelated stuff from this search hopefully my viagra spam will rank as deserved again ;)

Update a few hours later: Matt ranks #1 for [buy cheap viagra algorithmically]:
Matt Cutts's first spot for [buy cheap viagra algorithmically]
His ranking for [buy cheap viagra] fell about 10 positions to #17 but for [buy cheap viagra online] he’s still on the first SERP, now at position #10 (#3 yesterday). Interesting. It seems that Google’s newish turbo-blog-indexing influences the rankings of pages linked from blog posts relatively short dated but not exactly long lasting.

Related posts:
Negative SEO At Work: Buying Cheap Viagra From Google’s Very Own Matt Cutts - Unless You Prefer Reddit? Or Topix? by Fantomaster
Trust + keywords + link = Good ranking (or: How Matt Cutts got ranked for “Buy Cheap Viagra”) by Wiep



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Danny Sullivan did not strip for Matt Cutts

Nope, this is not recycled news. I’m not referring to Matt asking Danny to strip off his business suit, although the video is really funny. I want to comment on something Matt didn’t say recently, but promised to do soon (again).

Danny Sullivan stripped perfectly legit code from Search Engine Land because he was accused to be a spammer, although the CSS code in question is in no way deceitful.

StandardZilla slams poor Tamar just reporting a WebProWorld thread, but does an excellent job in explaining why image replacement is not search engine spam but a sound thing to do. Google’s recently updated guidelines need to tell more clearly that optimizing for particular user agents is not considered deceitful cloaking per se. This would prevent Danny from stripping (code) not for Matt or Google but for lurid assclowns producing canards.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Google enhances the quality guidelines

Maybe todays update of Google’s quality guidelines is the first phase of the Webmaster help system revamp project. I know there’s more to come, Google has great plans for the help center. So don’t miss out on the opportunity to tell Google’s Webmaster Central team what you’d like to have added or changed. Only 14 replies to this call for input is an evidence of incapacity, shame on the Webmasters community.

I haven’t had the time to write a full-blown review of the updates, so here are just a few remarks from a Webmaster’s perspective. Scroll down to Quality guidelines - specific guidelines to view the updates, that means click the links to the new (sometimes overlapping) detail pages.

As always, the guidelines outline best practices of Web development, refer to common sense, and don’t encourage over-interpretations (not that those are avoidable, nor utterly useless). Now providing Webmasters with more explanatory directives, detailed definitions and even examples in the “Don’ts” section is very much appreciated. Look at the over five years old first version of this document before you bitch ;)

Avoid hidden text or hidden links
The new help page on hidden text and links is descriptive and comes with examples, well done. What I miss is a hint with regard to CSS menus and other content which is hidden until the user performs a particular action. Google states “Text (such as excessive keywords) can be hidden in several ways, including […] Using CSS to hide text”. The same goes for links by the way. I wish they would add something in the lines of “… Using CSS to hide text in a way that a user can’t visualize it by a common action like moving the mouse over a pointer to a hidden element, or clicking a text link or descriptive widget or icon”. The hint at the bottom “If you do find hidden text or links on your site, either remove them or, if they are relevant for your site’s visitors, make them easily viewable” comes close to this but lacks an example.

Susan Moskwa from Google clarifies what one can hide with CSS, and what sorts of CSS hidden stuff is considered a violation of the guidelines, in the Google forum on June/11/2007:

If your intent in hiding text is to deceive the search engines, we frown on that; if your intent is purely to improve the visual user experience (e.g. by replacing some text with a fancier image of that same text), you don’t need to worry. Of course, as with many techniques, there are shades of gray between “this is clearly deceptive and wrong” and “this is perfectly acceptable”. Matt [Cutts] did say that hiding text moves you a step further towards the gray area. But if you’re running a perfectly legitimate site, you don’t need to worry about it. If, on the other hand, your site already exhibits a bunch of other semi-shady techniques, hidden text starts to look like one more item on that list. […] As the Guidelines say, focus on intent. If you’re using CSS techniques purely to improve your users’ experience and/or accessibility, you shouldn’t need to worry. One good way to keep it on the up-and-up (if you’re replacing text w/ images) is to make sure the text you’re hiding is being replaced by an image with the exact same text.

Don’t use cloaking or sneaky redirects
This sentence in bold red blinking uppercase letters should be pinned 5 pixels below the heading: “When examining […] your site to ensure your site adheres to our guidelines, consider the intent” (emphasis mine). There are so many perfectly legit ways to do the content presentation, that it is impossible to assign particular techniques to good versus bad intent, nor vice versa.

I think this page leads to misinterpretations. The major point of confusion is, that Google argues completely from a search engine’s perspective and dosn’t write for the targeted audience, that is Webmasters and Web developers. Instead of all the talk about users vs. search engines, it should distinguish plain user agents (crawlers, text browsers, JavaScript disabled …) from enhanced user agents (JS/AJAX enabled, installed and activated plug-ins …). Don’t get me wrong, this page gives the right advice, but the good advice is somewhat obfuscated in phrases like “Rather, you should consider visitors to your site who are unable to view these elements as well”.

For example “Serving a page of HTML text to search engines, while showing a page of images or Flash to users [is considered deceptive cloaking]” puts down a gazillion of legit sites which serve the same contents in different formats (and often under different URLs) depending on the ability of the current user agent to render particular stuff like Flash, and a bazillion of perfectly legit AJAX driven sites which provide crawlers and text browsers with a somewhat static structure of HTML pages, too.

“Serving different content to search engines than to users [is considered deceptive cloaking]” puts it better, because in reverse that reads “Feel free to serve identical contents under different URLs and in different formats to users and search engines. Just make sure that you accurately detect the capabilities of the user agent before you decide to alter a requested plain HTML page into a fancy conglomerate of flashing widgets with sound and other good vibrations, respectively vice versa”.

Don’t send automated queries to Google
This page doesn’t provide much more information than the paragraph on the main page, but there’s not that much to explain: don’t use WebPosition Gold™. Period.

Don’t load pages with irrelevant keywords
Tells why keyword stuffing is not a bright idea, nothing to note.

Don’t create multiple pages, subdomains, or domains with substantially duplicate content
This detail page is a must read. It starts with a to the point definition “Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar”, followed by a ton of good tips and valuable information. And fortunately it expresses that there’s no such thing as a general duplicate content penalty.

Don’t create pages that install viruses, trojans, or other badware
Describes Google’s service in partnership with StopBADware.org, highlighting the quickest procedure to get Google’s malware warning removed.

Avoid “doorway” pages created just for search engines, or other “cookie cutter” approaches such as affiliate programs with little or no original content
The info on doorway pages is just a paragraph on the “cloaking and sneaky redirect” page. I miss a few tips on how one can identify unintentional doorway pages created by just bad design, without any deceptive intent. Also, I think a few sentences on thin SERP-like pages would be helpful in this context.

“Little or no original content” targets thin affiliate sites, again doorway pages, auto-generated content, and scraped content. It becomes clear that Google does not love MFA sites.

If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first
The link points to the “Little or no original content” page mentioned above.


“Buying links in order to improve a site’s ranking is in violation of Google’s webmaster guidelines and can negatively impact a site’s ranking in search results. […] Google works hard to ensure that it fully discounts links intended to manipulate search engine results, such link exchanges and purchased links.”

Basically that means: if you purchase a link, then make dead sure it’s castrated or Google will take away the ability to pass link love from the page (or even site) linking out for green. Or don’t get caught respectively denunciated by competitors (I doubt that’s a surefire tactic for the average Webmaster).

Note that in the second sentence quoted above Google states officially that link exchanges for the sole purpose of manipulating search engines are a waste of time and resources. That means reciprocal links of particular types nullify each other, and site links might have lost their power too. <speculation>Google may find it funny to increase the toolbar PageRank of pages involved in all sorts of link swap campaigns, but the real PageRank will remain untouched.</speculation>

There’s much confusion with regard to “paid link penalties”. To the best of my knowledge the link’s destination will not be penalized, but the paid link(s) will not (or no longer) increase its reputation, so that in case the link’s intention got reported or discovered ex-post its rankings may suffer. Penalizing the link buyer would not make much sense, and Googlers are known as pragmatic folks, hence I doubt there is such a penalty. <speculation>Possibly Google has a flag applied to known link purchasers (sites as well as webmasters), which –if it exists– might result in more scrupulous judgements of other optimization techniques.</speculation>

What I really like is that the Googlers in charge honestly tried to write for their audience, that is Webmasters and Web developers, not (only) search geeks. Hence the news is that Google really cares. Since the revamp is a funded project, I guess the few paragraphs where the guidelines are still mysterious (for the great unwashed), or even potentially misleading, will get an update soon. I can’t wait for the next phase of this project.

Vanessa Fox creates buzz at SMX today, so I’ll update this post when (if?) she blogs about the updates later on (update: Vanessa’s post). Perhaps Matt Cutts will comment the updated quality guidelines at the SMX conference today, look for Barry’s writeup at Search Engine Land, and SEO Roundtable as well as the Bruce Clay blog for coverage of the SMX Penalty Box Summit. Marketing Pilgrim covered this session too. This post at Search Engine Journal provides related info, and more quotes from Matt. Just one SMX tidbit: according to Matt they’re going to change the name of the re-inclusion request to something like a reconsideration request.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

« Previous Page  1 | 2