How to spam the hell out of Google’s new source attribution meta elements

The moment you’ve read Google’s announcement and Matt’s question “What about spam?” you concluded “spamming it is a breeze”, right? You’re not alone.

Before we discuss how to abuse it, it might be a good idea to define it within its context, ok?

Playground

First of all, Google announced these meta tags on the official Google News blog  for a reason. So when you plan to abuse it with your countless MFA proxies of Yahoo Answers, you most probably jumped on the wrong band wagon. Google supports the meta elements below in Google News only.

syndication-source

The first new indexer hint is syndication-source. It’s meant to tell Google the permalink of a particular news story, hence the author and all the folks spreading the word are asked to use it to point to the one –and only one– URI considered the source:

<meta name="syndication-source" content="http://outerspace.com/news/ubercool-geeks-launched-google-hotpot.html" />

The meta element above is for instances of the story served from
http://outerspace.com/breaking/page1.html
http://outerspace.com/yyyy-mm-dd/page2.html
http://outerspace.com/news/aliens-appreciate-google-hotpot.html
http://outerspace.com/news/ubercool-geeks-launched-google-hotpot.html
http://newspaper.com/main/breaking.html
http://tabloid.tv/rehashed/from/rss/hot:alien-pot-in-your-bong.html

Don’t confuse it with the cross-domain rel-canonical link element. It’s not about canning duplicate content, it marks a particular story, regardless whether it’s somewhat rewritten or just reprinted with a different headline. It tells Google News to use the original URI when the story can be crawled from different URIs on the author’s server, and when syndicated stories on other servers are so similar to the initial piece that Google News prefers to use the original (the latter is my educated guess).

original-source

The second new indexer hint is original-source. It’s meant to tell Google the origin of the news itself, so the author/enterprise digging it out of the mud, as well as all the folks using it later on, are asked to declare who broke the story:

<meta name="original-source" content="http://outerspace.com/news/ubercool-geeks-launched-google-hotpot.html" />

Say we’ve got two or more related news, like “Google fell from Mars” by cnn.com and “Google landed in Mountain View” by sfgate.com, it makes sense for latimes.com to publish a piece like “Google fell from Mars and landed in Mountain View”. Because latimes.com is a serious newspaper, they credit their sources not only with a mention or even embedded links, they do it machine-readable, too:

<meta name="original-source" content="http://cnn.com/google-fell-from-mars.html" />
<meta name="original-source" content="http://sfgate.com/google-landed-in-mountain-view.html" />

It’s a matter of course that both cnn.com and sfgate.com provide such an original-source meta element on their pages, in addition to the syndication-source meta element, both pointing to their very own coverage.

If a journalist grabbed his breaking news from a secondary source telling “CNN reported five minutes ago that Google’s mothership started from Venus, and the LA Times spotted it crashing on Jupiter”, he can’t be bothered with looking at the markup and locating those meta elements in the head section, he has a deadline for his piece “Why Web search left Planet Earth”. It’s just fine with Google News when he puts

<meta name="original-source" content="http://cnn.com/" />
<meta name="original-source" content="http://sfgate.com/" />

Fine-prints

As always, the most interesting stuff is hidden on a help page:

At this time, Google News will not make any changes to article ranking based on this tags.

If we detect that a site is using these metatags inaccurately (e.g., only to promote their own content), we’ll reduce the importance we assign to their metatags. And, as always, we reserve the right to remove a site from Google News if, for example, we determine it to be spammy.

As with any other publisher-supplied metadata, we will be taking steps to ensure the integrity and reliability of this information.

It’s a field test

We think it is a promising method for detecting originality among a diverse set of news articles, but we won’t know for sure until we’ve seen a lot of data. By releasing this tag, we’re asking publishers to participate in an experiment that we hope will improve Google News and, ultimately, online journalism. […] Eventually, if we believe they prove useful, these tags will be incorporated among the many other signals that go into ranking and grouping articles in Google News. For now, syndication-source will only be used to distinguish among groups of duplicate identical articles, while original-source is only being studied and will not factor into ranking. [emphasis mine]

Spam potential

Well, we do know that Google Web search has a spam problem, IOW even a few so-1999-webspam-tactics still work to some extent. So we tend to classify a vague threat like “If we find sites abusing these tags, we may […] remove [those] from Google News entirely” as FUD, and spam away. Common sense and experience tells us that a smart marketer will make money from everything spammable.

But: we’re not talking about Web search. Google News is a clearly laid out environment. There are only so many sites covered by Google News. Even if Google wouldn’t be able to develop algos analyzing all source attribution attributes out there, they do have the resources to identify abuse using manpower alone. Most probably they will do both.

They clearly told us that they will compare those meta data to other signals. And that’s not only very weak indicators like “timestamp first crawled” or “first heard of via pubsubhubbub”. It’s not that hard to isolate particular news, gather each occurrence as well as source mentions within, and arrange those on a time line with clickable links for QC folks who most certainly will identify the actual source. Even a few spot tests daily will soon reveal the sites whose source attribution meta tags are questionable, or even spammy.

If you’re still not convinced, fair enough. Go spam away. Once you’ve lost your entry on the whitelist, your free traffic from Google News, as well as from news-one-box results on conventional SERPs, is toast.

Last but not least, a fair warning

Now, if you still want to use source attribution meta elements on your non-newsworthy MFA sites to claim owership of your scraped content, feel free to do so. Most probably Matt’s team will appreciate just another “I’m spamming Google” signal.

Not that reprinting scraped content is considered shady any more: even a former president does it shamelessly. It’s just the almighty Google in all of its evilness that penalizes you for considering all on-line content public domain.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

8 Comments to "How to spam the hell out of Google's new source attribution meta elements"

  1. SERPD on 17 November, 2010  #link

    How to spam the hell out of Google’s new source attribution meta elements…

    Sebastian had his (as always) side door approach and thoughts on the new Google (News) meta tags. Fun ride as always….

  2. […] How to spam the hell out of Google’s new source attribution meta elements, sebastians-pamphlets.com […]

  3. Andy Beard on 18 November, 2010  #link

    It would have made much more sense if they had added a way to just use a data attribute with cite or blockquote.

    That also then allows analysis at a block level.

    Can you get away with using these meta elements within the body? I know that might not validate, but it can work with redirects and have a purpose (such as firing a pixel)

  4. SEO Freak Show on 18 November, 2010  #link

    I have a feeling we will see a ton of abuse on the new meta - followed by a whole bunch of crying newbies that don’t understand the levels of risk. Especially since the newest craze is ’so called gurus’ promoting the hell outta get-it-while-its-hot rank on page 1 in under 3 minutes Google News products. The good news is that $497 is a little steep for most, but thats what torrents are for.

  5. Sebastian on 18 November, 2010  #link

    I don’t know whether the GoogleNews indexer will process misplaced meta elements or not. I wouldn’t recommend it, though. I assume that the News thingy can create a graph with page level metadata with ease.

    You can always use the CITE attribute in Q (…) elements to credit a particular quote, although as of today SE support of CITE is, well, not really existent. That’ll change.

  6. […] How to spam the hell out of Google’s new source attribution meta elements – A highly entertaining post from everyone’s favourite crab (Sebastian). He’s really becoming the king of the tongue in cheek posts lately. […]

  7. SEO Book.com on 1 December, 2010  #link

    … Recently Google created a news source attribution tag. If it works, it might be a good idea. But (even outside of spam) there are ways it can backfire. …

  8. Warner Carter on 8 December, 2010  #link

    I am wondering if this tag would help an original article on a website or a blog post. Would Google see it as spam if I used that meta for an original blog post I placed in article directories, guest posts or on a Squidoo Lens? It seems to me with this tag I could have a link while showing a page that has no anchor text or get a link from an otherwise nofollow placement, but might get slapped for doing so.

    [Nope, wouldn’t help. That’s for Google News only. Sebastian]

Leave a reply


[If you don't do the math, or the answer is wrong, you'd better have saved your comment before hitting submit. Here is why.]

Be nice and feel free to link out when a link adds value to your comment. More in my comment policy.