Archived posts from the 'Social Web' Category

Bizarre facettes of war in social media

As a matter of fact, wars happen in social media, too. I don’t mean flame wars. I don’t refer to Arab dictators who, closely following the #ArabTyrantManual, during uprisings shut down Facebook, Twitter, or even the whole friggin Interwebs. I admit, those scumbags are somewhat creative. For example Syria’s junior dictator Bashar al-Assad, wo launched a huge amount of hashtag spambots diluting every piece of information leaking out from cyber activists, while reforming his people with T-72 shellings and mashine gun live rounds. With a little help from a fellow assclown based in Iran, he even managed to jam sat phones, cutting off the opposition’s lifeline to YouTube.

So when even –alleged– ‘third world’ autocrats utilize high sophisticated techniques gaming social media in their war on their own people, we can safely assume that there’s way more interesting stuff to know about the role of social media in today’s wars. You’ve read the headlines announcing cyber squads and such. Of course that info was outdated for decades before it hit the mainstream press. Also, the average (that equals IT-wise clueless) journalist blathers about DoS attacks and such, usually ignoring the more subtle aspects of cyber war. I’m not exactly a fan of rehashed news, so I refuse to discuss the obvious.

Recently, I’ve stumbled upon a pretty sneaky cyber war tactic. Well thought out, although I can’t tell how effective it actually is. The setup is kinda minimalistic: one Facebook account, and a few hundred (Ok, as of today that’s 1.6k) blog comments written by PsyWarriors:

In North Africa, where peaceful Libyans turned freedom fighters are struggling in a bloody conflict with a ruthless regime that performs atrocities on a daily basis, NATO somewhat acts as the ‘Free Libyan Air Force’, officially just enforcing the UNSC resolution #1973. Nothing wrong with that, since –despite some Gaddafi troops defected to the opposition– the so-called ‘rebels’ are civilians defending themselves, their families, neighbors, and even countless foreigners who weren’t able to flee before Gaddafi’s henchmen crawled all over the country in their brutal war on Libya’s population.

Herein lies the problem. We’ve epic amateurs barely able to handle an AK 47 on the ground, and professionals in the air. Both fighting the mad dog’s professional forces without direct lines of communication to each other. The rag-tag freedom fighters lacked structure, command, communication, experience, strategy and everything with regard to warfare. After the initial strikes by American, British and French armed forces, NATO joined the battlefield with a plan. Its step by step execution wasn’t exactly compatible with the high expectations of the then still amateurish freedom fighters, who even suffered from occasional friendly fire after carelessly celebrating with AA tracer fire, and cruising through the desert in seized tanks, towards liberated towns.

Of course the tourists carrying high sophisticated gadgets in their huge olive green bags, brought in via tour operator helicopters from their shiny gray yachts sailing near the Libyan coastlines, sorted out some of those misunderstandings. But since the Libyan freedom fighters totally lacked a chain of command, it didn’t help much that the few savvy leaders who actually talked to these tourists got enlightened, because the rag-tag troops consisting of untrained citizens chaotically advancing and retreating in the desert were out of their reach. Qatari military advisers on the ground, helping Libyan citizens carrying seized weapons get into shape, as well as very few consultants and military advisers from UK, France, and Italy, who arrived later on, had just started to train freedom fighters.

Also, the message had to be carried out to the Libyan people, and to Libyans in the diaspora as well, without revealing too much sensitive info that Gaddafi’s loyalists could find interesting. All that with most of the recepients on the ground cutted off from all their information channels besides Libya State TV and few other satellite channels, because cell phones and ISPs were jammed by the government, land lines were insecure … a dilemma. The Transitional National Council (NTC) in Benghazi was the sole institution that was able to reach out to the people inside Libya.

Al Jazeera’s Libya Live Blog (URI changes often, so please click through from the index page) was heavily trafficked since the uprising began (on 17 February, 2011), attracting gazillions of page views and receiving thousands of comments daily. And here we introduce Gerhard Heinz, perhaps a former NVA pilot or not, who frequently updates the audience with strategical as well as tactical information, written in very plain English with a heavy east-german accent. Like: ‘a good tip for tank comanders in tripoli stay away from your tanks ,conkret in the air’ (refers to smart, that is GPS and laser guided, 660-pound concrete bombs used by coalition fighter jets to destroy tanks in residential areas without much collateral damage).

He delivers spot on reports of NATO sorties as well as clashes on the ground as they happen, alledgedly based on timely sat images, SIGINT, HUMINT and whatnot, long before they appear in the (western) press after NATO announcements. Most of his stuff gets confirmed by other sources later on. He even makes predictions that come true, and not all of those are easily guessable and likely to happen. He explains NATO tactics in layman terms, tells why NATO requested the freedom fighters must not advance towards Brega for weeks (to create a sneaky trap for an elite brigade and lots of reinforcements from Sirte), and so on. When NATO is dead sure that particular pro-gaddafi troops can’t communicate after air strikes on CCC infrastructure, so no warning can reach them in time, Gerhard Heinz addresses those, advising them to defect, or at least to run and hide quickly before ‘fast flying silver birds lose some eggs’ above their positions.

Obviously all that is insider knowledge, scraped from NATO and NTC/FF sources. Since NATO doesn’t act on this ‘leak’ they must be aware of, I’m jumping to the conclusion that Gerhard Heinz is a weapon of mass disinformation, and mass education as well. It’s not him alone, by the way, but he’s the most prominent case (Gerhard Heinz has a large fan club) I’ve spotted until now. He informs and educates Libyans hungry for every tiny bit of reliable info with regard to the conflict, scanning Al Jazeera’s website for updates 24/7, then spreading the word through all channels available, including social media.

I may be wrong in details, because I’m by no means an expert when it comes to all the military stuff. But I know that an organization like NATO has the capability to deal with sensitive information leaking out to the public domain for weeks. If it’s not happening on purpose, they just lost my respect.

I do think that this dude mixes in personal information that might be true, for example his military background. Also, his strong opinions (for example about a weak German government and its cowardly FM who cares more for his personal political affairs than for the Libyan people, and the widespread opposition to the official politics within the German armed forces) are believable. At least it sounds authentic and consistent throughout more than 1,600 blog comments. And that’s doable even by a PsyOps team, considering that Gerhard Heinz posts at times when he should sleep. He openly admits that he’s backed by staff gathering and processing the facts from various sources, but denies all ties to NATO.

So, maybe, I should leave it to that with the words of a blog commenter on Al Jazeera’s website, who said:

@Gerhard Heinz
You have earned a lot of rep. back for Germany, they really owe you some thanks for your work and dedication in this.
It would be interesting to have an article in german newspapers about what you did, when all this is over, and more of it can be told.
For now its kind of a mystery (at least to me), what a german is doing in the middle of all this, and how he can be so well informed. I am very curious to hear how you did it.
Lots of respect from me.

Just make sure, dear reader, that you keep your natural scepticism when you read –regardless where, and that includes the mainstream press as well as social media– about a war. There might be an aganda behind every sentence.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

How brain-amputated developers created the social media plague

The bot playground commonly refered to as “social media” is responsible for shitloads of absurd cretinism.

Twitter Bot PlaygroundFor example Twitter, where gazillions of bots [type A] follow other equally superfluous but nevertheless very busy bots [type B] that automatically generate 27% valuable content (links to penis enlargement tools) and 73% not exactly exciting girly chatter (breeding demand for cheap viagra).

Bazillions of other bots [type C] retweet bot [type B] generated crap and create lists of bots [type A, B, C]. In rare cases when a non-bot tries to participate in Twitter, the uber-bot [type T] prevents the whole bot network from negative impacts by serving a 503 error to the homunculus’ browser.

This pamphlet is about the idiocy of a particular subclass of bots [type S] that sneakily work in the underground stealing money from content producers, and about their criminal (though brain-dead) creators. May they catch the swine flu, or at least pox or cholera, for the pest they’ve brought to us.

The Twitter pest that costs you hard earned money

WTF I’m ranting about? The technically savvy reader, familiar with my attitude, has already figured out that I’ve read way too many raw logs. For the sake of a common denominator, I encourage you to perform a tiny real-world experiment:

  • Publish a great and linkworthy piece of content.
  • Tweet its URI (not shortened - message incl. URI ≤ 139 characters!) with a compelling call for action.
  • Watch your server logs.
  • Puke. Vomit increases with every retweet.

So what happens on your server? A greedy horde of bots pounces on every tweet containing a link, requesting its content. That’s because on Twitter all URIs are suspected to be shortened (learn why Twitter makes you eat shit). This uncalled-for –IOW abusive– bot traffic burns your resources, and (with a cheap hosting plan) it can hinder your followers to read your awesome article and prevent them from clicking on your carefully selected ads.

Those crappy bots not only cost you money because they keep your server busy and increase your bandwidth bill, they actively decrease your advertising revenue because your visitors hit the back button when your page isn’t responsive due to the heavy bot traffic. Even if you’ve great hosting, you probably don’t want to burn money, not even pennies, right?

Bogus Twitter apps and their modus operandi

If only every Twitter&Crap-mashup would lookup each URI once, that wouldn’t be such a mess. Actually, some of these crappy bots request your stuff 10+ times per tweet, and again for each and every retweet. That means, as more popular your content becomes, as more bot traffic it attracts.

Most of these bots don’t obey robots.txt, that means you can’t even block them applying Web standards (learn how to block rogue bots). Topsy, for example, does respect the content producer — so morons using “Python-urllib/1.17″ or “AppEngine-Google; (+http://code.google.com/appengine; appid: mapthislink)” could obey the Robots Exclusion Protocol (REP), too. Their developers are just too fucking lazy to understand such protocols that every respected service on the Web (search engines…) obeys.

Some of these bots even provide an HTTP_REFERER to lure you into viewing the website operated by their shithead of developer when you’re viewing your referrer stats. Others fake Web browsers in their user agent string, just in case you’re not smart enough to smell shit that really stinks (IOW browser-like requests that don’t fetch images, CSS files, and so on).

One of the worst offenders is outing itself as “ThingFetcher” in the user agent string. It’s hosted by Rackspace, which is a hosting service that obviously doesn’t care much about its reputation. Otherwise these guys would have reacted to my various complaints WRT “ThingFetcher”. By the way, Robert Scoble represents Rackspace, you could drop him a line if ThingFetcher annoys you, too.

ThingFetcher sometimes requests a (shortened) URI 30 times per second, from different IPs. It can get worse when a URI gets retweeted often. This malicious piece of code doesn’t obey robots.txt, and doesn’t cache results. Also, it’s too dumb to follow chained redirects, by the way. It doesn’t even publish its results anywhere, at least I couldn’t find the fancy URIs I’ve feeded it with in Google’s search index.

In ThingFetcher’s defense, its developer might say that it performs only HEAD requests. Well, it’s true that HEAD request provoke only an HTTP response header. But: the script invoked gets completely processed, just the output is trashed.

That means, the Web server has to deal with the same load as with a GET request, it just deletes the content portion (the compelety formatted HTML page) when responding, after counting its size to send the Content-Length response header. Do you really believe that I don’t care about machine time? For each of your utterly useless bogus requests I could have my server deliver ads to a human visitor, who pulls the plastic if I’m upselling the right way (I do, usually).

Unfortunately, ThingFetcher is not the only bot that does a lookup for each URI embedded in a tweet, per tweet processed. Probably the overall number of URIs that appear only once is bigger than the number of URIs that appear quite often while a retweet campaign lasts. That means that doing HTTP requests is cheaper for the bot’s owner, but on the other hand that’s way more expensive for the content producer, and the URI shortening services involved as well.

ThingFetcher update: The owners of ThingFetcher are now aware of the problem, and will try to fix it asap (more information). Now that I know who’s operating the Twitter app owning ThingFetcher, I take back the insults above I’ve removed some insults from above, because they’d no longer address an anonymous developer, but bright folks who’ve just failed once. Too sad that Brizzly didn’t reply earlier to my attempts to identify ThingFetcher’s owner.

As a content producer I don’t care about the costs of any Twitter application that processes Tweets to deliver anything to its users. I care about my costs, and I can perfecly live without such a crappy service. Liberally, I can allow one single access per (shortened) URI to figure out its final destination, but I can’t tolerate such thoughtless abuse of my resources.

Every Twitter related “service” that does multiple requests per (shortened) URI embedded in a tweet is guilty of theft and pilferage. Actually, that’s an understatement, because these raids cost publishers an enormous sum across the Web.

These fancy apps shall maintain a database table storing the destination of each redirect (chain) acessible by its short URI. Or leave the Web, respectively pay the publishers. And by the way, Twitter should finally end URI shortening. Not only it breaks the Internet, it’s way too expensive for all of us.

A few more bots that need a revamp, or at least minor tweaks

I’ve added this section to express that besides my prominent example above, there’s more than one Twitter related app running not exactly squeaky clean bots. That’s not a “worst offenders” list, it’s not complete (I don’t want to reprint Twitter’s yellow pages), and bots are listed in no particular order (compiled from requests following the link in a test tweet, evaluating only a snapshot of less than 5 minutes, backed by historized logs.)

Skip examples

Tweetmeme’s TweetmemeBot coming from eagle.favsys.net doesn’t fetch robots.txt. On their site they don’t explain why they don’t respect the robots exclusion protocol (REP). Apart from that it behaves.

OneRiot’s bot OneRiot/1.0 totally proves that this real time search engine has chosen a great name for itself. Performing 5+ GET as well as HEAD requests per link in a tweet (sometimes more) certainly counts as rioting. Requests for content come from different IPs, the host name pattern is flx1-ppp*.lvdi.net, e.g. flx1-ppp47.lvdi.net. From the same IPs comes another bot: Me.dium/1.0, me.dium.com redirects to oneriot.com. OneRiot doesn’t respect the REP.

Microsoft/Bing runs abusive bots following links in tweets, too. They fake browsers in the user agent, make use of IPs that don’t obviously point to Microsoft (no host name, e.g. 65.52.19.122, 70.37.70.228 …), send multiple GET requests per processed tweet, and don’t respect the REP. If you need more information, I’ve ranted about deceptive M$-bots before. Just a remark in case you’re going to block abusive MSN bot traffic:

MSN/Bing reps ask you not to block their spam bots when you’d like to stay included in their search index (that goes for real time search, too), but who really wants that? Their search index is tiny –compared to other search engines like Yahoo and Google–, their discovery crawling sucks –to get indexed you need to submit your URIs at their webmaster forum–, and in most niches you can count your yearly Bing SERP referrers using not even all fingers of your right hand. If your stats show more than that, check your raw logs. You’ll soon figure out that MSN/Bing spam bots fake SERP traffic in the HTTP_REFERER (guess where their “impressive” market share comes from).

FriendFeed’s bot FriendFeedBot/0.1 is well explained, and behaves. Its bot page even lists all its IPs, and provides you with an email addy for complaints (I never had a reason to use it). The FriendFeedBot made it on this list just because of its lack of REP support.

PostRank’s bot PostRank/2.0 comes from Amazon IPs. It doesn’t respect the REP, and does more than one request per URI found in one single tweet.

MarkMonitor operates a bot faking browser requests, coming from *.embarqhsd.net (va-71-53-201-211.dhcp.embarqhsd.net, va-67-233-115-66.dhcp.embarqhsd.net, …). Multiple requests per URI, no REP support.

Cuil’s bot provides an empty user agent name when following links in tweets, but fetches robots.txt like Cuil’s offical crawler Twiceler. I didn’t bother to test whether this Twitter bot can be blocked following Cuil’s instructions for webmasters or not. It got included in this list for the supressed user agent.

Twingly’s bot Twingly Recon coming from *.serverhotell.net doesn’t respect the REP, doesn’t name its owner, but does only few HEAD requests.

Many bots mimicking browsers come from Amazon, Rackspace, and other cloudy environments, so you can’t get hold of their owners without submitting a report-abuse form. You can identify such bots by sorting your access logs by IP addy. Those “browsers” which don’t request your images, CSS files, and so on, are most certainly bots. Of course, a human visitor having cached your images and CSS matches this pattern, too. So block only IPs that solely request your HTML output over a longer period of time (problematic with bots using DSL providers, AOL, …).

Blocking requests (with IPs belonging to consumer ISPs, or from Amazon and other dynamic hosting environments) with a user agent name like “LWP::Simple/5.808″, “PycURL/7.18.2″, “my6sense/1.0″, “Firefox” (just these 7 characters), “Java/1.6.0_16″ or “libwww-perl/5.816″ is sound advice. By the way, these requests sum up to an amount that would lead a “worst offenders” listing.

Then there are students doing research. I’m not sure I want to waste my resources on requests from Moscow’s “Institute for System Programming RAS”, which fakes unnecessary loads of human traffic (from efrate.ispras.ru, narva.ispras.ru, dvina.ispras.ru …), for example.

When you analyze bot traffic following a tweet with many retweets, you’ll gather a way longer list of misbehaving bots. That’s because you’ll catch more 3rd party Twitter UIs when many Twitter users view their timeline. Not all Twitter apps route their short URI evaluation through their servers, so you might miss out on abusive requests coming from real users via client sided scripts.

Developers might argue that such requests “on behalf of the user” are neither abusive, nor count as bot traffic. I assure you, that’s crap, regardless a particular Twitter app’s architecture, when you count more than one evaluation request per (shortened) URI. For example Googlebot acts on behalf of search engine users too, but it doesn’t overload your server. It fetches each URI embedded in tweets only once. And yes, it processes all tweets out there.

How to do it the right way

Here is what a site owner can expect from a Twitter app’s Web robot:

A meaningful user agent

A Web robot must provide a user agent name that fulfills at least these requirements:

  • A unique string that identifies the bot. The unique part of this string must not change when the version changes (”somebot/1.0″, “somebot/2.0″, …).
  • A URI pointing to a page that explains what the bot is all about, names the owner, and tells how it can be blocked in robots.txt (like this or that).
  • A hint on the rendering engine used, for example “Mozilla/5.0 (compatible; …”.

A method to verify the bot

All IP addresses used by a bot should resolve to server names having a unique pattern. For example Googlebot comes only from servers named "crawl" + "-" + replace($IP, ".", "-") + ".googlebot.com", e.g. “crawl-66-249-71-135.googlebot.com”. All major search engines follow this standard that enables crawler detection not solely relying on the easily spoofable user agent name.

Obeying the robots.txt standard

Webmasters must be able to steer a bot with crawler directives in robots.txt like “Disallow:”. A Web robot should fetch a site’s /robots.txt file before it launches a request for content, when it doesn’t have a cached version from the same day.

Obeying REP indexer directives

Indexer directives like “nofollow”, “noindex” et cetera must be obeyed. That goes for HEAD requests just chasing for a 301/302/307 redirect response code and a “location” header, too.

Indexer directives can be served in the HTTP response header with an X-Robots-Tag, and/or in META elements like the robots meta tag, as well as in LINK elements like rel=canonical and its corresponding headers.

Responsible behavior

As outlined above, requesting the same resources over and over doesn’t count as responsible behavior. Fetching or “HEAD’ing” a resource no more than once a day should suffice for every Twitter app’s needs.

Reprinting a page’s content, or just large quotes, doesn’t count as fair use. It’s Ok to grab the page title and a summary from a META element like “description” (or up to 250 characters from an article’s first paragraph) to craft links, for example - but not more! Also, showing images or embedding videos from the crawled page violates copyrights.

Conclusion, and call for action

If you suffer from rogue Twitter bot traffic, use the medium those bots live in to make their sins public knowledge. Identify the bogus bot’s owners and tweet the crap out of them. Lookup their hosting services, find the report-abuse form, and submit your complaints. Most of these apps make use of the Twitter-API, there are many spam report forms you can creatively use to ruin their reputation at Twitter. If you’ve an account at such a bogus Twitter app, then cancel it and encourage your friends to follow suit.

Don’t let the assclowns of the Twitter universe get away with theft!

I’d like to hear about particular offenders you’re dealing with, and your defense tactics as well, in the comments. Don’t be shy. Go rant away. Thanks in advance!



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

How to disagree on Twitter, machine-readable

URI link condom for social mediaWith standard hyperlinks you can add a rel="crap nofollow" attribute to your A elements. But how do you tell search engine crawlers and other Web robots that you disagree with a link’s content, when you post the URI at Twitter or elsewhere?

You cannot rely on the HTML presentation layer of social media sites. Despite the fact that most of them add a condom to all UGC links, crawlers do follow those links. Nowadays crawlers grab tweets and their embedded links long before they bother to fetch the HTML pages. They fatten their indexers with contents scraped from feeds. That means indexers don’t (really) take the implicit disagreement into account.

As long as you operate your own URI shortener, there’s a solution.

Condomize URIs, not A elements

Here’s how to nofollow a plain link drop, where you’ve no control over link attributes like rel-nofollow:

  • Prerequisite: understanding the anatomy of a URI shortener.
  • Add an attribute like shortUri.suriNofollowed, boolean, default=false, to your shortened URIs database table. In the Web form where you create and edit short URIs, add a corresponding checkbox and update your affected scripts.
  • Make sure your search engine crawler detection is up-to-date.
  • Change the piece of code that redirects to the original URI:
    if ($isCrawler && $suriNofollowed) {
    header("HTTP/1.1 403 Forbidden redirect target", TRUE, 403);
    print "<html><head><title>This link is condomized!</title></head><body><p>Search engines are not allowed to follow this link: <code>$suriUri</code></p></body></html>";
    }
    else {
    header("HTTP/1.1 301 Here you go", TRUE, 301);
    header("Location: $suriUri");
    }
    exit;

Here’s an example: This shortened URI takes you to a Bing SEO tip. Search engine crawlers get bagged in a 403 link condom.

Since you can’t test it yourself (user agent spoofing doesn’t work), here’s a header reported by Googlebot (requesting the condomized URI above) today:


HTTP/1.1 403 Forbidden
Date: Thu, 07 Jan 2010 10:19:16 GMT
...
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

The error page just says:
Title + H1: Link is nofollow'ed
P: Sorry, this shortened URI must not get followed by search engines.

If you can’t roll your own, feel free to make use of my URI Condomizer. Have fun condomizing crappy links on Twitter.

URI:
Nofollow

If you check “Nofollow” your URI gets condomized. That means, search engines can’t request it from the shortened URI, but users and other Web robots get redirected.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

How to cleverly integrate your own URI shortener

This pamphlet is somewhat geeky. Don’t necessarily understand it as a part of my ongoing jihad holy war on URI shorteners.

Clever implementation of an URL shortenerAssuming you’re slightly familiar with my opinions, you already know that third party URI shorteners (aka URL shorteners) are downright evil. You don’t want to make use of unholy crap, so you need to roll your own. Here’s how you can (could) integrate a URI shortener into your site’s architecture.

Please note that my design suggestions ain’t black nor white. Your site’s architecture may require a different approach. Adapt my tips with care, or use my thoughts to rethink your architectural decisions, if they’re applicable.

At the first sight, searching for a free URI shortener script to implement it on a dedicated domain looks like a pretty simple solution. It’s not. At least not in most cases. Standalone URI shorteners work fine when you want to shorten mostly foreign URIs, but that’s a crappy approach when you want to submit your own stuff to social media. Why? Because you throw away the ability to totally control your traffic from social media, and search engine traffic generated by social media as well.

So if you’re not running cheap-student-loans-with-debt-consolidation-on-each-payday-is-a-must-have-for-sexual-heroes-desperate-for-a-viagra-overdose-and-extreme-penis-length-enhancement.info and your domain’s name without the “www” prefix plus a few characters gives URIs of 20 (30) characters or less, you don’t need a short domain name to host your shortened URIs.

As a side note, when you’re shortening your URIs for Twitter you should know that shortened URIs aren’t mandatory any more. If your message doesn’t exceed 139 characters, you don’t need to shorten embedded URIs.

By integrating a URI shortener into your site architecture you gain the abilitiy to perform way more than URI shortening. For example, you can transform your longish and ugly dynamic URIs into short (but keyword rich) URIs, and more.

In the following I’ll walk you step by step through (not really) everything an incoming HTTP request might face. Of course the sequence of steps is a generalization, so perhaps you’ll have to change it to fit your needs. For example when you operate a WordPress blog, you could code nearly everthing below in your 404 page (consider alternatives). Actually, handling short URIs in your error handler is a pretty good idea when you suffer from a mainstream CMS.

Table of contents

To provide enough context to get the advantages of a fully integrated URI shortener, vs. the stand-alone variant, I’ll bore you with a ton of dull and totally unrelated stuff:

Introduction

There’s a bazillion of methods to handle HTTP requests. For the sake of this pamphlet I assume we’re dealing with a well structured site, hosted on Apache with mod_rewrite and PHP available. That allows us to handle each and every HTTP request dynamically with a PHP script. To accomplish that, upload an .htaccess file to the document root directory:

RewriteEngine On
RewriteCond %{SERVER_PORT} ^80$
RewriteRule . /requestHandler.php [L]

Please note that the code above kinda disables the Web server’s error handling. If
/requestHandler.php
exists in the root directory, all ErrorDocument directives (except some 5xx) et cetera will be ignored. You need to take care of errors yourself.

/requestHandler.php (Warning: untested and simplified code snippets below)
/* Initialization */
$serverName = strtolower($_SERVER["SERVER_NAME"]);
$canonicalServerName = "sebastians-pamphlets.com";
$scheme = "http://";
$rootUri = $scheme .$canonicalServerName; /* if used w/o path add a
slash */
$rootPath = $_SERVER["DOCUMENT_ROOT"];
$includePath = $rootPath ."/src"; /* Customize that, maybe you've to manipulate the file system path to your Web server's root */
$requestIp = $_SERVER["REMOTE_ADDR"];
$reverseIp = NULL;
$requestReferrer = $_SERVER["HTTP_REFERER"];
$requestUserAgent = $_SERVER["HTTP_USER_AGENT"];
$isRogueBot = FALSE;
$isCrawler = NULL;
$requestUri = $_SERVER["REQUEST_URI"];
$absoluteUri = $scheme .$canonicalServerName .$requestUri;
$uriParts = parse_url($absoluteUri);
$requestScript = $PHP_SELF;
$httpResponseCode = NULL;

Block rogue bots

You don’t want to waste resources by serving your valuable content to useless bots. Here are a few ideas how to block rogue (crappy, not behaving, …) Web robots. If you need a top-notch nasty-bot-handler please contact the authority in this field: IncrediBill.

While handling bots, you should detect search engine crawlers, too:

/* lookup your crawler IP database to populate $isCrawler; then, if the IP wasn't identified as search engine crawler: */
if ($isCrawler !== TRUE) {
$crawlerName = NULL;
$crawlerHost = NULL;
$crawlerServer = NULL;
if (stristr($requestUserAgent,"Baiduspider")) {$crawlerName = "Baiduspider"; $crawlerServer = ".crawl.baidu.com";}
...
if (stristr($requestUserAgent,"Googlebot")) {$crawlerName = "Googlebot"; $crawlerServer = ".googlebot.com"; }
if ($crawlerName != NULL) {
$reverseIp = @gethostbyaddr($requestIp);
if (!stristr($reverseIp,$crawlerServer)) {
$isCrawler = FALSE;
}
if ("$reverseIp" == "$requestIp") {
$isCrawler = FALSE;
}
if ($isCrawler !== FALSE;) {
$chkIpAddyRev = @gethostbyname($reverseIp);
if ("$chkIpAddyRev" == "$requestIp") {
$isCrawler = TRUE;
$crawlerHost = $reverseIp;
// store the newly discovered crawler IP
}
}
}
}

If Baidu doesn’t send you any traffic, it makes sense to block its crawler. This piece of crap doesn’t behave anyway.
if ($isCrawler &&
"$crawlerName" == "Baiduspider") {
$isRogueBot = TRUE;
}

Another SE candidate is Bing’s spam bot that tries to manipulate stats on search engine usage. If you don’t approve such scams, block incoming! from the IP address range 65.52.0.0 to 65.55.255.255 (131.107.0.0 to 131.107.255.255 …) when the referrer is a Bing SERP. With this method you occasionally might block searching Microsoft employees who aren’t aware of their company’s spammy activities, so make sure you serve them a friendly GFY page that explains the issue.

Other rogue bots identify themselves by IP addy, user agent, and/or referrer. For example some bots spam your referrer stats, just in case when viewing stats you’re in the mood to consume porn, consolidate your debt, or buy cheap viagra. Compile a list of NSAW keywords and run it against the HTTP_REFERER:
if (notSafeAtWork($requestReferrer)) {$isRogueBot = TRUE;}

If you operate a porn site you should refine this approach.

As for blocking requests by IP addy I’d recommend a spamIp database table to collect IP addresses belonging to rogue bots. Doing a @gethostbyaddr($requestIp) DNS lookup while processing HTTP requests is way too expensive (with regard to performance). Just read your raw logs and add IP addies of bogus requests to your black list.
if (isBlacklistedIp($requestIp)) {$isRogueBot = TRUE;}

You won’t believe how many rogue bots still out themselves by supplying you with a unique user agent string. Go search for [block user agent], then pick what fits your needs best from rougly two million search results. You should maintain a database table for ugly user agents, too. Or code
if (isBlacklistedUa($requestUserAgent) ||

stristr($requestUserAgent,”ThingFetcher”)) {$isRogueBot = TRUE;}

By the way, the owner of ThingFetcher really should stand up now. I’ve sent a complaint to Rackspace and I’ve blocked your misbehaving bot on various sites because it performs excessive loops requesting the same stuff over and over again, and doesn’t bother to check for robots.txt.

Finally, serve rogue bots what they deserve:
if ($isRogueBot === TRUE) {

header("HTTP/1.1 403 Go fuck yourself", TRUE, 403);
exit;
}

If you’re picky, you could make some fun out of these requests. For example, when the bot provides an HTTP_REFERER (the page you should click from your referrer stats), then just do a file_get_contents($requestReferrer); and serve the slutty bot its very own crap. Or just 301 redirect it to the referrer provided, to http://example.com/go-fuck-yourself, or something funny like a huge image gfy.jpeg.html on a freehost (not that such bots usually follow redirects). I’d go for the 403-GFY response.

Server name canonicalization

Although search engines have learned to deal with multiple URIs pointing to the same piece of content, sometimes their URI canonicalization routines do need your support. At least make sure you serve your content under one server name:
if (”$serverName” != “$canonicalServerName”) {
header(”HTTP/1.1 301 Please use the canonical URI”, TRUE, 301);
header(”Location: $absoluteUri”);
header(”X-Canonical-URI: $absoluteUri”); //
experimental
header("Link: <$absoluteUri>; rel=canonical"); // experimental
exit;
}

Subdomains are so 1999, also 2010 is the year of non-’.www’ URIs. Keep your server name clean, uncluttered, memorable, and remarkable. By the way, you can use, alter, rewrite … the code from this pamphlet as you like. However, you must not change the $canonicalServerName = "sebastians-pamphlets.com"; statement. I’ll appreciate the traffic. ;)

When the server name is Ok, you should add some basic URI canonicalization routines here. For example add trailing slashes –if necessary–, and remove clutter from query strings.

Sometimes even smart developers do evil things with your URIs. For example Yahoo truncates the trailing slash. And Google badly messes up your URIs for click tracking purposes. Here’s how you can ‘heal’ the latter issue on arrival (after all search engine crawlers have passed the cluttered URIs to their indexers :( ):
$testForUriClutter = $absoluteUri;
if (isset($_GET)) {
foreach ($_GET as $var => $crap) {
if ( stristr($var,”utm_”) ) {
$testForUriClutter = str_replace($testForUriClutter, “&$var=$crap”, “”);
$testForUriClutter = str_replace($testForUriClutter, “&amp;$var=$crap”, “”);

unset ($_GET[$var]);
}
}
$uriPartsSanitized = parse_url($testForUriClutter);
$qs = $uriPartsSanitized["query"];
$qs = str_replace($qs, "?", "");
if ("$qs" != $uriParts["query"]) {
$canonicalUri = $scheme .$canonicalServerName .$requestScript;
if (!empty($qs)) {
$canonicalUri .= "?" .$qs;
}
if (!empty($uriParts["fragment"])) {
$canonicalUri .= "#" .$uriParts["fragment"];
}
header("HTTP/1.1 301 URI messed up by Google", TRUE, 301);
header("Location: $canonicalUri");
exit;
}
}

By definition, heuristic checks barely scratch the surface. In many cases only the piece of code handling the content can catch malformed URIs that need canonicalization.

Also, there are many sources of malformed URIs. Sometimes a 3rd party screws a URI of yours (see below), but some are self-made.

Therefore I’d encapsulate URI canonicalization, logging pairs of bad/good URIs with referrer, script name, counter, and a lastUpdate-timestamp. Of course plain vanilla stuff like stripped www prefixes don’t need a log entry.


Before you’re going to serve your content, do a lookup in your shortUri table. If the requested URI is a shortened URI pointing to your own stuff, don’t perform a redirect but serve the content under the shortened URI.

Deliver static stuff (images …)

Usually your Web server checks whether a file exists or not, and sends the matching Content-type header when serving static files. Since we’ve bypassed this functionality, do it yourself:
if (empty($uriParts[”query”])) && empty($uriParts[”fragment”])) && file_exists(”$rootPath$requestUri”)) {
header(”Content-type: ” .getContentType(”$rootPath$requestUri”), TRUE);
readfile(”$rootPath$requestUri”);
exit;
}
/* getContentType($filename) returns a
MIME media type like 'image/jpeg', 'image/gif', 'image/png', 'application/pdf', 'text/plain' ... but never an empty string */

If your dynamic stuff mimicks static files for some reason, and those files do exist, make sure you don’t handle them here.

Some files should pretend to be static, for example /robots.txt. Making use of variables like $isCrawler, $crawlerName, etc., you can use your smart robots.txt to maintain your crawler-IP database and more.

Execute script (dynamic URI)

Say you’ve a WP blog in /blog/, then you can invoke WordPress with
if (substring($requestUri, 0, 6) == “/blog/”) {
require(”$rootPath/blog/index.php”);
exit;
}

(Perhaps the WP configuration needs a tweak to make this work.) There’s a downside, though. Passing control to WordPress disables the centralized error handling and everything else below.

Fortunately, when WordPress calls the 404 page (wp-content/themes/yourtheme/404.php), it hasn’t sent any output or headers yet. That means you can include the procedures discussed below in WP’s 404.php:
$httpResponseCode = “404″;
$errSrc = “WordPress”;
$errMsg = “The blog couldn’t make sense out of this request.”;
require(”$includePath/err.php”);
exit;

Like in my WordPress example, you’ll find a way to call your scripts so that they don’t need to bother with error handling themselves. Of course you need to modularize the request handler for this purpose.

Resolve shortened URI

If you’re shortening your very own URIs, then you should lookup the shortUri table for a matching $requestUri before you process static stuff and scripts. Extract the real URI belonging to your site and serve the content instead of performing a redirect.

Excursus: URI shortener components

Using the hints below you should be able to code your own URI shortener. You don’t need all the balls and whistles (like stats) overloading most scripts available on the Web.

  • A database table with at least these attributes:

    • shortUri.suriId, bigint, primary key, populated from a sequence (auto-increment)
    • shortUri.suriUri, text, indexed, stores the original URI
    • shortUri.suriShortcut, varchar, unique index, stores the shortcut (not the full short URI!)

    Storing page titles and content (snippets) makes sense, but isn’t mandatory. For outputs like “recently shortened URIs” you need a timestamp attribute.

  • A method to create a shortened URI.
    Make that an independent script callable from a Web form’s server procedure, via Ajax, SOAP, etc.

    Without a given shortcut, use the primary key to create one. base_convert(intval($suriId), 10, 36); converts an integer into a short string. If you can’t do that in a database insert/create trigger procedure, retrieve the primary key’s value with LAST_INSERT_ID() or so and perform an update.

    URI shortening is bad enough, hence it makes no sense to maintain more than one short URI per original URI. Your create short URI method should return a previously created shortcut then.

    If you’re storing titles and such stuff grabbed from the destination page, don’t fetch the destination page on create. Better do that when you actually need this information, or run a cron job for this purpose.

    With the shortcut returned build the short URI on-the-fly $shortUri = getBaseUri() ."/" .$suriShortcut; (so you can use your URI shortener across all your sites).

  • A method to retrieve the original URI.
    Remove the leading slash (and other ballast like a useless query string/fragment) from REQUEST_URI and pull the shortUri record identified by suriShortcut.

    Bear in mind that shortened URIs spread via social media do get abused. A shortcut like ‘xxyyzz’ can appear as ‘xxyyz..’, ‘xxy’, and so on. So if the path component of a REQUEST_URI somehow looks like a shortened URI, you should try a broader query. If it returns one single result, use it. Otherwise display an error page with suggestions.

  • A Web form to create and edit shortened URIs.
    Preferably protected in a site admin area. At least for your own URIs you should use somewhat meaningful shortcuts, so make suriShortcut an input field.
  • If you want to use your URI shortener with a Twitter client, then build an API.
  • If you need particular stats for your short URIs pointing to foreign sites that your analytics package can’t deliver, then store those click data separately.
    // end excursus

If REQUEST_URI contains a valid shortcut belonging to a foreign server, then do a 301 redirect.
$suriUri = resolveShortUri($requestUri);
if ($suriUri === FALSE) {
$httpResponseCode = “404″;
$errSrc = “sUri”;
$errMsg = “Invalid short URI. Shortcut resolves to more than one result.”;
require(”$includePath/err.php”);
exit;
}
if (!empty($suriUri))
if (!stristr($suriUri, $canonicalServerName)) {
header(”HTTP/1.1 301 Here you go”, TRUE, 301);
header(”Location: $suriUri”);
exit;
}
}

Otherwise ($suriUri is yours) deliver your content without redirecting.

Redirect to destination (invalid request)

From reading your raw logs (404 stats don’t cover 302-Found crap) you’ll learn that some of your resources get persistently requested with invalid URIs. This happens when someone links to you with a messed up URI. It doesn’t make sense to show visitors following such a link your 404 page.

Most screwed URIs are unique in a way that they still ‘address’ one particular resource on your server. You should maintain a mapping table for all identified screwed URIs, pointing to the canonical URI. When you can identify a resouce from a lookup in this mapping table, then do a 301 redirect to the canonical URI.

When you feature a “product of the week”, “hottest blog post”, “today’s joke” or so, then bookmarkers will love it when its URI doesn’t change. For such transient URIs do a 307 redirect to the currently featured page. Don’t fear non-existing ‘duplicate content penalties’. Search engines are smart enough to figure out your intention. Even if the transient URI outranks the original page for a while, you’ll still get the SERP traffic you deserve.

Guess destination (invalid request)

For many screwed URIs you can identify the canonical URI on-the-fly. REQUEST_URI and HTTP_REFERER provide lots of hints, for example keywords from SERPs or fragments of existing URIs.

Once you’ve identified the destination, do a 307 redirect and log both REQUEST_URI and guessed destination URI for a later review. Use these logs to update your screwed URIs mapping table (see above).

When you can’t identify the destination free of doubt, and the visitor comes from a search engine, extract the search query from the HTTP_REFERER and pass it to your site search facility (strip operators like site: and inurl:). Log these requests as invalid, too, and update your mapping table.

Serve a useful error page

Following the suggestions above, you got rid of most reasons to actually show the visitor an error page. However, make your 404 page useful. For example don’t bounce out your visitor with a prominent error message in 24pt or so. Of course you should mention that an error has occured, but your error page’s prominent message should consist of hints how the visitor can reach the estimated contents.

A central error page gets invoked from various scripts. Unfortunately, err.php can’t be sure that none of these scripts has outputted something to the user. With a previous output of just one single byte you can’t send an HTTP response header. Hence prefix the header() statement with a ‘@’ to supress PHP error messages, and catch and log errors.

Before you output your wonderful error page, send a 404 header:
if ($httpResponseCode == NULL) {
$httpResponseCode = “404″;
}
if (empty($httpResponseCode)) {
$httpResponseCode = “501″; // log for debugging
}
@header(”HTTP/1.1 $httpResponseCode Shit happens”, TRUE, intval($httpResponseCode));
logHeaderErr(error_get_last());

In rare cases you better send a 410-Gone header, for example when Matt’s team has discovered a shitload of questionable pages and you’ve filed a reconsideration request.

In general, do avoid 404/410 responses. Every URI indexed anywhere is an asset. Closely watch your 404 stats and try to map these requests to related content on your site.

Use possible input ($errSrc, $errMsg, …) from the caller to customize the error page. Without meaningful input, deliver a generic error page. A search for [* 404 page *] might inspire you (WordPress users click here).


All errors are mine. In other words, be careful when you grab my untested code examples. It’s all dumped from memory without further thoughts and didn’t face a syntax checker.

I consider this pamphlet kinda draft of a concept, not a design pattern or tutorial. It was fun to write, so go get the best out of it. I’d be happy to discuss your thoughts in the comments. Thanks for your time.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

As if sloppy social media users ain’t bad enough … search engines support traffic theft

Prepare for a dose of techy tin foil hattery. [Skip rant] Again, I’m going to rant about a nightmare that Twitter & Co created with their crappy, thoughtless and shortsighted software designs: URI shorteners (yup, it’s URI, not URL).

don't get seduced by URI shortenersRecap: Each and every 3rd party URI shortener is evil by design. Those questionable services do/will steal your traffic and your Google juice, mislead and piss off your potential visitors customers, and hurt you in countless other ways. If you consider yourself south of sanity, do not make use of shortened URIs you don’t own.

Actually, this pamphlet is not about sloppy social media users who shoot themselves in both feet, and it’s not about unscrupulous micro blogging platforms that force their users to hand over their assets to felonious traffic thieves. It’s about search engines that, in my humble opinion, handle the sURL dilemma totally wrong.

Some of my claims are based on experiments that I’m not willing to reveal (yet). For example I won’t explain sneaky URI hijacking or how I stole a portion of tinyurl.com’s search engine traffic with a shortened URI, passing searchers to a charity site, although it seems the search engine I’ve gamed has closed this particular loophole now. There’re still way too much playgrounds for deceptive tactics involving shortened URIs

How should a search engine handle a shortened URI?

Handling an URI as shortened URL requires a bullet proof method to detect shortened URIs. That’s a breeze.

  • Redirect patterns: URI shorteners receive lots of external inbound links that get redirected to 3rd party sites. Linking pages, stopovers and destination pages usually reside on different domains. The method of redirection can vary. Most URI shorteners perform 301 redirects, some use 302 or 307 HTTP response codes, some frame the destination page displaying ads on the top frame, and I’ve seen even a few of them making use of meta refreshs and client sided redirects. Search engines can detect all those procedures.
  • Link appearance: redirecting URIs that belong to URI shorteners often appear on pages and in feeds hosted by social media services (Twitter, Facebook & Co).
  • Seed: trusted sources like LongURL.org provide lists of domains owned by URI shortening services. Social media outlets providing their own URI shorteners don’t hide server name patterns (like su.pr …).
  • Self exposure: the root index pages of URI shorteners, as well as other pages on those domains that serve a 200 response code, usually mention explicit terms like “shorten your URL” et cetera.
  • URI length: the length of an URI string, if less or equal 20 characters, is an indicator at most, because some URI shortening services offer keyword rich short URIs, and many sites provide natural URIs this short.

Search engine crawlers bouncing at short URIs should do a lookup, following the complete chain of redirects. (Some whacky services shorten everything that looks like an URI, even shortened URIs, or do a lookup themselves replacing the original short URI with another short URI that they can track. Yup, that’s some crazy insanity.)

Each and every stopover (shortened URI) should get indexed as an alias of the destination page, but must not appear on SERPs unless the search query contains the short URI or the destination URI (that means not on [site:tinyurl.com] SERPs, but on a [site:tinyurl.com shortURI] or a [destinationURI] search result page). 3rd party stopovers mustn’t gain reputation (PageRank™, anchor text, or whatever), regardless the method of redirection. All the link juice belongs to the destination page.

In other words: search engines should make use of their knowledge of shortened URIs in response to navigational search queries. In fact, search engines could even solve the problem of vanished and abused short URIs.

Now let’s see how major search engines handle shortened URIs, and how they could improve their SERPs.

Bing doesn’t get redirects at all

Bing 301 messed up SERPsOh what a mess. The candidate from Redmond fails totally on understanding the HTTP protocol. Their search index is flooded with a bazillion of URI-only listings that all do a 301 redirect, more than 200,000 from tinyurl.com alone. Also, you’ll find URIs that do a permanent redirect and have nothing to do with URI shortening in their index, too.

I can’t be bothered with checking what Bing does in response to other redirects, since the 301 test fails so badly. Clicking on their first results for [site:tinyurl.com], I’ve noticed that many lead to mailto://working-email-addy type of destinations. Dear Bing, please remove those search results as soon as possible, before anyone figures out how to use your SERPs/APIs to launch massive email spam campaigns. As for tips on how to improve your short-URI-SERPs, please learn more under Yahoo and Google.

Yahoo does an awesome job, with a tiny exception

Yahoo 301 somewhat OkYahoo has done a better job. They index short URIs and show the destination page, at least via their site explorer. When I search for a tinyURL, the SERP link points to the URI shortener, that could get improved by linking to the destination page.

By the way, Yahoo is the only search engine that handles abusive short-URIs totally right (I will not elaborate on this issue, so please don’t ask for detailled information if you’re not a SE engineer). Yahoo bravely passed the 301 test, as well as others (including pretty evil tactics). I so hope that MSN will adopt Yahoo’s bright logic before Bing overtakes Yahoo search. By the way, that can be accomplished without sending out spammy bots (hint2bing).

Google does it by the book, but there’s room for improvements

Google fails with meritsAs for tinyURLs, Google indexes only pages on the tinyurl.com domain, including previews. Unfortunately, the snippets don’t provide a link to the destination page. Although that’s the expected behavior (those URIs aren’t linked on the crawled page), that’s sad. At least Google didn’t fail on the 301 test.

As for the somewhat evil tactis I’ve applied in my tests so far, Google fell in love with some abusive short-URIs. Google –under particular circumstances– indexes shortened URIs that game Googlebot, having sent SERP traffic to sneakily shortened URIs (that face the searcher with huge ads) instead of the destination page. Since I’ve begun to deploy sneaky sURLs, Google greatly improved their spam filters, but they’re not yet perfect.

Since Google is responsible for most of this planet’s SERP traffic, I’ve put better sURL handling at the very top of my xmas wish list.

About abusive short URIs

Shortened URIs do poison the Internet. They vanish, alter their destination, mislead surfers … in other words they are abusive by definition. There’s no such thing as a persistent short URI!

Long time ago Tim Berners-Lee told you that URI shorteners are evil fucking with URIs is a very bad habit. Did you listen? Do you make use of shortened URIs? If you post URIs that get shortened at Twitter, or if you make use of 3rd party URI shorteners elsewhere, consider yourself trapped into a low-life traffic theft scam. Shame on you, and shame on Twitter & Co.

fight evil URI shortenersBesides my somewhat shady experiments that hijacked URIs, stole SERP positions, and converted “borrowed” SERP traffic, there are so many other ways to abuse shortened URIs. Many of them are outright evil. Many of them do hurt your kids, and mine. Basically, that’s not any search engine’s problem, but search engines could help us getting rid of the root of all sURL evil by handling shortened URIs with common sense, even when the last short URI has vanished.

Fight shortened URIs!

It’s up to you. Go stop it. As long as you can’t avoid URI shortening, roll your own URI shortener and make sure it can’t get abused. For the sake of our children, do not use or support 3rd party URI shorteners. Deprive the livelihood of these utterly useless scumbags.

Unfortunately, as a father and as a webmaster, I don’t believe in common sense applied by social media services. Hence, I see a “Twitter actively bypasses safe-search filters tricking my children into viewing hardcore porn” post coming. Dear Twitter & Co. — and that addresses all services that make use of or transport shortened URIs — put and end to shortened URIs. Now!



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Search engines should make shortened URIs somewhat persistent

URI shorteners are crap. Each and every shortened URI expresses a design flaw. All –or at least most– public URI shorteners will shut down sooner or later, because shortened URIs are hard to monetize. Making use of 3rd party URI shorteners translates to “put traffic at risk”. Not to speak of link love (PageRank, Google juice, link popularity) lost forever.

SEs could rescue tiny URLsSearch engines could provide a way out of the sURL dilemma that Twitter & Co created with their crappy, thoughtless and shortsighted software designs. Here’s how:

Most browsers support search queries in the address bar, as well as suggestions (aka search results) on DNS errors, and sometimes even 404s or other HTTP response codes other than 200/3x. That means browsers “ask a search engine” when an HTTP request fails.

When a TLD is out of service, search engines could have crawled a 301 or meta refresh from a page formerly living on a .yu domain for example. They know the new address and can lead the user to this (working) URI.

The same goes for shortened URIs created ages ago by URI shortening services that died in the meantime. Search engines have transferred all the link juice from the shortened URI to the destination page already, so why not point users that request a dead short URI to the right destination?

Search engines have all the data required for rescuing short URIs that are out of service in their datebases. Not de-indexing “outdated” URIs belonging to URI shorteners would be a minor tweak. At least Google has stored attributes and behavior of all links on the Web since the past century, and most probably other search engines are operated by data rats too.

URI shorteners can be identified by simple patterns. They gather tons of inbound links from foreign domains that get redirected (not always using a 301!) to URIs on other 3rd party domains. Of course that applies to some AdServers too, but rest assured search engines do know the differences.

So why the heck didn’t Google, Yahoo/MSN Bing, and Ask offer such a service yet? I thought it’s all about users, but I might have misread something. Sigh.

By the way, I’ve recorded search engine misbehavior with regard to shortened URIs that could arouse Jack The Ripper, but that’s a completely other story.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Thanks for all the ego food!

healthy and organic ego food burgerDefine Ego Food : Healthy, organic food for Sebastian’s ego, so it can grow up big and strong.

[Please note that only organic ego-food-burgers are healthy, so please refrain from any blackhatted tactics when praising or flaming me. Also, don’t even think that the greedy guy on the right will answer to the name of Sebastian. Rednecks crabby old farts insurrectionists are bald and wear a black hat.]

 

I’m not yet sure whether the old year ended with an asc(33) or the new year started with an U+0021. However, I want to shout out a loud Thank You! to you, my dear readers. Thanks to you my pamphlets prosper.

I’m not only talking about your very much appreciated kind mentions1 on your blogs. What gets my lazy butt out of my bed to write more pamphlets is another highlight of my day: checking this blog’s MBL and Feedburner stats. In other words: I write because you read, sphinn and stumble my articles.

The 2007 Search Blog Awards

Despite my attempt to cheat my way to a search blog award with a single-candidate-category, Loren over at SEJ decided to accept a nomination of my pamphlets in the Best SEO Blog category. It was a honor to play in that league, and it means a lot to me.

Congrats to Barry, and thanks to the 150 people who voted for me!

Yep, I’ve counted even the 1/2/3-votes, in fact as constructive criticism. I’ve no clue whether the folks who gave me low ratings just didn’t know me or considerd my blog that worthless. Anyway, I take that very seriously and will try to polish up Sebastian’s Pamphlets for the next round.

The 2007 Rubber Chicken Awards (SEM version)

Runner up in the 2007 Rubber Chicken AwardIn related good news, I, Google’s nightmare, have almost won the 2007 Rubber Chicken Award for the dullest most bizarre SEO blog post.

Ranked in row two I’m in good company with Geraldine, Jeff and David. Another post of mine made it in row three.

Congrats to Matt and Sandra who won the most wanted award on the Web!

More Ego Food

While inserting my daily load of blatant comment-author-link spam on several blogs, last night I stumbled upon a neat piece of linkbait from Shaun and couldn’t resist to slap and discredit him. Eventually he banned me, but I can spam via email too. Read the result more ego food tonight: Sebastian’s sauced idiot version of robots.txt pulled by Shaun from the UK’s Scotland’s great Hobo SEO Blog.

What can I improve?

I’m really proud of such a great readership. What do you want to see here this year? I’m blogging in my spare time, but I’ll try to fulfill as many wishes as possible. Please don’t hesitate to post your requests here. Consider the comments my to-do list for 2008. Thank you again, and have a great year!


1  It seems I’m suffering from an inbound link penalty: Technorati recently discoverd my new URL but refuses to update my reputation, despite all my pings, so I’m stuck with a daily link count.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Gaming Sphinn is not worth it

Thou shalt not spam Sphinn!OMFG, yet another post on Sphinn? Yup. I tell you why gaming Sphinn is counter productive, because I just don’t want to read another whiny rant in the lines of “why do you ignore my stuff whilst A listers [whatever this undefined term means] get their crap sphunn hot in no time”. Also, discussions assuming that success equals bad behavior like this or this one aren’t exactly funny nor useful. As for the whiners: Grow the fuck up and produce outstanding content, then network politely but not obtrusive to promote it. As for the gamers: Think before you ruin your reputation!

What motivates a wannabe Internet marketer to game Sphinn?

Traffic of course, but that’s a myth. Sphinn sends very targeted traffic but also very few visitors (see my stats below).

Free uncondomized links. Ok, that works, one can gain enough link love to get a page indexed by the search engines, but for this purpose it’s not necessary to push the submission to the home page.

Attention is up next. Yep, Sphinn is an eldorado for attention whores, but not everybody is an experienced high-class call girl. Most are amateurs giving it a (first) try, or wrecked hookers pushing too hard to attract positive attention.

The keyword is positive attention. Sphinners are smart, they know every trick in the book. Many of them make a living with gaming creative use of social media. Cheating professional gamblers is a waste of time, and will not produce positive attention. Even worse, the shit sticks at the handle of the unsuccessful cheater (and in many cases the real name). So if you want to burn your reputation, go found a voting club to feed your crap.

Fortunately, getting caught for artificial voting at Sphinn comes with devalued links too. The submitted stories are taken off the list, that means no single link at Sphinn (besides profile pages) feeds them any more, hence search engines forget them. Instead of a good link from an unpopular submission you get zilch when you try to cheat your way to the popular links pages.

Although Sphinn doesn’t send shitloads of traffic, this traffic is extremely valuable. Many spinners operate or control blogs and tend to link to outstanding articles they found at Sphinn. Many sphinners have accounts on other SM sites too, and bookmark/cross-submit good content. It’s not unusual that 10 visits from Sphinn result in hundreds or even thousands of hits from StumbleUpon & Co. — but spinners don’t bookmark/blog/cross-submit/stumble crap.

So either write great content and play by the rules, or get nowhere with your crappy submission. The first “10 reasons why 10 tricks posts about 10 great tips to write 10 numbered lists” submission was fun. The 10,000 plagiarisms following were just boring noise. Nobody except your buddies or vote bots sphinn crap like that, so don’t bother to provide the community with footprints of your lousy gaming.

If you’re playing number games, here is why ruining a reputation by gaming Sphinn is not worth it. Look at my visitor stats from July to today. I got 3.6k referrers in 4 months from Sphinn because a few of my posts went hot. When a post sticks with 1-5 votes, you won’t attract much more click throughs than from those 1-5 folks who sphunn it (that would give 100-200 hits or so with the same amount of submissions). When you cheat, the story gets buried and you get nothing but flames. Think about that. Thanks.

Rank Last Date/Time Referral Site Count
1 Oct 09, 2007 @ 23:29 http: / / sphinn.com/ story/ 1622 504
2 Oct 23, 2007 @ 14:53 http: / / sphinn.com/ story/ 2764 419
3 Nov 01, 2007 @ 03:42 http: / / sphinn.com 293
4 Oct 08, 2007 @ 04:21 http: / / sphinn.com/ story/ 5469 288
5 Nov 02, 2007 @ 13:35 http: / / sphinn.com/ story/ 8883 192
6 Oct 09, 2007 @ 23:38 http: / / sphinn.com/ story/ 4335 185
7 Oct 22, 2007 @ 23:55 http: / / sphinn.com/ story/ 5362 139
8 Oct 29, 2007 @ 15:02 http: / / sphinn.com/ upcoming 131
9 Nov 02, 2007 @ 13:34 http: / / sphinn.com/ story/ 7170 131
10 Sep 10, 2007 @ 09:09 http: / / sphinn.com/ story/ 1976 116
11 Oct 15, 2007 @ 22:40 http: / / sphinn.com/ story/ 6122 113
12 Sep 22, 2007 @ 13:39 http: / / sphinn.com/ story/ 3593 90
13 Oct 05, 2007 @ 21:56 http: / / sphinn.com/ story/ 5648 87
14 Sep 22, 2007 @ 13:25 http: / / sphinn.com/ story/ 4072 80
15 Oct 14, 2007 @ 17:24 http: / / sphinn.com/ story/ 5973 77
16 Aug 30, 2007 @ 04:17 http: / / sphinn.com/ story/ 1796 72
17 Oct 16, 2007 @ 05:46 http: / / sphinn.com/ story/ 6761 61
18 Oct 11, 2007 @ 05:56 http: / / sphinn.com/ story/ 1447 60
19 Sep 13, 2007 @ 12:27 http: / / sphinn.com/ story/ 4548 54
20 Nov 02, 2007 @ 22:14 http: / / sphinn.com/ story/ 11547 53
21 Sep 03, 2007 @ 09:34 http: / / sphinn.com/ story/ 4068 44
22 Oct 09, 2007 @ 23:40 http: / / sphinn.com/ story/ 5093 42
23 Nov 02, 2007 @ 01:46 http: / / sphinn.com/ story/ 248 41
24 Sep 14, 2007 @ 05:58 http: / / sphinn.com/ story/ 2287 36
25 Oct 31, 2007 @ 06:17 http: / / sphinn.com/ story/ 11205 35
26 Oct 07, 2007 @ 12:07 http: / / sphinn.com/ story/ 6124 25
27 Nov 01, 2007 @ 09:41 http: / / sphinn.com/ user/ view/ profile/ Sebastian 22
28 Aug 08, 2007 @ 10:52 http: / / sphinn.com/ story/ 245 21
29 Sep 02, 2007 @ 19:17 http: / / sphinn.com/ story/ 3877 17
30 Sep 22, 2007 @ 00:42 http: / / sphinn.com/ story/ 4968 17
31 Oct 01, 2007 @ 12:49 http: / / sphinn.com/ story/ 5310 17
32 Aug 30, 2007 @ 08:20 http: / / sphinn.com/ story/ 4143 14
33 Sep 11, 2007 @ 21:38 http: / / sphinn.com/ story/ 3783 13
34 Nov 01, 2007 @ 15:50 http: / / sphinn.com/ published/ page/ 2 11
35 Sep 01, 2007 @ 23:03 http: / / sphinn.com/ story/ 597 10
36 Oct 24, 2007 @ 18:17 http: / / sphinn.com/ story/ 1767 10
37 Sep 15, 2007 @ 08:26 http: / / sphinn.com/ story.php? id= 5469 8
38 Oct 30, 2007 @ 09:42 http: / / sphinn.com/ upcoming/ mostpopular 7
39 Oct 24, 2007 @ 18:38 http: / / sphinn.com/ story/ 10881 7
40 Oct 30, 2007 @ 01:19 http: / / sphinn.com/ upcoming/ page/ 2 6
41 Sep 20, 2007 @ 07:09 http: / / sphinn.com/ user/ view/ profile/ login/ Sebastian 5
42 Jul 22, 2007 @ 09:39 http: / / sphinn.com/ story/ 1017 5
43 Oct 13, 2007 @ 08:34 http: / / sphinn.com/ published/ week 5
44 Sep 08, 2007 @ 04:17 http: / / sphinn.com/ story/ 4653 5
45 Oct 31, 2007 @ 06:55 http: / / sphinn.com/ story/ 11614 5
46 Aug 13, 2007 @ 03:06 http: / / sphinn.com/ story/ 2764/ editcomment/ 4018 4
47 Aug 23, 2007 @ 07:52 http: / / sphinn.com/ story.php? id= 3593 4
48 Sep 20, 2007 @ 06:21 http: / / sphinn.com/ published/ page/ 1 4
49 Oct 23, 2007 @ 15:01 http: / / sphinn.com/ story/ 748 3
50 Jul 29, 2007 @ 10:47 http: / / sphinn.com/ story/ title/ Google- launched- a- free- ranking- checker 3
51 Sep 30, 2007 @ 21:13 http: / / sphinn.com/ category/ Google/ parent_ name/ Google 3
52 Aug 25, 2007 @ 04:47 http: / / sphinn.com/ story.php? id= 3735 3
53 Sep 15, 2007 @ 11:28 http: / / sphinn.com/ story.php? id= 5648 3
54 Sep 29, 2007 @ 01:35 http: / / sphinn.com/ story/ 7058 3
55 Oct 28, 2007 @ 22:56 http: / / sphinn.com/ greatesthits 3
56 Oct 23, 2007 @ 04:44 http: / / sphinn.com/ story/ 10380 3
57 Oct 27, 2007 @ 04:10 http: / / sphinn.com/ story/ 11233 3
58 Jul 13, 2007 @ 04:23 Google Search: http: / / sphinn.com 2
59 Jul 21, 2007 @ 03:19 http: / / sphinn.com/ story.php? id= 849 2
60 Jul 27, 2007 @ 10:06 http: / / sphinn.com/ story.php? id= 1447 2
61 Jul 30, 2007 @ 20:09 http: / / sphinn.com/ story.php? id= 1796 2
62 Aug 07, 2007 @ 10:01 http: / / sphinn.com/ published/ page/ 3 2
63 Aug 13, 2007 @ 11:20 http: / / sphinn.com/ story.php? id= 2764 2
64 Sep 05, 2007 @ 05:23 http: / / sphinn.com/ story/ 3735 2
65 Aug 28, 2007 @ 01:56 http: / / sphinn.com/ story.php? id= 3877 2
66 Aug 27, 2007 @ 10:01 http: / / sphinn.com/ submit.php? url= http: / / sebastians- pamphlets.com/ links/ categories 2
67 Aug 31, 2007 @ 14:13 http: / / sphinn.com/ story.php? id= 4335 2
68 Sep 02, 2007 @ 14:29 http: / / sphinn.com/ story.php? id= 1622 2
69 Sep 08, 2007 @ 19:48 http: / / sphinn.com/ story.php? id= 4548 2
70 Sep 05, 2007 @ 01:07 http: / / sphinn.com/ submit.php? url= http: / / sebastians- pamphlets.com/ why- ebay- and- wikipedia- rule- googles- serps 2
71 Sep 06, 2007 @ 13:22 http: / / sphinn.com/ published/ page/ 4 2
72 Sep 16, 2007 @ 13:30 http: / / sphinn.com/ story.php? id= 3783 2
73 Sep 18, 2007 @ 11:55 http: / / sphinn.com/ story.php? id= 5973 2
74 Sep 19, 2007 @ 08:15 http: / / sphinn.com/ story.php? id= 6122 2
75 Sep 19, 2007 @ 14:37 http: / / sphinn.com/ story.php? id= 6124 2
76 Oct 23, 2007 @ 00:07 http: / / sphinn.com/ story/ 10387 2
77 Jul 16, 2007 @ 18:21 http: / / sphinn.com/ upcoming/ category/ AllCategories/ parent_ name/ All Categories 1
78 Jul 19, 2007 @ 20:19 http: / / sphinn.com/ story/ 864 1
79 Jul 20, 2007 @ 15:57 http: / / sphinn.com/ story/ title/ Buy- Viagra- from- Reddit 1
80 Jul 27, 2007 @ 10:48 http: / / sphinn.com/ story/ title/ Blogger- to- rule- search- engine- visibility 1
81 Jul 31, 2007 @ 06:07 http: / / sphinn.com/ story/ title/ The- Unavailable- After- tag- is- totally- and- utterly- useless 1
82 Aug 02, 2007 @ 14:45 http: / / sphinn.com/ user/ view/ history/ login/ Sebastian 1
83 Aug 03, 2007 @ 10:59 http: / / sphinn.com/ story.php? id= 1976 1
84 Aug 06, 2007 @ 03:59 http: / / sphinn.com/ user/ view/ commented/ login/ Sebastian 1
85 Aug 15, 2007 @ 08:27 http: / / sphinn.com/ category/ LinkBuilding 1
86 Aug 15, 2007 @ 14:17 http: / / sphinn.com/ story/ 2764/ editcomment/ 4362 1
87 Aug 28, 2007 @ 13:42 http: / / sphinn.com/ story/ 849 1
88 Sep 09, 2007 @ 15:15 http: / / sphinn.com/ user/ view/ commented/ login/ flyingrose 1
89 Sep 10, 2007 @ 05:15 http: / / sphinn.com/ published/ page/ 20 1
90 Sep 10, 2007 @ 05:55 http: / / sphinn.com/ published/ page/ 19 1
91 Sep 11, 2007 @ 12:22 http: / / sphinn.com/ published/ page/ 8 1
92 Sep 11, 2007 @ 23:13 http: / / sphinn.com/ category/ Blogging 1
93 Sep 12, 2007 @ 09:04 http: / / sphinn.com/ story.php? id= 5362 1
94 Sep 13, 2007 @ 06:36 http: / / sphinn.com/ category/ GoogleSEO/ parent_ name/ Google 1
95 Sep 14, 2007 @ 08:21 http: / / hwww.sphinn.com 1
96 Sep 16, 2007 @ 14:52 http: / / sphinn.com/ GoogleSEO/ Did- Matt- Cutts- by- accident- reveal- a- sure- fire- procedure- to- identify- supplemental- results 1
97 Sep 18, 2007 @ 08:05 http: / / sphinn.com/ story/ 5721 1
98 Sep 18, 2007 @ 09:08 http: / / sphinn.com/ story/ title/ If- yoursquore- not- an- Amway- millionaire- avoid- BlogRush- like- the- plague 1
99 Sep 18, 2007 @ 10:02 http: / / sphinn.com/ story/ 5973#wholecomment8559 1
100 Sep 19, 2007 @ 11:48 http: / / sphinn.com/ user/ view/ voted/ login/ bhancock 1
101 Sep 19, 2007 @ 20:27 http: / / sphinn.com/ published/ page/ 5 1
102 Sep 20, 2007 @ 00:39 http: / / blogmarks.net/ my/ marks,new? title= How to get the perfect logo for your blog& url= http: / / sebastians- pamphlets.com/ how- to- get- the- perfect- logo- for- your- blog/ & summary= & via= http: / / sphinn.com/ story/ 6122 1
103 Sep 20, 2007 @ 01:34 http: / / sphinn.com/ user/ page/ 3/ voted/ Wiep 1
104 Sep 24, 2007 @ 15:49 http: / / sphinn.com/ greatesthits/ page/ 3 1
105 Sep 24, 2007 @ 19:51 http: / / sphinn.com/ story.php? id= 6761 1
106 Sep 24, 2007 @ 22:32 http: / / sphinn.com/ greatesthits/ page/ 2 1
107 Sep 26, 2007 @ 15:13 http: / / sphinn.com/ story.php? id= 7170 1
108 Sep 29, 2007 @ 05:27 http: / / sphinn.com/ category/ SphinnZone 1
109 Oct 09, 2007 @ 11:44 http: / / sphinn.com/ story.php? id= 8883 1
110 Oct 10, 2007 @ 10:04 http: / / sphinn.com/ published/ month 1
111 Oct 24, 2007 @ 15:07 http: / / sphinn.com/ story.php? id= 10881 1
112 Oct 26, 2007 @ 09:53 http: / / sphinn.com/ story.php? id= 11205 1
113 Oct 30, 2007 @ 08:58 http: / / sphinn.com/ upcoming/ page/ 3 1
114 Oct 30, 2007 @ 12:31 http: / / sphinn.com/ upcoming/ most 1
Total  3,688


Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Share Your Sphinn Love!

Sphinn RockstarsDonna started a meme with a very much appreciated compliment - Thanks Donna! Like me she discovered a lot of “new” folks at Sphinn and enjoyed their interesting blogs.

Savoring Sphinn comes with a duty Donna thinks, so she appeals to share the love. She’s right. All of us benefit from Sphinn love, it’s only fair to spread it a little. However, picking only three people I’d never have come accross without Danny’s newest donation to the Internet marketing community is a tough task. Hence I wrote a long numbered list and diced. Alea iacta est. Here are three of the many nice people I met at Sphinn:

Hamlet Batista Tadeusz Szewczyk Tinu Abayomi-Paul
Hamlet Batista Tadeusz Szewczyk Tinu Abayomi-Paul
Blog Blog Blog
Feed Feed Feed
A post I like A post I like A post I like

To those who didn’t make it on this list: That’s just kismet, not bad karma! I bet you’ll appear in someone’s share the sphinn love post in no time.

To you three: Get out your sphinn love post and choose three sphinners writing a feed-worthy blog, preferably people not yet featured elsewhere. I’ve subscribed to a couple feeds of blogs discovered at Sphinn, and so did you. There’s so much great stuff at Sphinn that you’re spoilt for choice.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

If you’re not an Amway millionaire avoid BlogRush like the plague!

Do not click BlogRush affiliate links before you’re fully awake. Oh no, you did it … now praise me because I’ve sneakily disabled the link and read on.
BlogRush

    My BlogRush Summary:

  1. You won’t get free targeted traffic to your niche blog.
  2. You’ll make other people rich.
  3. You’ll piss off your readers.
  4. You’ll promote BlogRush and get nothing in return.
  5. You shouldn’t trust a site fucking up the very first HTTP request.
  6. Pyramid schemes just don’t work for you.

You won’t get free targeted traffic to your niche blog

The niches you can choose from are way too broad. When you operate a niche blog like mine, you can choose “Marketing” or “Computers & Internet”. Guess what great traffic you gain with headlines about elegant click tracking or debunking meta refresh myths from blogs selling MySpace templates to teens or RFID chips to wholesalers? In reality you get hits via blogs selling diet pills to desperate housewives (from my referrer stats!) or viagra to old age pensioners, if you see a single BlogRush referrer in your stats at all. (I’ve read a fair amount of the hype about actually targeted headline delivery in BlogRush widgets. I just don’t buy it from what I see on blogs I visit.)

You’ll make other people rich

Look at the BlogRush widget in your or my sidebar, then visit lots of other niche blogs which are focused more or less on marketing related topics. All these widgets carry ads for generic marketing blogs pitching just another make me rich on the Internet while I sleep scheme or their very own affiliate programs. These blogs, all early adopters, will hoard BlogRush’s traffic potentials. Even if you can sign up at the root to place you at the top of the pyramid referral structure, you can’t avoid that the big boys with gazillions of owed impressions in BlogRush’s “marketing” queue dominate all widgets out there, your’s included. (I heard that John Reese will try to throw a few impressions on tiny blogs before niche bloggers get upset. I doubt that will be enough to keep his widgets up.)

You’ll piss off your readers

Even if some of your readers recognize your BlogRush widget, they’ll wonder why you recommend totally unrelated generic marketing gibberish on your nicely focused blog. Yes, every link you put on your site is a recommendation. You vouch for this stuff when you link out, even when you don’t control the widget’s content. Read Tamar’s Why the Fuss about BlogRush? to learn why this clutter is useless for your visitors. Finally, the widget slows your site down and your visitors hate long loading times.

You’ll promote BlogRush and get nothing in return

When you follow the advice handed out by BlogRush and pitch their service with posts and promotional links on your blog, then you help BlogRush to skyrocket at the search engines. That will bring them a lot of buzz, but you get absolute nothing for your promotional efforts because your referrer link doesn’t land on the SERPs.

You shouldn’t trust a site fucking up the very first HTTP request

Ok, that’s a geeky issue and you don’t need to take it very seriously. Request your BlogRush affiliate link with a plain user agent not accepting cookies or executing client sided scripting, then read the headers. BlogRush does a 302 redirect to their home page rescuing your affiliate ID in an unclosed base href directive. Chances are you’ll never get the promised credits from upsold visitors using uncommon user agents respectively browser settings, because they don’t manage their affiliate traffic properly.

Pyramid schemes just don’t work for you

Unfortunately, common sense is not as common as you might think. I’m guilty of that too, but I’ll leave my widget up for a while to monitor what it brings in. The promise of free traffic is just too alluring, and in fact you can’t lose much. If you want, experiment with it and waste some ad space, but pull it once you’ve realized that it’s not worth it.

Disclaimer

This post was inspired by common sense, experience of life, and a shitload of hyped crap posts on Sphinn’s upcoming list where folks even created multiple accounts to vote their BlogRush sales pitches to the home page. If anything I’ve said here is not accurate or at least plausible, please submit a comment to set the records straight.



Share/bookmark this: del.icio.usGooglema.gnoliaMixxNetscaperedditSphinnSquidooStumbleUponYahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

  1 | 2  Next Page »