Monthly archive: October, 2005

I want more Jaggers!

Jagger-1 was good to me, and what I’ve seen from Jagger-2 pleases me even more. I can’t wait for Jagger-3! Dear folks at Google, please continue the Jagger series and roll out a new Jagger weekly, I’d love to see my Google traffic doubling every week! In return I’ll double the time I’m spending on Google user support in your groups. Thanks in advance!

Tags: Search Engine Optimization (SEO) Google



Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Smart Web Site Architects Provide Meaningful URLs

From a typical forum thread on user/search engine friendly Web site design:

Question: Should I provide meaningful URLs carrying keywords and navigational information?

Answer 1: Absolutely, if your information architecture and its technical implementation allow the use of keyword rich hyphened URLs.

Answer 2: Bear in mind that URLs are unchangeable, thus first consider to develop a suitable information architecture and a flexible Web site structure. You’ll learn that folders and URLs are the last thing to think of.

Question: WTF do you mean?

Answer: Here you go, it makes no sense to paint a house before the architect has finished the blueprints.

Tags: Search Engine Optimization (SEO) Information Architecture



Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

MySQL’s ODBC Driver 3.51 drives me nuts

ODBC drivers can drive me crazy. Especially when the last thing I’m looking at is the ODBC driver, because I thought this darn thing is a well developed and tested open source piece.

A site I’m working on collects log data in a MySQL table, counting page views per landing page, referrer page and SE search terms. The stats are nice, but pretty much useless, because with such a structure it’s hard to create summaries and keyword analyses.

Luckily Progress OpenEdge was available, so I thought it should be possible to read the MySQL table via ODBC from the Web server, creating all reports with Progress, which has a great temp-table support, amazing fast word indexing, and can handle billions of large records with ease.

Well, I’ve downloaded, installed and configured the MySQL ODBC Driver 3.51, and did a successful connection test. So far so nice, but now the nightmare began. With Progress I couldn’t create the ODBC data server instance, and as always the error messages were misleading.

To make a long story short, the current MySQL ODBC driver lacks so much functionality that it cannot work. The answer is buried in the PEG mailing list archive which is not fully indexed by Google. Gus Bjorklund from Progress Software states “The ODBC dataserver will not work due to a variety of functions not implemented in the MySQL ODBC driver … As people who have tried it have discovered, MySQL does not yet have a complete enough implementation of the SQL DML”.

Frustrating. Back to the stone age. Oups. Transferring table dumps failed due to the large amount of data. Aaahhhrrrggg. Developing a Web service in PHP sending selected data in handy batches makes my day.

However, does anybody has (heard of) a working ODBC driver for MySQL?

Tags: MySQL ODBC Progress Miserable Failure



Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

Duplicate Content Filters are Sensitive Plants

In their ever lasting war on link and index spam search engines produce way too much collateral damage. Especially hierarchically structured content suffers from over-sensitive spam filters. The crux at this is, that user friendly pages need to duplicate information from upper levels. The old rule “what’s good for users will be honored by the engines” no longer applies.

In fact the problem is not the legitimate duplication of key information from other pages, the problem is that duplicate content filters are sensitive plants not able to distinguish useful repetition from automated generation of artificial spider fodder. The engines won’t lower their spam threshold, that means they will not fix this persistent bug in the near future, so Web site owners have to live with decreasing search engine traffic, or react. The question is, what can a Webmaster do to escape the dilemma without converting the site to a useless nightmare for visitors, because all textual redundancies were eliminated?

The major fault of Google’s newer dupe filters is, that their block level analysis often fails in categorizing page areas. Web page elements in and near the body area, which contain duplicated key information from upper levels, are treated as content blocks, not as part of the page template where they logically belong to. As long as those text blocks reside in separated HTML block level elements, it should be quite easy to rearrange those elements in a way that the duplicated text becomes part of the page template, what should be safe at least with somewhat intelligent dupe filters.

Unfortunately, very often the raw data aren’t normalized, for example the text duplication happens within a description field in a database’s products table. That’s a major design flaw, and it must be corrected in order to manipulate block level elements properly, that is to declare them as part of the template vs. part of the page body.

My article Feed Duplicate Content Filters Properly explains a method to revamp page templates of eCommerce sites on the block level. The principle outlined there can be applied to other hierarchical content structures too.

Tags: Search Engine Optimization (SEO) Google



Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments
 

New Google Dupe Filters?

Folks at WebmasterWorld, ThreadWatch and other hang-outs discuss a new duplicate content filter from Google. This odd thing seems to wipe out the SERPs, producing way more collateral damage than any other filter known to SEOs.

From what I’ve read, all threads contentrate on on-page and on-site factors trying to find a way out of Google’s trash can. I admit that on-page/site factors like near-duplicates produced with copy, paste and modify operations or excessive quoting can trigger duplicate content filters. But I don’t buy that’s the whole story.

If a fair amount of the vanished sites mentioned in the discussions are rather large, those sites probably are dedicated to popular themes. Popular themes are subject of many Web sites. The amount of unique information on popular topics isn’t infinite. That is, many Web sites provide the same piece of information. The wording may be different, but there are only so many ways to rewrite a press release. The core information is identical, making many pages considered near-duplicates, and inserting longer quotes even duplicates text snippets or blocks.

Semantic block analysis of Web pages is not a new thing. What if Google just bought a few clusters of new machines, now applying well known filters on a broader set of data? This would perfectly explain why a year ago four very similar pages all ranked fine, then three of four disappeard, and since yesterday all four are gone, because the page having the source bonus resides on a foreign Web site. To come to this conclusion, just expand the scope of the problem analysis to the whole Web. This makes sense, since Google says “Google’s mission is to organize the world’s information”.

Read more here: Thoughts on new Duplicate Content Issues with Google.

Tags: Search Engine Optimization (SEO) Google



Share/bookmark this: del.icio.us • Google • ma.gnolia • Mixx • Netscape • reddit • Sphinn • Squidoo • StumbleUpon • Yahoo MyWeb
Subscribe to      Entries Entries      Comments Comments      All Comments All Comments