Friday, September 12, 2008

Metadata does a website good.

Maybe you remember that old tag line from the dairy industry here in the US. The advert claimed that "Milk does a body good."

Metadata on the web has a checkered past. Back when I was starting to put up websites in the mid '90s, Metadata tags, embedded in your webpage was what search crawlers used to help determine how significant a particular piece of content was to a website. So you'd overload your meta tags on a piece of HTML to completely spam the page with the terms you wanted to be ranked highly on. It quickly became worthless as everyone would work to put whatever the hot search terms were as metadata, even if the page didn't relate to that subject at all.

So why are we talking about metadata again? A picture is worth a thousand words.


That's United Airlines stock. You can't help but notice that they took a very hard hit on Monday. That's because there was a news story out that they were close to declaring bankruptcy. Unfortunately for the good shareholders and employees of UAL, that story had been dredged up from earlier this decade on a Florida newspaper's website. Google news saw that it was getting a lot of links and play and picked it up, bringing it to the front page for millions of people. Once it started to drop, in rolled the automatic sells and voila, you have the making of a very bad day.

A mistake. But for UAL stockholders and employees, not all that funny. The stock, even after recovering most of it's value that same day, is still trading over 10% lower than it was. That's a lot of value to lose because of an old news story being mistakenly played as current.

That's where metadata comes in. If the Florida newspaper's website had used metadata to tag each of its news stories with a "published date" the crawler could have known to safely ignore that story. Hell, the newspaper could have even built their own crawler to stamp the archived content with simple metadata pieces of information like "Last Published" and "Written By". Metadata can be held in a separate data structure that can then be joined to the content by the publishing system so that you're not having to reconstruct old legacy systems. If you're building a new content system for the web, it's a fantastically powerful tool to allow your users to see things like related content, manuals for your products, aftermarket items that connect to the product you are looking at, service programs for the product, etc.

Got an old stale site? It doesn't take much to wrap a metadata system around it to allow you to be able to flag content. If you've got the content quality at a high enough level, you can even offer to let your web viewers help you tag some of the content groupings. Good luck!

0 comments: