Q: Who has the most valuable information on the internet?
A: Del.icio.us has the most valuable information on the internet
Why?
What amazon has done for its products, del.icio.us has done for the world wide web.
Years ago, the most valuable information on the internet was the millions of bookmarks located on people’s home machines. This is kind of how yahoo began way back in 1994. When you were finally connected to this “Internet” thing, you didn’t really have anywhere to go or anything to do. It was neat and it was cool, but after an hour or so of staring at the screen, and maybe inputting a few sites your friends had told you about, that was it. Without a “You are here” sign describing “Where You Can Go,” you went back to using cc:mail as your first instant messaging system.
Once a friend told you that you should go to yahoo.com to *find something of interest* you were all set. Yahoo became your gateway to the internet. The seed. The root node.
Some weeks later, you discovered that you could make yahoo your *homepage* - whatever that meant, and that’s what you did, and that’s where you started everyday.
With a little surfing, you also discovered excite.com, and lycos.com, and netscape.com - but just about everyone started with yahoo.com, because Jerry Yang and David Filo had some text files with an “interesting web site” on each line, and they thought they should share that information with all the people at stanford - with the "Jerry and David's Guide to the World Wide Web" directory. A directory of the web sites that are cool and worth your time. This directory or list of sites became yahoo.
A few years later, alta vista, webcrawler, lycos, dogpile, excite and even yahoo itself became search engines. Alta vista was by far the best search engine until 1998, when google launched their beta service from their garage in Menlo Park. I remember trying-out google somewhere around the end of 98 or early 99, and that was it. I searched for linux and php, got what I was looking for, and never went back to the other search engines.
Back in 1999-2000, to find information, you used google (or yahoo, powered by google search). To find links that were good, and passed some kind of human screening, you went to Slashdot, or kuro5hin (the precursor to digg), and places like news.com (cnet) and game central (cnet). In 1999, I went to news.com first, then Slashdot, then kuro5hin, and then a whole bunch of news sites that took me a long time to find and were in my “bookmarks” or “favorites” - places like feed magazine (feedmag.com, oh how i miss you), suck.com, moreover.com, napster.com, scour.net, etc.
Joshua Schachter had come across a problem that a lot of us were having: namely, that we have bookmarks and favorites scattered across different machines/floppies/cd’s/zip disks. They were hard to find … got lost … were destroyed … to send them to your friends, you had to paste them into emails or IM windows.
Joshua created a site to post links to the public named memepool. This was around 2001. memepool showed a descending list of posts, blog-like, that contained some snarky commentary and a link or two. It was similar to the current-day del.icio.us/popular ….but was posted by Joshua and a few of friends. He created memepool to archive his bookmarks. It was essentially a single user system, but it was the first step into a larger world.
In 2003, Joshua decided to create a “multi-user public bookmarking system” and del.icio.us was born. While building delicious, Joshua happened to introduce this little concept he called ”tagging” – which, you may have heard, has changed the internet a little bit. His service was also one of the first to introduce human-readable url's and a rest api. Delicious also led the way in the concept of a user "owning" their data and provided easy methods for exporting ALL of your data at any time. "Pundits" are just now starting to get this concept, but delicious was way ahead of everyone else in providing this essential ingredient to adoption.
So, delicious has been operating for about 4 years, and has some x millions of users and some y (b)(m)illions of links in its database. This represents perhaps hundreds of thousands of “people hours” spent finding the web sites on the Internet that are valuable and worth checking-out.
Delicious represents choices that have been made. Delicious is a learning machine, and it has done learnt a shitload about what is not shit on the internet.
If you were starting your own search engine, the best way to seed your spiders would be to point them at your delicious bookmarks. I’ve done this with Nutch. It works quite nicely and creates a pure, useful, search engine. Why? Because your bookmarks are relevant and highly valuable. You've spent countless hours surfing and made a conscious and implicit choice to save and tag those specific urls. You took the time to say,”this site is worthy and I will now spend the next 2-30 seconds going through the ass pain of saving and tagging it for future consideration.”
This is implicit relevance, authority, pagerank: you name it. All wrapped-up in one. These url’s are pristine as far as indexes go. The super-secret highly complex stealth-mode ranking algorithm filter they passed through was: the not-too-shabby human neo cortex.
Sure, the delicious database of links is not comprehensive in the “indexing ever damn billions and billions of websites on the intarweb” sense. They don’t have to be. Delicious points to the sites that matter for a given tag. A tag is essentially a key word search in reverse.
My mom is not a php programmer and is not tagging url's with the tag “php” – but lots of php programmers are, and thus the list is comprehensive, interesting, ranked and has implicit authority for php programmers by php programmers.
So, delicious is the best platform and resource for what is good on the internet. It is the best database for discovering sites worth your time. It is simple: if a url was good enough to bookmark and tag, it is typically worth my time to look at it. And just like Amazon, it says to you, “if you liked this url, you may also like these urls and these tags and these users.” Not only do you find what you're looking for on delicious, but you discover new things all the time via related tags and users.
Sure, you run into different ideas of what is “worthy” by different users…and you may not agree with its worthiness; but, when it comes to tags coupled with urls, and urls that have been saved by many different people, you can be pretty confident that a given url is worth your time.
You might ask about spam. What about spammers on delicious? Won’t they ruin the pristine indexes and wreck the party like they have everywhere else? Perhaps. I’ve seen spam on delicious, but only recently…and it appears to be getting purged. However, since delicious is pretty much totally user generated, you can count on delicious users to help out with the spam problem. Basically adding a “this is spam” button next to every link would do the trick. The stuff you have to do on the backend to make this work is a little involved, but not impossible. I know: I’ve spent a lot of time at technorati coming up with ways in which this could be done.
You could also do it the delicious way: tag a spam url with the tag “thisisspam” --- delicious would get valuable human spam review for free, and could create filters that do not show urls with a thisisspam tag. Its index is cleaned by unwashed humans AND it has a valuable index of url's that are spam - which can perhaps be sold to companies that need such a list for their own indexes. In addition, you are not even purging the system of spam (thus making spammers cry their little eyes out about how unfair everything is), just placing a filter on it … so, if people really wanna look at the spam, they just check out http://del.icio.us/tag/thisisspam and the river of tears is cylenced.
When people talk about user generated search, or human edited search…they don't seem to realize that they have it right now with delicious. If you can imagine an application that combines delicious with google/powerset and wikipedia/answers.com, you can imagine what the next killer app is going to be. In fact, with these pieces of technology combined, all you need to do is add in a feedback mechanism, couple it to a yahoo answers–like interface and you have what everyone wants:
a global-brain/library-of-alexandria2.0/skynet/machine-that-has-learned-from-us creature which can answer any question we ask it.
…and has a bunch of answers for questions we have not even thought to ask it.
So, with the acquisition of delicious, yahoo has come back around to "Jerry and David's Guide to the World Wide Web" in a big way. If they can combine and enhance their search/knowledge/media technologies, they will be the first company to produce the web 3.0 app everyone has been breathlessly waiting for: an application/creature to which you say,” Just give me the good shit” - and then it does.