I love the idea behind Cuil, the latest search engine in a long list of failures (Mahalo, Ask, Powerset) to challenge Google. As Mashable explains, they are pulling out all the stops to hit Google from multiple directions across their core search competency:
Enter Cuil, a very serious competitor, packed with ex-Googlers (Tom Costello and Anna Patterson are the backbone of Cuil, and they’ve both worked at Google), and claiming to have the largest index of websites – 120 billion – in the world.
It doesn’t end there: Cuil pulls pretty much every trick in the book. Big claims about the biggest index, privacy concerns (IP addresses of users aren’t saved, making it impossible for a third party to request it from them), semi-semantic approach (Cuil’s engine recognizes the relations between certain words on a web site, which helps it rank pages better). Hell, they even pulled the energy-saving trick: the front page of Cuil is completely black, in contrast to Google’s eye-poking whiteness.
Check out the Slashdottie thread for more discussion. I’m not interested in going there; rather I’m more concerned with how relevant the results from Cuil are, compared to Google, in a stricter context of information retrieval. After all, a search engine is about finding information.
4 of the 9 total results are spam from Ebooksbay. An additional 4 are for converting MP3s. The final result (which is quite spammy) is for ripping DVDs to a variety of formats. Score: 11%.
Google gives you 7 DVD ripping guides, and three spams site of ripping software. Essentially, you have to give it a Score: 100%, since it’s pretty much the baseline in our test. Just based on what I’ve seen so far, this will be a comparison not of relative merits, but of how much less relevant the results from Cuil are compared to Google.
Wait, what is that in the rightmost result!!!? Yes, that winsome young woman is carefully inspecting a ConcurrentHashMap! Ahm, bad image / search results correlations aside, the search listings fail to list the authority Java documentation source (Sun’s website) and instead list 2 mirrors (java 5 and 6), 4 bug reports, 3 mailing list discussions, and 2 random libraries with a similarly named class. Score: 50%.
Google nicely gives us the Sun Java page as the first result, 2 snippets of code using this class, 6 guides to using concurrent hash maps, a benchmark, one of the same random libraries as Cuil (Oswego), and a different random library (backport-util). I’d give them Score: 80% at this task.
Anyway, I’m getting tired of writing this. Cuil just doesn’t deliver fast, consistent, high-quality search results. The relevance is quite low, in spite of the interface improvements and searching / clustering / recommendation features.
|This entry was posted on Tuesday, July 29th, 2008 at 9:16 pm and is tagged with ripping dvds, relative merits, semantic approach, tom costello, googlers, ripping software, google, whiteness, privacy concerns, finding information, final result, how to rip a dvd, information retrieval, competency, backbone, ip addresses, competitor, baseline, cuil, third party. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.|