Have you ever wanted to add the current Technorati Top 100 to your feed reader, but couldn’t find an XML or OPML feed? Well, today your fears are over. In a mere 71 lines of PHP, I’ve created a script that parses, caches, and gzips the top blogs of the day. This means that it’s nice on Technorati’s servers, nice on my servers, bandwidth-savvy, and fast for you. If you want to import the world’s most popular blogs into your RSS reader today, use this file:
What does it look like? A standard xml file, with utf-8 encoding:
How do you import it? If you’re using bloglines, visit this URL:
Then you should see a file upload form. Simply save my opml file to your desktop, and upload it into bloglines:
After that, you’ll be all set to enjoy reading the best blogs on the planet.
* For you standards fanatics, this validates
I took this down to add RSS / ATOM autodiscovery and caching, so that it will add the xmlUrl as well as the htmlUrl to the OPML file. This is troublesome, since the standard link tag for this:
is not implemented properly on the following blogs that otherwise have RSS:
I have sent the following email collectively to them:
I am contacting you because you are either the listed technical contact of your online blog, or you were the most prominent email contact address on your site. You are also on the technorati top 100 list currently, and have an RSS feed.
Unfortunately, your site does not properly define a LINK tag for your rss feed. You can read about the specification here:
The format of the link tag is simple. Just add:
<link rel=”alternate” type=”application/rss+xml” title=”RSS” xhref=”url/to/rss/file” mce_href=”url/to/rss/file”>
in the HEAD section of your blog. This allows rss readers and other web agents to autodiscover your feed url and present it to end-users. Autodiscovery leads to greater readership, more page hits, and even more popularity. It also enables web-developers and users to easily locate the XML feeds on your site. Everyone wins.
Sadly, there are also an equal or greater number of Technorati Top 100 sites completely lacking RSS…
I’ve fixed problems loading a file using PHP’s file_get_contents with Dreamhost, since Dreamhost prefers cURL:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 1);
$data = curl_exec($ch);
Everything should now be running smoothly. Also, entities are properly escaped too, for XML well formedness.
Technorati changes their top format again, I’ve updated the parser. Lemme know next time it breaks.
Someone just left a comment complaining that the feed broke. It did. It’s fixed (again). Let me know if it doesn’t work b/c Technorati changed something…
|This entry was posted on Wednesday, August 3rd, 2005 at 3:12 am and is tagged with technorati top 100, rathergood com, link tag, xml file, opml, bloglines, contact address, file upload, rss reader, caches, utf 8, fears, bandwidth, atom, email, denbeste, servers, blogs, blog. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.|