« Looking forward to Solaris 10 | Main | David Rumsey: How Online Digital Libraries Provide Access to Culture »
July 30, 2004
Notes on RSS Scaling Problem
I dropped by the RSS Scaling Problem BoF tonight. Even though I'm in a different ball game than Yahoo! or LiveJournal was interested in hearing the discussion. A bit of thinking about it has turned up a few thoughts from the ideas from the meeting.
Seems to me there are several different use cases that might demand different solutions. You've got the user who will launch the aggregator every time they want to read vs. the user who opens it up once and leaves it open all day. You also have the company who provides a weblog or feed service that gets millions of repeated requests for unchanged feeds and can afford to implement some large-scale solution vs the individual on a DSL modem who can afford nothing.
There were a few ideas in the meeting. A few stand out:
- respect the ttl tag if it's in the feed
- when a client requests a feed the HTTP connection is kept open and entries are fed down the wire as they are updated
- Create a DNS-like lookup where you can lookup the feed and get a response with the latest update time for that feed - perhaps be able to ask for last modified times of several feeds at once
- peer-to-peer feeds
- use jabber or chat-like protocol so feeds are sent when available
Maybe there can be some kind of combination:
- client respects the ttl tag
- if Yahoo, LJ, other weblog/feed services keep user posting trends they can tweak the ttl as they send the feeds out (allowing the user to override)
- client keeps track of last request persistently to preserve last access even if client is closed and relaunched
- if Yahoo, LJ, other weblog/feed services keep user posting trends they can tweak the ttl as they send the feeds out (allowing the user to override)
- client makes a conditional GET or request for just HTTP headers, in the case that the server want's to send a feed with a persistent connection the full feed is sent to the client and continuously updated
- does the client know if the HTTP connection closed? does it initiate a new connection automatically if it gets disconnected?
- this is great for news, or other have-to-know-now feeds, but for personal weblogs I'm not sure it's worth keeping the connection open if the next update won't be for several hours
- does the client know if the HTTP connection closed? does it initiate a new connection automatically if it gets disconnected?
- client makes a request for the HTTP headers, server returns a header, date is compared with last update and feed is grabbed only if the feed has been updated
The more I think about it the more I like the ttl idea, where if you're a techy and have your own server and know how to control the ttl you can, but if you're using a service it sets it for you based on your previous behavior (and lets you specify in preferences). The longest you'd have to wait if you change the ttl is until the end of your current ttl. Is it unrealistic to think that the client designers could respect the ttl?
Posted by mike at July 30, 2004 1:32 AM
Hard Drive Recovery Group offers hard disk data recovery services for RAID, laptops and servers. Complete clean room and hard drive repair service.Trackback Pings
TrackBack URL for this entry:
http://mike.kruckenberg.com/mt/mt-tb.cgi/61