The module must be usable on devices with limited memory and disk space. Disk space policy is discussed elsewhere, this document discusses memory use.
The module will be optimised for typical use. We assume that by typical use, a subscriber will subscribe to less than 1000 feeds, and that each feed typically contains less than 200 entries.
It is also optimized for the typical use of reading a feed and its entries one at a time before moving on to the next feed and reading its entries. But not often going back and reread a feed where there are no new entries.
Although this is the typical use the module is optimized for it should still work with other use patterns, but will not be as fast.
Feeds and its entries are always kept and deleted simultaniously in memory.
Feeds are constructed in memory when parsing, so all feeds currently being parsed have to be kept in memory. It should be possible to load several feeds simultaniously. The number of feeds being loaded simultaniously should be tweakable. For deviced with limlited bandwidth and memory two or three simultaniously loads could be enough, for a desktop much more feeds may be loaded simultaniously.
Other modules (including javascript) may keep pointers to feeds and entries. The feeds should be kept in memory as long as other modules keep references to them.
After one feed has been parsed, and particulary after its entry list has been displayed, its entries are very likely to be needed shortly (user want to read entries from the feed). It should therefore be kept in memory for some time after being parsed. When updating all feeds it's much harder to know which feed is needed next.
One possible solution would be to always keep a limited number of feeds in memory after being parsed, and particulary after being displayed. A low number probably should suffice, particulary on devices with limited memory. Possibly as low as just one. Should be a tweakable number.
If a feed is not in memory when it is requested/needed it must be retrieved from disk before it can be used. From the data storage format we've chosen we keep one file for each feed. Since feeds are predicted to remain fairly small reading the whole feed from disk and creating a feed (and entry) object from it should be fast enough to do synchronously. As a result it should be totally transparent from the outside if the entry is in memory or has to be fetched from disk storage.
In addition to classes for feeds and entries there must be a general store class. This should record which feeds we are subscribed to, store all feeds which are in memory and be able to fetch feeds not in memory.
The store must be able to fetch feeds from their internal id if they are either in memory or in disk storage. Since it would be fairly common operation to fetch a feed by id it should be reasonably fast for a list of up to 1000 feeds.
It should be able to generate a list of subscribed feeds, sorted in some logical way (alphabetically by feed title?). Expected to be done regulary, but not very often.
Should be able to subscribe to new feeds and stop subscribing to existing feeds. Should be needed rarely and need not be particularly fast.
The easiest model would be a list of feeds. This would however require a linear search each time a feed is requested. Lists are fast at insertion and deletion which is expected to be a rare occurance. If sorted in same order as displayed it will be very fast to display sorted list. Probably only useful if totally number of feeds is low enough for the linear search not to be a factor.
An array sorted by id should be fast to look up feeds in (if doing binary search). It will however need to be sorted in a separate array before display, and insertions and deletions will be fairly slow.
A hash will be very fast at looking up by id, and fast at insertions and deletions. As with arrays it will have to be sorted in a separate array before display. It also has the most general overhead, so there has to be quite a few feeds for its benefits to show.
It's possible to use several in combination, e.g. a hash for fast lookup and an array for fast display. Both will then have to be updated simultaniously and both will have to be kept in memory. The overhead is probably not worth it except for on deviced with excessive memory.