Log In

Student projects/Feed aggregation library

Published 1 month ago5 minute read

From MoodleDocs

Jump to:navigation, search

Line 1: Line 1:


<p class="note">'''Note''': This page outlines ideas for the Feed aggregation library project. It's a ''specification under construction''! If you have any comments or suggestions, please add them to the [[Talk:Student projects/Feed aggregation library|page comments]].''</p>

<p class="note">'''Note''': This page outlines ideas for the Feed aggregation library project. It's a ''specification under construction''! If you have any comments or suggestions, please add them to the [[Talk:Student projects/Feed aggregation library|page comments]].''</p>

Warning: This page is no longer in use. The information contained on the page should NOT be seen as relevant or reliable.

Note: This page outlines ideas for the Feed aggregation library project. It's a specification under construction! If you have any comments or suggestions, please add them to the page comments.

This is a draft spec as part of the Google Summer of Code submission of Chris Zubak-Skees (chriszs [at] gmail.com). It is preliminary and partial. Spec based on the "Consuming RSS feeds" idea listed on Student projects. I welcome any and all feedback.

RSS/Atom feeds are becoming an important technology on the web and so it's crucial that Moodle has good support for consuming these feeds for a variety of different uses. At present we generate RSS feeds for use by other applications, though only consume feeds in the RSS feeds block. It would be useful to have a core library which can take care of aggregating feeds (and the issues around it) and to provide them in a simple format for plugins and other core parts of Moodle to use.

This project will involve creating a feed aggregation library which:

As a proof of concept, it will be necessary to refactor the RSS block to use it.

To test this library, it will be necessary to develop a large test corpus of valid and mildly invalid RSS feeds from popular websites, content management tools, and manufactured by the tester.

Term Definition
RSS feed A list of links in a machine readable format. Often used to syndicate itemized and chronological content, such as blog posts. RSS refers to a particular technology, but we use it interchangeably with Atom here.
Atom Another feed specification with characteristics similar to RSS feeds.

Note: Database specifics are preliminary.

Stores a list of requested feed URLs and some associated information.

Field Type Default Info
id int(10) autoincrementing
feedurl varchar(255) The URL at which to fetch the feed
normalizedfeedurl varchar(255) A URL stripped of some specifics, used to match against requested URLs (see get_feed())
timefetched int(10) The time this feed was last fetched. Used for caching and potential pruning
feedtitle varchar(255) The name of this feed as retrieved when last fetched
siteurl varchar(255) The URL of the site attached to this feed as retrieved when last fetched
feeddescription text The description of the feed as retrieved when last fetched

Stores a list of requested feed items.

Field Type Default Info
id int(10) autoincrementing
feedurlsid int(10) Ties the item to the feed
itemurl varchar(255) The URL as retrieved in the item
itemtime int(10) The time this item indicates or when it was first fetched (requires keeping track of individual feed items)
itemtitle varchar(255) The name of this item as retrieved when last fetched
itemguid varchar(255) The unique id of the element as retrieved when last fetched
itemdescription text The description of the item as retrieved when last fetched
itemposition int(10) The position of the item in the RSS feed channel's list

Note: API specifics are preliminary.

Returns a data structure with a list of RSS feeds found at the given URL. If the page itself is an RSS feed returns just that feed. If the page is HTML then attempts to auto-discover RSS feeds in the header meta tags or linked in the body.

A further possibility (and one that may be beyond the immediate scope) is to use some sort of search mechanism (such as Google Blog Search) to retrieve a list of RSS feeds that match a given search term. This might be fragile, because it would depend on the data provider maintaining consistency.

Returns a data structure representing a RSS feed. URL should be normalized (e.g. http://www.example.com/feed.xml and http://example.com/feed.xml are probably the same), but not overly aggressively. Should be transparently and centrally cached for subsequent calls within some time period. The data structure should be consistent, abstracting away details of the feed organization where possible.

Returns a data structure representing multiple RSS feeds with items merged into one stream based on either provided or fetched date-time information and provided order. Should use get_feed() internally.

The RSS block needs to be refactored to use the new public API. If possible parts of the existing database structure should be maintained for backwards compatibility. Will need to use find_feed() when user adds RSS feed and get_feed() on every subsequent display. May need minimal additional/changed preferences UI to accommodate result of find_feed().

Ideas for the future

See also

publisher logo

You may also like...