Dave Johnson on open web technologies, social software and Java
This is the sixth in my series of Web Integration Patterns. Check out the intro at this URL http://rollerweblogger.org/roller/entry/web_integration_patterns
This pattern is about integrating web sites and applications by using standard feed formats to convey timely information, updates, status messages, events and other things from one web application to another.
A feed is a list of entries, each with an timestamp, ID, title, content and metadata like categories and tags. Entries are arranged in reverse chronological order. The entries in a feed can represent just about anything from blog entries, Flickr photos, YouTube videos, source-code change sets or tasks in a change management system. A feed is an XML resource that is available at a URL. If you want updates, then you poll that URL, ideally using HTTP Conditional GET so that you only pull down the feed when it has been updated.
Generally speaking there are two standard feed formats in use on the web: RSS and Atom, both are based on XML. Both use different element names, for example: what Atom calls "entries" RSS calls "items." Because these standard formats are so widely supported, providing a feed is an effective way to share updates from your web site or application.
Elements of Atom (from my 2006 presentation on Atom)
Another flavor of feeds is ActivityStrea.ms, which is essentially a feed format with a schema for representing about 70 different types of activities. These activities can be social network activities like share or friend and they can also be business activities like assign, resolve or schedule. One advantage of using the ActivityStrea.ms standard is that it has both an Atom and a JSON mapping.
Feed-based Integration is listed as a basic pattern because it can be very easy to implement. The ability to produce feeds is built-in to many different types of web applications from blog and wikis to continuous integration servers. If you are writing your own web application, you can choose to use XML tools to produce your feeds, a templating engine or a dedicated feed toolkit like ROME. You’ll find plenty of XML tools and templating engines no matter what language you are using.
Which type of feed should you produce? That depends. Atom is the most complete specification and is a true IETF standard, so often it is the right choice. To make the right decision, you have to consider who is going to be consuming your feeds. If your consumers prefer RSS, them give them that. If your consumers prefer JSON over XML, then consider ActivityStrea.ms in JSON flavor.
You can use a wide variety of tools to parse and process feed data. For example, there are many web sites and services that can digest feeds and trigger other events and processing. Services such as Yahoo Pipes and If This Then That can read feeds, process each item and perform other actions based on item values.
Processing a feed with Yahoo Pipes (from Marsh Gardiner's post)
If you need to add RSS/Atom reading features to your own software, you can use standard XML parsing tools and, for most language, you’ll find that there are open source libraries specifically designed for parsing feeds.
Feeds are a great way to do simple integrations, but there are limitations and there will be times you’ll need to go beyond the basics with RSS and Atom feeds. Here's an example. Normally, with feeds, clients have to repeatedly poll the feed URL for updates. This is annoying, and inefficient, even with HTTP Conditional GET. To address this problem, you can setup a PubSubHubub server that will subscribe to feeds and will then notify other subscribers instantly when updates are available, so that those other subscribers don’t have to poll.
Another problem is that, if you don’t poll often enough, you might miss some updates and they may “scroll” off the bottom of the feed before you see them. Feed providers can address this problem by supporting Feed Paging and Archiving, which allows clients to use next and previous links to “page” back to feed items that are no longer in the first page of the feed.
One more beyond-the-basics item to mention is the related pattern Web APIs, which we'll cover later. Web APIs are listed as a related pattern to Feed-based Integration. That's because feeds have been used as the basis for several "Web APIs" or protocols. These protocols specify how to use HTTP POST, GET, PUT and DELETE and create, retrieve, update and delete web resources that are represented as feed entries. Examples are the IETF's Atom Protocol, Microsoft's OData and Google's GData APIs.
That’s it for Feed-based Integration. In my next posts, we'll move in to the Advanced Patterns.