Blogging Roller: How Atom Publishing Protocol works

How Atom Publishing Protocol works

Unlike the other RSS and Atom books that are hitting the shelves these days, RSS and Atom in Action is going to cover the Atom Publishing Protocol. So, I'm following the protocol development very closely. This blog entry is a summary of Atom as it stands today (based on Draft 04 released May 10, 2005). It's a follow up to my post earlier post Atom in a nutshell which explained Atom API 0.9.

Atom Publishing Protocol (a work in progress) is a new web services protocol for interacting with a blog, wiki or other type of content management system by simply sending XML over HTTP, no SOAP or XML-RPC required. Your code interacts with collections of web resources by using HTTP verbs GET, POST, PUT and DELETE as they are meant to be used. Here's how it works.

Discovering your workspaces and collections

To start out, you do an authenticated GET on a server's Atom URL to get a description of the services available to you. You get back an an Atom Services XML document that lists the workspaces and within each the collections available. A workspace could be a blog, a wiki namespace or content collection that you have access to via your username/password.

Each workspace can contain two types of collections: entries and resources. Eventually, the spec will probably allow for (at least) five types of collections: entries, categories, templates, users, and generic resources.

Here is an example of a services document XML for a blog user with access to two blogs "My Blog" and "Marketing Team Blog":

<?xml version="1.0" encoding='utf-8'?>
<service xmlns="http://purl.org/atom/app#">
     <workspace title="My Blog" > 
        <collection contents="entries" title="Blog Entries" 
            href="http://localhost:8080/roller/atom/myblog/entries" />
        <collection contents="generic" title="File Uploads" 
            href="http://localhost:8080/roller/atom/myblog/resources" />
     </workspace>
     <workspace title="Marketing Team Blog">
         <collection contents="entries" title="Blog Entries" 
             href="http://localhost:8080/roller/atom/marketingblog/entries" />
         <collection contents="generic" title="File Uploads" 
             href="http://localhost:8080/roller/atom/marketingblog/entries" />
     </workspace>
</service>

Working with collections

So, a workspace is a blog and a blog contains collections of things like entries, uploaded file resources, categories, etc. All of these collections are handled the same way and respond the same way to the HTTP verbs. All support paging, so the server can return a collection in easy to digest chunks. All support query by date, so you can filter collections by start and end date.

To get a collection, do a GET on it's URI (that's the collection's 'href' as listed in the services document).

The services document specifies a URI for each collection. If you do a GET on a collection URI, you'll get back an Atom Collection XML document that lists the first X number of members in the collection. If there are more than X members, then the document will include a next URI which you can use to get the next batch of members. You can also specify an HTTP Range header to restrict a collection to only those between specific start and end dates.

Here is an example of a the collection document XML you might receive by doing a GET on the Blog Entries collection from My Blog above.

<?xml version="1.0" encoding='utf-8'?>
<collection xmlns="http://purl.org/atom/app#"
   next="http://localhost:8080/roller/atom/myblog/entry/77088a1" >
   <member title="The Connells: Fun and Games"
         href="http://localhost:8080/roller/atom/myblog/entry/7700a0" 
         updated="2005-04-16T23:07:08-0400" />
   <member title="The Connells: Boylan Heights"
         href="http://localhost:8080/roller/atom/myblog/entry/7700c9" 
         updated="2005-04-15T23:06:09-0400" />
   <member title="The Connells: Gladiator"
         href="http://localhost:8080/roller/atom/myblog/entry/7700c8" 
         updated="2005-04-14T14:05:31-0400" />
</collection>

Each member in a collection document has a title and a URI. To get a member of a collection, you do a GET on the member's URI. To delete it you use DELETE. To update it you use PUT. Here's an example of the entry XML you might receive by doing a GET on the first member of the Blog Entries collection example above:

<entry xmlns="http://purl.org/atom/ns#">
  <title>The Connells: Fun and Games</title>
  <id>7700a0</id>
  <updated>2005-06-01T19:07:45Z</updated>
  <content type="html">
     Let me tear down into your heart
     Let me take a seat and stay awhile
     Let me have a half of your whole  
     Let me keep it for myself awhile  
  </content>
</entry>

To add a member to a collection, you simply POST the member to the collection's URI. If you're POSTing a new entry, send the XML for the entry (example entry XML shown below). If you're POSTing a file upload, then send the file.

That's it. Pretty simple huh?

What about authentication?

The Atom spec requires that a server support either HTTP digest authentication, which can be difficult for some bloggers to implement (depending on their ISP and web server), or CGI authentication, which can be a lot easier to implement (I believe WSSE qualifies as CGI authentication, and that's what my Atom implementation uses).

What about devices with limited HTTP support?

Some devices have crippled HTTP client capabilities. For example, some Java J2ME powered cell phones can't do an HTTP PUT or DELETE. If you can't do a PUT or a DELETE, you can use POST instead with a SOAP wrapper that specifies a Web-Method of PUT or DELETE. Atom servers are required to support SOAP POSTS and returning results in SOAP envelopes.

What about draft vs. published states for entries?

Still undecided. Some folks suggest using different collections for entries in different states (draft, approved, published, etc.). But it's more likely that a new element will be introduced in the Atom format to specify the state of an entry.

Summary

That's my summary of Atom protocol as it stands today. I think it's a great improvement over existing blog APIs in terms of features and design. And it's not very difficult to implement. I know because I'm almost done with my server and client implementations. I hope to release them shortly for review. The release will probably be a standalone release of Roller (the Atom server) and my BlogClient UI (the Atom client).

If you see some opportunities for improvement in the protocol, please join the Atom Protocol mailing list and help out. Last call for spec changes in now slated for October. And for the Atom experts out there: what did I get wrong? Leave a comment.

Dave Johnson in Blogging • 🕒 07:00AM Jun 03, 2005

Tags: app atom blogapps