Posts tagged 'atom'



Atom Publishing Protocol, draft #10

APP draft #10 is available. I'm still reading it over, but the major changes appear to be:
  • Categories can be specified at the workspace and collection level. Multiple category schemes are allowed and both fixed and free-form categories (e.g. tags) are allowed.
  • Collection titles are now specified by an <atom:title> instead of an attribute on the <collection> element.
  • A new "slug" header has been added for media posts so that clients can specify the file-name to be used for the uploaded file.
I'm especially happy about the category support -- now Atom protocol can do everything that MetaWeblog API can do, and much more. I'll be updating my client and server implementations during the next week.

Tri-XML 2006 presentation


Here's the abstract of the talk I gave this morning at Tri-XML 2006:
Beyond blogging: Atom format and protocol. Like XML-RPC and SOAP before, feeds and publishing protocols were born in the blogopshere and quickly moved beyond blogging. Nowadays, web service providers are using RSS/Atom feeds and REST-based publishing protocols as lightweight alternatives to SOAP. And developers are finding new ways to combine web services from different sites into new applications, known as "mash-ups" in the lingo of Web 2.0. If you'd like to do the same, then attend this talk to learn about the new IETF Atom feed format (RFC-4287) and the soon-to-be-finalized Atom protocol, which together form a strong foundation for REST-based web services development.
Here's a rough outline of the talk:
  • Introduction
    • Beyond blogging
    • Blogs hit the hit time
    • The web is bloggy
    • Atom as an alternative to WS-*
  • Understanding feeds
    • Birth of RSS
    • RSS 1.0: the RDF fork
    • The simple fork and RSS 2.0
    • Atom: the standard
  • Parsing feeds
    • Fetching and parsing feeds
    • Universal Feed Parser
    • ROME utilities
    • Windows RSS platform
  • Serving feeds
    • Approaches for generating and serving feeds
    • Feed autodiscovery
    • Styled feeds
  • Atom protocol
    • Compared to MetaWeblog
    • REST based approach
    • Introspection
    • Collections
    • Extending Atom
  • Atom protocol in action
    • Getting a service doc
    • Getting collections
    • Posting an entry
    • Posting an image
  • Demo: interacting with an Atom server via command-line
And here are the slides: TriXML2006-BeyondBlogging.pdf

Tags: topic:[Atom Publishing Protocol], topic:[Atom], topic:[APP], topic:[RSS], topic:[feeds]

Last two chapters to production


Over the weekend, I put my finishing touches on the (last) two new chapters for RSS and Atom in Action. Tomorrow they'll both be off to copy-editing, typesetting and then to the printers for publication in mid-June.

I really lucked out in the reviewer category. Thanks to Walter VonKoch of Microsoft's Windows RSS Platform team, who not only answered my questions but kindly offered to review the Windows RSS chapter. And thanks also to former co-workers Pat Chanezon and Alejandro Abdelnur, who reviewed the ROME chapter.

By the way Alejandro is back from Asia, blogging again and already coming up with cool new APIs for ROME. Checkout ROME.Mano, a pipeline framework for RSS and Atom feeds.

Atom protocol and WADL


Via The Aquarium I see that Mark Hadley's work on Web Application Description Language (WADL) is now a Sun Technical Report. WADL provides a way to describe a REST based web application or service so that tools can discover services, generate proxies, etc. As I understand it, WADL is to REST as WSDL is to SOAP.

There's also something new since the last time I looked at WADL. Mark has added a section on the Atom protocol and examples that show how to use a WADL file to replace an Atom introspection document. Looks like good stuff to me. If you need an introspection doc for your REST based web service, why not use WADL?

Via Google, I found that there's also a WADL presentation on-line.

Tip'o'the hat to the FeedValidator and crew


I'd like to thank the folks who developed and run the FeedValidator, a valuable service that let's you know if your feeds validate against the RSS, Atom and commonly used extension specs. The warnings that it issues may be irritating and some can be safely ignored, but they're valuable just the same. I don't particularly like the warnings about <content:encoded> (which Roller now uses, by the way) and the style attribute, but I understand why they're necessary. If you want to whine about something, whine about the crappy RSS specs that we're all stuck with not the folks that are trying to help you understand them.

Update: In Roller RSS 2.0 feeds, we now use <atom:summary> for entry.summary (which is new) and <description> for entry.text (as we always have).

Re: Experimenting with the MS Feeds API


I'm seeing lots of interest in my MS Feeds API post yesterday, sparked by links from Sam Ruby, Dave Winer and Randy Morin. Some people might have gotten the impression that I was criticizing the decisions Microsoft made in mapping RSS elements and extension elements to the Feeds API object model. I wasn't.

I think Microsoft made pretty good choices, given the simplified object model that they're working with. If somebody is using funky RSS, then they mean it. For example, if somebody declares the Content Module namespace and uses the <content:encoded> namespace in their feed, then that's probably the content that they want folks to use. I think that's the philosophy Microsoft used in making those decisions, except for prefering <pubDate> over <dc:date>, which I don't understand.

The problem is, the Feeds API object model is a little too simple. Like RSS 2.0, it doesn't model the common things that bloggers do like having both a summary and content for each item, or having  name and/or e-mail address for each author. That's why people use extensions like the <content:encoded> and <dc:creator> (or prefer Atom, which does a better job of modeling those common things). I hope Microsoft will fix this by improving the object model and if they do, they won't have to make as many choices about which elements to use.

Experimenting with the MS Feeds API


The Windows RSS platform includes a Feeds API that parses all forms of RSS and Atom to a simplified  object model.

For example, an Item object has an Author property and not an author name, author e-mail and author URI which are all possible in Atom. And, an Item object has a Description field and not description and content (as in Wordpress feeds) or summary and content (as in Atom feeds).

So, how does the Feeds API decide how to map elements to this  simplified object model? I did some C# experiments and here are some of my findings. Note that the Feeds API is beta software and will certainly change for the better (I hope) by the time it is released in IE7 and Windows Vista.

 Item contains
 Feeds API returns
<dc:creator>dave</dc:creator>  item.Author = "dave"
<author>dave@example.com</author> item.Author = "dave@example.com"
<author>dave@example.com</author>
<dc:creator>dave</dc:creator>
item.Author = "dave"
   (prefers funky RSS)
<description>my desc</description>
<content:encoded>my content</content:encoded>
item.Description = "my content"
   (prefers funky RSS)
<pubDate>
   Thu, 9 Mar 2006 23:13:04 -0500
</pubdate>
item.Date =
   "
3/10/2006 4:13:04 AM"
   (uses GMT)
<pubDate>
   Thu, 9 Mar 2006 23:13:04 -0500
</pubdate>
<dc:date>
   2004-08-19T11:54:37-08:00
</dc:date>
item.Date =
   "
3/10/2006 4:13:04 AM"

   (prefers core RSS element)
<atom:summary>my summary</atom:summary>
<atom:content>my content</atom:content>
item.Description = "my content"

First, it's interesting that those funky RSS elements that Winer dislikes are preferred over the core RSS elements in important places. And second, what if you're not happy with Microsoft's mapping choices in this area?

For example, how do you get both description and content from those Wordpress feeds? Wordpress (and Typepad) uses the <description> element as a summary and the funky <content:encoded> element for the full content (see Winer's own Wordpress.com feed for example). You've got to parse the XML yourself. The Feeds API tries to makes that easy by providing both the XML for the entire feed and the XML fragment for each item, but I think most developers would prefer to have a more complete object model.

See also: What's up with the Windows RSS Platform

Tags: topic:[rss], topic:[atom], topic:[feeds], topic:[ie7], topic:[vista]

The never ending story of RSS and Atom in Action


You know last week, when I said the book was ready to go to the printers and would be available this week as an e-book? We'll, I was wrong.

While we waited for Atom protocol to stabilize, things changed in the world of C# and Java feed APIs. Microsoft introduced the Windows RSS platform and a pre-release of the Windows Feeds API is available in the IE7 beta. And ROME has come along way too; now with Atom format 1.0 support and a growing list of extension modules. We decided that we just couldn't publish a book on RSS and Atom without covering the Windows RSS platform and ROME in-depth. So now I'm under the gun again, writing away into the wee hours of the night. I should be done by April 14th and, with luck, the book will be out in late May, just in time for JavaOne. That explains my sudden interest in the Windows RSS platform.

The kids hate it, but I think it's for the best. Manning will have the very first book that covers the Atom protocol (with a working client and server), the Windows RSS platform and ROME in-depth. It'll definitely be worth the wait.

Tags: topic:[atom], topic:[rss], topic:[ie7], topic:[atom protocol]

What's up with the Windows RSS platform?


The Windows RSS Platform (or Feeds API) is the feed handling engine that powers the new RSS features in IE7. It will also be included in Windows Vista for use by other applications. Note that here, RSS is a generic term meant to include both RSS and Atom -- the Feeds API supports both. The Feeds API is packaged in a DLL called msfeeds.dll and available to programmers as a set of dual-interface COM objects. Here are the features exposed via the Feeds API.
  • Common feed list: list of feeds for current user, organized as folder hierarchy.
  • Feed store: local cache of feeds, feeds available via abstract object model
  • Download engine: for managing and monitoring large enclosure downloads
  • RSS sharing extensions: new XML elements to support bi-directional sync via RSS

The Feeds API gives you access to the current user's feed subscription list, a feed parser that can handle any form of RSS and Atom as well as the IE7 podcast download engine. The parser parses feeds to an abstract object model designed to represent any sort of feed. It handles funky RSS and in some cases prefers the funky elements (e.g. <content:escape> over <description>).

I'd like to learn more about how the Feeds API decides which elements to use, how sync works, and how the whole package compares to the premier Java Feeds API ROME. So, I've downloaded IE7 and started experimenting with the API from C#. I'll be posting more on this topic in the next week or two.

Here are some of the references I've been using to understand the API:

Feeds API docs, specs and whitepapers from Microsoft 

Microsoft employee blogs about the RSS platform

Other blogs about it

Update1: added a couple of new links suggested by Mark Woodman
Update2: added reference to Simple List Extensions
Update3: added link to RSS in Windows Vista presentaton

Tags: topic:[atom], topic:[rss], topic:[ie7], topic:[atom protocol]

Pebble and Blojsom and Atom protocol


I've used code from the excellent Pebble and Blojsom blog servers in the past (and given credit in the Roller CREDITS file). I'd love to be able to contribute back and now there's an opportunity to do that. So to Simon and David (or anybody else hacking those servers), if you want to get Atom protocol working in your server, the easiest way might be for you to bring in some code from Roller. I specifically designed our Atom protocol implementation to allow for sharing and to be free of Roller dependencies.

For example, here's how you'd do it for Pebble:
  • Bring the classes from the package org.roller.presentation.atomapi into Pebble (except for RollerAtomHandler, you won't need that one).
  • You'll also need to bring in the ROME and JDOM jars if you're not aleady using them.
  • Implement the interface AtomHandler with calls to the Pebble backend, call it PebbleAtomHandler or something similar.
  • Change one line of code in the AtomServlet method createAtomRequestHandler() to create your new PebbleAtomHandler instead of the Roller one.
And feel free to pepper me with questions along the way. I'd be happy to help and happy to make changes to make this sharing easier. I'm also considering the idea of an Atom Server Kit package in my Blogapps project (on second thought, ROME might be a better home).

When you're done, head over to the #atom channel on irc.freenode.net so we can do some interop testing with MatisseBlogger and other Atom protocol clients.

Atom protocol, OpenSearch and Microformats

Joe Gregorio: APP, OpenSearch and Microformats. Get used to seeing them; those small pieces loosely joined are the future of web services.
Joe's talking about the new Lucene Web Services API, which is based on Atom protocol (APP), OpenSearch and Microformats. It's very cool to see the APP already applied outside of the realm of blogs.

Atom protocol draft 7


I was planning on submitting Chapter 8 of RSS and Atom in Action to Manning today, but Atom protocol draft 7 has appeared. The changes look good and the only really significant one for me is the move from list templates, which allowed indexing into a collection, to next/previous paging as we had in draft 4. I'm going to revise my implementation, Chapter 8 and turn it in on Wednesday. Once that's done, I'll release Blogapps v0.1.

Atom Protocol draft 05

There's a new draft of the Atom Protocol available and I've already started working on updating my client (the BlogClient example from my upcoming book RSS and Atom in Action) and server (in the Roller sandbox) implementations. Surprisingly, the new spec doesn't look all that different from the previous one, so perhaps a weekend of work will do the trick.


The talk went well

My second JavaOne was a great experience, but it was a little stressful because up until last night I couldn't find any of my co-speakers. I spent most of Wednesday preparing to give the whole talk by myself, but luckily for me (and the attendees), Pat and Kevin showed up just in time. Unfortunately, Pat showed up with some very bad news for us at Sun: he's leaving to work at Google.

In the end, I think the talk went pretty well. Kevin did most of Pat and my slides on syndication because we had split the talk 50-50 when we couldn't locate Pat on Wednesday night (and assumed he was still in Paris). He did a good job with the material and added in some interesting points from his experience at Rojo.com where they parse millions of feeds per hour with the Java-based Apache Commons (sandbox) FeedParser.

We were a little disappointed with the turnout. I'd be surprised if the 700+ seat Yerba Buena theater was more than 30% full. The fact that were in a lunchtime timeslot on the last day of the show certainly didn't help. Anyhow, I'm relieved that it's over and ready for a nice long week off.


How Atom Publishing Protocol works

Unlike the other RSS and Atom books that are hitting the shelves these days, RSS and Atom in Action is going to cover the Atom Publishing Protocol. So, I'm following the protocol development very closely. This blog entry is a summary of Atom as it stands today (based on Draft 04 released May 10, 2005). It's a follow up to my post earlier post Atom in a nutshell which explained Atom API 0.9.

Atom Publishing Protocol (a work in progress) is a new web services protocol for interacting with a blog, wiki or other type of content management system by simply sending XML over HTTP, no SOAP or XML-RPC required. Your code interacts with collections of web resources by using HTTP verbs GET, POST, PUT and DELETE as they are meant to be used. Here's how it works.

Discovering your workspaces and collections

To start out, you do an authenticated GET on a server's Atom URL to get a description of the services available to you. You get back an an Atom Services XML document that lists the workspaces and within each the collections available. A workspace could be a blog, a wiki namespace or content collection that you have access to via your username/password.

Each workspace can contain two types of collections: entries and resources. Eventually, the spec will probably allow for (at least) five types of collections: entries, categories, templates, users, and generic resources.

Here is an example of a services document XML for a blog user with access to two blogs "My Blog" and "Marketing Team Blog":

<?xml version="1.0" encoding='utf-8'?>
<service xmlns="http://purl.org/atom/app#">
     <workspace title="My Blog" > 
        <collection contents="entries" title="Blog Entries" 
            href="http://localhost:8080/roller/atom/myblog/entries" />
        <collection contents="generic" title="File Uploads" 
            href="http://localhost:8080/roller/atom/myblog/resources" />
     </workspace>
     <workspace title="Marketing Team Blog">
         <collection contents="entries" title="Blog Entries" 
             href="http://localhost:8080/roller/atom/marketingblog/entries" />
         <collection contents="generic" title="File Uploads" 
             href="http://localhost:8080/roller/atom/marketingblog/entries" />
     </workspace>
</service>

Working with collections

So, a workspace is a blog and a blog contains collections of things like entries, uploaded file resources, categories, etc. All of these collections are handled the same way and respond the same way to the HTTP verbs. All support paging, so the server can return a collection in easy to digest chunks. All support query by date, so you can filter collections by start and end date.

To get a collection, do a GET on it's URI (that's the collection's 'href' as listed in the services document).

The services document specifies a URI for each collection. If you do a GET on a collection URI, you'll get back an Atom Collection XML document that lists the first X number of members in the collection. If there are more than X members, then the document will include a next URI which you can use to get the next batch of members. You can also specify an HTTP Range header to restrict a collection to only those between specific start and end dates.

Here is an example of a the collection document XML you might receive by doing a GET on the Blog Entries collection from My Blog above.

<?xml version="1.0" encoding='utf-8'?>
<collection xmlns="http://purl.org/atom/app#"
   next="http://localhost:8080/roller/atom/myblog/entry/77088a1" >
   <member title="The Connells: Fun and Games"
         href="http://localhost:8080/roller/atom/myblog/entry/7700a0" 
         updated="2005-04-16T23:07:08-0400" />
   <member title="The Connells: Boylan Heights"
         href="http://localhost:8080/roller/atom/myblog/entry/7700c9" 
         updated="2005-04-15T23:06:09-0400" />
   <member title="The Connells: Gladiator"
         href="http://localhost:8080/roller/atom/myblog/entry/7700c8" 
         updated="2005-04-14T14:05:31-0400" />
</collection>

Each member in a collection document has a title and a URI. To get a member of a collection, you do a GET on the member's URI. To delete it you use DELETE. To update it you use PUT. Here's an example of the entry XML you might receive by doing a GET on the first member of the Blog Entries collection example above:

<entry xmlns="http://purl.org/atom/ns#">
  <title>The Connells: Fun and Games</title>
  <id>7700a0</id>
  <updated>2005-06-01T19:07:45Z</updated>
  <content type="html">
     Let me tear down into your heart
     Let me take a seat and stay awhile
     Let me have a half of your whole  
     Let me keep it for myself awhile  
  </content>
</entry>

To add a member to a collection, you simply POST the member to the collection's URI. If you're POSTing a new entry, send the XML for the entry (example entry XML shown below). If you're POSTing a file upload, then send the file.

That's it. Pretty simple huh?

What about authentication?

The Atom spec requires that a server support either HTTP digest authentication, which can be difficult for some bloggers to implement (depending on their ISP and web server), or CGI authentication, which can be a lot easier to implement (I believe WSSE qualifies as CGI authentication, and that's what my Atom implementation uses).

What about devices with limited HTTP support?

Some devices have crippled HTTP client capabilities. For example, some Java J2ME powered cell phones can't do an HTTP PUT or DELETE. If you can't do a PUT or a DELETE, you can use POST instead with a SOAP wrapper that specifies a Web-Method of PUT or DELETE. Atom servers are required to support SOAP POSTS and returning results in SOAP envelopes.

What about draft vs. published states for entries?

Still undecided. Some folks suggest using different collections for entries in different states (draft, approved, published, etc.). But it's more likely that a new element will be introduced in the Atom format to specify the state of an entry.

Summary

That's my summary of Atom protocol as it stands today. I think it's a great improvement over existing blog APIs in terms of features and design. And it's not very difficult to implement. I know because I'm almost done with my server and client implementations. I hope to release them shortly for review. The release will probably be a standalone release of Roller (the Atom server) and my BlogClient UI (the Atom client).

If you see some opportunities for improvement in the protocol, please join the Atom Protocol mailing list and help out. Last call for spec changes in now slated for October. And for the Atom experts out there: what did I get wrong? Leave a comment.


Welcome to May

Time to test the power of positive thinking: this is the month that Atom Protocol settles down and becomes stable enough for me to finish up the book.

« Previous page | Main