Blogging Roller

Dave Johnson on open web technologies, social software and software development

Javablogs 1.2 using Informa.

Charles Miller posted an interesting write-up on the recent Javablogs 1.2 improvements. I found it especially interesting that Javablogs now uses the Informa RSS parser, which takes the strict XML approach to RSS feed parsing - if the XML is invalid it ignores the feed. They are considering using Mark Pilgrim's ultra-liberal feed Universal Feed Parser which is much more forgiving, but it is written in Python and may need to be run in a separate process. Wouldn't it would be nice if Informa provided ultra-liberal parsing capabilities? Hmmm... I wonder... how hard would it be to port Pilgrim's parser to Jython?

Dave Johnson in Java • 🕒 02:35PM Apr 14, 2004
Tags: Java
Comments:

Many of the arguments in favour of liberal parsing don't actually apply to Javablogs: If your RSS feed doesn't appear on Javablogs because it's broken, that should be incentive for you to fix the feed. (That said, a lot of Roller feeds seem to be spitting out invalid UTF-8) I've found a couple of bugs in Informa where perfectly valid feeds don't get read because they have a namespace declaration. Luckily they're easy to patch. Open Source is cool that way.

Posted by Charles Miller on April 14, 2004 at 10:13 PM EDT #

And Niko (Informa's project manager) is pretty good at getting patches integrated too. The problem with 'liberal' parsing, is that often invalid feeds are actually invalid XML - Informa saves a load of time by using standard XML parsers, which assume they are dealing with valid XML. I don't know of a liberal XML parser, so I expect integrating a liberal parser into Informa would be a lot of work. If Mark Pilgrims feed parser could be easily ported however, extending informa to support user defined parsers isn't much work at all - then you get all the benifits of Informa's nice feed API.

Posted by Sam Newman on April 15, 2004 at 07:33 AM EDT #

I did some research yesterday, but as a complete J/Python novice I couldn't figure out how to run a Python script in Jython. My one previous attempt to run Python (the wxAtom client) failed completely.

Posted by Lance on April 17, 2004 at 02:15 PM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed