Dave Johnson on open web technologies, social software and Java
Here is an XSL transform for converting a flat OPML file (like those produced by PlanetPlanet sites), to a Roller Planet config file (with all subscriptions in one group): opml2planet.xsl
Planet Roller is a community aggregator, a tool for creating a website that combines related but separately hosted blogs together into one blog with it's own newsfeed. Planet Roller will eventually be part of Roller, but for the upcoming Roller 1.1 release it's in the Roller "sandbox" and will only be available in custom builds. There's also a standalone verion of Planet Roller, which I'll describe below.
Here's some status. I spent most of the week creating the infrastructure needed for configuring and running Planet Roller inside of Roller. That means storing the subscription and group configuration in a database, rather than an XML file. And, it means doing aggregation via a database query rather than spinning through a bunch of hashtables. Once I'm done, we'll have a custom-build of Roller that puts every Roller blog on the system into the aggregator and allows us to add separately blogs into the mix.
Want to try Planet Roller? I've been testing a standalone command-line version of Planet Roller, which I call Planet Tool, by running a site called Triangle Bloggers, which combines a bunch of local blogs in the Raleigh-Durham area. So, one way to try Planet Roller is to visit that site and subscribe to the feed. Triangle Bloggers has been a good testing experience because I've been forced to deal with a wide variety of Atom and RSS feeds. Planet Tool can handle Atom and just about any form of RSS, as long as it has item level publication dates (i.e. must be RSS 0.93 or later).
If you want to try running Planet Tool and creating your own aggregated blog, you can get the tool here: planet-roller-1.1-dev.tar (source is included). If you have Java installed, all you need to do is download it, un-tar it (with tar or Winzip), open a command window, and either run planet-tool.sh or plannet-tool.bat. It reads an XML config file and then generates the HTML and XML files needed for an aggregated blog. To keep your aggregated blog up to date, you'll need to run Planet Tool on a schedule, so run it as a cron job or as a Window Scheduled Task.
For more information on the config file and on page templates see this blog entry:
Rome + Texen = Planet Roller
For more information on how Planet Tool works:
Planet Roller Internals
We the committers and friends of the open source Roller Weblogger project propose that the project become part of the Apache Software Foundation. The rest of this document explains the rationale behind this proposal, how Roller meets the Apache project scope, initial source, resources required, and initial committer criteria. [Read More]
Big IDEA: from Zane State Collegeâs IDEA Center: Wow, Iâm impressed with Roller. A brief rundown of some of the plusses:
Maybe WordPress MultiUser is really the way to go. I used the regular WP for a class blog previously and liked having the multi-author blog. But I also really like Roller now; maybe the developers will add multi-author blogs soon.
- configurable editor interfacesâplain, WYSIWYG (java, IE-only and Mozilla-only)
- timed availability of comments (enable for n days)
- enable/disable comments (per-post or blog-wide)
- nice blogroll import from OPML
- bookmark import
- create static pagesâthe link is created for you and added to the main navigation for you blog
- per-user themes
- spell check
- new user registration
- rss: site-wide, per-blog, and per-category
I've been digging into Part II of the book and making great progress. So far, I've written three of the "blog app" chapters. Recall that these are short chapters, each centered on an interesting blog application written in Java or C#. I did a blog app chapter about the Planet Tool aggregator, one about searching and monitoring blogs with newsfeed search engines, and one about a Cross Poster in C#. The Cross Poster is like Ben Hammersley's one, except mine handles both RSS and Atom feeds and can post to any MetaWeblog API based blog server (Ben's only does Movable Type).
Now I'm working on a pair of blog apps called MailBlogger (blogging via email) and BlogMailer (which sends you a daily digest of your favorite blogs via email). This stuff is really easy on the Java-side thanks to JavaMail and ROME. On the C#/.NET side: it's .NOT so easy.
Where is the .NET analog to JavaMail? As far as I can tell, the .NET class libraries support only SMTP and the free alternatives are pretty weak, especially when you compare them to JavaMail. I ended up using Pawel Lesnikowski's open source Mail Namespace for C#, which is what .NET blogging package Das Blog uses for POP3 support.
And where is the .NET equivalent to ROME? ROME is an active and growing open source product that can parse all newsfeed formats. On the .NET side there are two separate free parser projects, one for RSS and one for Atom, and both appear to be dead. I hope that is not the case because I need those libraries. For the Java blog apps, I can rely on ROME. For the .NET blog apps, I may have to use my own System.XML based parser.
By the way, I had to make some patches to RSS.Net (adding in W3CDateTime) to get it to parse dates properly.
I recently got the news that the book, or some portion thereof, will be released in the Manning Early Access Program (MEAP). That means we're gearing up for production right now. Getting the promotional stuff ready (for example), deciding which funky Manning dude goes on a cover, and putting the early chapters through production even though the book is not quite done. So, when will the book be done? I'm not sure. We have to wait for Atom Protocol to wrap up, but you'll be able to get the early chapters via MEAP in the next month or two.
roller AND ("blog server" OR "blog software")
I'm trying to compare the quality of results from the three services, but so far I can't see too much difference. Ok, on to the topic of the post.
Yesterday, PubSub scooped both Technorati and Feedster with a couple of mentions of Roller. First, in a post from D'Arcy Norman about "Massively Multi User Weblogging." Then in a follow up from James Farmer, who gives his point-of-view on each of the options. Farmer's follow-up is filled with lots of good info, until he gets to Roller where he says:
It looks to me like Roller is going to become part of Sunâs enterprise software offerings (here and here) so thereâs not a heap of point following that up at the momentSeriously flawed thinking there, so I responded:
Whatever do you mean by "It looks to me like Roller is going to become part of Sunâs enterprise software offerings (here and here) so thereâs not a heap of point following that up at the moment"? If anything, that is a reason to follow up, not a reason against. Anyhowâ¦ Roller is a great choice, Sun has over a thousand blogs running on it. JRoller.com has > 7000. There are numerous other large scale sites. Roller is open source software licensed under and Apache license and that ainât gonna change.
Plus, Roller is Java/J2EE -- so it will run anywhere from standalone to Apache/Tomcat to big honkin' app servers like Weblogic, Websphere, and the Sun Java App Server (minor tweaks necessary for some of those app severs, but it will run).
If you're considering setting up a site for hundreds or thousands of blogs, make sure Roller is on the evaluation list.
We took Jetbrains up on their offer of free IDEA IntelliJ IDE licenses to open source projects. Today the licenses arrived and I mailed them out to the Roller committers. I'm happy to have another IDE to work with because neither Eclipse nor Netbeans completely satisfies me on Solaris x86. Swing-based Netbeans and IntelliJ run perfectly on Solaris x86, but unfortuantely Eclipse does not.
Pat Chanezon: The details are in my post to the ROME dev mailing list: it seems like mod_security, as configured at the TextDrive ISP refuses HTTP get with a user-agent containing the string 'java'! How how should java people interpret that? I guess as a compliment: if mod_security forbids Java in the user-agent, it means that even spammers and rogue spiders ended up dumping Perl for their nasty HTTP business to use Java instead: Java everywhere ;-)"
I burned several hours trying to figure out why Roller Planet (and specifically the Triangle Blogs Aggregator) wouldn't work against Textdrive hosted blogs, only to learn that the reason is an anti-Java filter. So my interpretation is %@#(*&!!! Textdrive is not our friend. They don't support Java and they appear to be actively filtering out Java clients. Perhaps I'm wrong and mod_security is the culpret?
PS. By changing my user-agent string to "Roller Planet 1.1-dev" (I considered the user-agent string "Textdrive s****" but that's just rude), I was able to get beyond the filter.
I can't seem to find documentation for Technorati's keyword search syntax anywhere. The help page and FAQ don't say a thing about complex queries and the text field says "Keyword or URL" which indicates to me that, perhaps, only single keywords are supported. I've had some success with NOT, but any more complex queries fail.
Hey lazyweb, where can I find documentation for the Technorati search language?
Tim Bray: the blogosphere, Long Tail and all, is not about the millions of voices, it’s about the millions of ears; it is, more than any other single thing, an improvement in our ability to listen, to find out what’s going on.
I like that.
I put together an experimental Triangle blogs aggregator using the new Planet Tool aggregator. Let me know if you'd like to be added or subtracted from the aggregation. I stole Anton's Blog Together theme. Maybe I can convince him to steal my aggregator.
There is an RSS feed and OPML subscription list for the whole site, and one for each of the groups (Chapel Hill bloggers, RTP bloggers, and TriJUG bloggers).
Doing some research on Technorati and other newsfeed search engines, I happened on Bill's blog and this memory of a midnight walk through the Triangle:
Bill Clinton: So that’s the reason I know these places so well. I remember one night and this is many years ago, I went to see him, but he wasn’t home. There were no buses and I didn’t bring my wallet so I tried to walk from Chapel Hill to Raleigh. Big mistake. After a few miles, I couldn’t go on anymore. I tried to hitchhike, but I found no buyers. I gave up. It was extremely dark. People living in New York don’t know how dark rural areas are. The crickets were chirping, I remember them very well. My old friends the crickets. Since I was in the middle of nowhere I decided to go into the shrub and sleep. I did. It's surprising how easy I fell asleep. I mean. In the middle of nowhere. I had no fear whatsoever. Youth.
Paul Humphreys: Jonathan wants Sun employees to blog. The title on the blogs.sun.com page says 'Welcome to Blogs.sun.com! This space is accessible to any Sun employee to write about anything.' It really is true. I do not get reminder or requests from anyone that I must write wonderful things about Sun, increase my technical content and not to write about random things that no one has any interest in.People I meet have a hard time believing that "write about anything bit" but it's true. I'd like to see every company, every organization in fact, adopt something like Tim's Policy on Public Discourse.