Blogging Roller
Dave Johnson on open web technologies, social software and Java
Dave Johnson on open web technologies, social software and Java
Above: a random selection of photos from my Flickr photo-stream.
I attended the Triangle Bloggers Conference 2005 on Saturday morning in Chapel Hill. The meeting was held in a classroom large enough to accommodate the approximately 150 people in attendance, power in every seat, and wireless internet. The agenda was divided into three portions, but the conference was really one long, seamless, and very interesting conversation between audience members and the speakers. The theme was using blogs to build community, how to build a larger readership for your blog, how to use blogs in grassroots journalism. Here are a couple of the things I wrote down (these are not 100% accurate quotes):
I also got a chance to talk to folks about corporate blogs at SAS and IBM (both have some internal Roller sites) and student blogs at UNC. I also spent some time talking to Roch Smith, the man behind the Greensboro 101 community aggregator. All and all it was a great experience. I learned a lot about blogging and I feel a little more connected to my hometown and the Triangle in general. Thanks to Anton Zuiker, Paul Jones, and everybody else who helped put it together. More information, check here and here.
Tags: Blogging
is born.
Planet Roller is currently a command-line line tool that reads a configuration file of newsfeed subscription data, then generates an aggegated weblog with an
RSS feed
, and an
OPML listing
of all subscriptions. It's essentially a Java version of Planet Planet
. I've got it set up to run every 30 mintues. Yes, I'm aware that the RSS gets a warning on
validation
. No, I haven't added newsfeed autodiscovery yet. Yes, I stole
David Edmondson
's Planet Sun theme. No, I haven't done any testing on the OPML. Enough questions already! I need to get back to work.
I'll be adding a couple more details to this post as the night progresses.
OK, I'm back. Did I mention that Planet Roller is a community aggregator
, a "A Community Aggregator is a portal-like web application that displays weblog posts from a group of closely related but separately hosted weblogs and provides synthetic newsfeeds so that readers may subscribe to the group as a whole."
Configuring Planet Roller
Currently, Planet Roller is just a simple command-line tool that is designed to run as a scheduled task. It reads a list of newsfeed subscriptions from an XML file, as shown below. Eventually, there will also be a UI for Planet Roller so that you don't have to shell into to a server and edit an XML file to add and delete subscriptions.
<planet-config>
<main-page>control.vm</main-page>
<admin-name>Dave Johnson</admin-name>
<admin-email>dave.johnson@rollerweblogger.org</admin-email>
<site-url>http://rollerweblogger.org/planet</site-url>
<output-dir>/nfs/ank/home1/r/roller/public_html/planet</output-dir>
<template-dir>/nfs/ank/home1/r/roller/planet-roller/templates</template-dir>
<cache-dir>/nfs/ank/home1/r/roller/planet-roller/cache</cache-dir>
<subscription id="dave">
<feed-url>http://rollerweblogger.org/rss/roller</feed-url>
<site-url>http://rollerweblogger.org/page/roller</site-url>
</subscription>
<subscription id="lance">
<feed-url>http://www.brainopolis.com/roller/rss/lance</feed-url>
<site-url>http://www.brainopolis.com/roller/page/lance</site-url>
</subscription>
<subscription id="matt">
<feed-url>http://raibledesigns.com/rss/rd</feed-url>
<site-url>http://raibledesigns.com/page/rd</site-url>
</subscription>
<subscription id="anil">
<feed-url>http://www.busybuddha.org/blog/rss/anil</feed-url>
<site-url>http://www.busybuddha.org/blog/page/anil</site-url>
</subscription>
<subscription id="henri">
<feed-url>http://blog.generationjava.com/roller/rss/bayard</feed-url>
<site-url>http://blog.generationjava.com/roller/page/bayard</site-url>
</subscription>
<subscription id="pat">
<feed-url>http://blogs.sun.com/roller/rss/pat</feed-url>
<site-url>http://blogs.sun.com/roller/page/pat</site-url>
</subscription>
<group handle="roller">
<description>Other folks who are blogging Roller</description>
<max-page-entries>30</max-page-entries>
<max-feed-entries>30</max-feed-entries>
<subscription-ref refid="dave">
<subscription-ref refid="lance">
<subscription-ref refid="pat">
<subscription-ref refid="matt">
<subscription-ref refid="anil">
<subscription-ref refid="henri">
</subscription-ref>
<group handle="trijug">
<description>Triangle Java User Group Bloggers</description>
<max-page-entries>40</max-page-entries>
<max-feed-entries>40</max-feed-entries>
<subscription-ref refid="dave">
</subscription-ref>
</group>
The configuration file contains three types of information: 1) configuration information for the planet site itself, 2) newsfeed subscriptions, and 3) groups. Groups allow a single Planet Roller site to host differernt aggregations. In the above configuration file, I've defined two groups "Planet Roller" and "Planet TriJUG". Note that one subscription can appear in more than one group.
Customizing Planet Roller File Generation
The command-line version of Planet Roller uses the Texen
feature of Velocity
to generate whatever files you want in your Planet Roller site. I included templates for HTML, RSS, and OPML, but you can tweak these and/or add whatever you want.
You tell Planet Roller which templates to use by specifying a Texen control template in the
#set ($groupHandles = $planet.groupHandles)
#foreach ($groupHandle in $groupHandles)
#set ($outputFile = $strings.concat([$groupHandle, ".html"]))
$generator.parse("html.vm", $outputFile, "groupHandle", $groupHandle)
#set ($outputFile = $strings.concat([$groupHandle, ".rss"]))
$generator.parse("rss.vm", $outputFile, "groupHandle", $groupHandle)
#set ($outputFile = $strings.concat([$groupHandle, ".opml"]))
$generator.parse("opml.vm", $outputFile, "groupHandle", $groupHandle)
#end
The control template loops through the groups defined in the config file and for each, generates an HTML file using the html.vm template, an RSS file using the rss.vm template, and an OPML file using the opml.vm template. You can provide your own control template, or just hack the one that comes with Planet Roller.
Based on the above configuration data and control template, when Planet Roller runs, you'll end up with six files:
Let's look at the RSS template, so you can get a feel for how the templates work.
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
#set($group = $planet.getGroup($groupHandle))
$planet.configuration.url/${group.handle}.html
<description>$utilities.textToHTML($group.description,true)</description>
<lastbuilddate>$utilities.formatRfc822Date($date)</lastbuilddate>
<generator>Roller Planet 1.1-dev</generator>
#set($entries = $planet.getAggregation($group, 30))
#foreach( $entry in $entries )
<item>
<description>$utilities.textToHTML($entry.content,true)</description>
<category>$utilities.textToHTML($entry.category,true)</category>
$entry.permalink
<pubdate>$utilities.formatRfc822Date($entry.published)</pubdate>
#if($entry.author)<dc:creator>$utilities.textToHTML($entry.author,true)</dc:creator>#end
</item>
#end
</channel>
</rss>
And here is the OPML template:
#set($group = $planet.getGroup($groupHandle)) <opml version="1.1"> <datecreated>$utilities.formatRfc822Date($date)</datecreated> <datemodified>$utilities.formatRfc822Date($date)</datemodified> <ownername>$planet.config.adminName</ownername> <owneremail>$planet.config.adminEmail</owneremail> #foreach($sub in $group.subscriptions) <outline htmlurl="$utilities.textToHTML($sub.siteUrl)" xmlurl="$utilities.textToHTML($sub.feedUrl)" text="$utilities.textToHTML($sub.title)"> #end </outline>
Within a template, you have access to the configuration through the $planet object, plus there are a couple of other objects that you'll find helpful in generating files. Here are the objects that are available in a template:
Running Planet Roller
You can run Planet Roller from a simple script, like the one below:
#!/bin/bash
_CP=.:./lib/planet-roller-1.1-dev.jar
_CP=${_CP}:./lib/rollerbeans.jar
_CP=${_CP}:./lib/commons-logging.jar
_CP=${_CP}:./lib/jaxen-full.jar
_CP=${_CP}:./lib/jdom.jar
_CP=${_CP}:./lib/dom4j-1.4.jar
_CP=${_CP}:./lib/rome-0.5.jar
_CP=${_CP}:./lib/rome-fetcher-0.5.jar
_CP=${_CP}:./lib/velocity-1.4.jar
_CP=${_CP}:./lib/velocity-dep-1.4.jar
java -classpath ${_CP} org.roller.tools.planet.PlanetTool $1
If you want Planet Roller to run on a schedule, schedule it. For example, on UNIX you can use cron. I use the following cron task to run Planet Roller on the 6th and 36th minute of every hour:
6,36 * * * * (cd ~roller/planet-roller; ./planet-roller.sh)
Planet Roller uses the Rome Fetcher library to retrieve, parse, and cache newsfeed data to disk. Fetcher uses HTTP Conditional Get and Etags to ensure that feeds are only downloaded when truly updated.
That's enough for now. Tomorrow, I'll tell you about Planet Roller internals.
Tags: blogging