Dave Johnson on open web technologies, social software and software development
« First blogs.sun.com... | Main | Couple of notes from... »
After a couple days of hacking with the Rome Fetcher and Velocity Texen, Planet Roller is born.
Planet Roller is currently a command-line line tool that reads a configuration file of newsfeed subscription data, then generates an aggegated weblog with an RSS feed, and an OPML listing of all subscriptions. It's essentially a Java version of Planet Planet. I've got it set up to run every 30 mintues. Yes, I'm aware that the RSS gets a warning on validation. No, I haven't added newsfeed autodiscovery yet. Yes, I stole David Edmondson's Planet Sun theme. No, I haven't done any testing on the OPML. Enough questions already! I need to get back to work.
I'll be adding a couple more details to this post as the night progresses.
OK, I'm back. Did I mention that Planet Roller is a community aggregator, a "A Community Aggregator is a portal-like web application that displays weblog posts from a group of closely related but separately hosted weblogs and provides synthetic newsfeeds so that readers may subscribe to the group as a whole."
Configuring Planet Roller
Currently, Planet Roller is just a simple command-line tool that is designed to run as a scheduled task. It reads a list of newsfeed subscriptions from an XML file, as shown below. Eventually, there will also be a UI for Planet Roller so that you don't have to shell into to a server and edit an XML file to add and delete subscriptions.
<planet-config> <main-page>control.vm</main-page> <admin-name>Dave Johnson</admin-name> <admin-email>dave.johnson@rollerweblogger.org</admin-email> <site-url>http://rollerweblogger.org/planet</site-url> <output-dir>/nfs/ank/home1/r/roller/public_html/planet</output-dir> <template-dir>/nfs/ank/home1/r/roller/planet-roller/templates</template-dir> <cache-dir>/nfs/ank/home1/r/roller/planet-roller/cache</cache-dir> <subscription id="dave"> <feed-url>http://rollerweblogger.org/rss/roller</feed-url> <site-url>http://rollerweblogger.org/page/roller</site-url> </subscription> <subscription id="lance"> <feed-url>http://www.brainopolis.com/roller/rss/lance</feed-url> <site-url>http://www.brainopolis.com/roller/page/lance</site-url> </subscription> <subscription id="matt"> <feed-url>http://raibledesigns.com/rss/rd</feed-url> <site-url>http://raibledesigns.com/page/rd</site-url> </subscription> <subscription id="anil"> <feed-url>http://www.busybuddha.org/blog/rss/anil</feed-url> <site-url>http://www.busybuddha.org/blog/page/anil</site-url> </subscription> <subscription id="henri"> <feed-url>http://blog.generationjava.com/roller/rss/bayard</feed-url> <site-url>http://blog.generationjava.com/roller/page/bayard</site-url> </subscription> <subscription id="pat"> <feed-url>http://blogs.sun.com/roller/rss/pat</feed-url> <site-url>http://blogs.sun.com/roller/page/pat</site-url> </subscription> <group handle="roller"> <description>Other folks who are blogging Roller</description> <max-page-entries>30</max-page-entries> <max-feed-entries>30</max-feed-entries> <subscription-ref refid="dave"> <subscription-ref refid="lance"> <subscription-ref refid="pat"> <subscription-ref refid="matt"> <subscription-ref refid="anil"> <subscription-ref refid="henri"> </subscription-ref> <group handle="trijug"> <description>Triangle Java User Group Bloggers</description> <max-page-entries>40</max-page-entries> <max-feed-entries>40</max-feed-entries> <subscription-ref refid="dave"> </subscription-ref> </group>
The configuration file contains three types of information: 1) configuration information for the planet site itself, 2) newsfeed subscriptions, and 3) groups. Groups allow a single Planet Roller site to host differernt aggregations. In the above configuration file, I've defined two groups "Planet Roller" and "Planet TriJUG". Note that one subscription can appear in more than one group.
Customizing Planet Roller File Generation
The command-line version of Planet Roller uses the Texen feature of Velocity to generate whatever files you want in your Planet Roller site. I included templates for HTML, RSS, and OPML, but you can tweak these and/or add whatever you want.
You tell Planet Roller which templates to use by specifying a Texen control template in the element of the config file. Specify the templates directory in the element. The control template does not generate anything itself. It controls the file generation process and it determines which files are generated and which template is used for each. Here is Planet Roller's current control template:
#set ($groupHandles = $planet.groupHandles) #foreach ($groupHandle in $groupHandles) #set ($outputFile = $strings.concat([$groupHandle, ".html"])) $generator.parse("html.vm", $outputFile, "groupHandle", $groupHandle) #set ($outputFile = $strings.concat([$groupHandle, ".rss"])) $generator.parse("rss.vm", $outputFile, "groupHandle", $groupHandle) #set ($outputFile = $strings.concat([$groupHandle, ".opml"])) $generator.parse("opml.vm", $outputFile, "groupHandle", $groupHandle) #end
The control template loops through the groups defined in the config file and for each, generates an HTML file using the html.vm template, an RSS file using the rss.vm template, and an OPML file using the opml.vm template. You can provide your own control template, or just hack the one that comes with Planet Roller.
Based on the above configuration data and control template, when Planet Roller runs, you'll end up with six files:
Let's look at the RSS template, so you can get a feel for how the templates work.
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"> <channel> #set($group = $planet.getGroup($groupHandle)) $planet.configuration.url/${group.handle}.html <description>$utilities.textToHTML($group.description,true)</description> <lastbuilddate>$utilities.formatRfc822Date($date)</lastbuilddate> <generator>Roller Planet 1.1-dev</generator> #set($entries = $planet.getAggregation($group, 30)) #foreach( $entry in $entries ) <item> <description>$utilities.textToHTML($entry.content,true)</description> <category>$utilities.textToHTML($entry.category,true)</category> $entry.permalink <pubdate>$utilities.formatRfc822Date($entry.published)</pubdate> #if($entry.author)<dc:creator>$utilities.textToHTML($entry.author,true)</dc:creator>#end </item> #end </channel> </rss>
And here is the OPML template:
#set($group = $planet.getGroup($groupHandle)) <opml version="1.1"> <datecreated>$utilities.formatRfc822Date($date)</datecreated> <datemodified>$utilities.formatRfc822Date($date)</datemodified> <ownername>$planet.config.adminName</ownername> <owneremail>$planet.config.adminEmail</owneremail> #foreach($sub in $group.subscriptions) <outline htmlurl="$utilities.textToHTML($sub.siteUrl)" xmlurl="$utilities.textToHTML($sub.feedUrl)" text="$utilities.textToHTML($sub.title)"> #end </outline>
Within a template, you have access to the configuration through the $planet object, plus there are a couple of other objects that you'll find helpful in generating files. Here are the objects that are available in a template:
Running Planet Roller
You can run Planet Roller from a simple script, like the one below:
#!/bin/bash _CP=.:./lib/planet-roller-1.1-dev.jar _CP=${_CP}:./lib/rollerbeans.jar _CP=${_CP}:./lib/commons-logging.jar _CP=${_CP}:./lib/jaxen-full.jar _CP=${_CP}:./lib/jdom.jar _CP=${_CP}:./lib/dom4j-1.4.jar _CP=${_CP}:./lib/rome-0.5.jar _CP=${_CP}:./lib/rome-fetcher-0.5.jar _CP=${_CP}:./lib/velocity-1.4.jar _CP=${_CP}:./lib/velocity-dep-1.4.jar java -classpath ${_CP} org.roller.tools.planet.PlanetTool $1
If you want Planet Roller to run on a schedule, schedule it. For example, on UNIX you can use cron. I use the following cron task to run Planet Roller on the 6th and 36th minute of every hour:
6,36 * * * * (cd ~roller/planet-roller; ./planet-roller.sh)
Planet Roller uses the Rome Fetcher library to retrieve, parse, and cache newsfeed data to disk. Fetcher uses HTTP Conditional Get and Etags to ensure that feeds are only downloaded when truly updated.
That's enough for now. Tomorrow, I'll tell you about Planet Roller internals. </template-dir></main-page></subscription-ref></subscription-ref></subscription-ref></subscription-ref></subscription-ref></planet-config>
Dave Johnson in Blogging
02:31PM Feb 13, 2005
Comments [2]
Tags:
blogging
This is just one entry in the weblog Blogging Roller. You may want to visit the main page of the weblog
Below are the most recent entries in the category Blogging, some may be related to this entry.
I'm sorry if I missed this info, but is the source for Planet Roller available? is it packaged with the main Roller distribution ? and can it be run without roller ?
Posted by John Sawers on March 05, 2005 at 04:31 AM EST #
Posted by Dave Johnson on March 05, 2005 at 01:47 PM EST #