« Son Volt | Main | rssboard.org »

OpenOffice.org for open source docs?

Ted Husted: The OpenOffice suite provides an interesting opportunity for open source products. Since the suite is free, open source, and multiplatform, using this tool with our projects is little different than using Subversion or Ant.

Problem is, the format is not change-log friendly. By design, all changes made to a ASF product are logged to one of the mailing lists, where they become part of our "communal memory". When a change is made to an OpenOffice document and checked into the repository, it is logged as a change to a binary file. No one watching the project knows what changed unless they spend several minutes opening the document and reviewing the internal change log.

Albeit, Roller is deliberating whether to use the OpenOffice to maintain it's user documentation. The vote is pending now. Since OpenOffice can save to multiple formats, my suggestion is that we also checkin a companion HTML document, so that everyone can see what changes in real time. We'd contnue to edit the ODF file, and just Save As to HTML before checking in both files. Film at 11.

Yes that's right: Ted Husted (of Struts fame) is blogging!

Plus, Ted has been participating on the Roller dev list and most recently raising some issues with the use of OpenOffice.org for our user and install guides. As you can see, Ted's biggest concern is that, when docs are in a binary format it is hard to monitor doc changes by watching the diffs in SVN commit notifications. Does your open source project use OpenOffice.org and if so, how did you deal with this issue?
Comments:

I really hate to say this, but one could set up OOo to automatically save in Word .doc format. Then of course check in the file as is since it's a Word file. I'm not positive but I assume that that would solve the problem. Alternately, and this depends on how you manage your files, you could simply use OpenOffice.org's own versions feature to track changes, separately. Choose File > Versions, click Save New Version. On a separate topic, I saw you mentioned Son Volt. I've been listening to Trip Shakespeare again, the band's great-grandfather or something along those lines. Love it.

Posted by Solveig Haugland on January 27, 2006 at 03:32 PM EST #

Thanks Solveig. Saving in Word format would not help since Word is also a binary format. If OOo saved ODF as an XML text file, instead of XML files within a ZIP file, then we'd be able to use simple text diff tools (like the one built into our source code control system) to compare revisions.

Posted by Dave Johnson on January 27, 2006 at 08:52 PM EST #

ODF, under the hood, is a zipped archive, the meat of which are a few XML files: one for the content, one for the styles, and if you have embedded images or other binary media, those as well. A reasonably accessible (though non normative, of course!) overview is available on Wikipedia [ http://en.wikipedia.org/wiki/OpenDocument_technical_specifications ]. So, the thought is one could unzip the ODF and stick that into the repository (say, with SVN hook scripts), although I don't see how this would address the issue of nice friendly diffs (if that's what Ted Husted is talking about). Another thought is that OO.org also can produce and consume DocBook, which is designed pretty much for this kind of purpose; round-tripping here might not be up to your requirements. I really like this approach, but I'm a DocBook nerd =) Another possibility: use Maven for your build process, and use xdoc or what they're calling APT (almost plain text) for the documentation.

Posted by Adam Constabaris on January 28, 2006 at 10:43 PM EST #

Checking in an exploded ODT file is an interesting idea. The diffs might not be pretty, but they would be better than we have now (none with a binary file). DocBook is another interesting idea and I experimented with saving from OOo to DocBook, but the DocBook output looks too simple to be useful (e.g. heading and paragraph markup and that's about it).

Posted by Dave Johnson on January 29, 2006 at 12:33 AM EST #

This is an automated task that can be done to get a real diff if you compress the doc. Look a bit more into the file format before deprecating it :-) Heck, I don't don't know what 8+13 = ?

Posted by scott hutinger on February 09, 2006 at 11:42 PM EST #

Post a Comment:
  • HTML Syntax: NOT allowed

« Son Volt | Main | rssboard.org »

Welcome

This is just one entry in the weblog Blogging Roller. You may want to visit the main page of the weblog

Related entries

Below are the most recent entries in the category Roller, some may be related to this entry.