Resolution?

Anthony has found and fixed what looks to be the source of the FreeRoller crashes. I'm keeping my fingers crossed and I will keep you posted.

A break.

I've been pretty busy lately, with a software release at work coinciding with a software release at home and the ongoing FreeRoller difficulties. I need to burn some vacation time so I'm going to take a couple of days off and relax around the house. I'm not sure if this means more or less blogging. Probably more.

Tracking down the FreeRoller problem.

We've been having a helluva time tracking down the problem that is causing <a href= "http://www.freeroller.net">FreeRoller to crash two or three times each day.  By "crash" I mean that FreeRoller gets slow, stops responding to hits on Roller's Velocity-driven pages, and eventually seems to lock up Tomcat. Once Tomcat is locked up, you will started getting "Document contains no data" errors in your browser.

Here are some of the things I've been working on to resolve this problem:

Ensure that resources are released

Roller uses Castor JDO and the Velocity DataSourceResourceLoader for all database access, so Roller has to trust those components to use and release database connections properly.  Every time Roller gets a database connection (a JDO Database object), it uses that connection within a try block and releases the connection within a finally block.

FreeRoller uses Roller on Tomcat with DBCP connection pooling <a href= "http://jakarta.apache.org/tomcat/tomcat-4.0-doc/jndi-resources-howto.html">via JNDI and the MySQL JDBC drivers. We have not be able to figure out how to configure DBCP to get debug informaiton on the number of database connections in use. If you know how to do this please drop me a line.

Ensure that exceptions are being handled properly

There were a number of places where exceptions were being blown out to Tomcat instead of being handled properly by using a response error code.  There was also one place where an exception was being thrown from within a catch block.  None of these should cause a crash, but I was guessing that perhaps one of them was tickling a Tomcat or Java VM bug.

Look for infinte loops and infinite recursion

I found and fixed an infinite loop in the error.jsp file which used itself as its own error handler.  That bug was very easy to find due to the big stack trace and stack overflow exception in the Tomcat logs.  

There could be other problems with infinite loops and recursion, but I have not found any evidence of these problems in the FreeRoller logs.  Weblog templates and weblog entries are treated as Velocity templates, so it possible that a weblog author could introduce an infinite loop or infinite recursion.  Roller tries to protect against this in the Roller macros.includePage() and macros.showWeblogEntries() directives, but we can't offer protection against a truly malicious user.

Test under heavy load

FreeRoller runs on Tomcat 4.0.6 and MySQL on Linux.  So, I setup Tomcat 4.0.6 and MySQL on my Redhat 7.1 box and tried some stress tests.  I tried using JMeter, but it kept on running out of memory.  I eventually found Openload and set it up to hit http://snoopdave/roller/page/test1 (a pretty heavy page) and http://snoopdave/roller/rss/test1 with 5 clients on each.  I configured Roller with a session and page-cache timeout of 2 minutes so that the database code gets a workout.  I ran for several hours at heavy load with no ill effects and the number of JDO Database objects never exceeded 20.

Even with a new 0.9.6.3-pre1 build of Roller that addresses the issues above, FreeRoller still had a crash today.  I'm baffled.  If you have any suggestions for testing or debugging tools or approaches, please let me know.  Free feel to browse the code ;-)

Main | Next day (Nov 4, 2002) »