Who ate the threads?
I tried the kill -QUIT trick on FreeRoller this afternoon after it locked up. Looking at the stack traces for Tomcat's 75 threads, I found that most were either waiting for database connections or involved reading cache entries from the disk.
Recently, I had changed to OSCache settings so that Roller caches to disk rather than to memory. I did that because Roller was running out of memory. Caching to disk fixed the out-of-memory problem, but at the expense of heavy disk access. So, I changed the OSCache settings back to cache to both disk and to memory. OSCache seems to have some limitations in this area. Apparently, if you configure OSCache to cache in memory and to disk, you can limit the number of entries in the memory cache, but the disk cache is unlimited - which is not the best situation.
Also, I changed the DBCP connection pooling parameters to reduce the maxIdle and maxWait so that threads will timeout quickly rather than hanging around waiting for connections and thus forcing the creation of new threads.
Those changes kept FreeRoller up for a much longer period of time, but did not fix the problem.
Kicking bricks barefooted.
Andy Oliver: I followed the instructions, step by step, substituting the postgresql script for the mysql and the driver name, etc. (Take for granted that I approximately know how to do these things since I do Java/database-related apps for a living) [...] Nope...It doesn't look like it worked...
Andy Oliver: competence is required to judge competency and that the incompetent often don't know that they are.
I'd like to point out that we have this thing, we call it a "<a href= "http://sourceforge.net/mail/?group_id=47722">mailing list," where we answer installation questions and otherwise help people who are trying to get started with Roller. Look into it and if you have any suggestions for making the Roller install process better, please share them with us.
FreeRoller eating threads.
FreeRoller keeps on running out of threads to process incoming requests. This could be a problem in Roller, but I suspect that we may have run up against this bug in the Tomcat 4.0.X HTTP connector:
BUG5735: HTTP connector running out of processors under heavy load
The bug is marked as fixed, but I think that means the fix is in the new Tomcat 4.1.X Coyote HTTP connections.
Next time this happens, I'll try Glen Nielson's advice (from the bug report):I would recommend that you dump the stack for all running threads when you experience this problem. This can help identify what is causing the problem. By reviewing the stack dump for each thread you can determine whether the problem is due to Tomcat or your application. On unix you do a kill -QUIT {tomcat java pid) to cause the thread stack's to be dumped to catalina .out. A Processor for Tomcat runs your application code, delays in your code can cause additional processor threads to be created to handle new requests. Possible application or configuration problems which can delay requests: Connection delays due to networked services such as a db. Connection delays due to running out of pooled resources. Thread synchronization deadlocks. A cascading affect where many new processors get created due to excessively long JVM Garbage Collections. start java with -verbose:gc to detect this.