Overview
Displaying a blog page can take dozens of database queries and database queries can be expensive. They take time, consume CPU cycles and typically use network bandwidth. Roller's built-in caching system addresses this problem by caching generated pages and feeds. By default, Roller caches pages and feeds in memory using a Least Recently Used (LRU) algorithm and by default caches are configured appropriately for a 100 blog system. If you are running a site with more blogs or a very high-traffic site, you should consider changing the caching configuration. First, let's discuss how the caches work.
Cache invalidation and expiration
When Roller generates a page, it puts a copy of that page in a cache. The next time that a request comes in for that page, Roller returns the page from the cache. When a blog changes, Roller invalidates the blog's cache enties, i.e. it throws that blog's pages out of the cache. And by default, when the cache is full and we need to add a new entry to the cache, we push out the least recently used entry in the cache to make room; that's the LRU algorithm I mentioned before.
Sometimes, a blog page includes things that change frequently like a list of referrers or a server-side hit counter or data from some other source. We don't want to invalidate a blog's cache entries every time a hit is counted. That would defeat the purpose of the cache. So, by default Roller uses an expiring cache that automatically invalidates cache entries after timeout period.
Cache configuration
To configure the Roller caches, you add properties to your roller-custom.properties properties override file. You can learn more about this override in Section 6 of the Roller 4.0 Installation Guide and you can find a complete list of the properties you can override in Section 11.
First, let's cover the default caching mechanism. If you're running a large and high-traffic site, you might want to consider using the non-expiring cache or setting the cache timeout very high (4, 6 or 12 hours). Here's how you tell all caches to use the non-expiring cache:
cache.defaultFactory=org.apache.roller.util.cache.LRUCacheFactoryImplHowever, if you do that, then blogs that use Roller's built-in hit counter or that display referrers will not be updated as often as your users would like. So, you might want to consider removing the #showReferrersList() macro from any themes in use on your site.
Configuring Roller's four page and feed caches
You can configure caching differently for the different types of pages and feeds produced by Roller. There are four separately configurable caches. Here are their names and an explanation of each:
And for each one of these caches you can configure these properties:
Cache property names follow the pattern cache.<cache-name>.<property-name>. The best way to understand how this works is to look at the default cache configuration used by Roller:
# Weblog page cache (all the weblog content) cache.weblogpage.enabled=true cache.weblogpage.size=400 cache.weblogpage.timeout=3600 # Feed cache (xml feeds like rss, atom, etc) cache.weblogfeed.enabled=true cache.weblogfeed.size=200 cache.weblogfeed.timeout=3600 # Site-wide cache (all content for site-wide frontpage weblog) cache.sitewide.enabled=true cache.sitewide.size=50 cache.sitewide.timeout=1800 # Planet cache (planet feeds) cache.planet.enabled=true cache.planet.size=10 cache.planet.timeout=1800
The default cache configurations above are setup for a 100 weblog system. To some extent, this is guess-work. For example, we've decided to cache 4 pages and 2 feeds for each blog. That's how we arrived a cache.weblogpage.size=400 and cache.weblogfeed.size=200. And we've decided to cache blog entries for 30 minutes and feeds for one hour. That's how we arrived at cache.weblogpage.size=400 and cache.weblogfeed.timeout=3600.
You might decide to do things a little differently on your Roller system. Copy the properties above to your roller-custom.properties file and set them to values you thing are appropriate for number of weblogs, average page size, traffic levels and JVM heap size of your Roller installation.
Conclusion
Roller default cache configuration will work well without modification for a small to medium size Roller installation, but for large high-traffic sites you should increase cache sizes and think carefully about timeouts. And if you're running Roller in a cluster you might want to consider using a distributed caching system like memcached
. I'll discuss that in my next HOWTO.
This work is licensed under a Creative Commons License.
Copyright 2002-2007, David M Johnson (dave.johnson at rollerweblogger.org)
This is a personal weblog, I do not speak for my employer.

Buy now from Amazon.com
Or direct from Manning
| « July 2008 | ||||||
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 10 | 11 | |||
13 | 14 | 15 | 16 | 19 | ||
20 | 22 | 23 | 24 | 25 | 26 | |
27 | 28 | 29 | 30 | 31 | ||
| Today | ||||||
Allen Gilliland
Anil Gangolli
Dan Axon
Danese Cooper
Film Babble Blog
Geertjan's Weblog
Henri Yandell
James Robertson
Jim Grisanzio
Josh Staiger
Linda Skrocki
Pat Chanezon
Rama
Ruby Sinreich
Simon Phipps
Tim Bray
Will Snow
Janne Jalkanen
Joe Gregorio
Matt Raible
Mike Cannon Brookes
Rafe Colburn
Sam Ruby
Simon Brown
My other sites
Posted by German Eichberger on March 04, 2008 at 01:58 PM EST #
Posted by Anil Samuel on March 24, 2008 at 09:00 PM EDT #
Posted by Anil Samuel on March 25, 2008 at 12:46 AM EDT #
Anil, the cache info is not intended to be used as a hit counter, but if you wanted to display cache-info on a blog page and you can code in Java then you could write your own plugin model.
I'd like to see better blog statistics in Roller, but that has not been a priority for us because the big Roller sites are on the open internet where we can use services like Google Analytics.
- DavePosted by Dave Johnson on March 25, 2008 at 08:27 AM EDT #