More about FreeRoller stability and performance.
Cameron Purdy commented on FreeRoller performance recently:
Apparently, over half the load is on the RSS/XML side (not the "interactive web/HTML" side,) and it's enough to completely saturate their T1 connection. Wow! And it's running on an eMachine (a US$200 "Walmart computer") that was given up after it had served its useful life. So in a way, it's amazing that it runs at all. However, Anthony has plans to cluster it, and with a couple of commodity servers to run on (plus enough bandwidth?), it should be back flying again, hopefully before too long.
Recently, I've been helping Anthony Eden keep FreeRoller up and I've been watching the logs and the stats. Cameron is right, FreeRoller gets a hell of a lot of RSS traffic and FreeRoller is running on a pretty light-weight machine: a 450Mhz Pentium II with 256MB RAM. This might be a fine an Apache/mod_perl app, but for big beefy Roller/Tomcat/Struts/Velocity/Castor it's just not enough.
Still, I'm doing what I can to improve performance. For example: this weekend I added caching to FreeRoller's RSS feed. To do this I had to add: 1) handling for the IF-MODIFIED-SINCE header in the PageCache ServletFilter and 2) a new Filter Mapping to map '/rss/*' to Roller's OSCache based PageCache Servlet Filter. This should speed average RSS response time, reduce memory usage, and database access. It may have some effect on the FreeRoller RSS feeds, so if you notice a problem in a FreeRoller RSS feed let me know about it.
BEA: huge adoption curve climbing very fast for Linux.
From Computer World's interview with BEA's CEO Alfred Chuang:
What Linux trends are you seeing with BEA software?
Huge adoption curve climbing very fast for BEA over the last six to nine months. A lot of focus in the financial services marketplace, where there's a lot of experimentation and initial deployment going on with Linux on Intel. And I think the motivation in that arena is simplification and cost reduction, so they are looking to buy significantly less expensive hardware.
What's the breakdown of platforms on which BEA software is running?
About 50% is on Sun, and about 23%, 24% is on Hewlett-Packard. Hewlett-Packard has both Intel and non-Intel platforms in there. And then it drops off pretty quick. IBM hardware, I think, is 5% or 7%. In some countries, we sell a lot of IBM's hardware.
What about the Linux operating system?
Linux is around the 15% to 20% range, which has climbed pretty quickly.
New JAXB and JSF releases from Sun.
Get 'em while they're hot. The Server Side reports that Sun has finally released <a href= "http://jcp.org/aboutJava/communityprocess/final/jsr031/index.html">JAXB 1.0. The JAXB reference implementation <a href= "http://java.sun.com/webservices/docs/1.1/ReleaseNotes.html#redistribute">is redistributable, but is it open source?
The Server Side also reports that Sun has released Java Server Faces (JSF) Early Access 3, a tutorial, and a public draft of the JSF spec.
Seedlings poking through.
Frank Boosman: But there is another Silicon Valley rising from the ashes -- a new generation of companies starting and growing up, born out of bad times. These companies are the seedlings poking through the ashes of a devastating fire. When the fire is a memory, they'll be the tallest trees in the new forest.
New Tomcat and Ant releases.
As usual, Matt is on top of the Jakarta news. He notes that new Tomcat 5.0.1 alpha and Ant 1.5.2 are both available for download.
Java IDEs: Market Overview.
<a href= "http://www.computerworld.com/developmenttopics/development/java/story/0,10801,78943,00.html">Computer World summarizes the Meta Group study by writing that Borland, IBM, and Oracle are the "clear leaders" while Sun and Jetbrains are the "distant challengers."
Hibernate tests in place and remember me.
I developed Roller without unit testing, but I'm not going proceed with the Hibernate implementation without having tests in place. Last night, I checked in tests for the Roller NewsfeedManager, the simplest of the Roller backend managers. The next step is to write or generate Hibernate mappings for the NewsfeedData object and code up a new NewsfeedManagerImpl.
While I was working on that, Matt implemented remember me, added email notification for comments, and upgraded his site to the latest Roller from CVS (not even I am ready to do that ;-)
Ready to Hibernate.
Finally, I'm ready to start working on the Hibernate implementation of the Roller Weblogger backend. I ended up doing a lot more refactoring than I had intended. I told you about how I moved code from the Castor implementation into abstract base classes that can be used by the Hibernate implementation. Now I'll describe the changes that I made in the Roller build and code-generation process.
In the original Roller build process, illustrated in Figure 4 of the article Building a J2EE Weblogger, I used abstract javax.ejb.EntityBean
classes as the meta-data basis for generating code via the XDoclet EJBDoclet task. I was subverting EJBDoclet: using it to generate Data Objects, Struts Forms, and Castor mappings but not using any of it's EJB output.
That worked pretty well, but eventually it became a problem. The generated Data Objects were just dumb data-holders and, over time, we realized that they need to be smarter "business objects." The Data Objects were regenerated on every build and that made it diffucult, if not impossible, to add new methods, new logic, and new collections.
The new Roller build/code-gen process<img src="http://www.rollerweblogger.org/resources/roller/xdoclet-roller-sm.jpg" alt="Diagram of Roller build process">
The new Roller build process, or at least the code-generation part of it, is shown above. We now start with some hand written "Plain Old Java objects" or POJOs. We still have to subvert EJBDoclet because the XDoclet <strutsform>
and <castormapping>
can only exist inside EJBDoclet.
I had to use Matt Raible's patched version of <strutsform>
(from his struts-resume example) because the one in XDoclet 1.2b2 works only if the source class extends javax.ejb.EntityBean
and the new Roller POJO's don't do that. Matt and I consider this to be a bug in EJBDoclet, but I'm not sure the XDoclet guys agree. Maybe Roller should define it's own <strutsform>
that works on any POJO and that inserts validator tags (something Matt also added in his struts-resume example).
The Roller build/code-gen process is still not perfect, but it's "good enough" for me to begin my Hibernate implementation of the Roller Business Tier interfaces. I'll be blogging as I go so stay tuned.
Dot-Net vs. Java shootout at NCSU.
In addition to offering a week of high-tech training for $95, TechEngage has also arranged a <a href= "http://www.techengage.org/shootout.aspx">Dot-Net vs. Java shootout on Wednesday, March 12, 6:30 PM, at the N.C. State University College of Management in Raleigh, N.C. Representatives from Microsoft and Borland will face off against representatives from IBM, JBoss, and Sun.
Moblogging Roller.
Matt Raible has successfully tested Russell Beattie's <a href= "http://www.russellbeattie.com/notebook/20030225.html#231612">world-famous <a href= "http://www.manywhere.com/Moblogger.html">ManyWhere Moblogger with Roller.
Arguing with Scott Meyers.
If you read the Java blogs, you've probably seen Bill Venners' excellent How to interview a Programmer article by now. If not, check it out. My favorite quote is below:
Scott Meyers: I hate anything that asks me to design on the spot. That's asking to demonstrate a skill rarely required on the job in a high-stress environment, where it is difficult for a candidate to accurately prove their abilities. I think it's fundamentally an unfair thing to request of a candidate.This one is a close second:
Bruce Eckel: I ask candidates to create an object model of a chicken.
I mean, how can you argue with Scott Meyers and how can you object to a chicken?
Speaking of arguing with Scott Meyers... I worked at one of those phone company research labs back in the 90's and we had enough money to bring in Scott Meyers to teach us all about C++. Scott was explaining how you can overload operators and you can even overload the equals sign, when one student raised his hand. The student explained that there were some situations in telecommunications systems where A was equal to B, but B was not equal to A. Scott immediately objected, of course, but the student went off into some jargon-filled explanation of his particular problem domain. Scott let the student finish and then said, "if you were to overload the equals operator so that A was equal to B, but B was not equal to A, then I would want to kill you." That was the end of the argument.Preparation for Hibernation.
To prepare for the move to <a href= "http://hibernate.bluemars.net">Hibernate, I've been refactoring the current implementation of the Roller business tier. There is a lot of code in the Castor implementation of the <a href= "http://www.rollerweblogger.org/javadoc/org/roller/model/package-summary.html">persistence manager interfaces that is generic enough to be used in the Hibernate impementation. So, I'm moving that code up into abstract base classes that may be used by both implementations.
Tuesday at the RTP-WUG: Model Driven Architectures and Eclipse.
Looks like another great Eclipse-oriented talk at the Research Triangle Park Websphere Users Group (RTP-WUG) meeting Tuesday. Sridhar Iyengar of IBM and Randy Miller of TogetherSoft will speak about Model Driven Architecture and the Eclipse Modelling Framework. Full details on the RTP-WUG web site.
FreeRoller performance.
As FreeRoller users and readers know, FreeRoller is having problems keeping up with the load of 500+ weblogs and the 101-list traffic. Sometimes the main page takes several minutes to load and often the system just collapses under the stress. According to Anthony Eden, who runs FreeRoller, the system is powered by one little eMachines box. I could use that as an excuse, but, regardless of the hardware, the poor performance of FreeRoller reflects poorly on the Roller software.
I considered using a profiler to figure out the root cause of the performance problem, but a profiler is difficult to setup for a Servlet app, there's a learning curve, and there are only so many Roller hours in my weekend. Plus, I had a gut feeling that the Roller main page, index.jsp, was causing the slow-downs.
Looking at the code behind index.jsp, I found that it uses several queries to build a list of weblogs, hit counts, and the last update time for each weblog. The queries are not complex, but because they are executed using an Object-Relational (O/R) persistence framework (Castor) each query results in the creation of one object per row returned. FreeRoller has over 500 weblogs, so each refresh of the index.jsp page results in the creation of thousands of objects. Even with page caching in place to reduce the frequency of refreshes, this has the potential to bog down the whole system.
I rewrote the code behind index.jsp to use only one query and I also added a limit to that query so that a Roller administrator can limit the number of weblogs displayed on the main page. Next, I did some load testing with JMeter. I set up a JMeter test plan to use 3 threads to hit index.jsp, a 20-second ramp up, and a 300 ms delay between requests. I did this testing on a 1.5Ghz Athlon, 768MB RAM box running Windows XP, Java 1.4.1, and Tomcat 4.1.18. Here are the results:
Avg. Dev. Throughput Note 1) Old index.jsp 1051ms 8629 120/min Heavy CPU usage 2) New index.jsp 248ms 1326 248/min Limit 500 weblogs 3) New index.jsp 93ms 186 247/min Limit 30 weblogs 4) TC index.jsp 90ms 72 337/min Tomcat example page
The old index.jsp would peg the CPU at 100% for minutes at a time during cache refreshes. I'm not sure why, but I suspect that multiple requests were simultaneously refreshing the cache (I'm not sure how OSCache hanldles this). During test #1, Roller was so slow as to be almost unusable. Test #2 and #3 show that the new index.jsp is a great improvement. With a limit of 500 weblogs, the CPU usage is very heavy during a cache refresh but doesn't get to 100%. With a limit of 30 weblogs, the cache refresh does not have a noticable effect on the CPU. Test #4 tests the Tomcat examples page, it is included only as a baseline for comparison.
I made the above changes in the Roller 0.9.6 code branch so that they can me applied to FreeRolller right away. I'm still not happy about Roller performance and memory usage. I'm sure there is plenty of room for improvement. My next experiment is implementing the Roller business tier with the Hibernate O/R framework. I'm curious to see how it compares to Castor.
Lance's Prevayler experiments.
Lance is experimenting with Prevayler by using it and JXPath to implement the Roller business tier. Cool stuff. I'm gearing up for a Hibernate implementation; more about that later.
My review of Newsmonster.
Lots of interesting features, but the one I find myself wanting most is an un-installer.
A terrible, buggy, monster.
I found this very interesting read on the history of AWT, Swing, and SWT fom an undisclosed source via Roller user <a href= "http://blog.xesoft.com/page/jon.lipsky/20030221#a_good_read_about_history">Jon Lipsky's blog. Here is a tasty excerpt:
Alan Williamson's mysterious "source close to IBM": At IBM we hated Swing from day one. Big, buggy, and looks [like] crap. Initially our tools such as VisualAge for Java were all written in Smalltalk ( which used native widgets ) so when we started to migrate these to a Java codebase we need a widget set. All of the IBM developers are the same crowd who used to work with Smalltalk, and we reluctantly under management orders built our WebSphere Studio tools using Swing. It was a terrible, buggy, monster. In our initial previews when it was demo'd against Microsoft Visual Studio products all our users hated it just because of how it looked, never mind what it let you do. Most shoppers don't like to get in car that looks and smells terrible, even if it does have a nice engine.UPDATE: <a href= "http://blog.xesoft.com/page/jon.lipsky/20030221#a_good_read_no_truth">Jon Lipsky was contacted by somebody at Sun who claims there are many and major inaccuracies in the above story.
Java LGPL clarification.
(via Lance) This quote from Free Software Foundation lawyer Eben Moglen seems to clarify the issue of using an LGPL jar in a non-GPL application (such as Roller). As long as the jar is LGPL, rather than GPL, you can include it in your application and then license your application however you choose. Here is the key quote:
Eben Moglen: If the author of the other code had chosen to release his JAR under the Lesser GPL, your contribution to the combined work could be released under any license of your choosing, but by releasing under GPL he or she chose to invoke the principle of "share and share alike."Some conversations at the last RTP bloggers lunch left some doubt in my mind about LGPL, but the above quote clears it up nicely.
Castor links.
Roller uses the Castor persistence framework. FreeRoller can be very slow.
Are the two things related? I have no real empirical evidence (yet) to prove a link, but I did find an interesting link of a different sort in my referer logs this morning to a blog entry by (Roller user) Matthew Porter. Back in December 2002, Matthew chose Hibernate instead of Castor for the Java Lobby Community Port (JLCP) project. Here's why:Persistent Framework Choice for JLCP: Castor was not chosen for two primary reasons. The first is the lack of development of Castor in the past year. In addition, one the tests we performed at DMI, Castor was significantly slower than other PFs- to the point where it was intolerable. The first reason and recent tests led me to believe that the situation regarding speed had not changed.Also, an interesting link was posted in a comment on my Long Transactions post yesterday. This is a pretty interesting article:
O/R Mapping with Castor JDO in the Real World: Castor holds up to its promises in simple testing and trial runs. However, it has proven to fall short in some practical issues with our application of about twenty-five data classes and as many tables. Most of our problems come from the need to hold onto objects across transactions and perform complex updates.I'm tempted to rewrite the Roller backend using Hibernate just for the hell of it, but I really should to do some profiling of Roller to see where the problem lies, don't you think? I guess I could do a 30-day eval of OptimizeIT or JProbe, but I would be happy to hear your recommendations for free and/or open source profiling tools. Got any?
Long transactions.
From the Roller-dev list:
Lance: Castor is *supposed* to be caching objects for us so that we don't need to make repeated calls to the database. I suspect the way we are using Castor may not be optimal, though I really don't know enough to suggest improvements.Webapps seem to cause problems for O/R frameworks. In an O/R framework, things seem to work properly only inside of a O/R "transaction" or "session." In a webapp, you get an object from the database in one session, close that session, allow the user to modify the object in an HTML form, then start another session to update the object in the database. Castor calls this a "long transaction." I wonder how these long transactions effect Castor's ability to cache efficiently.
« Previous page | Main | Next page »