Wednesday, July 14, 2010

HSQLDB, H2 and Java Unit Testing

I ran into a situation in a production environment where highly tested methods were failing on database errors, and came to think there must be a scenario I missed somewhere.  After scrubbing for clues, I finally found that there was an existing test and it flew right past the error I got from MySQL.  It turns out that our test database, HSQLDB running in-memory, was less strict (or more forgiving) when it came to enforcing not-null columns.  Oops.

HSQLDB, or HyperSQL Database Engine, has been around for a long time and is embedded as the database provider for OpenOffice.org 3.2.  It is freely available under a modified BSD license and is implemented in several not-so-shabby applications including Jira, Spring Framework and Liquibase.  As a developer, I have used HSQLDB for its in-memory or flat file database formats when testing or prototyping.  However, I never expected to run into a situation where setting not-null on a column would not be honored.

I began looking for an answer in the Hibernate irc channel, thinking perhaps the database configuration was off or that I had my syntax wrong on that column.  That is when Steve Ebersole, a Hibernate hero, explained to me that he too was experiencing random unexpected problems with HSQLDB and he was looking into switching to H2.  In my mind, if Steve Ebersole from Hibernate says "these are the kinds of reasons we are moving to h2 for our testing db," I should probably follow suit.

H2, or H2 Database Engine, was developed initially in 2004 by Thomas Mueller -- the same guy who initially wrote HSQLDB -- and is available for free under either a modified Mozilla Public License or the Eclipse Public License.  "H2" actually stands for Hypersonic 2 but really has no ties to or any shared code with HSQLDB.  It is written in pure Java, is very fast, supports emulation modes for several database engines (MySQL, Oracle, DB2, even HSQLDB), the documentation on the website is clean and easy to navigate, and ... it enforces not-null columns.

Integrating H2 into OpenMRS was easy; all it really required was adding a jar and changing the base context-sensitive test class to reference it instead of HSQLDB.  To make it work the way HSQLDB had previously, I had to use an additional parameter in the JDBC url: DB_CLOSE_DELAY=-1.  This makes the database stay open to improve performance.  I also had to add an @After annotation to the base test class cleanup method so that H2 started with a clean dataset for each test.  Then came the testing.

Initially I found upwards of sixty errors and/or failures, and quickly got the list down to 10 errors and 12 failures after fixing a few Hibernate configuration file issues regarding column length; H2 fails on attempts to insert text exceeding column lengths.  I found that the majority of the problems were mitigated by ensuring data was properly configured in the individually loaded datasets for the failing tests; some tests relied solely on data from the custom data set and did not account for what existed in the standard data set either adding to or being overwritten by the custom data.

At this point, I feel like I can actually rely on my tests to properly mimic my production environment.  This is a huge relief, and hopefully the HSQLDB group can get their database engine up to snuff soon so the supported apps will not run into these issues.  For me, H2 is the way to go.

Thursday, March 4, 2010

XML, XSLT and HL7 Tools

Over the past few days, I have been focusing on modifying some XSLT used by OpenMRS to transform InfoPath-rendered XML to HL7 messages.  OpenMRS uses HAPI, the HL7 open source Java API, to parse and process HL7 messages.  InfoPath generates XML based on schemata that the FormEntry module renders.  Being a complete novice to HL7, I had to gather a few tools to validate formats and examine the message contents.  Specifically, I needed reliable, open source tools for evaluating and processing XML, XSLT and HL7.

For evaluating HL7 messages, I first happened upon HL7Spy (not free, except for a trial period).  It is really good at not only evaluating HL7 messages, but explaining what each section of each field should be.  Apparently, HL7Spy is good for manipulating files with large amounts of messages in them.  I only need to look at one message, for now.  Looking for an open source solution, I tried to implement the HAPI HL7 Conformance Tools, but ran into issues getting the Message Validator to work in my environment.  I then found HL7 Inspector, which is perfectly fine for examining HL7 messages.  It leaves me wanting for more information, but there are numerous random HL7 documents and the HAPI docs to help with that.

I had much better luck finding XML/XSLT tools.  I really liked using the free trial of Stylus Studio, but the free version of EditiX is perfect.  The pay version does some fancy footwork, like XSLT debugging and profiling, but for validation of both XML and XSLT, browsing the DOM and applying transforms, EditiX works.

OpenMRS developers: if you start running into HL7 issues, definitely pick up one or more of these tools.  It will certainly help you understand the message better than counting the number of pipes on each line in a text editor.

Thursday, February 25, 2010

Becoming an OpenMRS Developer

I have become an OpenMRS developer.  


The process began when I first heard about OpenMRS, a "community-developed, open-source, enterprise electronic medical record system platform," while perusing job postings in Indianapolis.  The Regenstrief Institute site was driven by Plone, so naturally, I was intrigued.  Several phone calls and one trip to this illustrious city later, I was given the opportunity to come on as a Systems Engineer, dedicated to OpenMRS development.


What I did not know at the time is how extensive OpenMRS's installation base is (click to see a map, courtesy of RI's Michael Downey).  My role at Regenstrief Institute also incorporates support for the AMRS / AMPATH installation in Eldoret, Kenya.  In just a few very short weeks, I have enjoyed conversations with several developers and implementers from around the world.  This is exactly what I have been waiting for.


Open source development requires a developer to change perspectives.  One has to be open to new ideas, from other cultures as well as other paradigms.  A developer has to stay informed and be ready to contribute advice, especially if another developer begins work down a similar path.  The grammar and language used to document, within code or on a wiki, has to be internationally understandable and accessible.  Finally, if a developer really wants to affect the direction of development, participation in the online community is mandatory.  This includes frequent discussions and meetings, and contributing effective, well-formed and useful code on a regular basis.


To be fair, my experience in the client-driven software development realm allowed me some opportunities for exciting innovation.  I was involved in implementing an early WSGI framework on top of Zope (and Plone) to deliver highly targeted content in unique ways.  We brought a static site with literally millions of daily hits into a multi-server Wordpress installation with a custom ajax rating plugin, and saw it bought up by NBC Sports.  I also became integrated with voice technology used in distribution centers and long term care, and got to travel quite a bit.


I appreciate everything I have learned over the last several years, working for both open source and for-profit efforts.  That said, I am very excited to be on board at Regenstrief Institute, and cannot wait to meet the people I work with daily in Kenya.  For now, I believe I am in the right place at the right time.