Database Reliability- Part 2 (Nov 28)

When writing new code for our estimating and accounting software, we have a rule of thumb. When it first works, it’s 1/3 done. When it works for everything the programmer can think of to break it, it’s 2/3 done.  After it survives testing by people actually using it, it (eventually) reaches 100%.

We built the new database for Goldenseal Pro more than a year ago, but it was bare bones, and really just the first 1/3 of the work. It was sufficient for initial interface testing, and to make sure the basic approach was valid. The past couple weeks we added the rest of what a database needs, so it’s now at the 2/3 mark. The final 1/3 will be polishing, tweaking and testing, done gradually over the next few months. The final test will be using it to handle our TurtleSoft business accounting for a while.

Our primary goal for the new database code is to make it extremely reliable. We suspect that there are still a few subtle bugs remaining in the NeoAccess database that runs Goldenseal 4.9x, and we want to avoid that for Goldenseal Pro.

One specific way to increase reliability is to simplify the basic structure of the database.

NeoAccess used something called a binary tree (or b-tree) to store the location of each record within the file. B-trees are optimized to access records via binary search, which is the most efficient way to find anything. It only takes log-2 time to find any record in a sorted list. That means a binary search in a thousand records only takes 10 steps, a million records only takes 20 steps, and a billion takes 30. Binary search is amazingly fast when you are indexing, say, all humans on Earth.

The problem is, b-trees are complicated. The branches need to be balanced and trimmed, and there are many ways to go wrong. If there are still bugs left in our modified version of NeoAccess, they are probably buried in the tree management code. It’s just too complex, so we never touched it.

For Goldenseal Pro we use simple lists, rather than b-trees. They are slower to search, but on our scale it’s only a matter of microseconds.  B-trees are gross overkill for small businesses with mere hundreds or thousands of records. Simple linear searches are easy to debug, and much less fragile. Even when a b-tree is appropriate, it’s far better to use a well-tested library, rather than “roll your own” code as in NeoAccess.

For large numbers of records, we still use a tree structure. But it is a very “broad” tree that rarely grows more than two levels deep. It’s more like a shrub or low-maintenance ground cover. Theoretically, the new database won’t be as efficient as a true binary tree. However, we much prefer increased simplicity and reliability, over a tiny performance gain.

It’s possible to do binary searches without a b-tree, and binary search is even built into the C++ standard library that we use. However, binary searches are harder to debug, so we will only add them in places where linear search is visibly slow. That’s part of the tweaking process.

************************

Another part of making code reliable is to write cautious code. That means we frequently “sanity check”, and give an error message if anything is amiss. It makes our development process much easier, since most bugs give a message that takes us right to the problem. If any bugs survive into the final version, users will get a warning, rather than a hidden problem that bites them later.

The main reason that Goldenseal 4.9x rarely crashes, is because it sanity checks first. There are 6,700 places in our code where we confirm that something really exists, and 1,600 places where we check for a reasonable value. There are more thousands of sanity checks built into the code logic, but we can’t count those easily.

Along with our usual cautious coding, Goldenseal Pro includes several layers of sanity checking to make sure the database stays healthy. I’ll talk more about that in a future post.

Dennis Kolva
Programming Director
TurtleSoft.com

 

Author: Dennis Kolva

Programming Director for Turtle Creek Software. Design & planning of accounting and estimating software.