Database Reliability- Part 3 (Dec 5)

The most important task for any database is to remember where each record is, within its file. To accomplish that, our accounting and estimating software uses a binary tree (b-tree) to store record locations. It has a different tree for each class of records. Every tree has a root (starting point) and nodes (branches). Nodes at the end are “leaf nodes”, which store the record locations.

A true binary tree has 2 branches at each node, which makes a tall tree (11 levels for 1,000 records, or 21 levels for a million). The NeoAccess database we’ve used for Goldenseal versions 1.0 to 4.9x came with a default of 8 branches per node. We upped it to 32. For unknown reasons NeoAccess also had two levels at each node, so it took 8 levels for 1,000 records, or 12 levels for a million. Still a rather complex tree.

Back in the 1980s and 90s, RAM and hard drives were cramped, expensive, and relatively slow. Because of that, it was best to keep nodes small. To save space, NeoAccess also didn’t create nodes until there were records to fill them. As a result, it scattered nodes randomly throughout the file (a bit more than 1 node for every 32 records). To save space, NeoAccess also stored very little data in each node. They only contained 4 or 8 bytes for each branch: just a file position, plus a record ID in leaf nodes.

We started to use Goldenseal to run TurtleSoft in 2000, just before releasing version 1.0. Unfortunately, a few NeoAccess nodes in our company file are corrupted, and have been for more than 10 years. When we try to look at the affected records, the node points to a very wrong file location. The OS is not happy with that, so it crashes.

We’ve spent dozens of hours stepping through code and looking at our raw file data, trying to find the errors. The problem is, our company file contains over 9,000 NeoAccess nodes, and they all look identical. If we could find the lost or damaged nodes in a raw file editor, we might be able to fix them. Unfortunately, it’s the proverbial needle in haystack.

Over the years, we also have looked at 30 or 40 corrupted Goldenseal files sent in by users.  When we scanned them with a raw file editor, about half contained data from other programs entirely. Our code has no access to anything beyond our file, so we didn’t put it there. Those files probably were damaged when the OS or hard drive firmware got confused, and wrote someone else’s data on top of ours. A few bad files were caused by TurtleSoft bugs, but we haven’t seen any of those since 2006. The rest were probably caused by small NeoAccess errors: possibly due to unknown bugs in their code, possibly from “bit rot” screwing up file addresses. Change one bit in the middle of a tree, and the whole thing stops working.

For Goldenseal Pro, our goal is to make the database less fragile, and more repairable. Your hardware (and the OS) will never be 100% perfect, but we can at least make it possible to diagnose and repair any database errors.

As a first step, Goldenseal Pro creates all its basic tree structures right at the beginning. Doing so has a cost. New, empty Goldenseal Pro files start out at 6+ megabytes (compared to 106K for original Goldenseal). A file that big would have been absurdly huge in the 1980s or 90s, but now it’s just a few millionths of a hard drive. The new setup puts the most important database structures in the same places in every file, and they won’t ever move. A uniform file structure will be much easier to diagnose and repair.

Goldenseal Pro also uses bigger nodes: with 128 to 4,096 branches, depending on how many items you are likely to have there. Almost all trees will only be one level deep. If they go two levels deep, they’ll hold a few million records before needing a third. The result is less clutter, and easier debugging.

The new nodes include text for their name and contents, so they are easily visible in the raw file. They also include a short text tag for each branch, and ‘safety tags’ at the beginning and end. All these ‘file navigation markers’ help us now during development and debugging. They also will help anyone looking at damaged files, in the future.

For extra security, all the important database structures are duplicated somewhere else in the file. At least in theory, we can rebuild anything that suffers damage.  Most likely we’ll wait for actual damaged files before writing the repair tools, since it’s hard to predict in advance, how data may go wrong.

All of this extra security adds more than 100 bytes per record, so Goldenseal Pro files will be significantly larger. At $60 to $300 per terabyte, we figure you probably won’t mind spending a penny or two for a few extra megabytes, to get more reliable data.

You still should have a good backup system, using multiple locations and methods. There are plenty of ways for data to die, and we can’t prevent all of them. However, we can at least make our files stronger and more survivable. As a bonus, it makes our programming and debugging easier.

Dennis Kolva
Programming Director

Author: Dennis Kolva

Programming Director for Turtle Creek Software. Design & planning of accounting and estimating software.