Memory & Speed (May 10)

When our staff first started work on Goldenseal accounting software, computers had 1 to 4 megabytes of RAM. That was sparse. Programmers had to work hard to conserve memory. It’s probably why the NeoAccess database library used such a skinny tree to locate records on disk. It only consumed 9 or 10 bytes per record.

Back then, processors ran at 16 to 32 megahertz. Speed was another big concern. We originally chose the NeoAccess database because it was fast, thanks to record caching.  A cache means that each record sticks around in RAM for a while, after it is opened. Reading data in RAM is many times faster than the hard drive, so a cache saves time for anything used more than once. Of course, there is only so much room in RAM, so a cache must be managed carefully. Our programmers thought hard about when to write records to disk, and when to remove items from the cache.

These days, computers have gigabytes of RAM (a thousand times more than the 90s). Processors run at a few gigahertz (a hundred times faster). Programmers can spend less time worrying about memory and clock cycles. They can focus more on reliability and maintainability.

Our staff has been using the Sample Company File for testing since late 2016. It’s a small file. A few weeks ago we converted our own TurtleSoft file to the new format, in prep for using it daily. With 10,000s of records instead of 100s, it’s a chance to stress-test and fine-tune the database code.

NeoAccess had only one way to locate a record on disk: climb through the index tree, and find its ID and file mark (4 bytes each). It was fast, and saved memory. Unfortunately, if any bits were damaged along the way, the record was lost forever (or read from the wrong part of the file, producing garbage).  In 2002 we added a File Manager to store record addresses in a second place. It definitely reduced database errors. Unfortunately, it had its own flaws, and never quite lived up to its potential. More about that next week.

As mentioned a couple weeks ago, Goldenseal Pro is more ‘webby’, with extra links, and ways to recover damaged data. To make that happen, it now uses about 80 bytes for each record location, instead of 10. We may even add more. The extra links, padding and redundancy make Goldenseal Pro files about triple the size. However, they are still a tiny fraction of a modern terabyte drive. For a few extra cents of drive space, it’s well worth the increase in reliability.

For speed, Goldenseal Pro still uses a record cache. However, it needed a complete rewrite. The old NeoAccess code worked OK, but it was very hard to understand and debug. Their cache also relied on their indexing system, which we’ve replaced. Back in 2002 we added code to help with cache diagnostics. It has been running in parallel for 16 years, and now it takes over completely.

Managing the cache is tricky. Some records are used more than once, so the cache needs ‘reference counts’ to keep them open until everything is finished.  Mess up the count, and the program will crash (if the record is removed prematurely) or leak memory (if never removed). Memory leaks don’t cause immediate problems, but eventually they use up all available RAM and crash. Even worse, memory leaks cause a crash at some random later time, which makes them hard to debug.

Working on database code is probably the hardest possible programming. It’s complicated.  Fortunately, there also is plenty of smaller stuff to work on, when programmer brain cells are not cranked up to 11.

Dennis Kolva
Programming Director


Author: Dennis Kolva

Programming Director for Turtle Creek Software. Design & planning of accounting and estimating software.