Goldenseal Pro Progress- Black Boxes (May 25)

Remodelers rarely build kitchen appliances from scratch. Store-bought models run better than anything you could ever cob up on your own.

The same thing applies to software apps. It’s hard enough to write code that does construction estimating and accounting. There’s no sense reinventing wheels, so we use libraries written by other people for most of the basics. 

The good news is that there are plenty of software libraries out there. They’ll do just about anything. Most of them are free or cheap.

The bad news is that they are free or cheap. There is little incentive for their developers to polish things. It’s very rare to find a library that is fully reliable, well documented, and easy to use. Even if the source code is available, it’s often an ugly mess that someone whipped out on a caffeine high. Subtle bugs and security flaws probably lurk in there, but there’s no way to know for sure.

When you can’t view or fix the interior workings, it’s called a black box. Maybe it uses C++, or magnets, or squirrels.  There’s no way to know for sure. Whatever it is in there, about the best you can do is learn the quirks of the black box, and work around its flaws.

The NeoAccess database was definitely a black box. A dank basement with spiderwebs. For Goldenseal Pro, we considered switching to MySQL or some other database library. However, we soon discovered that most open-source software libraries are also black boxes, and just as ugly inside. Better the toxic danger zone you know, than the one you don’t.

Anyhow, last week our staff finally decided to completely rewrite the cache-management code. The first attempt, adapted from NeoAccess, was just too confusing. Chasing down its bugs was like wack-a-mole.

Basic design was not hard. Getting it to work on new files wasn’t too bad either. Then we went back and tested the conversion from older Goldenseal files to the new format. For a while it felt like this bathroom repair.  The conversion still uses NeoAccess to read the old file, and it was not happy with the changes.

It was time to put on the hazmat suits, climb in there, and update the NeoAccess cache system for a 64-bit world. We only need to run it one last time, but it’s an important one time. Our existing Goldenseal users will want to keep their old accounting data, just as much as we do.

Jiggling the cache system turned up several obscure bugs. It’s possible they weren’t hurting anything, but now that code actually does what it claims to do. After considerable effort, the conversion started working properly again. Previous testing was giving a few mystery database errors, but most of them disappeared. Sometimes it’s easier to just rebuild stuff, rather than fix it.  That is one problem with black boxes: sometimes the weird stuff you do to work around their flaws becomes its own problem, even after you replace the black box with something better.

We’re still finishing the File Manager code that kicks in when there are more than 32,000 records, but this latest spate of database work is nearing completion. Then we can get back to interface.

Most ‘black box’ libraries were someone’s one-time project, then they moved on to something else. Our software is exactly the opposite. We have been working on the same code base since 1995, and probably will still be working on it in 2035. So, the goal is to make Goldenseal Pro the opposite of a black box!

Dennis Kolva
Programming Director
TurtleSoft.com

 

 

 

Goldenseal Pro- File Manager (May 17)

There are many ways for a database to go bad. Losing records is a big problem, of course. Even worse is when one record writes over the top of another one. That corrupts at least one record. Sometimes the damage expands further.

A more subtle problem happens when records change: something that happens often in accounting and estimating software. Usually records get bigger, which means they no longer fit into their old place. It’s easiest to move them to the end of the file, but then there’s a gap at their former location. Do that enough times, and you end up with a huge file that is mostly empty space.

The NeoAccess database kept a ‘free list’ of empty spaces that was meant to prevent both those problems. New and changed records filled in the gaps, and kept the file compact.  However, there were occasional mystery bugs, and we suspected the free list was at fault.  If it ever forgot to remove a gap, then a record would be trashed when a second was written to the same place. We tried to debug the free list code but it was too confusing.

In 2002, we added the File Manager to help manage file contents. It kept its own list of gaps that duplicated the free list, and also stored the location of every record in the file.  That way there were two ways to make sure records were added safely. Goldenseal could find a gap, then double-check the neighborhood to make sure the space was really free.

At first the File Manager was just a diagnostic tool, but it soon replaced the old free list. Mystery over-write bugs disappeared. The Manager added an extra 10 bytes per record, but increased reliability was well worth the cost.

Unfortunately, the File Manager had a subtle flaw. It was by far the biggest record in the database (in the TurtleSoft file, it’s 2.5 megabytes).  If users were low on RAM, it was the first thing that wouldn’t fit. Nearly always, that led to a crash or a freeze, but the database would still be fine. However, if there was exactly enough RAM to load the Manager, but not enough to save it, it would die midway through the file save, and corrupt the database. It didn’t happen often but it was awful when it did.

Goldenseal users haven’t reported this problem in the past few years, probably because computers now have so many gigabytes of RAM. However, it’s still a design flaw.

In Goldenseal Pro we fix it by using multiple, smaller File Managers. There’s one for each sector in the file, enough to hold 32,000 records. Managers are accessed via a very broad, 2-level tree that is similar to the ones used to index records. It’s safer and a teeny bit faster than the old system, with less data to read from the drive.

The old File Manager kept growing larger as more records were added. Periodically, it had to be relocated in the file. That was when it was most likely to die, and kill the file with it.  In Goldenseal Pro, each File Manager is a fixed size. When it fills, we just add another one. That also makes it safer.

Essentially, Goldenseal Pro stores the location of every record in two different places: one sorted by record ID, and one sorted by location in the file. There’s also a separate list of file gaps that helps keep the file compact. The system is more complicated than the previous File Manager, but not by much. The code is much more understandable, so we’ll be able to repair damage to either half of the record storage.

Right now, the system can handle a billion records. If any users ever get close to that amount, we’ll add code to allow a 3-level tree. That will jump it into the trillions.

Dennis Kolva
Programming Director
TurtleSoft.com

 

Memory & Speed (May 10)

When our staff first started work on Goldenseal accounting software, computers had 1 to 4 megabytes of RAM. That was sparse. Programmers had to work hard to conserve memory. It’s probably why the NeoAccess database library used such a skinny tree to locate records on disk. It only consumed 9 or 10 bytes per record.

Back then, processors ran at 16 to 32 megahertz. Speed was another big concern. We originally chose the NeoAccess database because it was fast, thanks to record caching.  A cache means that each record sticks around in RAM for a while, after it is opened. Reading data in RAM is many times faster than the hard drive, so a cache saves time for anything used more than once. Of course, there is only so much room in RAM, so a cache must be managed carefully. Our programmers thought hard about when to write records to disk, and when to remove items from the cache.

These days, computers have gigabytes of RAM (a thousand times more than the 90s). Processors run at a few gigahertz (a hundred times faster). Programmers can spend less time worrying about memory and clock cycles. They can focus more on reliability and maintainability.

Our staff has been using the Sample Company File for testing since late 2016. It’s a small file. A few weeks ago we converted our own TurtleSoft file to the new format, in prep for using it daily. With 10,000s of records instead of 100s, it’s a chance to stress-test and fine-tune the database code.

NeoAccess had only one way to locate a record on disk: climb through the index tree, and find its ID and file mark (4 bytes each). It was fast, and saved memory. Unfortunately, if any bits were damaged along the way, the record was lost forever (or read from the wrong part of the file, producing garbage).  In 2002 we added a File Manager to store record addresses in a second place. It definitely reduced database errors. Unfortunately, it had its own flaws, and never quite lived up to its potential. More about that next week.

As mentioned a couple weeks ago, Goldenseal Pro is more ‘webby’, with extra links, and ways to recover damaged data. To make that happen, it now uses about 80 bytes for each record location, instead of 10. We may even add more. The extra links, padding and redundancy make Goldenseal Pro files about triple the size. However, they are still a tiny fraction of a modern terabyte drive. For a few extra cents of drive space, it’s well worth the increase in reliability.

For speed, Goldenseal Pro still uses a record cache. However, it needed a complete rewrite. The old NeoAccess code worked OK, but it was very hard to understand and debug. Their cache also relied on their indexing system, which we’ve replaced. Back in 2002 we added code to help with cache diagnostics. It has been running in parallel for 16 years, and now it takes over completely.

Managing the cache is tricky. Some records are used more than once, so the cache needs ‘reference counts’ to keep them open until everything is finished.  Mess up the count, and the program will crash (if the record is removed prematurely) or leak memory (if never removed). Memory leaks don’t cause immediate problems, but eventually they use up all available RAM and crash. Even worse, memory leaks cause a crash at some random later time, which makes them hard to debug.

Working on database code is probably the hardest possible programming. It’s complicated.  Fortunately, there also is plenty of smaller stuff to work on, when programmer brain cells are not cranked up to 11.

Dennis Kolva
Programming Director
TurtleSoft.com

 

Goldenseal Pro Progress- Back to Databasics (May 3)

Back in 1999, we started using real data to test beta versions of Goldenseal accounting software. There was an unpleasant surprise. Intermittently, the software gave mystery NeoAccess errors, then crashed with a corrupted database. The only recourse was to trash the file and start over. We reported the bugs to NeoAccess support, but by then they had stopped answering emails.

We considered switching to a different database library, but couldn’t find anything better. Out of desperation, our staff spent half a year rewriting the NeoAccess code to make it more reliable. We found and fixed at least two serious bugs. Most likely, the rewrites accidentally fixed a few more. After that, the code ran well enough for a version 1.0 release.

The database still had rare problems, so we added a bunch of diagnostic commands to better understand what was happening inside the file. With their help, we tracked down a few more subtle bugs, and squashed the last of them in 2004.

Now that we are testing Goldenseal Pro with our own real data, we’re seeing database bugs again. The software adds many thousands of records just fine, but sometimes it has failed after deleting or changing them. This time around, bugs are entirely expected, since it’s brand new code getting its first stress test. The schedule has 2 weeks allocated for database bug fixes, and so far that seems about right.

Fortunately, the new database code is much cleaner and more understandable. Bug fixes usually take hours instead of months.

Up until now, we have been able to step through the database code in the debugger to find and fix problems. However, as the amount of data increases, the bugs grow more obscure (and harder to duplicate). So, we just added some diagnostic commands again. Being able to see interior details helped catch the deletion bug. Right now Goldenseal Pro is working fine, although there are almost certainly more bugs lurking.

You might wonder why we don’t just write bug-free code to start with. And yeah, we agree, that would sure be nice. Having perfect code in the first run would save everyone a lot of time.

What prevents that is the same reason that construction projects have punch lists. Why first drafts have tyops and bad grammar. Why cars aren’t perfect, even after 130 years of engineering. Complex things are difficult, and creators can’t anticipate every possible problem. Frequently, errors need to occur at least once, before you even know they’re a problem. Not to mention, just plain old random omissions and mistakes.

Since we can’t write perfect code, we spend a lot of time testing and debugging. The software gets closer and closer to perfect.

Right now Goldenseal Pro is in that rough polishing stage. Polishing is a slow, grinding process that takes time before it produces something shiny.

Dennis Kolva
Programming Director
TurtleSoft.com