To handle data inside Goldenseal files, we licensed a library called NeoAccess back in the 90s. It was a popular object database. It also had serious bugs. AOL 3.0 used NeoAccess: deleting too many emails would trash everything. Netscape also used it, and the bugs helped kill them. Their internal emails about it were in search engines for a while. We read the sad saga while searching for fixes.
We considering using a different database, but there were no better alternatives. So we delayed Goldenseal for a year while we rewrote their code to be stable enough to actually use. Our staff found and fixed that deletion bug. Some of the other code was so cranky and confusing that we replaced it entirely.
One of the problem areas was NeoFreeList. It kept track of empty spaces in the database, but sometimes failed. The rewrite was DB_GapManager, which did the same thing more reliably.
A couple years later we added DB_FileManager: a list of all records, sorted by file position. It was a second way to locate records, so they often could be salvaged if their primary index was damaged. It also made extra sure that nothing would overlap, if the gap manager was wrong.
Databases are simple at heart: just a way to locate records inside a file. If you never change anything, there are a zillion ways to write one.
Active databases are more complicated. Deleting records creates empty spaces. It’s even worse when you make a record larger. It can’t stay in the same space, or it will overlap the next one. So you have to move it. It’s always safe to put new or moved records at the end, but then the file keeps growing. Eventually you get an enormous file that is mostly empty spaces.
Some apps fix the problem with a Compress File option. It goes through and moves everything closer to the front, eliminating the gaps. DB_GapManager is better. It keeps a list of empty spaces, so records can fill a gap instead of expanding things at the end.
A few years ago we moved DB_GapManager to TurtleSoft Pro, and adapted it to 64-bit. This year we added “sanity checking” code to make it more reliable. Those sometimes give warnings now. None are serious. They seem to be caused by small gaps that stopped being tracked. Still, life is better when the code is perfect.
We also moved DB_FileManager to TurtleSoft Pro, but made it smarter. The old version was just one long list of all records in the whole file. It was huge, which sometimes caused memory problems. The new form uses a DB_SectorIndex that has a list of DB_SectorManagers. Each of those covers one chunk of disk space (16,000 records, about 2 megabytes). It’s a shallow tree structure that is more frugal with RAM. It also is more expandable: up to 125 million records and many gigabytes of file size. We even can expand beyond that by adding another layer.
DB_GapManager is complicated. Deleting a record next to a gap means expanding the gap instead of adding a new one. A deleted record between two gaps is even worse, since three have to be merged and one removed. It’s not easy to debug.
To fix the Gap Manager errors, our staff added code in the sector managers to report exact gap sizes. They can look at the records before and after. Then we wondered if the gap manager was even needed. We bypassed it temporarily, and started to fetch gaps direct from the sector lists.
The result: much simpler code that works just as well. We are still tweaking the new version, and probably will tune it gradually over the next few years. It’s a balance between efficient disk-filling, speed and memory use. Testing that will require some big, mature files to play with.
Since the last post, we also started to rewrite layouts as text rather than raw binary data. There already are stream classes for just that purpose, so the work went quickly. The app now exports layouts as text to the database or resource files. Import will take a few more days. It’s another redesign that turned out to be a huge improvement.
Sometimes, programming feels like swimming through mud. And sometimes, it goes right. That’s when we remember why we do this.
Dennis Kolva
Programming Director
TurtleSoft.com