Bits & Bytes (June 17)

In movies, computer programmers often work at a screen full of binary digits. 01110101 01100010 10011011 01011101 01001111. For most people, it just looks like gibberish. For real world programmers it’s gibberish, too. A blur of zeros and ones. We almost never work with raw bits. The smallest size that makes sense is a byte (8 bits), often treated as a text character. Most numbers are 2, 4 or 8 bytes in size.

Yep, raw binary isn’t very useful, but this week we fixed a small mystery problem that was all about the bits. It involved a “bit” of C++ history.

Our accounting software stores hundreds of checkbox values. A single bit is enough for a yes/no choice, but computers are not designed to store anything that small. One byte is their minimum size.

These days, the standard way to store yes/no items is in a full byte. It wastes 7 bits, but who cares about that when there are billions of bytes in RAM, and trillions on the drive.

It wasn’t so easy in the early days of our accounting software. Those PCs only had a few megabytes of RAM, shared with the system and other apps. We had to be frugal. Hard drives only held 100 megs, so file size was also a concern. To save space, we often packed 8 true/false values into a byte.

Back then, the most common way to do that was with bitmasks. They use binary operators to work on individual bits. For example, a binary AND with 4 gets you the third bit. 128 gets you the 8th. Bitmasks are hard to read and easy to screw up: not much better than raw 01110101. They caused many bugs in the early days of Goldenseal estimating/accounting.

When updating to 64 bit, one of the first things we did was replace the bit masks. Modern C++ has a better way to handle them:  bitfields. They also pack bits into a byte, but each value has a name. That makes the code more readable. Debugging is easier.

The obscure error happened because in the mid-90s, some bold TurtleSoft programmer made a bitmask on a 2-byte number. It used mask numbers all the way up to 32768. When we switched it to a bitfield 5 years ago, the byte order was wrong: one of those big/little-endian problems that I talked about a while back.

The bug didn’t show up until we tested with Customer records in our TurtleSoft file. They have a few custom fields, which were disabled improperly because some bits were wrong. While fixing that, we also revised the way we store specs for all data fields. It’s now cleaner and more future-proof.

In case any serious programmers read this blog, I should clarify that it sometimes makes sense to work with raw bits. For example, if you want to multiply by 64, just shift bits to the left by 6 places. It’s a lot faster than regular math. There’s a whole realm of high-efficiency programming that squeezes out extra speed with tricks like that. Games use them a lot.

TurtleSoft builds stodgy old accounting software, with no need to shave any microseconds. Our top priority is accurate, reliable code that is easy to maintain. Not everyone has that luxury.

Dennis Kolva
Programming Director
TurtleSoft.com

 

 

 

 

 

 

Author: Dennis Kolva

Programming Director for Turtle Creek Software. Design & planning of accounting and estimating software.