CMoney (Jan 27)

This past week our staff rewrote CMoney, the class that stores money values for Goldenseal accounting software. Translating a text string into a CMoney used to take 100 lines of code. Now it’s 2 lines.

When we started work on Goldenseal, personal computers were still shifting from 16-bit to 32-bit processors. Those (plus 8-bit characters) were the main data sizes, along with “floating point” numbers for decimals.

We first considered using floating point for money values, but floats have precision problems. Big numbers can get errors in the penny digits. They also were extremely slow back then.

32-bit integers can handle up to 4 billion, so they seemed possible at first. However, with pennies and negatives the cap is only $20 million. Unit costs and tax rates need fractional pennies, which shrinks capacity even further. 32 bits is just too small for practical money storage.

The next step up is 64-bit integers. They were exotic beasts at the time. Not part of the C++ standard. 64-bit required special libraries that faked it by using two 32-bit numbers. The math was hard to use, unreliable, and slow. Too flaky for accounting software.

Our staff finally built CMoney with two numbers: 32 bits for dollars, and 16 bits for pennies. A total of 24 bits, or 6 bytes in size. It gets up to $2 billion, and down to 1/100 of a cent. That was a reasonable size range for the US and Europe.

However, a few Goldenseal buyers were in countries with very small currency values. The app was too small for big overseas projects, in places with a currency unit worth a penny or less. The solution was to add an option to shift money decimals by 1, 2 or 3 places. It’s complicated code, but it met their needs.

On the opposite end, CMoney can handle 1/100 of a penny. It also stores percentages down to .0001%. Sadly, a few counties in Indiana have local income tax rates that need 5 decimal places (.00005%). Our payroll withholding can’t go there, so the software is wrong by a few cents per year.

Fast forward 30 years, and 64-bit numbers have become totally mainstream. Every OS added them years ago to get past the 4GB barriers for disk space and RAM. Now 64-bit is everywhere in the C++ standard, and fully supported.

For the past decade, we’ve considered switching CMoney to a single 64 bit value. This past week, our staff finally made the change. Math sure is easier without separate dollars and pennies!

The change gives CMoney more precision, and more capacity. It now goes down to 1/1,000 of a penny, or .00001%. One decimal smaller. Good enough for Indiana, or really small unit costs. Its maximum money value is $90 trillion. That’s 40x bigger than what Goldenseal can manage now, even with decimal shifting. If your country has inflation so bad that a house costs more than tens of trillions, you have bigger problems than just your accounting. If necessary, we can bring back decimal shifting, but it probably won’t ever be needed.

In fact, we probably won’t use the whole $90 trillion. Better to leave some space at the top, so there is less risk of math overflows. We may even add an option in Prefs to set the cap much lower. Whatever limit you want. That will prevent huge surprises when your cat sits on the keyboard. It will only affect data entry, not data storage.

Changing CMoney to 64 bits worth of millipennies was easy. It only took a few lines of code to switch to the new size. Rewriting all the math was a bit harder, but it was finished in a day. Most was obvious. CMoney source code is 1000 lines smaller now.

Unfortunately, this is accounting software. Almost every type of record includes some sort of money value. The jump from 6 bytes to 8 bytes changes the file size of almost everything. It took several days to revise code in 500+ files so they handle the increase. Everything now has an old size and a new one, since we still need to read existing Goldenseal files.

Even worse, many stored lists (arrays) include a money value. To handle the size change, we had to add an old-money version of each, plus some code to convert it to the new format. Those are finished, though we may have missed a few.

Right now we are testing the changes by converting our Goldenseal test file to the new format. It has some of every class, and any incorrect record lengths give error messages. Our staff probably has spend hundreds or thousands of hours over the years, diagnosing small file-length problems. We will be doing more of it for a while. Most likely a few more days.

TurtleSoft code can change like this because everything inside the database has a version number. Old data and new data live together peacefully inside a company file. More about that in the next post.

Dennis Kolva
Programming Director
TurtleSoft.com

Strings #2 (Jan 20)

The CTextString update is almost complete. Everything runs properly with the new text strings (so far). Some of the changed code is still untested, but that will shake out along with our regular app testing. We added break points to possible problem areas, so it will jump into the debugger automatically the first time we use them.

Besides making the class more reliable, the update was a chance to remove obsolete code. No sense in rewriting stuff that is no longer needed.

One thing that went is Pascal strings. They were the standard way to handle text in 68K Macs. 32-bit Mac OS X still used them for a few things. Pascal is totally gone from 64-bit OS versions. Now it’s totally gone from our code too. A bit less clutter.

We also decided to toss the complex code that converts text to money. Instead, we will rewrite CMoney to make its data storage simpler, better and faster. It’s a change that has been on our to-do list for years, and this is the perfect time to do it. More about CMoney in the next post.

We also cleared out some Microsoft-specific code that handled their many text formats. It’s left over from we tried using MFC to build a Windows version (and ultimately failed). Their string classes drove us nuts. There was BSTR, CString, CStringA, CStringT, CStringW, LPSTR, LPCSTR, LPCTSTR, LPWSTR, LPCWSTR and LPCTSTR. I probably missed a few. Use the wrong one or convert improperly, and the app crashed.

Clearing out that obsolete string cruft was a reminder that TurtleSoft dodged a bullet by procrastinating on the text upgrade.

Since 1844, people have used many formats to move text over wires. The first digital one was a 5-bit system that replaced Morse code in the late 1800s. It had 32 possible values: enough for an all-cap alphabet, spaces, and a few control characters. You may have seen that style of telegram in old movies: HELP STOP NO LOWERCASE LETTERS OR PUNCTUATION EXCLAMATION MARK

ASCII was developed in the 1960s for teletypes and computers. It uses 7 bits, enough for 128 characters. That’s enough for upper and lower case alphabets, numerals, punctuation similar to a typewriter, plus control characters (tab, carriage return, line feed, etc). For a few decades, ASCII was the primary way to display and store text. It’s still very common.

Apple extended ASCII to 8 bits in the Macintosh. The extra 128 characters included vowels with accents, tildes, upper/lowercase Greek letters, smart quotes, and a fancier set of punctuation marks. Command-shift-K is , #140 in Apple extended ASCII.

Even 8-bit ASCII was not big enough to include Russian, Arabic, Sanskrit, Thai and other alphabets. As they grew more international, Microsoft and Apple switched to “wide” 16-bit characters for a while. Those can handle 64,000 different characters. Good enough to cover almost all languages and alphabets.

Switching between standard and wide ASCII text was hard. Wide strings are the reason for half those weird text types listed above. Having two different character sizes is why crashes happened. It also made ordinary text twice as big. Fortunately, TurtleSoft won’t need to deal with wide characters. They were a dead end. We jumped right over them.

Problem was, Chinese and a few neighbors use a different symbol for every word. Already that’s 80,000+ symbols. Ethnographers wanted support for niche alphabets. Historians wanted support for dead languages. Texters wanted emojis. Other folks wanted their own special character sets. 16 bits was not enough room for everyone.

That’s why Unicode happened. It’s expandable to as many bits and bytes as it needs. Unicode supports every possible language. It also includes geometric shapes, dingbats, musical notes, math symbols, emojis, and more. If someone discovers hieroglyphs on Mars, Unicode will write them.

Qt text fields can display full Unicode. They store its data in a format called UTF-8, the same size as ASCII characters. Also the same size as text inside a CTextString. Our code doesn’t need to know anything about Unicode. We just store a string of data, and Qt will convert it into plain text, 故 or 🍇or whatever.

The first release of TurtleSoft Pro probably won’t display Unicode. There are plenty of other things to worry about first. However, we can easily slip it in later.

Dennis Kolva
Programming Director
TurtleSoft.com

Strings (Jan 13)

Last week, our staff finished a grueling 10-day stretch of work on payroll tax tables. In a good year it only takes 3 days.

Federal tables changed for all 50 states + DC. For decades the US got by with just 2 step tables, but now there are 8 of them. Last year we added one set of new tables, but we missed some and added them this year. A record number of states also changed their withholding formulas, including a few major rewrites.

We returned to work on the Layouts window, but soon ran into distractions.

Some menu commands in the current Goldenseal don’t yet have a place in TurtleSoft Pro. We needed one to help with testing. Adding a button to the top bar made the most sense, but it’s getting crowded up there. So we tried moving things around to give it more space. Qt is not very good at that kind of setup, but we finally produced something that is a bit better.

Then there were mystery crashes in CTextString: the class we use to store text. It’s not the first time it has done that. In the past we rewrote offending code to avoid the errors, but a better cure is to rewrite CTextString.

When we started Goldenseal, there was no standard C++ library. Everyone wrote their own string class to handle text. We adapted ours from the one in the PowerPlant framework. It is very 1990s programming. More like C than C++. Text is stored in chunks of raw memory, with raw pointers to access it. That’s just how things were done back in those days.

Raw C is a bit like using a circular saw with all the safety guards removed. Just a sharp spinning blade on a motor. You can do all the same stuff with it, and it works much better for some tasks. But, yeah, in the long run you’ll keep more fingers if you use the safer version.

The modern C++ standard library has std::string. TurtleSoft Pro already uses it for a few things. It’s safer, more versatile, and more reliable than CTextString. Not surprising, because thousands of programmers work on the C++ libraries, while TurtleSoft has only had a few. Compared to the C++ gurus, we also are amateurs. Our strength is construction accounting and estimating knowledge, not the dark arts of C++ memory calls.

So, rather than debug the mystery errors, we updated CTextString to use a std::string instead of raw C memory. We can’t swap over entirely, because CTextString does many things that are specific to our app. So it owns a std::string that does the heavy listing, then keeps on doing what it used to.

This is one of those big changes that risks breaking some things, but ends up better (probably) in the long run. Almost like, say, jacking an old house and putting in a new foundation. Text strings are that basic.

With the change, there are places where we can remove big chunks of complicated code. For example, Goldenseal converts text to a decimal number via a loop through each character. It ignores non-digits, deals with negatives, and checks for decimal points (period or comma depending on location). There’s special code for extra decimal points. The C++ standard library has a function to convert text to decimals, so the new version is just one line.

Our staff accomplished some major successes last month, but this change is deeper and riskier. Right now it compiles OK and doesn’t crash. We still need to test every command and make sure the output hasn’t changed. Wish us luck!

Dennis Kolva
Programming Director
TurtleSoft.com