# Archive for May 26th, 2017

### Arduino vs. Significant Figures: Useful 64-bit Fixed Point

Posted by Ed in Amateur Radio, Software on 2017-05-26

Devoting eight bytes to every fixed point number may be excessive, but having nine significant figures apiece for the integer and fraction parts pushes the frequency calculations well beyond the limits of the DDS hardware, without involving any floating point library routines. This chunk of code performs a few more calculations using the format laid out earlier and explores a few idioms that may come in handy later.

Rounding the numbers to a specific number of decimal places gets rid of the repeating-digit problem that turns 0.10 into 0.099999:

uint64_t RoundFixedPt(union ll_u TheNumber,unsigned Decimals) { union ll_u Rnd; Rnd.fx_64 = (One.fx_64 / 2) / (pow(10LL,Decimals)); TheNumber.fx_64 = TheNumber.fx_64 + Rnd.fx_64; return TheNumber.fx_64; }

That pretty well trashes the digits beyond the rounded place, so you shouldn’t display any more of them:

void PrintFixedPtRounded(char *pBuffer,union ll_u FixedPt,unsigned Decimals) { char *pDecPt; FixedPt.fx_64 = RoundFixedPt(FixedPt,Decimals); PrintIntegerLL(pBuffer,FixedPt); // do the integer part pBuffer += strlen(pBuffer); // aim pointer beyond integer pDecPt = pBuffer; // save the point location *pBuffer++ = '.'; // drop in the decimal point, tick pointer PrintFractionLL(pBuffer,FixedPt); if (Decimals == 0) *pDecPt = 0; // 0 places means discard the decimal point else *(pDecPt + Decimals + 1) = 0; // truncate string to leave . and Decimals chars }

Which definitely makes the numbers look prettier:

Tenth.fx_64 = One.fx_64 / 10; // Likewise, 0.1 PrintFixedPt(Buffer,Tenth); printf("\n0.1: %s\n",Buffer); PrintFixedPtRounded(Buffer,Tenth,9); // show rounded value printf("0.1 to 9 dec: %s\n",Buffer); TestFreq.fx_64 = RoundFixedPt(Tenth,3); // show full string after rounding PrintFixedPt(Buffer,TestFreq); printf("0.1 to 3 dec: %s (full string)\n",Buffer); PrintFixedPtRounded(Buffer,Tenth,3); // show truncated string with rounded value printf("0.1 to 3 dec: %s (truncated string)\n",Buffer); 0.1: 0.099999999 0.1 to 9 dec: 0.100000000 0.1 to 3 dec: 0.100499999 (full string) 0.1 to 3 dec: 0.100 (truncated string) CtPerHz.fx_64 = -1; // Set up 2^32 - 1, which is close enough CtPerHz.fx_64 /= 125 * MEGA; // divide by nominal oscillator PrintFixedPt(Buffer,CtPerHz); printf("\nCt/Hz = %s\n",Buffer); printf("Rounding: \n"); for (int d = 9; d >= 0; d--) { PrintFixedPtRounded(Buffer,CtPerHz,d); printf(" %d: %s\n",d,Buffer); } Ct/Hz = 34.359738367 Rounding: 9: 34.359738368 8: 34.35973837 7: 34.3597384 6: 34.359738 5: 34.35974 4: 34.3597 3: 34.360 2: 34.36 1: 34.4 0: 34

Multiplying two scaled 64-bit fixed-point numbers should produce a 128-bit result. For all the values we (well, I) care about, the product will fit into a 64-bit result, because the integer parts will always multiply out to less than 2^{32} and we don’t care about more than 32 bits of fraction. This function multiplies two fixed point numbers of the form `a.b × c.d`

by adding up the partial products thusly: `ac + bd + ad + bc`

. The product of the integers `ac`

won’t overflow 32 bits, the cross products `ad`

and `bc`

will always be slightly less than their integer factors, and the fractional product `bd`

will always be less than 1.0.

*Soooo*, just multiply ’em out as 64-bit integers, shift the products around to align the appropriate parts, and add up the pieces:

uint64_t MultiplyFixedPt(union ll_u Mcand, union ll_u Mplier) { union ll_u Result; Result.fx_64 = ((uint64_t)Mcand.fx_32.high * (uint64_t)Mplier.fx_32.high) << 32; // integer parts (clear fract) Result.fx_64 += ((uint64_t)Mcand.fx_32.low * (uint64_t)Mplier.fx_32.low) >> 32; // fraction parts (always < 1) Result.fx_64 += (uint64_t)Mcand.fx_32.high * (uint64_t)Mplier.fx_32.low; // cross products Result.fx_64 += (uint64_t)Mcand.fx_32.low * (uint64_t)Mplier.fx_32.high; return Result.fx_64; }

This may be a useful way to set magic numbers with a few decimal places, although it does require keeping the decimal point in mind:

TestFreq.fx_64 = (599999LL * One.fx_64) / 10; // set 59999.9 kHz differently PrintFixedPt(Buffer,TestFreq); printf("\nTest frequency: %s\n",Buffer); PrintFixedPtRounded(Buffer,TestFreq,1); printf(" round: %s\n",Buffer); Test frequency: 59999.899999999 round: 59999.9

Contrary to what I thought, computing the `CtPerHz`

coefficient doesn’t require pre-dividing both 2^{32} and the oscillator by 2, thus preventing the former from overflowing a 32 bit integer. All you do is knock the numerator down by one little itty bitty count you’ll never notice:

CtPerHz.fx_64 = -1; // Set up 2^32 - 1, which is close enough CtPerHz.fx_64 /= 125 * MEGA; // divide by nominal oscillator PrintFixedPt(Buffer,CtPerHz); printf("\nCt/Hz = %s\n",Buffer); Ct/Hz = 34.359738367

That’s also the largest possible fixed-point number, because unsigned:

TempFX.fx_64 = -1; PrintFixedPt(Buffer,TempFX); printf("Max fixed point: %s\n",Buffer); Max fixed point: 4294967295.999999999

With nine.nine significant figures in the mix, tweaking the 125 MHz oscillator to within 2 Hz will work:

Oscillator tune: CtPerHz Oscillator: 125000000.00 -10 -> 34.359741116 -9 -> 34.359741116 -8 -> 34.359740566 -7 -> 34.359740566 -6 -> 34.359740017 -5 -> 34.359740017 -4 -> 34.359739467 -3 -> 34.359739467 -2 -> 34.359738917 -1 -> 34.359738917 +0 -> 34.359738367 +1 -> 34.359738367 +2 -> 34.359737818 +3 -> 34.359737818 +4 -> 34.359737268 +5 -> 34.359737268 +6 -> 34.359736718 +7 -> 34.359736718 +8 -> 34.359736168 +9 -> 34.359736168 +10 -> 34.359735619

So, all in all, this looks good. The vast number of strings in the test program bulk it up beyond reason, but in actual practice I think the code will be smaller than the equivalent floating point version, with more significant figures. Speed isn’t an issue either way, because the delays waiting for the crystal tester to settle down at each frequency step should be larger than any possible computation.

The results were all verified with my trusty HP 50g and HP-15C calculators, both of which wipe the floor with any other way of handling mixed binary / hex / decimal arithmetic. If you do bit-wise calculations, even on an irregular basis, get yourself a SwissMicro DM16L; you can thank me later.

The Arduino source code as a GitHub Gist:

## Blowback