Arduino vs. Significant Figures: Useful 64-bit Fixed Point

Devoting eight bytes to every fixed point number may be excessive, but having nine significant figures apiece for the integer and fraction parts pushes the frequency calculations well beyond the limits of the DDS hardware, without involving any floating point library routines. This chunk of code performs a few more calculations using the format laid out earlier and explores a few idioms that may come in handy later.

Rounding the numbers to a specific number of decimal places gets rid of the repeating-digit problem that turns 0.10 into 0.099999:

uint64_t RoundFixedPt(union ll_u TheNumber,unsigned Decimals) {
union ll_u Rnd;

  Rnd.fx_64 = (One.fx_64 / 2) / (pow(10LL,Decimals));
  TheNumber.fx_64 = TheNumber.fx_64 + Rnd.fx_64;
  return TheNumber.fx_64;

That pretty well trashes the digits beyond the rounded place, so you shouldn’t display any more of them:

void PrintFixedPtRounded(char *pBuffer,union ll_u FixedPt,unsigned Decimals) {
char *pDecPt;

  FixedPt.fx_64 = RoundFixedPt(FixedPt,Decimals);

  PrintIntegerLL(pBuffer,FixedPt);  // do the integer part

  pBuffer += strlen(pBuffer);       // aim pointer beyond integer

  pDecPt = pBuffer;                 // save the point location
  *pBuffer++ = '.';                 // drop in the decimal point, tick pointer


  if (Decimals == 0)
    *pDecPt = 0;                    // 0 places means discard the decimal point
    *(pDecPt + Decimals + 1) = 0;   // truncate string to leave . and Decimals chars

Which definitely makes the numbers look prettier:

  Tenth.fx_64 = One.fx_64 / 10;             // Likewise, 0.1
  printf("\n0.1: %s\n",Buffer);
  PrintFixedPtRounded(Buffer,Tenth,9);                    // show rounded value
  printf("0.1 to 9 dec: %s\n",Buffer);

  TestFreq.fx_64 = RoundFixedPt(Tenth,3);                 // show full string after rounding
  printf("0.1 to 3 dec: %s (full string)\n",Buffer);

  PrintFixedPtRounded(Buffer,Tenth,3);                    // show truncated string with rounded value
  printf("0.1 to 3 dec: %s (truncated string)\n",Buffer);

0.1: 0.099999999
0.1 to 9 dec: 0.100000000
0.1 to 3 dec: 0.100499999 (full string)
0.1 to 3 dec: 0.100 (truncated string)

  CtPerHz.fx_64 = -1;                       // Set up 2^32 - 1, which is close enough
  CtPerHz.fx_64 /= 125 * MEGA;              // divide by nominal oscillator
  printf("\nCt/Hz = %s\n",Buffer);

  printf("Rounding: \n");
  for (int d = 9; d >= 0; d--) {
    printf("     %d: %s\n",d,Buffer);

Ct/Hz = 34.359738367
     9: 34.359738368
     8: 34.35973837
     7: 34.3597384
     6: 34.359738
     5: 34.35974
     4: 34.3597
     3: 34.360
     2: 34.36
     1: 34.4
     0: 34

Multiplying two scaled 64-bit fixed-point numbers should produce a 128-bit result. For all the values we (well, I) care about, the product will fit into a 64-bit result, because the integer parts will always multiply out to less than 232 and we don’t care about more than 32 bits of fraction. This function multiplies two fixed point numbers of the form a.b × c.d by adding up the partial products thusly: ac + bd + ad + bc. The product of the integers ac won’t overflow 32 bits, the cross products ad and bc will always be slightly less than their integer factors, and the fractional product bd will always be less than 1.0.

Soooo, just multiply ’em out as 64-bit integers, shift the products around to align the appropriate parts, and add up the pieces:

uint64_t MultiplyFixedPt(union ll_u Mcand, union ll_u Mplier) {
union ll_u Result;

  Result.fx_64  = ((uint64_t)Mcand.fx_32.high * (uint64_t)Mplier.fx_32.high) << 32; // integer parts (clear fract) 
  Result.fx_64 += ((uint64_t)Mcand.fx_32.low * (uint64_t)Mplier.fx_32.low) >> 32;   // fraction parts (always < 1)
  Result.fx_64 += (uint64_t)Mcand.fx_32.high * (uint64_t)Mplier.fx_32.low;          // cross products
  Result.fx_64 += (uint64_t)Mcand.fx_32.low * (uint64_t)Mplier.fx_32.high;

  return Result.fx_64;

This may be a useful way to set magic numbers with a few decimal places, although it does require keeping the decimal point in mind:

  TestFreq.fx_64 = (599999LL * One.fx_64) / 10;           // set 59999.9 kHz differently
  printf("\nTest frequency: %s\n",Buffer);
  printf("         round: %s\n",Buffer);

Test frequency: 59999.899999999
         round: 59999.9

Contrary to what I thought, computing the CtPerHz coefficient doesn’t require pre-dividing both 232 and the oscillator by 2, thus preventing the former from overflowing a 32 bit integer. All you do is knock the numerator down by one little itty bitty count you’ll never notice:

  CtPerHz.fx_64 = -1;                       // Set up 2^32 - 1, which is close enough
  CtPerHz.fx_64 /= 125 * MEGA;              // divide by nominal oscillator
  printf("\nCt/Hz = %s\n",Buffer);

Ct/Hz = 34.359738367

That’s also the largest possible fixed-point number, because unsigned:

  TempFX.fx_64 = -1;
  printf("Max fixed point: %s\n",Buffer);

Max fixed point: 4294967295.999999999

With nine.nine significant figures in the mix, tweaking the 125 MHz oscillator to within 2 Hz will work:

Oscillator tune: CtPerHz
 Oscillator: 125000000.00
 -10 -> 34.359741116
  -9 -> 34.359741116
  -8 -> 34.359740566
  -7 -> 34.359740566
  -6 -> 34.359740017
  -5 -> 34.359740017
  -4 -> 34.359739467
  -3 -> 34.359739467
  -2 -> 34.359738917
  -1 -> 34.359738917
  +0 -> 34.359738367
  +1 -> 34.359738367
  +2 -> 34.359737818
  +3 -> 34.359737818
  +4 -> 34.359737268
  +5 -> 34.359737268
  +6 -> 34.359736718
  +7 -> 34.359736718
  +8 -> 34.359736168
  +9 -> 34.359736168
 +10 -> 34.359735619

So, all in all, this looks good. The vast number of strings in the test program bulk it up beyond reason, but in actual practice I think the code will be smaller than the equivalent floating point version, with more significant figures. Speed isn’t an issue either way, because the delays waiting for the crystal tester to settle down at each frequency step should be larger than any possible computation.

The results were all verified with my trusty HP 50g and HP-15C calculators, both of which wipe the floor with any other way of handling mixed binary / hex / decimal arithmetic. If you do bit-wise calculations, even on an irregular basis, get yourself a SwissMicro DM16L; you can thank me later.

The Arduino source code as a GitHub Gist:


, ,

  1. #1 by madbodger on 2017-05-26 - 09:39

    I’m a big fan of creative integer math. It’s fast, relatively small, and predictable. And, as in your case, there are often opportunities for creative optimization. I’m really enjoying reading how you go about it.

    • #2 by Ed on 2017-05-26 - 19:14

      Thanks for the good words!

      You’ll be amused to know I can zero-beat the DDS to the GPS-locked oscillator, then waving my hand near the (insulated!) DDS board shifts the frequency by just under 1 Hz. Of course, that’s only 4 parts per billion, but now I can see it!

  2. #3 by solaandjin on 2017-05-26 - 15:33

    My first impulse would have been to Google for “arduino fixed point library”:

    Is there some hidden reason why this would not be appropriate? Not to rain on your parade, I know how much fun DIY can be.

    • #4 by Ed on 2017-05-26 - 18:47

      I really only needed a multiplication that lets me go from 60-ish kHz to the DDS register value with enough significant digits to not wreck 0.1 Hz frequency steps, so dragging in a whole library seems like entirely too much of a good thing. That whole mess of code boils down to some structs and a couple of functions, so it’s not really all that much effort.

      Plus, I’m just finishing a Circuit Cellar column on significant figures vs. DDS computations, wherein I show some code and discuss the consequences, so it’s best if I understand what I’m doing. [grin]

      Update — If I understand the notation correctly, the library tops out at Q16.16 with 16 bits apiece for the integer and fraction parts: definitely not big enough. I need about nine decimal digits = 32 bits for each part, so (I think) what I’m doing is Q32.32.

  3. #5 by scruss2 on 2017-05-26 - 17:27

    Before you run off and write too much code for handling this data type, I think you’ve just recreated the “Rational” data type used in EXIF camera metadata. ISTR it being a struct of two 32-bit integers as numerator and denominator. There may be libraries out there that will have tested code for you to use.

    • #6 by Ed on 2017-05-26 - 18:57

      I must hang out a shingle: Wheels reinvented while you wait watch!

      The intersection of Arduino RAM and EXIF image data surely defines a vanishingly small area, one not revealed by a casual search with the obvious keywords …

      In any event, there’s not a whole lot more code before this mess gets off the ground. I’ve been squirting frequency steps into the DDS with gleeful abandon; the SPI hardware requires just a few dozen lines of code. Looking good!

      • #7 by scruss2 on 2017-05-27 - 11:39

        Fair enough. Not finding much code either (well, lots of the same rational → float: exactly what you don’t want). Also, EXIF’s “SRATIONAL” type wastes a bit by having both numerator and denominator signed.

        • #8 by scruss2 on 2017-05-27 - 13:24

          … though I did just find out that avr-gcc has supported Q32.32 fixed point (and many others) for a while. Any attempt to include stdfix.h and use the K suffix in the Arduino IDE gives you this, though: “fixed-point types not supported in C++”, dangit.

          • #9 by Ed on 2017-05-28 - 13:10

            Makes a few dozen lines of OwnCode™ look downright attractive …

  1. AD9850 DDS Module: Hardware Assisted SPI and Fixed-point Frequency Stepping | The Smell of Molten Projects in the Morning
  2. AD9850 DDS Module: OLED Display | The Smell of Molten Projects in the Morning
  3. 128×64 OLED Display: I²C Timings | The Smell of Molten Projects in the Morning

Spam comments vanish. Comment moderation may cause a delay.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s