Ed Nisley's Blog: Shop notes, electronics, firmware, machinery, 3D printing, laser cuttery, and curiosities. Contents: 100% human thinking, 0% AI slop.
Inserting a few simple floating point operations between the SPI transfers provides a quick-n-dirty look at the timings:
Math timing – double ops
The corresponding code runs in the ADC end-of-conversion handler:
void adc0_isr(void) {
digitalWriteFast(ANALOG_PIN,HIGH);
AnalogSample = adc->readSingle(); // fetch just-finished sample
SPI.beginTransaction(SPISettings(8000000, MSBFIRST, SPI_MODE0));
digitalWriteFast(DDS_FQUD_PIN, LOW);
SPI.transfer(DDSBuffer.Phase); // interleave with FM calculations
FlipPin(GLITCH_PIN);
TestFreq += DDSStepFreq;
FlipPin(GLITCH_PIN);
SPI.transfer(DDSBuffer.Bits31_24);
TestFreq -= DDSStepFreq;
SPI.transfer(DDSBuffer.Bits23_16);
TestFreq *= DDSStepFreq;
SPI.transfer(DDSBuffer.Bits15_8);
FlipPin(GLITCH_PIN);
TestFreq /= DDSStepFreq;
FlipPin(GLITCH_PIN);
SPI.transfer(DDSBuffer.Bits7_0);
SPI.endTransaction(); // do not raise FQ_UD until next timer tick!
digitalWriteFast(ANALOG_PIN,LOW);
}
The FlipPin() function twiddling the output bit takes a surprising amount of time, as shown by the first two gaps in the blocks of SPI clocks (D4). Some cursor fiddling on a zoomed scale says 300 ns = 50-ish cycles for each call. In round numbers, actual code doing useful work will take longer than that.
Double precision floating add / subtract / multiply seem to take about 600 ns. That’s entirely survivable if you don’t get carried away.
Double precision division, on the other paw, eats up 3 μs = 3000 ns, so it’s not something you want to casually plunk into an interrupt handler required to finish before the next audio sample arrives in 20 μs.
Overall, the CPU utilization seems way too high for comfort, mostly due to the SPI transfers, even without any computation. I must study the SPI-by-DMA examples to see if it’s a win.
For lack of anything smarter, I put a 1 kΩ resistor from RF Out to Ground to get some DC current going, then used a 470 nF cap and 47 Ω resistor as an AC load:
K1003 Channel Element – bias lashup
Which oscillated around a mid-scale DC bias, but looked ugly:
K1003 Channel Element – 13.4 MHz output – 1k bias
Perusing some receiver schematics suggested a heavier DC load, so I swapped in a 470 Ω resistor:
The general idea is to frequency modulate the sine wave coming from a DDS, thereby generating a signal suitable for upconverting in amateur repeaters now tied to unobtainable crystals. The crystals run from 4-ish to 20-ish MHz, with frequency multiplication from 3 to 36 producing RF outputs from 30-ish MHz through 900-ish MHz; more details as I work through the choices.
The demo code runs on a bare Teensy 3.6 as a dipstick test for the overall timing and functionality:
FM DDS – Teensy 3.6 SPI demo
The fugliest thing you’ve seen in a while, eh?
An overview of the results:
Analog 4 kHz @ 40 kHz – SPI demo overview
The pulses in D1 (orange digital) mark timer ticks at a 40 kHz pace, grossly oversampling the 4 kHz audio bandwidth in the hope of trivializing the antialiasing filters. The timer tick raises the DDS latch pin (D6, top trace) to change the DDS frequency, fires off another ADC conversion, and (for now) copies the previous ADC value to the DAC output:
The purple analog trace is the input sine wave at 4 kHz. The yellow analog stairstep comes from the DAC, with no hint of a reconstruction filter knocking off the sharp edges.
The X1 cursor (bold vertical dots) marks the start of the ADC read. I hope triggering it from the timer tick eliminates most of the jitter.
The Y1 cursor (upper dotted line, intersecting X1 just left of the purple curve) shows the ADC sample apparently happens just slightly after the conversion. The analog scales may be slightly off, so I wouldn’t leap to any conclusions.
The pulses in D2 mark the ADC end-of-conversion interrupts:
void adc0_isr(void) {
digitalWriteFast(ANALOG_PIN,HIGH);
AnalogSample = adc->readSingle(); // fetch just-finished sample
SPI.beginTransaction(SPISettings(8000000, MSBFIRST, SPI_MODE0));
digitalWriteFast(DDS_FQUD_PIN, LOW);
SPI.transfer(DDSBuffer.Phase); // interleave with FM calculations
SPI.transfer(DDSBuffer.Bits31_24);
SPI.transfer(DDSBuffer.Bits23_16);
SPI.transfer(DDSBuffer.Bits15_8);
SPI.transfer(DDSBuffer.Bits7_0);
SPI.endTransaction(); // do not raise FQ_UD until next timer tick!
digitalWriteFast(ANALOG_PIN,LOW);
}
The real FM code will multiply the ADC reading by the amplitude-to-frequency-deviation factor, add it to the nominal “crystal” frequency, convert the sum to the DDS delta-phase register value, then send it to the DDS through the SPI port. For now, I just send five constant bytes to get an idea of the minimum timing with the SPI clock ticking along at 8 MHz.
The tidy blurs in D4 show the SPI clock, with the corresponding data in D5.
D6 (top trace) shows the DDS FQ_UD (pronounced “frequency update”) signal dropping just before the SPI data transfer begins. Basically, FQ_UD is the DDS Latch Clock: low during the delta-phase value transfer, with the low-to-high transition latching all 40 control + data bits into the DDS to trigger the new frequency.
A closer look at the sample and transfer:
Analog 4 kHz @ 40 kHz – SPI demo detail
For reference, the digital players from bottom to top:
D0 – unused here, shows pulses marking main loop
D1 – 40 kHz timer ticks = ADC start conversion
D2 – ADC end of conversion,”FM calculation”, send DDS data
D3 – unused here, shows error conditions
D4 – SPI clock = rising edge active
D5 – SPI MOSI data to DDS = MSB first
D6 – SPI CS = FQ_UD = DDS latch
Remember, the yellow analog stairstepped trace is just a comfort signal showing the ADC actually samples the intended input.
Dropping the sampling to 20 kHz would likely work just as well and double the time available for calculations. At least now I can measure what’s going on.
All in all, it looks feasible.
And, yes, the scope is a shiny new Siglent SDS2304X with the MSO logic-analyzer option. It has some grievous UX warts & omissions suggesting an architectural botch job, but it’s mostly Good Enough for what I need. More later.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The IEEE-754 spec says a double floating-point variable carries about 15.9 decimal digits, which agrees with the 9 integer + 7 fraction digits. The highlight lowlight (gray bar) in the first figure shows the slight stumble where adding 1e-7 changes the sum, but not quite enough to affect the displayed fraction.
In round numbers, an increment of 1e-5 would work just fine:
You’d use the “smallest of all” epsilon in a multiplied increment, perhaps to tick a value based on a knob or some such. Fine-tuning a VHF frequency with millihertz steps probably doesn’t make much practical sense.
The DDS frequency increment works out to 41.9095 mHz, slightly larger than with the Arduino, because it’s fot a cheap DDS eBay module with an AD9851 running a 180 MHz (6 × 30 MHz ) clock.
However, (nearly) all the remaining glitches seem to occur while writing a single row of pixels, which trashes the rest of the display and resolves on the next track update. That suggests slowing the timing during the initial hardware setup did change the results.
Another look at the Luma code showed I missed the Chip Enable (a.k.a. Chip Select in the SH1106 doc) change in serial.py:
def _write_bytes(self, data):
gpio = self._gpio
if self._CE:
time.sleep(1.0e-3)
gpio.output(self._CE, gpio.LOW) # Active low
time.sleep(1.0e-3)
for byte in data:
for _ in range(8):
gpio.output(self._SDA, byte & 0x80)
gpio.output(self._SCLK, gpio.HIGH)
byte <<= 1
gpio.output(self._SCLK, gpio.LOW)
if self._CE:
time.sleep(1.0e-3)
gpio.output(self._CE, gpio.HIGH)
What remains unclear (to me, anyway) is how the code in Luma's bitbang class interacts with the hardware-based SPI code in Python’s underlying spidev library. I think what I just changed shouldn’t make any difference, because the code should be using the hardware driver, but the failure rate is now low enough I can’t be sure for another few weeks (and maybe not even then).
All this boils down to the Pi’s SPI hardware interface, which changes the CS output with setup / hold times measured in a few “core clock cycles”, which is way too fast for the SH1106. It seems there’s no control over CS timing, other than by changing the kernel’s bcm2708 driver code, which ain’t happening.
The Python library includes a no_cs option, with the caveat it will “disable use of the chip select (although the driver may still own the CS pin)”.
Running vcgencmd measure_clock core (usage and some commands) returns frequency(1)=250000000, which says a “core clock cycle” amounts to a whopping 4 ns.
Forcibly insisting on using Luma’s bitbang routine may be the only way to make this work, but I don’t yet know how to do that.
Obviously, I should code up a testcase to hammer the OLED and peer at the results on the oscilloscope: one careful observation outweighs a thousand opinions.
The sturdy metal enclosure ought to be good for something, I thought, so I rescued it from the trash.
One of the ten button-head screws galled in place and resisted a few days of penetrating oil, so I drilled it out:
Drilled-out button screw head
The PCB has no ICs! It simply routes all the LED and button pins through the pillar into the sewing machine controller:
Brother BAS-311 Control Head – interior
The ribbon cable alternates the usual flat strip with sections of split conductors:
Segmented ribbon cable
The split segments let it roll up into the pillar, with enough flexibility to allow rotating the head. I’ve seen segmented twisted-pair ribbon cable, but never just flat conductors.
Maybe the control head can become Art in its next life?