Software details:

The software running on the buoy is written in C (Arduino Framework). I might eventually clean it up and publish it. The backend was initially the Helium Google Forms integration. In order to support transmitting the raw measurements (hundreds of bytes instead of 24 bytes), I switched to point-to-point LoRa. This uses a second esp32 that receives the frames, encodes them to json, and publishes them to mqtt. Then some python running on a raspberry pi takes the readings off mqtt, calculates some derived values, and posts them to a google form as well as uploads processed versions to google cloud storage.

In the Buoy:

The buoy now contains 3 MCUs that communicate over two i2c busses. The main application logic now runs on an avr64db32. The LoRa board runs the network code only. Finally an ATTiny1624 acts as a watchdog and cycles power for the rest of the system if it hangs.

avr64db32

The main application logic now runs on an avr64db32. This wakes up at 16hz to collect samples continually. It wakes the ESP every ~16 seconds to transmit data. This chip draws ~1-2ma at 4mhz. As much of the logic as possible runs here. The full firmware image for this chip is 64kb making it the easiest and safest to update. At SF7 CR4_5 this takes 35 seconds to transmit an update and another 10s or so to apply and verify.

A PTC interrupt is configured to call its ISR at 16hz. A full sample is 16 seconds giving 256 PIT interrupts per sample. When a new reading starts the full 32-bit value is stored. Each of the following samples is stored as a 16-bit difference from sample0. After the 256'th reading the 512+4 bytes of samples are further compressed to a 32-bit offset, an 8-bit shift value, and 256 8-bit samples. This is copied into a transmit buffer so the 512-byte sample buffer is available for the next cycle. This also increments a (volatile) timebase called ticks.
The application data that passes between processors is organized in a register file. This is a union between some structs storing both the configuration and the network messages and a uint8_t array.
- A structure acfg for the avr configuration. Things like the GPS timeouts and BMP390 settings.
- A structure ecfg for the esp32 configuration. Things like the network transmit power.
- A structure ecmd for dispatching commands from the avr to the esp32
- A structure acmd for dispatching commands from the esp32 to the avr
- A structure raw that contains the detailled measurement information and statistics
- A structure wan that contains the old 24-byte LoRaWAN summary messages. In the event the buoy floats away these should still be received.
The avr64 impliments an i2c slave on the secondary i2c bus exposing that register file. This was a change from the initial plan as the esp32 arduino i2c slave driver proved horribly unstable.
The main loop interleves preparing network messages, dispatching them to the esp, and collects GPS measurements. It keeps a local non-volatile copy of the ticks at the start of the loop (loop_ticks) to avoid expensive synchronization.
A flag is set by the PIT ISR when a sample is ready. The loop fills out the rest of the network frame and when ecmd is idle dispatches it to the esp32 by populating ecmd and driving the avr2esp pin low
The loop checks the age of the most recent GPS reading. If it is time, the GPS is turned on.
If data is available from the GPS it is processed. After a lock the measurements are stored, the times are updated
After the ESP returns the result of a transmission request, statistics are kept on success/failure, power levels are adjusted. The watchdog is fed only after the ESP says it transmitted a packet. Additionally several error conditions will cause watchdog feeding to be skipped:
- lack of response from the esp
- the pressure sensor returning all zeros
- the esp32 not receiving an ack for 2 hours

Heltec Wireless Stick Lite (esp32)

The main purpose of this component is to run arduino-lmic. It also impliments all of the cryptography and the firmware update routines for the avr64 and itself. It is a power hog, running 12-20ma "idle". The SPI driver (required for LoRa) prevents ESP32 power managent from dymanically adjusting the clock. While this also supports OTA firmware updates, they are (currently) 6x larger, more error-prone, and it is possible to perform an update to software that breaks further updates.

The main loop enters light-sleep, with both a RTC-GPIO wakeup and a 30s backup timer. On a GPIO wakeup it loads the ecmd register and does whatever the opcode says it should
- transmit raw data directly
- transmit summary over LoRaWAN
- update the esp32's clock
Raw frames are sent encrypted, mostly as a convenient way to not process random garbage received over the air. The scheme will be described below at some point.
After each transmission, the esp waits for a response which will either be an ACK or a command for it. Commands are mostly firmware updates or settings changes.
The backup timer exists to wake the ESP so it can send an "I'm alive" to allow rescue from bad AVR firmware

The Watchdog

Power to the rest of the system (the ESP, the AVR, and the sensors) is controlled by the watchdog. In it also controls the reset lines but I decided to configure it to power-cycle instead. The watchdog firmware is somewhat more flexible and allows configurable (long) intervals. The system currently uses a 90-second interval to avoid triggering a watchdog reset during firmware updates (the interval is increased at the beginning of an esp update). The watchdog also exposes some memory that survies power cycles to store some configuration like LoRaWAN frame counters, reboot counters, and sequence number counters. Finally it controls the ina3221 power monitor and impliments an accumulator to allow the main processor to only read/clear the power usage information once per sample. Communication is via an i2c slave on the watchdog (on the bus mastered by the avr).

The old single-mcu logic was:

Interleaved with the below, track average power use and copy bytes from the GPS to the GPS library
Power the GPS on using vext control
Start a timer to collect a pressure (altitude) measurement every 59ms until 256 samples have been collected (15.104s).
Find the min and max altitude.
Caculate the average period. While a FFT would be more correct the results looked wrong in manual testing, so that version is deferred until v1 of the hardware which will support over-the-air firmware updates.
Wait for the GPS to acquire a signal (usually done by now)
Read the GPS, battery, temperature, and do some power calculations
Pack a 24-byte LoraWAN frame
Schedule the next measurement based on the current battery level
Sleep

Frame format 0b000:

LoraWAN packet size is highly constrained. While a JSON representation of this might run hundreds of bytes we can easily byte pack all the data into a 24-byte frame (the Helium data size unit).

bytes	signal	format
0	3-bit type code \| 5 bits of sequence	allows the parser to determine message format. This is format 0b000
1	7 bits of sequence \| high bit of timestamp
2-3	low 16-bits of timestamp	time in seconds since midnight
4-6	lattitude	24-bits (lat+90)100009.32
7-9	longitude	24-bits (lat+180)100004.66
10	battery current	ma + 127
11	solar current	ma + 127
12	temperature	degrees C * 4
13-14	wave height	mm
15-16	gps horizontal accuracy	cm
17-18	wave period	ms
19-21	pressure	24-bit
22	battery	8-bit, (mv-3280)>>2 with 0 and 255 hardcoded for out of range
23	sat count	byte, could steal 4 bits here

Frame format 0b001:

This frame format encodes only the GPS delta from the previous 0b000 frame.

bytes	signal	format
0	3-bit type code \| 5 bits of sequence	allows the parser to determine message format. This is format 0b001
1	7 bits of sequence \| high bit of timestamp
2-3	low 16-bits of timestamp	time in seconds since midnight
4	lattitude delta	8-bits (lat+90)100009.32
5	longitude	8-bits (lat+180)100004.66
10	battery current	ma + 127
11	solar current	ma + 127
12	temperature	degrees C * 4
13-14	wave height	mm
15-16	gps horizontal accuracy	cm
17-18	wave period	ms
19-21	pressure	24-bit
22	battery	8-bit, (mv-3280)>>2 with 0 and 255 hardcoded for out of range
23	sat count	byte, could steal 4 bits here

Cryptography and raw packet formats

This section is mostly TBD. At a high level, a message is fragmented into 255-byte frames. The first frame includes a header with a message type, a frame sequence number, and an (ecb) encrypted SHA256 of the full message. A KDF is run off the decrypted hash to produce a per-message key. That key is used for GCM encryption of the remaining frames. A 64-bit tag is used per-frame to cheaply detect accidental corruption, if that passes the full sha256 is checked once the final message is assembled. Messages where the sha256 doesn't pass are discarded. Frame sequence counters prevent replays. This was mostly implimenmted to ensure the integrity of firmware updates but it was easy enough to adapt for the rest of the system. An AVR firmware update is 262 messages so there is retransmission logic as otherwise most updates would fail.

Notes on power management (old version):

I built and deployed the buoy with only a rough sense of the power consumption in various modes as well as the performance of the solar cell under real world conditions. In particular the INA3221 was a late addition and prior to that I was reasonably blind to the power use. To ensure it would work reasonably well across a wide range of possible scenarios (weather), the system was initially designed to adapt its power use depending on how charged its battery is.

The current draw with the GPS module on is around 50ma. The sleep draw is < 2ma. Further optimization is possible, the INA3221 is about 400ua and the mcu is also in the microamp range. This means the average power use can be varied by changing the duty cycle (time between measurements). Wave measurements take about 30 seconds which is similar to the GPS warm up time (a better GPS module would warm-start faster). So figure the system is in high power state for about 30-40s.

With a 2-minute sample rate, the system is on for about 30% of the time for a blended average draw of 15ma. If the battery is below 4.1V, and above 3.8V the system collects a sample every 5 minutes which should reduce the average draw down to 5-7ma. 3.7-3.8V collects a sample every 10 minutes and below 3.7V every 15 minutes.

Further experimentation will be done around comparing the power use for a reading that omits GPS

Measuring average current with the INA3221:

The INA3221 should be configured to spend most of its time measuring current instead of voltage. I use 588us voltage, 8244us current, and 64-sample averaging to produce a reading every ~1074ms which is an easy enough time period to interleve. The readings are accumulated during each measurement, tracking the total time. Use the measurement done flag to ensure each reading is captured exactly once. Wait for a cycle to finish (with the GPS off) before sleeping to ensure the reading on awakening reflects the sleep use. Note this would all be much simpler on an INA233 which has an integrated accumulator.

This library supports the INA3221.

Helium with the HelTec CubeCell:

More to come on Helium. But you can get started at this console.

Comments on making heltec lorawan library work with Helium:

set userChannelMasks[6] to {0xFF00, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000} for Helium's channels
make sure you set the region to US915
the heltec and helium byte ordering are the same for devEui, appEui, and appKey

Backend:

The current backend is a quick and dirty placeholder that like all temporary solutions may outlive its intended lifetime. Helium provides a google forms integration and runs user-supplied javascript to decode the payload into JSON and post it to the google form. Forms dumps data to a sheet which generates the graphs.