Machine‑to‑Machine UK, Friday, 1st October 2010
One of the advantages of working for a startup is that I get to work on interesting projects with interesting companies helping them solve difficult problems.
While my day job is normally working on product development for stormmq customer driven projects are always welcome. In this one I wrote an embedded AMQP C client.
I’ve recently finished one such project where I got to spend 3 weeks programming in C, something I’ve not done in over 15 years.
Our customer wanted to collect telemetry data from an electrical vehicle. A custom built embedded device collected data from the vehicle’s CAN bus, a number of discrete sensors and a GPS module and transmitted the data using a GPRS modem. During the proof of concept stage, the embedded devices connected directly to a backend server and transmitted their data in plain text.
Commercial sensitivity of the data collected meant that plain text transmission would not be acceptable for production and the close coupling of the telemetry unit to the backend server caused data to be lost whenever the backend was not available or the telemetry unit could not establish a GPRS connection.
The solution to two of these problems were to adopt stormmq’s service. Using AMQP over SSL decoupled the telemetry units from the backend and provided necessary confidentiality. The solution to the remaining problem (loss of data when without a GPRS signal) was to have the telemetry units buffer collected data to flash cards.
At stormmq we have a number of clients: Java, Ruby, C#, PHP and Python. However, we did not have a C client that worked over SSL. To allow our customer use the stormmq service we had to port a C AMQP client to the telemetry unit and get it working with an SSL stack on the unit.
At the start of the project I did not have access to one of the telemetry units. This was not a barrier as I was able to do some preliminary work on Linux, getting a C AMQP client to work with SSL.
Some searching for a suitable SSL implementation unearthed two candidates: CyaSSL and MatrixSSL. I choose CyaSSL because its license allowed has a exception that allows for it to be embedded within other FOSS projects, even if they are release under something other than the GNU license.
RabbitMQ had an experimental C client library available that seemed like it would more than meet the needs of our simple embedded client. Getting CyaSSL and rabbitmq‐c running on my Linux VM was straightforward. One worry though, was the amount of memory being used. CyaSSL alone allocated about 49KB on the heap for an SSL connection.
At this stage I finally got to meet the hardware developer, Tristan Dargay of TAD Electronics, based in Worcester. He runs a small lab building all sorts of micro electronics kit for clients as diverse as the security, telecoms and vehicle manufacturing industries.
Tristan gave me a prototype of the telemetry unit and helped me setup a development environment where I could add my code to what he had already developed for the telemetry unit. The unit used a PIC32MX processor from Microchip. Microchip provide a development environment, MPLAB. MPLAB comes with a set of C libraries and uses a gcc cross compiler to generate code for the PIC32MX. Tristan also provided a loader for the telemetry unit and an USB to RS485 adapter so that I could flash the telemetry unit from my MBP.
It’s been over 20 years since I did any work on an embedded system. Back then, I would have used an in‐circuit emulator from Ashling, HP or Intel and most likely would have done all the programming in assembler. This time all the code was written in C, mostly on a 32 bit Linux VM. I only used the MPLAB IDE, which I ran in a Vista VM, to develop code that talked directly with the telemetry unit’s hardware. I was able to run both VMs at the same time using Parallels on my 17″ MacBook Pro. The Mac and telemetry unit both fit nicely into a Crumpler bag. It’s actually been 25 years since I used the HP development system and it needed a clean room for it’s 5MB hard disk unit and controller. The Intel system, where everything was integrated into a single box, required two people to lift it.
I’m digressing from my visit to TAD Electronics. Over 2 days we compiled the software, uploaded it to the device and tested it; over and over again! The first task was to get the CyaSSL code working on the telemetry unit. Without a socket library we could not use any of the SSL functionality. We could however test the ctaocrypt library. All but one test passed. Ctaocrypt has a FASTMATH build option that uses a large integer arithmetic optimization (based on TomsFastMath library). Turning off the FASTMATH build option got the failing test to pass.
To get the SSL code running we needed a Berkley Socket abstraction running on the telemetry unit. All the socket abstraction required were implementations of send and recv. The GPRS modem was configured to automatically open a TCP/IP connection to the stormmq server, so no connect function was required. The send function just had to send characters to the GPRS modem’s UART chip and the receive function had to interact with a simple interrupt handler that accepted characters from the modem’s UART. Once we got the send and receive functions working we were able to establish an SSL connection to the stormmq server. Woohoo!
The following week when I got back to the office I still had a lot of work to do to reduce the memory footprint of CyaSSL on the telemetry device. The PIC32 processor had 256KB flash for program code and 64KB RAM for data. I reduced the CyaSSL RAM requirements by limiting the SSL send and receive buffers to 3KB and by removing support for zlib compression and session caching. I reduced the code size by building CaySSL to only support TLS 1.1, TLS 1.2 and the AES cipher suites. I changed internal CyaSSL memory allocation functions to allocate from a static buffer rather than the heap. Using static memory (apart from relieving an old bias I have against using malloc in an embedded system) meant that the MPLAB tool could keep track of how much memory I was using.
I was not able to remove all the dynamic memory allocations. The large math code uses dynamic allocation for big integers. I spent some time experimenting with how the code allocated memory but abandoned the work because: I did not have the time; I was in danger of writing my own version of malloc and I did not have sufficient test coverage of of the large integer arithmetic code to risk breaking it. (TomsFastMath does not use the heap. It allocates memory for large integers on the stack. I did get the FASTMATH build option working but only at the cost of using a large stack – so large that there was not enough memory to any data collection).
At this stage I had an SSL library running on the wee device but I did not have enough free memory for a port of rabbitmq‐c. I decided to write what we now call amqp‐lite. Amqp‐lite supports a tiny subset of the AMQP protocol suitable for an embedded client that just transmits data.
Once I got ampq‐lite working over SSL we had to test various failure scenarios. How would the software handle a broken connection whenever the telemetry until ended up in a tunnel? Testing connection breaks in my office proved to be very challenging. At first I just unscrewed the antenna from the telemetry unit. I expected the device to stop transmitting data but it continued to operate. The GPRS signal at my office must be much better than I realised. To get the unit to drop a connection I had to first disconnect the antenna and then cover the entire unit with a tinfoil hat!
Testing unearthed one serious problem. The GPRS module would automatically reestablish a TCP/IP connection whenever it got a signal. However, every time it reopened a TCP/IP connection we had to reestablish the SSL connection. In the first pass at the send and recv implementation I only checked the state of the DCD line on the GPRS module’s UART when writing data. This was not reliable because a connection loss could go unnoticed while the main loop was busy elsewhere.
A new version of the interrupt handler was required that monitored the state of the DCD signal from the UART. Anytime there was an interrupt from the UART I sampled the state of the DCD line and updated a variable that reflected the state of the connection (disconnected; connecting; connected; …). Back in the main loop I checked the connection state before every IO operation. However the code never seemed to see a state change. Two days were spent head scratching before I suspected the gcc optimizer. All the code to do GPRS IO, including the interrupt handler was in one source file and even though all variables used by the interrupt handler were volatile the mail loop did not see them changing value reliably. Moving the interrupt handler into its own file and declaring all the IO state variables extern volatile in the main loop suddenly fixed the problem. I’ve done a lot of work fixing race conditions in multi threaded C++ systems but I was still caught out by the gcc optimizer turned up to 11.
The final problem was that Tristan needed to add some extra data collection code and did not have enough free flash memory for the code. Thankfully the fix was easy. I had added a lot of tracing to the amqp‐lite code and all the printf format strings added up. Removing the tracing code freed up enough memory for the new data collection functionality.
When I was first asked if putting an AMQP client on a small embedded device I was confident it could be done. And thanks to a slightly customized version of CyaSSL we managed to conserve just enough memory for the TAD Electronics application code. Further savings would have been possible but would have been expensive in either time or license fees. It’s been a long time since I did any embedded systems development and entire project was the most fun I’ve had programming in years. Finally, I’d like to thank Tristan Dargay for his help, without which the project would not have been possible.
About the AuthorGot a question but don’t want to comment? Email me.
Other posts you might likeGuess-timated
Telemetry has for a long time been associated with Formula One racing.
Cloud Based Message Queuing as a component in a significant civilian vehicle telemetry collection project.
Innovator unveils first Cloud Message Queuing service using the AMQP standard with built in data encryption and delivered from UK based ISO approved data centres.