Getty Images
Tips and tricks for deploying TinyML
A typical TinyML deployment has many software and hardware requirements, and there are best practices that developers should be aware of to help simplify this complicated process.
TinyML is a generic approach for shrinking AI models and applications to run on smaller devices, including microcontrollers, cheap CPUs and low-cost AI chipsets.
While most AI development tools focus on building bigger and more capable models, deploying TinyML models requires developers to think about doing more with less. TinyML applications are often designed to run on battery-constrained devices with milliwatts of power, a few hundred kilobytes of RAM and slower clock cycles. Teams need to do more upfront planning to meet these stringent requirements. TinyML app developers need to consider hardware, software and data management and how these pieces will fit together during prototyping and scaling up.
ABI Research predicts the number of TinyML devices will grow from 15.2 million shipments in 2020 to a total of 2.5 billion by 2030. This promises many opportunities for developers who have learned how to deploy TinyML applications.
Sang Won Lee, CEO of embedded AI platform Qeexo, said, "Most of the work is similar to building a typical ML model, but there are two extra steps with TinyML: converting the model to C code and compiling for the target hardware." This is because TinyML deployments are geared toward small microcontrollers, which are not designed to run heavy Python codes.
It is also essential to plan how TinyML applications might deliver varying results in different environments. Lee said TinyML applications generally work with sensor data that's heavily dependent on the surrounding environment. When the environment changes, the sensor data changes as well. As a result, teams need to plan to reoptimize models in different environments.
What is involved in getting started with TinyML?
AI developers may want to brush up on C/C++ and embedded systems programming to understand the basics of deploying TinyML software on constrained hardware.
"Some familiarity with general principles of machine learning, embedded systems programming, microcontrollers and working with hardware microcontroller boards is needed," said Qasim Iqbal, chief software architect at autonomous submarine developer Terradepth.
Good products to assist TinyML deployments include the Arduino Nano 33 BLE Sense, the SparkFun Edge and the STMicroelectronics STM32 Discovery Kit. Secondly, a laptop or desktop computer with a USB port is needed for interfacing. Third, it's fun to experiment by equipping hardware with a microphone, accelerometer or camera. Finally, Keras software packages and Jupyter Notebooks might be needed for training a model on a separate computer before that model is moved to a microcontroller for execution and inference.
Iqbal also recommends learning preprocessing tools that transform raw input data to be fed to a TensorFlow Lite Interpreter. Then, a post-processing module can change the model's inferences, interpret them and make decisions. Once this is completed, an output handling stage can be implemented to respond to predictions using device hardware and software capabilities.
Before getting too serious, a few demo projects can help developers understand the implications of various TinyML constraints. In addition to limitations on RAM and clock speed, developers may also want to explore the limits of stripped-down Linux distributions that run on their target platforms. These often have limited support for the OS and system libraries that they would expect on larger Linux-based systems.
"Judicious decisions regarding the right device hardware, software support, machine learning model architecture and general software considerations are important," Iqbal said.
It's helpful to investigate whether a microcontroller will support the intended app or if larger devices, such as Nvidia's Jetson series of devices, might work better.
Combining hardware and software
Developers learning about TinyML software might consider investigating the community behind each TinyML tool before getting too attached to any particular one.
"Quite often, you won't be able to find answers to your questions in the official documentation," said Jakub Lukaszewicz, head of AI for construction technology platform AI Clearing. Lukaszewicz often found himself resorting to browsing the internet, Stack Overflow or specialized forums to find answers. If the ecosystem around the platform is sufficiently big and active, it's easier to find people who have similar problems and learn how they address them.
It's also helpful to investigate the available hardware before diving in too deeply.
"The sad news is that in the post-pandemic reality, delivery times can be long and you may be left with a limited choice of what is currently available on the shelf," Lukaszewicz said.
After getting the board, the next step is choosing the ML framework to work with. Lukaszewicz said TensorFlow Lite is currently the most popular framework, but PyTorch Mobile is gaining traction. Finally, you want to find tutorials or dummy projects using the ML framework and board of your choosing to see how the pieces fit together.
Watch out for changes in the frameworks and hardware that may create issues. Lukaszewicz has often struggled with outdated documentation and things not working as they should.
"It is often the case that the platform was tested against a given version of a framework, such as TensorFlow Lite, but struggled with the newest one," he said.
In such cases, he recommends downgrading to the latest supported version of the framework and rerunning your model.
Another problem is dealing with unsupported operations or insufficient memory to fit the model. Ideally, developers should take an off-the-shelf model and run it on a microcontroller without too much hassle. "Unfortunately, this is often not the case with TinyML," Lukaszewicz said.
He recommends first trying out models that have been proven to work on the board of your choice. He often discovered that a state-of-the-art model uses some mathematical operations that are not yet supported on certain devices. In such a scenario, you would have to change network architecture, replace those operations with supported ones and retrain the model, hoping all this would not sacrifice its quality. Reading forums and tutorials is a great way to see what works and what doesn't work on a given platform.
Choosing the right hardware
Lukaszewicz said there are several choices when it comes to selecting the appropriate hardware and configurations. Popular choices include ecosystems like Raspberry Pi, ESP32, STM32 and Arduino. He has found the following questions helpful in defining requirements:
- What form factor do you want? Do you need the smallest board available, or could you go with a bigger one?
- How about power consumption? Will your solution rely solely on batteries, or will you have an external power supply?
- Do you want quiet and low-maintenance passive cooling or is active cooling OK?
- How much RAM do you need?
- What built-in sensors should be on the board?
- What are the required ports and interfaces?
- What is the acceptable price range?
Deploying AI to the edge
Developers need to consider all viable approaches when deploying TinyML as a robust and scalable application rather than as a proof of concept. Building a TinyML application to scale starts with drafting a detailed description of the application and its requirements. This will help guide the selection of the sensor's hardware and software.
"Typically, a business will start with a use case that's driving them toward TinyML and from there begin to identify a solution that meets their needs," said Jason Shepherd, VP of ecosystem at Zededa and a Linux Foundation Edge board member.
Given the constrained nature of the devices involved, there is an extremely tight coupling between the software and the capabilities of the underlying hardware. This requires deep knowledge of embedded software development for compute optimization and power management. Shepherd said that organizations often build TinyML applications directly instead of buying the infrastructure, particularly in the early stages.
This is a great way to learn how all the pieces fit together, but many teams discover this is more complicated than they thought, particularly when they sort out the details of deploying AI to the edge at scale, support its entire lifecycle and integrate it with additional use cases in the future. It's worth investigating new tools from vendors like Latent AI and Edge Impulse to simplify the development of AI models that are optimized for the silicon they are deployed on.
Companies that decide to build these apps in-house need a mix of embedded hardware and software developers who understand the inherent tradeoffs when working with highly constrained hardware. Shepherd said key specialties should include the following:
- understanding model training and optimization;
- developing efficient software architecture and code;
- optimizing power management;
- dealing with constrained radio;
- networking technologies; and
- implementing security without the resources available on more capable hardware.
Enterprises must consider the privacy and safety implications of deploying TinyML applications in the field to succeed in the long run. Although TinyML applications show promise, they could also open new problems -- and pushback if companies are not careful.
"The success of edge AI overall and TinyML will require our concerted collaboration and alignment to move the industry forward while protecting us from potential misuse along the way," Shepherd said.