How to put the “elephant” of machine learning into the “refrigerator” of MCU?

In the eyes of many people, artificial intelligence (AI) is just an unknown existence in science fiction films, and it is far away from the lives of ordinary people. But this situation is changing. In the next 5-10 years, AI will penetrate into every aspect of our life at a speed beyond our imagination. Why do you say that? Let’s look down together.

The Basic Paradigm of AI IoT

The reason why we have a “sense of distance” between us and AI is mainly because playing with AI used to be a luxury. This “luxury” is mainly reflected in the fact that machine learning (ML), which AI relies on, requires very high computing power in the process of its training and reasoning. In order to cope with such challenges, cloud computing has become a classic method for implementing machine learning by concentrating computing power for data processing.

However, in the era of the Internet of Things, such a model has been challenged – centralized cloud computing consumes a lot of bandwidth and storage resources, real-time data transmission consumes a lot of power, long data transmission delay between the terminal and the cloud, data transmission and cloud centralized storage process High security risk. These drawbacks make people realize that pure cloud computing is not a panacea.

Therefore, edge computing, as a supplement to classic cloud computing, is getting more and more attention. According to the definition of edge computing, most of the computing tasks are directly processed on the edge device, and only when necessary, some preprocessed data are transmitted to the cloud for “finishing”, which can not only improve the real-time response of the edge side The speed and level of intelligence can also reduce the burden on network transmission channels and cloud data centers. Therefore, such a hybrid computing model can obviously perfectly resolve the pain points of traditional cloud computing.

This change in computing architecture has also had an impact on the model of machine learning, shifting it from a computing-centric model to a data-centric model. In the two modes, the former is to complete the training and inference of machine learning in the cloud data center, while the latter is to complete the training of the model in the cloud, and the inference is completed on the edge device, which also forms Fundamental Paradigm for Artificial Intelligence Internet of Things (AIoT) implementation.

Extending the boundaries of machine learning to MCUs

Obviously, edge computing has greatly expanded the boundaries of machine learning, moving it from the data center room to more diverse network edge intelligence. But for IoT applications, this does not seem to be enough. Because inference on edge devices still requires relatively powerful computing power, which usually requires more complex heterogeneous microprocessors including ML co-processors to achieve acceleration, such a configuration is already considered very ” high-end”. This alone will shut out many applications that are sensitive to power consumption, cost, and real-time performance from machine learning.

Therefore, if machine learning wants to continue to expand its territory, one of the main directions is to make microcontrollers (MCUs) with simpler resources and more limited computing power able to run and play machine learning. Research data from IC Insights shows that the global MCU shipments in 2018 were 28.1 billion, and this number will increase to 38.2 billion by 2023, and the global MCU stock will be hundreds of billions. Who can make this happen? A device of the order of magnitude can play machine learning, and its future and money will be limitless!

But for any dream, reality often seems more “skinny”. Deploying machine learning to MCU operation is like stuffing an elephant into a refrigerator, and this answer is definitely not a brain teaser, but requires careful consideration from two dimensions of technology.

Slimming down machine learning models

The first dimension is to consider how to “slim down” the “elephant” of the ML model, that is to say, to develop corresponding technologies that can deploy and run “miniaturized” machine learning inference on microcontrollers Model. The conditions that need to be met for this slimmed-down model include:

The terminal power consumption of running the model is generally at the mW level or even lower; the occupied memory is generally less than a few hundred kB; the inference time is at the ms level, which generally needs to be completed within 1s.

In order to achieve such a goal, TinyML technology came into being. As the name suggests, this is a technique that can make ML models “smaller”. Like the basic paradigm of AIoT machine learning mentioned above, TinyML also needs to collect data and train in the cloud, but the difference lies in the optimization and deployment of the model after training – in order to adapt to the limited computing resources of MCU, TinyML must The model is “deeply compressed”, and can be deployed on edge terminals only after a series of operations of distillation, quantization, encoding, and compilation of the model.

How to put the “elephant” of machine learning into the “refrigerator” of MCU?

Figure 1: Schematic diagram of deploying TinyML in an embedded device (Image source: Network)

Among them, some key technologies include:

Distillation: Refers to making changes to the model after training to create a more compact representation through the techniques of pruning and knowledge distillation.

Quantization: After model distillation, quantization is used to approximate the representation of 32-bit floating-point data in a data type with fewer digits, reducing model size, memory consumption, and speeding up model inference within an acceptable accuracy loss range.

Coding: It is to store data through more efficient coding methods (such as Huffman coding) to further reduce the size of the model.

Compilation: The model compressed in the above way will be compiled into C or C++ code that can be used by most MCUs, and run through the lightweight network interpreter (such as TF Lite and TF Lite Micro) on the device.

In the past two years, we have clearly felt that TinyML technology is heating up, and manufacturers are also increasing their investment in this field. According to the forecast of Silent Intelligence, in the next 5 years, TinyML will trigger an economic value of more than 70 billion US dollars, and maintain a compound annual growth rate of more than 27.3%.

Create a new species of machine learning MCU

Putting the “elephant in the refrigerator”, in addition to working hard on the “elephant” (that is, the ML model), another dimension of effort is to transform the “refrigerator”, which is to optimize and transform the familiar MCU , making it fit the needs of running ML.

For example, to meet the need to implement complex machine learning functions in IoT edge devices, Maxim Integrated has introduced a dedicated low-power ML microcontroller, the MAX78000. Built-in Arm Cortex-M4F processor (100MHz) and 32-bit RISC-V co-processor (60MHz), as well as a convolutional neural network accelerator supporting 64-layer network depth, the device can perform AI inference in battery-powered applications, while only Consumes microjoules of energy. Compared with the traditional software solution, this hardware acceleration-based solution reduces the energy consumption of complex AI inference to 1% of the former, and the inference speed can be 100 times faster.

It is expected that new species with similar ML characteristics will become an important branch in the product roadmap of various MCU manufacturers in the future.

Figure 2: The MAX78000, a low-power ML microcontroller from Maxim Integrated (Credit: Maxim)

Summary of this article

In summary, compared with embedded computing architectures such as microprocessors or x86, MCUs have the characteristics of low power consumption, low cost, short development cycle, fast time to market, good real-time performance, and large market volume. It can be combined with high-energy machine learning, and its imagination space is undoubtedly huge.

In the process of promoting the “combination” of the two, if we can provide developers with a “new species” of MCU that supports machine learning functions, and if we can provide a complete development tool chain to make the optimization and deployment of ML models easier, then Putting the “elephant” of machine learning into the “refrigerator” of the MCU will become an easy task.

More importantly, this trend has just sprung up, and you have every opportunity to become an early bird and fly freely in this new field.

Author: Yoyokuo