AI on Microcontrollers

FOSDEM 2018

The deployment of Deep learning technology today normally limited to GPU clusters, due to their computational requirements. For AI to be truly ubiquitous, its cost and energy efficiency needs to be improved. With the recent developments made in algorithms and MCUs, we introduce a deep-learning inferencing framework which runs TensorFlow models on MCU powered devices (with Mbed). In comparison to GPUs and mobile CPUs, MCU based devices are much more cost and power efficient. We believe this will open a new paradigm to AI and edge computing.

uTensor is a Machine Learning framework designed for IoT and embedded systems. Base on Mbed and TensorFlow, it is possible to run deep-learning models on an RTOS with memory requirement less than 256kB. Its binary size, ~50kB, is a desirable trait to most cost-sensitive systems.

In comparison to a typical ML environment today, running on GPU and APUs, the resource reduction uTensor offers is significant. This is achieved with techniques such as: quantization, garbage collection, minimal code dependency and network-architecture design.

The current frameworks consist of:

Operators: quantized kernels where most of the computations take place Tensors: data structure and memory abstraction Context: graph definition and resource management Code generator: an TensorFlow to uTensor export (WIP)

The framework serves as an environment to test out the latest idea in edge-computing, and, eventually hope to empower developers to create next generation smart-devices. Currently, the framework is able to run a 3-layer MLP on MNIST dataset with an accuracy of 97.1%. More operators, tools and optimizations are at the work.

This talk aims to share the architecture, design decisions and backstories during the development. We hope the information will help those who wish to contribute to or build similar projects in the future.

Speakers: Neil Tan