Energy-efficient Deployment of Deep Learning Applications on Cortex-M based Microcontrollers using Deep Compression

Konferenz: MBMV 2023 – Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen - 26. Workshop
23.03.2023-24.03.2023 in Freiburg

Tagungsband: ITG-Fb. 309: MBMV 2023

Seiten: 12Sprache: EnglischTyp: PDF

Autoren:
Deutel, Mark; Teich, Juergen (Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany)
Woller, Philipp; Mutschler, Christopher (Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Nuremberg, Germany)

Inhalt:
Large Deep Neural Networks (DNNs) are the backbone of today’s artificial intelligence due to their ability to make accurate predictions when trained on large data sets. With advancing technologies such as the Internet of Things, interpreting large amounts of data generated by sensors is becoming an increasingly important task. However, in many applications, not only the predictive performance but also the energy consumption of deep learning models is of great interest. This paper investigates the efficient deployment of deep learning models on resource-constrained microcontroller architectures via network compression. We present a methodology for systematically exploring different DNN pruning, quantization, and deployment strategies for different microcontroller architectures, with a focus on lowpower ARM Cortex-M-based systems. The exploration allows to analyze trade-offs between key metrics such as accuracy, memory consumption, execution time, and power consumption. We discuss experimental results on three different DNN architectures and show that we can compress them to less than 10% of their original parameter count before their prediction quality degrades. This also allows us to deploy and evaluate them on Cortex-M based microcontrollers.