ARM-CO-UP: ARM COoperative Utilization of Processors
| Authors | |
|---|---|
| Publication date | 09-2024 |
| Journal | ACM Transactions on Design Automation of Electronic Systems |
| Article number | 86 |
| Volume | Issue number | 29 | 5 |
| Number of pages | 30 |
| Organisations |
|
| Abstract |
HMPSoCs combine different processors on a single chip. They enable powerful embedded devices, which increasingly perform ML inference tasks at the edge. State-of-the-art HMPSoCs can perform on-chip embedded inference using different processors, such as CPUs, GPUs, and NPUs. HMPSoCs can potentially overcome the limitation of low single-processor CNN inference performance and efficiency by cooperative use of multiple processors. However, standard inference frameworks for edge devices typically utilize only a single processor.We present the ARM-CO-UP framework built on the ARM-CL library. The ARM-CO-UP framework supports two modes of operation – Pipeline and Switch. It optimizes inference throughput using pipelined execution of network partitions for consecutive input frames in the Pipeline mode. It improves inference latency through layer-switched inference for a single input frame in the Switch mode. Furthermore, it supports layer-wise CPU/GPU DVFS in both modes for improving power efficiency and energy consumption. ARM-CO-UP is a comprehensive framework for multi-processor CNN inference that automates CNN partitioning and mapping, pipeline synchronization, processor type switching, layer-wise DVFS, and closed-source NPU integration.
|
| Document type | Article |
| Language | English |
| Published at | https://doi.org/10.1145/3656472 |
| Downloads |
ARM-CO-UP
(Final published version)
|
| Permalink to this page | |
