Model and system robustness in distributed CNN inference at the edge

X. Guo; Q. Jiang; A.D. Pimentel; T. Stefanov

doi:https://doi.org/10.1016/j.vlsi.2024.102299

Model and system robustness in distributed CNN inference at the edge

Authors	X. Guo Q. Jiang A.D. Pimentel T. Stefanov
Publication date	01-2025
Journal	Integration
Article number	102299
Volume \| Issue number	100
Number of pages	9
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Prevalent large CNN models pose a significant challenge in terms of computing resources for resource-constrained devices at the Edge. Distributing the computations and coefficients over multiple edge devices collaboratively has been well studied but these works generally do not consider the presence of device failures (e.g., due to temporary connectivity issues, overload, discharged battery of edge devices). Such unpredictable failures can compromise the reliability of edge devices, inhibiting the proper execution of distributed CNN inference. In this paper, we present a novel partitioning method, called RobustDiCE, for robust distribution and inference of CNN models over multiple edge devices. Our method can tolerate intermittent and permanent device failures in a distributed system at the Edge, offering a tunable trade-off between robustness (i.e., retaining model accuracy after failures) and resource utilization. We verify the system‚Äôs robustness by validating the overall end-to-end latency under failures. We evaluate RobustDiCE using the ImageNet-1K dataset on several representative CNN models under various device failure scenarios and compare it with several state-of-the-art partitioning methods as well as an optimal robustness approach (i.e., full neuron replication). In addition, we demonstrate RobustDiCE‚Äôs advantages in terms of memory usage and energy consumption per device, and system throughput for various system setups with different device counts.
Document type	Article
Language	English
Published at	https://doi.org/10.1016/j.vlsi.2024.102299 (Final published version)
Downloads	Model and system robustness in distributed CNN inference at the edge (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Model and system robustness in distributed CNN inference at the edge