EASTER: Learning to Split Transformers at the Edge Robustly

doi:https://doi.org/10.1109/TCAD.2024.3438995

EASTER: Learning to Split Transformers at the Edge Robustly

Authors	X. Guo Q. Jiang Y. Shen A.D. Pimentel T. Stefanov
Publication date	11-2024
Journal	IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume \| Issue number	43 \| 11
Pages (from-to)	3626-3637
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Prevalent large transformer models present significant computational challenges for resource-constrained devices at the Edge. While distributing the workload of deep learning models across multiple edge devices has been extensively studied, these works typically overlook the impact of failures of edge devices. Unpredictable failures, due to, e.g., connectivity issues or discharged batteries, can compromise the reliability of inference serving at the Edge. In this article, we introduce a novel methodology, called EASTER, designed to learn robust distribution strategies for transformer models against device failures that consider the tradeoff between robustness (i.e., maintaining model functionality against failures) and resource utilization (considering memory usage and computations). We evaluate EASTER with three representative transformers—ViT, GPT-2, and Vicuna—under device failures. Our results demonstrate EASTER’s efficiency in memory usage, and possible end-to-end latency improvement for inference across multiple edge devices while preserving model accuracy as much as possible under device failures
Document type	Article
Language	English
Published at	https://doi.org/10.1109/TCAD.2024.3438995
Downloads	EASTER (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

EASTER: Learning to Split Transformers at the Edge Robustly