SwiftSNNI: Optimized Scheduling for Secure Neural Network Inference (SNNI) on Multi-Core Systems

Open Access
Authors
Publication date 2026
Book title ICPE '26
Book subtitle Proceedings of the 17th ACM/SPEC International Conference on Performance Engineering : May 4-8, 2026, Florence, Italy
ISBN (electronic)
  • 9798400723254
Event 17th ACM/SPEC International Conference on Performance Engineering
Pages (from-to) 120-134
Number of pages 15
Publisher New York, NY: Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Secure Neural Network Inference (SNNI) enables privacy-preserving inference on encrypted data with strong cryptographic guarantees. However, practical deployments suffer from high preprocessing overhead, significant communication costs, and sequential execution. These limitations lead to low throughput, underutilized system resources, long queueing delays, and poor scalability. This work introduces SwiftSNNI, a unified, resource-aware scheduling framework for SNNI. It implements a hybrid offline–online strategy that orchestrates offline preprocessing (Tpre,i) and online inference (Ton,i) jobs to maximize parallelism. By formulating SNNI scheduling as a constrained optimization problem, SwiftSNNI overlaps Tpre,i phase execution of future requests with active Ton,j, jobs. SwiftSNNI also incorporates optional advance notices to enable proactive Tpre,i, which further reduces average input delay (D). Evaluations using five benchmark neural networks (M1, M2, HiNet, AlexNet, VGG-16) under diverse workloads and stochastic arrival rates confirm substantial performance gains. Compared to a parallelized sequential baseline (MS-SHARK), SwiftSNNI achieves up to 97% lower average input delay (D), a 81% reduction in makespan (≈ 5.4 × speedup), and delivers 5.6 × increase in throughput. Furthermore, SwiftSNNI reduces average waiting time (W) by over 99%, demonstrating robust starvation prevention for high-concurrency workloads. SwiftSNNI supports concurrent execution, scales to larger neural networks, and provides an efficient runtime for SNNI deployments. The SwiftSNNI implementation is available online.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/3777884.3797005
Downloads
SwiftSNNI_ACM-vF2 (Accepted author manuscript)
SwiftSNNI (Final published version)
Permalink to this page
Back