CAPSlog: Scalable Memory-Centric Partitioning for Pipeline Parallelism
| Authors |
|
|---|---|
| Publication date | 2024 |
| Host editors |
|
| Book title | 2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing |
| Book subtitle | PDP 2024 : Dublin, Ireland, 20-22 March 2024 : proceedings |
| ISBN |
|
| ISBN (electronic) |
|
| Event | 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2024 |
| Pages (from-to) | 17-25 |
| Number of pages | 9 |
| Publisher | Piscataway, NJ: IEEE Computer Society |
| Organisations |
|
| Abstract |
Pipeline-parallel training has emerged as a popular method to train large Deep Neural Networks (DNNs), as it allows the use of the combined compute power and memory capacity of multiple Graphics Processing Units (GPUs). However, with the sustaining increase in Deep Learning (DL) model sizes, pipeline parallelism provides only a partial solution to the memory bottleneck in large-scale DNN training. Careful partitioning of the DL model over the available GPUs based on memory usage is required to further alleviate the memory bottleneck and train larger DNNs. mCAP is such a memory-oriented partitioning approach for pipeline parallel systems, but it does not scale to models with many layers and very large hardware setups, as it requires extensive profiling and fails to efficiently navigate the partitioning space to find the most memory-friendly partitioning. In this work, we propose CAPSlog, a scalable memory-centric partitioning approach that can recommend model partitionings for larger and more heterogeneous DL models and for larger hardware setups than existing approaches. CAPSlog introduces a new profiling method and a new, much more scalable algorithm for recommending memory-efficient partitionings. CAPSlog reduces the profiling time by 67 % compared to existing approaches, searches the partitioning space for the optimal solution orders of magnitude faster and can train significantly larger models. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1109/PDP62718.2024.00012 |
| Other links | https://www.proceedings.com/74377.html https://www.scopus.com/pages/publications/85191747579 |
| Downloads |
CAPSlog_Scalable_Memory-Centric_Partitioning_for_Pipeline_Parallelism
(Final published version)
|
| Permalink to this page | |
