A tool for bottleneck analysis and performance prediction for GPU-accelerated applications

Authors
Publication date 2016
Book title 2016 IEEE 30th International Parallel and Distributed Processing Symposium Workshops : IPDPSW 2016
Book subtitle proceedings : 23-27 May 2016, Chicago, Illinois
ISBN
  • 9781509036820
  • 9781509036837
ISBN (electronic)
  • 9781509021406
Event 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016
Pages (from-to) 641-652
Number of pages 12
Publisher Los Alamitos, California: IEEE Computer Society
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
  • Faculty of Science (FNWI)
Abstract

High-level tools for analyzing and predicting the performance GPU-accelerated applications are scarce, at best. Although performance modeling approaches for GPUs exist, their complexity makes them virtually impossible to use to quickly analyze the performance of real life applications and obtain easy-to-use, readable feedback. This is why, although GPUs are significant performance boosters in many HPC domains, performance prediction is still based on extensive benchmarking, and performance bottleneck analysis remains a nonsystematic, experience-driven process. In this context, we propose a tool for bottleneck analysis and performance prediction for GPU-accelerated applications. Based on random forest modeling, and using hardware performance counters data, our method can be used to quickly and accurately evaluate application performance on GPU-based systems for different problem characteristics and different hardware generations. We illustrate the benefits of our approach with three detailed use cases: a simple step-by-step example on a parallel reduction kernel, and two classical benchmarks (matrix multiplication and sequence alignment). Our results so far indicate that our statistical modeling is a quick, easy-to-use method to grasp the performance characteristics of applications running on GPUs. Our current work focuses on tackling some of its applicability limitations (more applications, more platforms) and improving its usability (full automation from input to user feedback).

Document type Conference contribution
Language English
Published at https://doi.org/10.1109/IPDPSW.2016.198
Other links https://www.scopus.com/pages/publications/84991661648
Permalink to this page
Back