TY - JOUR
T1 - Predicting number of threads using balanced datasets for openMP regions
AU - Alcaraz, Jordi
AU - TehraniJamsaz, Ali
AU - Dutta, Akash
AU - Sikora, Anna
AU - Jannesari, Ali
AU - Sorribes, Joan
AU - Cesar, Eduardo
N1 - Funding Information:
This work has been granted by the Ministerio de Ciencia e Innovación MCIN AEI/10.13039/501100011033 under contract PID2020-113614RB-C21 and by the Generalitat de Catalunya GenCat-DIUiE (GRC) project 2017-SGR-313. We would like to thank the Research IT team ( https://researchit.las.iastate.edu/ ) of Iowa State University for their continuous support in providing access to HPC clusters for conducting the experiments of this research project.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/4/30
Y1 - 2022/4/30
N2 - Incorporating machine learning into automatic performance analysis and tuning tools is a promising path to tackle the increasing heterogeneity of current HPC applications. However, this introduces the need for generating balanced datasets of parallel applications’ executions and for dealing with natural imbalances for optimizing performance parameters. This work proposes a holistic approach that integrates a methodology for building balanced datasets of OpenMP code-region patterns and a way to use such datasets for tuning performance parameters. The methodology uses hardware performance counters to characterize the execution of a given region and correlation analysis to determine whether it covers an unique part of the pattern input space. Nevertheless, a balanced dataset of region patterns may become naturally imbalanced when used for training a model for tuning any specific performance parameter. For this reason, we have explored several methods for dealing with naturally imbalanced datasets for finding the appropriated way of using them for tuning purposes. Experimentation shows that the proposed methodology can be used to build balanced datasets and that such datasets, plus a combination of Random Forest and binary classification, can be used to train a model able to accurately tune the number of threads of OpenMP parallel regions.
AB - Incorporating machine learning into automatic performance analysis and tuning tools is a promising path to tackle the increasing heterogeneity of current HPC applications. However, this introduces the need for generating balanced datasets of parallel applications’ executions and for dealing with natural imbalances for optimizing performance parameters. This work proposes a holistic approach that integrates a methodology for building balanced datasets of OpenMP code-region patterns and a way to use such datasets for tuning performance parameters. The methodology uses hardware performance counters to characterize the execution of a given region and correlation analysis to determine whether it covers an unique part of the pattern input space. Nevertheless, a balanced dataset of region patterns may become naturally imbalanced when used for training a model for tuning any specific performance parameter. For this reason, we have explored several methods for dealing with naturally imbalanced datasets for finding the appropriated way of using them for tuning purposes. Experimentation shows that the proposed methodology can be used to build balanced datasets and that such datasets, plus a combination of Random Forest and binary classification, can be used to train a model able to accurately tune the number of threads of OpenMP parallel regions.
KW - Hardware performance counters
KW - Machine learning
KW - OpenMP
KW - Parallel applications
KW - Performance tuning
UR - http://www.scopus.com/inward/record.url?scp=85129323699&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/ba3f7123-5272-3baa-a874-f7b50d84d8a8/
U2 - 10.1007/s00607-022-01081-6
DO - 10.1007/s00607-022-01081-6
M3 - Article
AN - SCOPUS:85129323699
JO - Computing (Vienna/New York)
JF - Computing (Vienna/New York)
SN - 0010-485X
ER -