A performance model for openMP memory bound applications in multisocket systems

César Allande, Josep Jorba, Anna Sikora, Eduardo César

Producción científica: Capítulo del libroCapítuloInvestigaciónrevisión exhaustiva

3 Citas (Scopus)

Resumen

The performance of OpenMP applications executed in multisocket multicore processors can be limited by the memory interface. In a multisocket environment, each multicore processor can present a performance degradation in memory-bound parallel regions when sharing the same Last Level Cache (LLC). We propose a characterization of the performance of parallel regions to estimate cache misses and execution time. This model is used to select the number of threads and affinity distribution for each parallel region. The model is applied for SP and MG benchmarks from the NAS Parallel Benchmark Suite using different workloads on two different multicore, multisocket systems. The results shown that the estimation preserves the behavior shown in measured executions for the affinity configurations evaluated. Estimated execution time is used to select a set of configurations in order to minimize the impact of memory contention, achieving significant improvements compared with a default configuration using all threads.

Idioma originalInglés
Título de la publicación alojadaIEEE International conference on computational science 2014 (ICCS 2014)
Lugar de publicaciónCairns (AU)
Páginas2208-2218
Número de páginas11
Volumen29
Edición1
DOI
EstadoPublicada - 1 may 2014

Serie de la publicación

NombreProcedia Computer Science
EditorialElsevier
ISSN (versión impresa)1877-0509

Huella

Profundice en los temas de investigación de 'A performance model for openMP memory bound applications in multisocket systems'. En conjunto forman una huella única.

Citar esto