Resum
The Ramer-Douglas-Peucker (RDP) algorithm applies a recursive split-and-merge strategy, which can generate fast, compact and precise data compression for time-critical systems. The use of GPU parallelism accelerates the execution of RDP, but the recursive behavior and the dynamic size of the generated sub-tasks, requires adapting the algorithm to use the GPU resources efficiently.
While previous research approaches propose the exploitation of task-based parallelism, our research advocates a general fine-grained solution, which avoids the dynamic and recursive execution of kernels. The segmentation of depth images, a typical application used on autonomous driving, reaches speeds of almost 1000 frames per second for typical workloads using our massively parallel proposal
on low-consumption, embedded GPUs. The GPU-accelerated solution is at least an order of magnitude faster than the execution of the same program on multiple CPU cores with similar energy consumption.
While previous research approaches propose the exploitation of task-based parallelism, our research advocates a general fine-grained solution, which avoids the dynamic and recursive execution of kernels. The segmentation of depth images, a typical application used on autonomous driving, reaches speeds of almost 1000 frames per second for typical workloads using our massively parallel proposal
on low-consumption, embedded GPUs. The GPU-accelerated solution is at least an order of magnitude faster than the execution of the same program on multiple CPU cores with similar energy consumption.
Idioma original | English |
---|---|
Nombre de pàgines | 14 |
ISBN (electrònic) | 978-3-030-50371-0 |
DOIs | |
Estat de la publicació | Publicada - 2020 |