Proceedings of ESPM2 2016: 2nd International Workshop on Extreme Scale Programming Models and Middleware - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis
Effective utilization of the increasingly heterogeneous hardware in modern supercomputers is a significant challenge. Many applications have seen performance gains by using GPUs, but many implementations leave CPUs sitting idle.In this paper, we describe a runtime managed system for coordinating heterogeneous execution. This system manages data transfers to and from GPU devices and schedules work across the computational resources of the system. The programmer need only tag methods and parameters to enable heterogeneous execution.Using this system, we observe improvements in programmer productivity and application performance. For selected benchmarks, when using heterogeneous execution we observe speedups of up to 3.09x relative to using only the host cores or only the device.
Accelerator architectures, High performance computing, Parallel programming, Runtime
Robson, Michael P.; Buch, Ronak; and Kale, Laxmikant V., "Runtime Coordinated Heterogeneous Tasks in Charm++" (2017). Computer Science: Faculty Publications, Smith College, Northampton, MA.