Enabling Moldability in OpenMP

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

OpenMP has long been a ubiquitous technology in High-Performance Computing (HPC), making parallel programs simple to reason about and portable to many different systems. When an OpenMP runtime decides which threads should run tasks, it often uses a simple work-stealing scheduler as they evenly distribute tasks among cores. This is the method used by LLVMs OpenMP runtime. But today, HPC systems often consist of multiple sockets, each with many cores and non-uniform memory access (NUMA). This creates a complicated memory hierarchy which isn’t accounted for by simple work-stealing schedulers. Another feature not supported well by simple work-stealing schedulers is nested parallelism, where each task runs multiple threads in parallel. It isn’t clear how many threads each task should be allocated i.e. the width of the task. If it’s too high there will be over-subscription while if it’s too low there will be load imbalance. This can be solved by supporting moldable tasks, which are tasks where the scheduler decides each task’s width. We extend LLVM’s OpenMP runtime with support for moldable tasks scheduled using a locality-aware scheduler.

Beskrivning

Ämne/nyckelord

Adaptive Scheduling, OpenMP, NUMA-aware scheduling, Moldable

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced