An Incompressible Navier-Stokes Equations Solver on the GPU Using CUDA

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen
Master Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Graphics Processing Units (GPUs) have emerged as highly capable computational accelerators for scientific and engineering applications. Many reports claim orders of magnitude of speedup compared to traditional Central Processing Units (CPUs), and the interest for GPU computation is high in the computational world. In this thesis, the capability of using GPUs to accelerate the full computational chain of a 3D incompressible Navier-Stokes solver, including solvers and preconditioners for sparse linear systems as well as assembly routines for a finite volume discretization, has been evaluated. The CG, GMRES and BiCGStab iterative solvers have been implemented on the CUDA GPGPU platform and evaluated together with the Jacobi, and Least Square Polynomial preconditioners. A double precision Navier-Stokes solver has been implemented using CUDA, adopting a collocated cartesian grid, SIMPLEC pressure-velocity coupling scheme, and implicit time discretization. The CUDA GPU implementations of the iterative solvers and preconditioners and the Navier-Stokes solver were validated and evaluated against serial and parallel CPU implementations. For the iterative solvers, speedups of between six and thirteen were achieved against the MKL CPU library, and the implemented methods beats existing open source GPU implementations of equivalent methods. For the full Navier-Stokes solver, speedups of up to a factor twelve were achieved compared to an equivalent commercial CPU code when equivalent iterative solvers were used. A speedup of a factor two was achieved when a commercial Algebraic MultiGrid method was used to solve the pressure Poisson equation in the commercial CPU implementation. The bottleneck of the resulting implementation was found to be the solution of the pressure Poisson equation. It accounted for a significant part of the total execution time for large problems. The implemented assembly routines on the GPU were highly efficient. The combined execution time for these routines were negligible compared to the total execution time. The GPU has been assessed as a highly capable accelerator for the implemented methods. About an order of magnitude of speedups have been achieved for algorithms which can efficiently be implemented on the GPU.

Description

Keywords

Data- och informationsvetenskap, Computer and Information Science

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By