Vectorizing FFT for faster AI Convolutions

dc.contributor.authorEl-Hajj, Victor
dc.contributor.authorForsberg, Anton
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerPericas, Miquel
dc.contributor.supervisorPapadopoulou, Nikela
dc.date.accessioned2023-12-20T18:15:52Z
dc.date.available2023-12-20T18:15:52Z
dc.date.issued2023
dc.date.submitted2023
dc.description.abstractThe Fast Fourier Transform (FFT) is a widely used algorithm in signal processing, communications and image processing. In this thesis we implemented and investigated FFT convolutions that leverage vector length agnostic programming for convolutional neural networks with the ARM Scalable Vector Extension (SVE) and RISC-V ”V” vector extensions. Our research aimed to address the limitations of traditional vectorisation techniques that require unportable fixed length vector instructions. We analysed the performance of applying vector length agnostic instructions with different vector lengths and L2 cache sizes. Due to unforeseen issues with simulator programs, we were unable to run all benchmarks and investigate all vector lengths as originally planned. However, our results showed that code using both vector extensions benefit from being portable by showing increasing speedups with simulated vector lengths. At best, there was a speedup of two times compared to the baseline using a short vector length of 512 bits, though vectorised implementations of the General Matrix Multiply (GeMM) and Winograd convolutions outperformed our FFT implementation by three to four times on the SVE architecture and three to eleven times on the RISC-V ”V” architecture on a network with small kernel sizes unfavourable to FFT. In conclusion, while the tools for simulating these architectures may be immature our investigation shows that the FFT convolution benefits from vector length agnostic programming.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307458
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectComputer science
dc.subjectengineering
dc.subjectproject
dc.subjectthesis
dc.subjectHPC
dc.subjectFFT
dc.subjectCNN
dc.subjectvector length agnostic programming
dc.subjectRISC-V ”V”
dc.subjectARM SVE
dc.titleVectorizing FFT for faster AI Convolutions
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeHigh-performance computer systems (MPHPC), MSc
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 23-110 VE AF.pdf
Storlek:
1013.2 KB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: