Engineering an Efficient Implementation of Pagh’s Algorithm for Sparse Matrix Multiplication
| dc.contributor.author | Ge, Shuhao | |
| dc.contributor.author | Kliemann, Paul | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
| dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
| dc.contributor.examiner | Damaschke, Peter | |
| dc.contributor.supervisor | Karppa, Matti | |
| dc.date.accessioned | 2025-11-05T12:29:15Z | |
| dc.date.issued | 2025 | |
| dc.date.submitted | ||
| dc.description.abstract | This thesis explores approximate matrix multiplication through the algorithm proposed by R. Pagh in Compressed Matrix Multiplication (2013) and its implementation developed by J. Andersson and M. Karppa. We identify memory bandwidth as the primary performance bottleneck and implement support for handling sparse inputs and outputs. We further introduce several optimizations to improve cache utilisation. While fitting the input data into the L3 caches of our CPU did not improve L3 hit rate, it did lead to the improvement in the L1 hit rate compared to the dense implementation. Our experimental evaluation demonstrates modest yet consistent runtime improvements, particularly on large-scale and sparse inputs, while retaining the mathematical guarantees of the original implementation. Despite these optimisations, our implementations remain outperformed by the stateof- the-art libraries on all tested real-world datasets. However, our modifications reduce the asymptotic runtime of the compression step and significantly decrease memory usage when processing sparse matrices. Based on our findings, we suggest potential directions for future research, including further optimization strategies. These should primarily focus on reducing memory traffic, as improvements in memory locality alone are unlikely to yield substantial gains. | |
| dc.identifier.coursecode | DATX05 | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12380/310724 | |
| dc.language.iso | eng | |
| dc.relation.ispartofseries | CSE 25-56 | |
| dc.setspec.uppsok | Technology | |
| dc.subject | Matrix Multiplication, Sparse Matrix, High Performance Computing, Fast Fourier Transform, Hashing. | |
| dc.title | Engineering an Efficient Implementation of Pagh’s Algorithm for Sparse Matrix Multiplication | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Data science and AI (MPDSC), MSc |
