Self-supervised learning of musical representations using VICReg; a comprehensive study of the VICReg loss function for self-supervised representation learning in the music domain
dc.contributor.author | Hesse, Cody | |
dc.contributor.author | Löf, Sebastian | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för arkitektur och samhällsbyggnadsteknik (ACE) | sv |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för arkitektur och samhällsbyggnadsteknik (ACE) | en |
dc.contributor.examiner | Ahrens, Jens | |
dc.contributor.supervisor | Lordelo, Carlos | |
dc.contributor.supervisor | Thomé, Carl | |
dc.date.accessioned | 2023-10-09T11:21:55Z | |
dc.date.available | 2023-10-09T11:21:55Z | |
dc.date.issued | 2023 | |
dc.date.submitted | 2023 | |
dc.description.abstract | Self-supervised learning has emerged as a promising method for learning informative representations suitable for many machine learning tasks. However, while selfsupervised representation learning has been instrumental in various fields, its significance in music information retrieval has only recently gained momentum. This thesis investigates the potential of the VICReg loss function for self-supervised learning in the music domain by comparing its performance against the established CLMR model. Following the evaluations performed in CLMR, we train our VICReg model on the publically available Free Music Archive and GTZAN datasets. We then evaluate the learned representation on the downstream task of music classification on the MagnaTagATune dataset by training a linear logistic classifier and a two-layer MLP classifier atop the representations generated by a frozen, pre-trained VICReg model. In our transfer learning experiments, VICReg achieves a ROC-AUC score of 89.15 and a PR-AUC score of 35.85 compared to 88.12 and 33.83, respectively, as achieved by CLMR, showing that VICReg demonstrates a competitive performance compared to CLMR. With more robust training and further tuning, we believe that VICReg can achieve superior performance compared to established loss functions for self-supervised representation learning in the music domain and advocate continued exploration in this direction. | |
dc.identifier.coursecode | ACEX30 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/307201 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Self-supervised learning, Contrastive learning, Music Information Retrieval, Representation learning, VICReg, CLMR, SampleCNN | |
dc.title | Self-supervised learning of musical representations using VICReg; a comprehensive study of the VICReg loss function for self-supervised representation learning in the music domain | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Sound and vibration (MPSOV), MSc |