Self-supervised learning of musical representations using VICReg; a comprehensive study of the VICReg loss function for self-supervised representation learning in the music domain

Hesse, Cody; Löf, Sebastian

Self-supervised learning of musical representations using VICReg; a comprehensive study of the VICReg loss function for self-supervised representation learning in the music domain

dc.contributor.author	Hesse, Cody
dc.contributor.author	Löf, Sebastian
dc.contributor.department	Chalmers tekniska högskola / Institutionen för arkitektur och samhällsbyggnadsteknik (ACE)	sv
dc.contributor.department	Chalmers tekniska högskola / Institutionen för arkitektur och samhällsbyggnadsteknik (ACE)	en
dc.contributor.examiner	Ahrens, Jens
dc.contributor.supervisor	Lordelo, Carlos
dc.contributor.supervisor	Thomé, Carl
dc.date.accessioned	2023-10-09T11:21:55Z
dc.date.available	2023-10-09T11:21:55Z
dc.date.issued	2023
dc.date.submitted	2023
dc.description.abstract	Self-supervised learning has emerged as a promising method for learning informative representations suitable for many machine learning tasks. However, while selfsupervised representation learning has been instrumental in various fields, its significance in music information retrieval has only recently gained momentum. This thesis investigates the potential of the VICReg loss function for self-supervised learning in the music domain by comparing its performance against the established CLMR model. Following the evaluations performed in CLMR, we train our VICReg model on the publically available Free Music Archive and GTZAN datasets. We then evaluate the learned representation on the downstream task of music classification on the MagnaTagATune dataset by training a linear logistic classifier and a two-layer MLP classifier atop the representations generated by a frozen, pre-trained VICReg model. In our transfer learning experiments, VICReg achieves a ROC-AUC score of 89.15 and a PR-AUC score of 35.85 compared to 88.12 and 33.83, respectively, as achieved by CLMR, showing that VICReg demonstrates a competitive performance compared to CLMR. With more robust training and further tuning, we believe that VICReg can achieve superior performance compared to established loss functions for self-supervised representation learning in the music domain and advocate continued exploration in this direction.
dc.identifier.coursecode	ACEX30
dc.identifier.uri	https://hdl.handle.net/20.500.12380/307201
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Self-supervised learning, Contrastive learning, Music Information Retrieval, Representation learning, VICReg, CLMR, SampleCNN
dc.title	Self-supervised learning of musical representations using VICReg; a comprehensive study of the VICReg loss function for self-supervised representation learning in the music domain
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Sound and vibration (MPSOV), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: ACEX30 - Cody Hesse and Sebastian Löf.pdf
Size:: 2.31 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen