Designing Loss Functions for Learning Sound Timbre Audio Representations in Variational Autoencoders

Korkmaz, Ipek

Designing Loss Functions for Learning Sound Timbre Audio Representations in Variational Autoencoders

dc.contributor.author	Korkmaz, Ipek
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Olsson, Simon
dc.contributor.supervisor	Tatar, Kivanc
dc.date.accessioned	2026-01-15T14:33:29Z
dc.date.issued	2025
dc.date.submitted
dc.description.abstract	This study investigates the effect of audio-related loss functions, audio feature extraction methods, and the addition of a synthesis layer on the reconstruction quality and latent space organization of variational autoencoders (VAEs). Three different experiments were conducted to address these questions. The first experiment suggests that different audio-related loss functions do not lead to significant differences in performance, aside from requiring different training durations. Additionally, in the second experiment, while adding a synthesis layer does not substantially improve reconstruction quality, it generally helps the model converge faster during training. Finally in the third experiment, which focuses on feature extraction methods, Mel-Frequency Cepstral Coefficients (MFCC) performed slightly better in terms of reconstruction quality. These findings can potentially guide architectural choices for effective audio representation learning in VAE-based models.
dc.identifier.coursecode	DATX05
dc.identifier.uri	https://hdl.handle.net/20.500.12380/310888
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	timbre representation
dc.subject	audio feature extraction
dc.subject	generative models
dc.subject	variational autoencoder
dc.title	Designing Loss Functions for Learning Sound Timbre Audio Representations in Variational Autoencoders
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Data science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 25-124 IK.pdf
Size:: 6.55 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen