Designing Loss Functions for Learning Sound Timbre Audio Representations in Variational Autoencoders

Korkmaz, Ipek

Designing Loss Functions for Learning Sound Timbre Audio Representations in Variational Autoencoders

Ladda ner

Primär fil CSE 25-124 IK.pdf (6.55 MB)

Publicerad

2025

Författare

Korkmaz, Ipek

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc

Sammanfattning

This study investigates the effect of audio-related loss functions, audio feature extraction methods, and the addition of a synthesis layer on the reconstruction quality and latent space organization of variational autoencoders (VAEs). Three different experiments were conducted to address these questions. The first experiment suggests that different audio-related loss functions do not lead to significant differences in performance, aside from requiring different training durations. Additionally, in the second experiment, while adding a synthesis layer does not substantially improve reconstruction quality, it generally helps the model converge faster during training. Finally in the third experiment, which focuses on feature extraction methods, Mel-Frequency Cepstral Coefficients (MFCC) performed slightly better in terms of reconstruction quality. These findings can potentially guide architectural choices for effective audio representation learning in VAE-based models.

Ämne/nyckelord

timbre representation, audio feature extraction, generative models, variational autoencoder

URI

https://hdl.handle.net/20.500.12380/310888

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Designing Loss Functions for Learning Sound Timbre Audio Representations in Variational Autoencoders

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

Endorsement

Review

Supplemented By

Referenced By