Mitigating label noise in ECG data: A comparative analysis

Manouras, Manousos; Pediaditis, Dimitrios

Mitigating label noise in ECG data: A comparative analysis

dc.contributor.author	Manouras, Manousos
dc.contributor.author	Pediaditis, Dimitrios
dc.contributor.department	Chalmers tekniska högskola / Institutionen för elektroteknik	sv
dc.contributor.examiner	Zeng, Xuezhi
dc.contributor.supervisor	Stenhede, Elias
dc.date.accessioned	2025-06-17T13:12:20Z
dc.date.issued	2025
dc.date.submitted
dc.description.abstract	Label noise in electrocardiogram (ECG) datasets, where samples are incorrectly labelled, significantly hinders the performance of machine learning models by fitting to the incorrect labels. This type of noise can arise from several factors, such as human error, inter-expert variability, or obsolete automated annotation algorithms, leading to inconsistencies within dataset labelling. In this thesis work, three noise mitigation methods are compared with a baseline model to evaluate both the impact of label noise and the effectiveness of these mitigation strategies in ECG datasets. The mitigation methods chosen are Stochastic co-teaching, Self-learning and DivideMix. Class-dependent label noise was synthetically introduced into two ECG datasets, PTB-XL and CODE15%, comprising of symmetric and asymmetric noise types with rates of 20% and 40%. The best-performing method, as quantified by the AUROC score, was self-learning, with improvements from 4 to 8% over the baseline in CODE15% and from 8 to 12% in PTB-XL. DivideMix demonstrated reduced performance, presumably because it has been optimised for specific image datasets. Stochastic Co-teaching achieved better results on the CODE15% dataset, likely due to the larger sample size of this dataset. Furthermore, an additional ECG dataset obtained from Akershus University Hospital was used to assess the generalisability of the best-performing method under unknown noise conditions. The results did not show an improvement over the baseline model, indicating a strong dependency between the characteristics of the dataset and the effectiveness of noise mitigation strategies.
dc.identifier.coursecode	EENX30
dc.identifier.uri	http://hdl.handle.net/20.500.12380/309502
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Deep Learning
dc.subject	Label Noise
dc.subject	AI
dc.subject	ECG
dc.subject	Neural Networks
dc.subject	Time-series
dc.title	Mitigating label noise in ECG data: A comparative analysis
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Biomedical engineering (MPBME), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: thesis2025_manouras_pediaditis_final.pdf
Storlek:: 3.51 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen