Mitigating label noise in ECG data: A comparative analysis
dc.contributor.author | Manouras, Manousos | |
dc.contributor.author | Pediaditis, Dimitrios | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för elektroteknik | sv |
dc.contributor.examiner | Zeng, Xuezhi | |
dc.contributor.supervisor | Stenhede, Elias | |
dc.date.accessioned | 2025-06-17T13:12:20Z | |
dc.date.issued | 2025 | |
dc.date.submitted | ||
dc.description.abstract | Label noise in electrocardiogram (ECG) datasets, where samples are incorrectly labelled, significantly hinders the performance of machine learning models by fitting to the incorrect labels. This type of noise can arise from several factors, such as human error, inter-expert variability, or obsolete automated annotation algorithms, leading to inconsistencies within dataset labelling. In this thesis work, three noise mitigation methods are compared with a baseline model to evaluate both the impact of label noise and the effectiveness of these mitigation strategies in ECG datasets. The mitigation methods chosen are Stochastic co-teaching, Self-learning and DivideMix. Class-dependent label noise was synthetically introduced into two ECG datasets, PTB-XL and CODE15%, comprising of symmetric and asymmetric noise types with rates of 20% and 40%. The best-performing method, as quantified by the AUROC score, was self-learning, with improvements from 4 to 8% over the baseline in CODE15% and from 8 to 12% in PTB-XL. DivideMix demonstrated reduced performance, presumably because it has been optimised for specific image datasets. Stochastic Co-teaching achieved better results on the CODE15% dataset, likely due to the larger sample size of this dataset. Furthermore, an additional ECG dataset obtained from Akershus University Hospital was used to assess the generalisability of the best-performing method under unknown noise conditions. The results did not show an improvement over the baseline model, indicating a strong dependency between the characteristics of the dataset and the effectiveness of noise mitigation strategies. | |
dc.identifier.coursecode | EENX30 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/309502 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Deep Learning | |
dc.subject | Label Noise | |
dc.subject | AI | |
dc.subject | ECG | |
dc.subject | Neural Networks | |
dc.subject | Time-series | |
dc.title | Mitigating label noise in ECG data: A comparative analysis | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Biomedical engineering (MPBME), MSc |