Mitigating label noise in ECG data: A comparative analysis

dc.contributor.authorManouras, Manousos
dc.contributor.authorPediaditis, Dimitrios
dc.contributor.departmentChalmers tekniska högskola / Institutionen för elektrotekniksv
dc.contributor.examinerZeng, Xuezhi
dc.contributor.supervisorStenhede, Elias
dc.date.accessioned2025-06-17T13:12:20Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractLabel noise in electrocardiogram (ECG) datasets, where samples are incorrectly labelled, significantly hinders the performance of machine learning models by fitting to the incorrect labels. This type of noise can arise from several factors, such as human error, inter-expert variability, or obsolete automated annotation algorithms, leading to inconsistencies within dataset labelling. In this thesis work, three noise mitigation methods are compared with a baseline model to evaluate both the impact of label noise and the effectiveness of these mitigation strategies in ECG datasets. The mitigation methods chosen are Stochastic co-teaching, Self-learning and DivideMix. Class-dependent label noise was synthetically introduced into two ECG datasets, PTB-XL and CODE15%, comprising of symmetric and asymmetric noise types with rates of 20% and 40%. The best-performing method, as quantified by the AUROC score, was self-learning, with improvements from 4 to 8% over the baseline in CODE15% and from 8 to 12% in PTB-XL. DivideMix demonstrated reduced performance, presumably because it has been optimised for specific image datasets. Stochastic Co-teaching achieved better results on the CODE15% dataset, likely due to the larger sample size of this dataset. Furthermore, an additional ECG dataset obtained from Akershus University Hospital was used to assess the generalisability of the best-performing method under unknown noise conditions. The results did not show an improvement over the baseline model, indicating a strong dependency between the characteristics of the dataset and the effectiveness of noise mitigation strategies.
dc.identifier.coursecodeEENX30
dc.identifier.urihttp://hdl.handle.net/20.500.12380/309502
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectDeep Learning
dc.subjectLabel Noise
dc.subjectAI
dc.subjectECG
dc.subjectNeural Networks
dc.subjectTime-series
dc.titleMitigating label noise in ECG data: A comparative analysis
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeBiomedical engineering (MPBME), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
thesis2025_manouras_pediaditis_final.pdf
Storlek:
3.51 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: