Using Machine Learning for Accelerating the Detection Time of Unreliable Failure Detectors
Ladda ner
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This study evaluates Machine Learning-based Failure Detectors for detecting faulty
nodes in distributed systems, focusing on metrics like Detection Time and Average
Mistake Rate. Random Forest and Support Vector Machine Failure Detectors
emerge as top performers. However, Machine Learning-based Failure Detectors
exhibit higher error rates, especially with longer detection times. To address this,
three ensemble methods are proposed, with one integrating top Machine Learningbased
Failure Detectors and analytical Failure Detectors for better balance. The
Precision Distribution module improves accuracy but struggles with completeness.
One analytical Failure Detector is used as a fallback to address completeness concerns,
although its effectiveness depends on the relationship between the analytical Failure
Detector and Precision Distribution module thresholds. Additionally, the Li-Marin
Long Short-Term Memory Failure Detector shows reduced detection time but with
increased computational overhead, posing challenges in fine-tuning overestimation
levels.
Beskrivning
Ämne/nyckelord
Failure Detector, Machine Learning, Distributed Systems, Ensemble Methods