Using Machine Learning for Accelerating the Detection Time of Unreliable Failure Detectors
Loading...
Download
Date
Authors
Type
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study evaluates Machine Learning-based Failure Detectors for detecting faulty
nodes in distributed systems, focusing on metrics like Detection Time and Average
Mistake Rate. Random Forest and Support Vector Machine Failure Detectors
emerge as top performers. However, Machine Learning-based Failure Detectors
exhibit higher error rates, especially with longer detection times. To address this,
three ensemble methods are proposed, with one integrating top Machine Learningbased
Failure Detectors and analytical Failure Detectors for better balance. The
Precision Distribution module improves accuracy but struggles with completeness.
One analytical Failure Detector is used as a fallback to address completeness concerns,
although its effectiveness depends on the relationship between the analytical Failure
Detector and Precision Distribution module thresholds. Additionally, the Li-Marin
Long Short-Term Memory Failure Detector shows reduced detection time but with
increased computational overhead, posing challenges in fine-tuning overestimation
levels.
Description
Keywords
Failure Detector, Machine Learning, Distributed Systems, Ensemble Methods
