A Fault-tolerant Distributed Library for Embedded Real-time Systems

Publicerad

Typ

Examensarbete för masterexamen

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

A distributed embedded control system (DECS) may have functionality that is safety-critical and time-sensitive, meaning if these systems malfunction the consequences could be devastating. In order to meet these requirements, a system must fulfill real-time constraints and guarantee correct functionality even in the presence of faults. In this thesis we present a software library providing clock synchronization, realtime scheduling and fault-tolerant decision making. It is intended for use with DECS communicating via controller area network (CAN). To achieve fault-tolerant decision making, we propose an early-stopping fault-tolerance algorithm solving up to t faults in a system of 2t + 1 nodes. We further propose an adaptation of this algorithm to real-world applications where there may be an interval of correct values instead of one correct value, as assumed in the base solution. The result is a lightweight and efficient library. The clock synchronization requires one message and has a precision comparable to other known solutions, but is not fault-tolerant. The scheduler runs in O(n2) time and uses a non-preemptive ratemonotonic policy. It can handle up to 63 user-defined tasks, and has a worst-case task delay of 2.5 ms for the lowest-priority task in a system with 60 tasks, assuming a task execution time of 0. The drawback is its inability to handle mixed-criticality task sets. Our proposed algorithm utilizes the properties inherent in CAN to provide an efficient way to rectify faults in the value domain. Due to the early-stopping property of the algorithm, the bus utilization increases linearly with the number of faults. We conclude that while the library is practical and efficient, fault-tolerant clock synchronization and fault handling in the time domain are necessary improvements before the library can be used in production systems.

Beskrivning

Ämne/nyckelord

Byzantine fault tolerance, Real-time scheduling, CAN, Distributed systems, Embedded control systems

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced