Fault-tolerant scheduling of real-time parallel DAG tasks on multiprocessors

Publicerad

Författare

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

The imperative of maximizing hardware utilization compels innovation and a strive towards finding more efficient solutions in real-time systems. The directed acyclic graph (DAG) model of parallel tasks is commonly used to represent the data dependencies of real-world applications. Providing techniques for tolerating and recovering from hardware and software faults for such a model is paramount to its usability in critical systems. Sectors like avionics and automotive require fault tolerance to ensure certification and guarantees of safe operation, which is why this combination is important. This thesis considers transient faults. Examining all possibilities of faults occurring in a DAG task is computationally expensive. Therefore, developing efficient methods for bounding the worst-case makespan under faulty conditions is a non-trivial problem and one which is examined in this thesis. A fault-aware schedulability test for a taskset can be derived from finding the number of processors required to meet each task’s deadline in the taskset. This thesis introduces six novel fault-aware schedulability tests that explicitly account for the runtime overhead of using fault recovery through node re-execution. Further, a workconserving scheduler is assumed and the federated scheduling technique is employed to address the problem of guaranteeing the schedulability of DAG tasksets. To bound the worst-case interference caused by re-executed nodes, the tests employ new analytical techniques and build upon existing fault-unaware scheduling techniques for efficient scheduling of DAG tasks. To evaluate the effectiveness of the proposed tests, a simulation framework was developed that is capable of generating random DAG tasks whose structure and computational load reflect that of real-world applications. Simulation results indicate that exploiting structural information of multiple long paths of a DAG task significantly enhances the power of the proposed tests in determining schedulability, regardless of the variation in the simulation parameters.

Beskrivning

Ämne/nyckelord

real-time scheduling, parallel tasks, fault tolerance, federated scheduling, work-conserving scheduling, computer science

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced