Embedded Chaos Assessing the feasibility of failure injection to evaluate an embedded software system
Examensarbete för masterexamen
Software engineering and technology (MPSOF), MSc
Producing a robust software system, which continues to function in spite of unexpected faults, is difficult. In particular, traditional software testing falls short in this regard, as unexpected conditions by definition cannot be tested for, no matter how creative the developer. Instead, a different method has taken root within the distributed systems world. Netflix’s ChaosMonkey is designed to randomly stress test a running system. The non-deterministic element present in ChaosMonkey serves to test Netflix’s architecture against the unexpected faults common in a distributed system, such as network outages. However, robustness is an issue in other systems as well. In particular, Ericsson identified the need to improve the robustness of the Radio Base Station (RBS) system, which is a core platform in the telecommunications solutions provided by Ericsson. To this end, Ericsson believed that an approach similar to ChaosMonkey could be beneficial. By exposing the RBS to a number of provocations, each of which the system should be able to handle, this theory was tested. A software tool was developed, which would produce and execute a random series of actions, in an attempt to provoke a failure in an RBS. After executing more than two thousand such series on an RBS, over the span of several days, a number of failures had been observed. These failures were believed to be caused by an intermittent error in a component in the RBS software. Thereby, the study demonstrated the potential of non-deterministic testing as a means of evaluating a software system. The project was carried out at Ericssons Lindholmen site in Gothenburg, Sweden.
Data- och informationsvetenskap , Informations- och kommunikationsteknik , Computer and Information Science , Information & Communication Technology