A Runtime Verification Framework For Kotlin Mobile Applications Master’s thesis in Computer science and engineering Denis Furian Department of Computer Science and Engineering CHALMERS UNIVERSITY OF TECHNOLOGY UNIVERSITY OF GOTHENBURG Gothenburg, Sweden 2020 Master’s thesis 2020 A Runtime Verification Framework For Kotlin Mobile Applications Denis Furian Department of Computer Science and Engineering Chalmers University of Technology University of Gothenburg Gothenburg, Sweden 2020 A Runtime Verification Framework For Kotlin Mobile Applications Denis Furian © Denis Furian, 2020. Supervisor: Gerardo Schneider, Department of Computer Science and Engineering Industrial Advisor: Boris Tiutin, Opera Software AB External Advisors: Christian Colombo, University of Malta Yliès Falcone, University Grenoble Alps Examiner: Wolfgang Ahrendt, Department of Computer Science and Engineering Master’s Thesis 2020 Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg SE-412 96 Gothenburg Telephone +46 31 772 1000 Typeset in LATEX Gothenburg, Sweden 2020 iv A Runtime Verification Framework For Kotlin Mobile Applications Denis Furian Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg Abstract The Kotlin programming language has recently been introduced to Android as the recommended language for development. We investigated whether we could use this language to improve the state of the art for Runtime Verification on mobile devices and focused on creating an API to monitor the execution of coroutines, one of the main Kotlin functionalities that are not featured in Java. This API should be employed by Android programmers to carry out concurrent tasks in a monitored environment and verify at runtime that the Kotlin guidelines and best practices for coroutines are being followed. We identified a number of such guidelines and redefined them as properties to either monitor through Runtime Verification or enforce through Runtime Enforcement; we then tested them on an in-house Android app built using our API. In this report we present the API and the results of our tests concerning performance overhead and memory usage, as well as our ideas for future development. Keywords: Runtime Verification, Runtime Enforcement, Android, Kotlin, Aspect- oriented Programming, Monitor, Coroutine, Concurrency, Structured Concurrency v Acknowledgements I wish to express my gratitude towards my supervisor Gerardo Schneider, who has lent me his expertise and continued support throughout the project. I would like to thank Christian Colombo, from the University of Malta, and Yliès Falcone, from the University Grenoble Alps, for their invaluable support in directing my efforts in examining the state of the art for Runtime Verification and identifying properties. My thanks go also to Boris Tiutin and the rest of the Opera Software office in Gothenburg, without whom this project wouldn’t have been possible. Finally, I want to thank Wolfgang Ahrendt, my examiner, for introducing me to the field of Runtime Verification and giving me the opportunity to write this thesis. Denis Furian, Gothenburg, February 2020 vii Contents List of Figures xii List of Tables xiii List of Listings xvii 1 Introduction 1 1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 RV of Android . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Goals and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Background and Related Work 5 2.1 Runtime Verification and its Application to Android . . . . . . . . . . 5 2.1.1 Expressing Properties . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.2 Monitoring the Execution . . . . . . . . . . . . . . . . . . . . 7 2.1.2.1 Application-centric RV . . . . . . . . . . . . . . . . . 8 2.1.2.2 Device-centric RV . . . . . . . . . . . . . . . . . . . 9 2.1.3 Existing Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.4 Limits of the Current Situation . . . . . . . . . . . . . . . . . 11 2.2 The Kotlin Programming Language . . . . . . . . . . . . . . . . . . . 11 2.2.1 Non-Nullable Types . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1.1 Interoperability with Java and Platform Types . . . 12 2.2.2 Default Function Arguments . . . . . . . . . . . . . . . . . . . 13 2.2.3 Absence of Checked Exceptions . . . . . . . . . . . . . . . . . 13 2.2.4 Coroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.4.1 Types of Coroutines . . . . . . . . . . . . . . . . . . 17 2.2.4.2 Termination of a Coroutine . . . . . . . . . . . . . . 19 2.2.4.3 Coroutines in Android . . . . . . . . . . . . . . . . . 19 2.2.5 Problems and Limitations . . . . . . . . . . . . . . . . . . . . 20 3 Prototype for a Monitoring Tool 21 3.1 Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Property 1: DestroyedWithOwner . . . . . . . . . . . . . . . . 21 3.1.2 Properties of tasks that return a value . . . . . . . . . . . . . 23 3.1.2.1 Property 2: NormalAsync . . . . . . . . . . . . . . . 23 3.1.2.2 Property 3: ExceptionalAsync and Property 4: Need- Handler . . . . . . . . . . . . . . . . . . . . . . . . . 23 ix Contents 3.1.3 Property 5: NoBlockUI and Property 6: UpdateUI . . . . . . 24 3.1.4 Property 7: ResumeIfNeeded . . . . . . . . . . . . . . . . . . . 24 3.1.5 List of Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 First Approach: Hardcoding Monitors into the Application . . . . . . 26 3.2.1 Definition of tasks to execute on coroutines . . . . . . . . . . . 27 3.2.1.1 Property 8: AlwaysOneJob . . . . . . . . . . . . . . 27 3.2.1.2 Property 9: SuccessWithJSON . . . . . . . . . . . . 28 3.2.2 Class implementation of monitors . . . . . . . . . . . . . . . . 29 3.2.3 Running the monitors in the ViewModel . . . . . . . . . . . . 30 3.2.3.1 Considerations . . . . . . . . . . . . . . . . . . . . . 36 3.3 Second Approach: Using Annotations to Generate Compile-time Mon- itors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4 Implementation of an API for Monitoring Kotlin Coroutines 39 4.1 The Interface MonitoredComponent and its API . . . . . . . . . . . . 39 4.1.1 First Version of the API . . . . . . . . . . . . . . . . . . . . . 39 4.1.1.1 Keeping Track of Tasks . . . . . . . . . . . . . . . . 41 4.1.1.2 Handling Exceptions . . . . . . . . . . . . . . . . . . 43 4.1.1.3 Remembering Failed Tasks . . . . . . . . . . . . . . . 45 4.1.1.4 Referencing the Correct Task . . . . . . . . . . . . . 47 4.1.2 Implementation of the Methods launch and async . . . . . . 49 4.1.3 The MonitoredApplication Class . . . . . . . . . . . . . . . 53 4.1.4 The MonitoredActivity Class . . . . . . . . . . . . . . . . . 54 4.1.5 The MonitoredViewModel Class . . . . . . . . . . . . . . . . . 54 4.2 The Finalised API . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2.1 The Methods launch and async . . . . . . . . . . . . . . . . . 57 4.2.2 The New MonitoredComponent Interface . . . . . . . . . . . . 58 4.2.2.1 The New MonitoredActivity and MonitoredViewModel Classes . . . . . . . . . . . . . . . . . . . . . . . . . 60 5 Experimental Validation 63 5.1 Testing the API on a Proof-of-concept App . . . . . . . . . . . . . . . 63 5.1.1 Test results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2 Benchmarking the Application . . . . . . . . . . . . . . . . . . . . . . 67 5.2.1 Benchmarking Tool . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.3 Memory Footprint . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6 Conclusion 73 6.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.2 Perspectives and Future Work . . . . . . . . . . . . . . . . . . . . . . 74 Bibliography 80 x List of Figures 2.1 Example of a connected graph representing an execution. . . . . . . . 6 2.2 Example of an aspect implemented using the AspectJ syntax: this syntax allows to define pointcuts and pieces of advice in a way that is similar to methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Example of an aspect implemented using AspectJ annotations: they can be used to mark methods of a standard Java class as pointcuts (with the @Pointcut annotation) or pieces of advice (e.g. by using the annotations @Before, After and Around). . . . . . . . . . . . . . 9 2.4 The Kotlin compiler assigns Java objects with platform types unless the Java code uses specific annotations. The upper part of the figure contains Java code with definitions for the methods getNullValue, getAnnotatedNullValue, which is annotated with @Nullable, and getAnnotatedNotNullValue, which is annotated with @NotNull. The lower part of the figure showcases the return types assigned to each method by the Kotlin compiler. . . . . . . . . . . . . . . . . . . . . . 15 2.5 The Java method receiveSomething is not notified about the Kotlin function getFoo throwing an IOException. . . . . . . . . . . . . . . . 16 2.6 The @Throws annotation ensures that Java method receiveSomething will not compile unless the call to getFoo handles the possible exception. 17 2.7 By using withContext it is possible to switch context for the currently running coroutine without spawning a new one. The figure displays the source code on the upper half and the output on the lower half. Compare the first part of the output (before the first empty line) with the second: in the latter there are three different coroutines (marked as SECOND COROUTINE#3, #4, #5) whereas in the former there is only one (marked as SECOND COROUTINE#2), indicating that no new coroutine was created. . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1 Graph of the lifecycle of an Android Activity, taken from the official Android documentation [63]. . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Notification from the Android OS that the UI thread is blocked for the application aptly named “App blocking the UI thread”. . . . . . . 24 3.3 Suspend points marked on lines 55, 58, 61 and 63. . . . . . . . . . . . 25 3.4 The image browsing app developed for testing hard-coded monitors. . 27 xi List of Figures 3.5 Simplified representation of the app’s architecture: once the buttonTags UI element is tapped by the user, it triggers the search method which, in turn, launches the getImages method on a coroutine. This corou- tine launches an asynchronous task to read bytes from the remote endpoint (in the readText method) and parses the result with the method parseFlickrImageJson to obtain one or more images. The images are used by the updateResultLiveData method to notify the activity, which receives the new data and displays in the method loadNewData. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.6 Representation of the inner workings of the Monitor class described in Figure 3.7. The method check reads an input trigger and starts a series of internal operations that may or may not cause a change in the monitor’s state. The method yields a state that is either an “OK state” or an “error state”. . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.7 UML representation of the State, Property and Monitor classes. The Trigger class is a type alias for String. . . . . . . . . . . . . . . 31 3.8 Flow chart detailing the logic for the functions doNext and afterResult of the MonitoredViewModel class. . . . . . . . . . . . . . . . . . . . . 32 3.9 Flow chart for the function maybeDoNext of the MonitoredViewModel class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.10 Updated version of the architecture in Figure 3.5 with added infor- mation on instrumentation: the coroutine builder methods launch and async, displayed in a different colour, are wrapped inside the maybeDoNext and afterResult methods (shown respectively in Fig- ures 3.9 and 3.8). Throughout the execution of each method inside the viewmodel, the internal state of the Monitor instance is updated with the number of search operations currently ongoing (“active searches ++” signifying an increment by 1 and “active searches −−” a decre- ment by 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1 Representation of the logic followed by the updated launch method for saving a CoroutineExceptionHandler inside the context. . . . . 48 4.2 New behaviour of the launch and async methods with regard to the CoroutineContext: the methods beforeTask and beforeLaunch/Async progressively update the context, leaving the user free to define how it should be updated. . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.1 Simplified call diagram for the proof-of-concept app. Each rounded rectangle represents a class and each rectangle inside it represents its methods, with the orange rectangles being the one employing corou- tines and, therefore, that we were interested in monitoring. The ar- rows represent the call flow starting from the activity. Note that inside the BrowsePicturesViewModel class there is a second flow that ends in error and updates the UI accordingly. . . . . . . . . . . . . . . . . 64 xii List of Tables 3.1 Summary of the properties that we wanted to verify and which of the monitors defined in section 3.1.5 would carry out the necessary controls. 26 3.2 Updated version of Table 3.1 containing properties 8 and 9. . . . . . . 29 5.1 Technical specifications of both devices employed in our benchmarks. 68 5.2 Benchmark results for the launch method listing the fastest, slowest and average times for the “standard” version and the instrumented one, with the overheads in the rightmost columns. All values are in nanoseconds (ns) save for the last column. . . . . . . . . . . . . . . . 69 5.3 Benchmark results for the async method listing the fastest, slowest and average times for the “standard” version and the instrumented one, with the overheads in the rightmost columns. All values are in nanoseconds (ns) save for the last column. . . . . . . . . . . . . . . . 69 5.4 Benchmark results for the async method, called a thousand times in parallel. The table lists the fastest, slowest and average times for the “standard” version and the instrumented one, with the overheads in the rightmost columns. All values are in nanoseconds (ns) save for the last column. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.5 Benchmark results for the ViewModel simulation listing the average time for the “standard” version and the instrumented one, with the overheads in the rightmost column. All values are in nanoseconds (ns). 71 5.6 Memory reserved by the Application instance, not instrumented (second column) and then instrumented (third and fourth columns). All values are in bytes. . . . . . . . . . . . . . . . . . . . . . . . . . . 72 xiii List of Tables xiv List of Listings 1 Difference in declaration for nullable and non-nullable types in Kotlin: the former are marked with a question mark (?). . . . . . . . . . . . . 11 2 The Kotlin compiler uses smart casts on nullable objects after they have been checked for null. . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Short-circuit evaluation applied to nullable objects: when a nullable variable is asserted non-null in a skippable expression, the compiler tries to skip the assertion. . . . . . . . . . . . . . . . . . . . . . . . . 13 4 Platform types can not be used explicitly and the user is not allowed to assign them to a variable. . . . . . . . . . . . . . . . . . . . . . . . 14 5 Example of function with default arguments: calling the function foo without specifying any value for its arguments a, b and c will populate them with the indicated values (or evaluated expressions). . . . . . . 14 6 Implementations of the AlwaysOneJob and SuccessWithJSON properties expressed using the Property class. . . . . . . . . . . . . . 34 7 Implementation of the search method: the code on lines 5-8 is only used to manually inform the viewmodel of a new search operation be- ing initiated, with invocations of the methods readState and updateState to respectively read and change the internal state of the monitor. . . 37 8 “Ideal” implementation of the search method where the boilerplate code featured in Listing 7 is hidden. . . . . . . . . . . . . . . . . . . . 37 9 Our version of the coroutine builder methods: the signature is thus the same as the standard one for CoroutineScope.launch [56] and CoroutineScope.async [57] with the sole addition of the MonitoredComponent as an argument. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 10 Declarations for the internal data structures in the MonitoredComponent interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 11 Formal definition of the methods with which the MonitoredComponent interface handled the initialisation, start and termination of tasks. . . 42 xv List of Listings 12 Implementation of our version of the launch method in order to save references to the launched tasks. As can be seen, our version of launch was actually a “wrapper” function calling the default launch (line 8) after performing some control operations. The call on line 12 was inside a finally block, meaning it would be executed whenever the task was cancelled. The task might actually be cancelled before the call on line 15, which would result in the launchedTasks array not holding any references to the task: for this reason we used the key argument in the method onComplete rather than the task itself. The same applied to our version of the async method, which per- formed the same operations and then call the default version of the same method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 13 Implementation of our version of the launch method in order to en- sure that one CoroutineExceptionHandler instance should always be in the context. The assignment at lines 6-7 ensured that the handler we would use be either the one already in the context or, should no such object exist in the context, the defaultHandler data structure defined inside the component. By employing the expression context[CoroutineExceptionHandler.Key] we queried the corou- tine context for its current handler (receiving a null if no handler was defined). It should be noted that the lines 11-15 add technically nothing to the function: ignoring CancellationExceptions is stan- dard coroutine behaviour since they are just a notice of cancellation, and any Exception thrown inside a block of code is automatically rethrown (as stated in 2.2.3). . . . . . . . . . . . . . . . . . . . . . . 44 14 Our implementation of the launch method in order to both keep record of running tasks and ensure the presence of a CoroutineExceptionHandler instance in the coroutine context. . . . . . . . . . . . . . . . . . . . . 45 15 Our implementation of the async method in order to both keep record of running tasks and rethrow any exceptions that should be raised during the task’s execution. . . . . . . . . . . . . . . . . . . . . . . . 46 16 Declaration of the WrongDispatcherException class. . . . . . . . . . 47 17 The extension function that we implemented as a fake accessor for the Throwable class. It returns true when called on an instance of CalledFromWrongThreadException, allowing us to distinguish ob- jects of this type by matching their class name. . . . . . . . . . . . . 49 18 Complete implementation for the launch method. . . . . . . . . . . . 51 19 Complete implementation for the async method. . . . . . . . . . . . 52 20 Declaration and initialisation of the data structure we used to keep record of the recommended coroutine dispatchers. . . . . . . . . . . . 53 21 Code of the MonitoredActivity, subclass of Activity that imple- ments the MonitoredComponent interface. . . . . . . . . . . . . . . . 55 22 Implementation of the extension function getMonitoredViewModel that provides any instance of MonitoredActivity with a matching viewmodel and initialises it automatically. . . . . . . . . . . . . . . . 55 xvi List of Listings 23 Code of the MonitoredViewModel, subclass of ViewModel that im- plements the MonitoredComponent interface. . . . . . . . . . . . . . . 56 24 Signatures of the new methods replacing the old API. . . . . . . . . . 59 25 Implementation of the MonitoredScope object and its two extension functions as short-hands for invoking the method launch or async on the component’s coroutineScope field. . . . . . . . . . . . . . . . 61 26 Snippets of the code referenced by the recommendedDispatchers data structure upon inspection. . . . . . . . . . . . . . . . . . . . . . 66 27 An example of a coroutine that bypasses our monitoring . . . . . . . 66 28 An example of nested coroutine builders, each using a randomised dispatcher and lacking an exception handler. . . . . . . . . . . . . . . 67 29 Implementation of the test for simulating the non-monitored execution. 70 30 Implementation of the test for simulating the monitored execution. . 71 xvii List of Listings xviii 1 Introduction Throughout the last decade smartphones have become a huge part of everyday life, growing from portable telephones to complex computers with which users can not only communicate but also access their bank account, monitor their health, play video games, find their car in a parking area and more. All sorts of applications can be found in virtual stores for Android and iOS and, in turn, there is a growing need to guarantee that they pose no threat to the user’s smartphone by means of a faulty implementation (e.g. an app consuming more battery than needed) or malicious operations (e.g. an app leaking personal data). Runtime Verification [1, 2, 3, 4, 5] is one of the possible methods of monitoring smartphone applications and reporting unwanted behaviour: it consists of a series of checks that verify at runtime whether the target program complies with a set of user- defined properties. In the case of devices using Google’s Android operating system, there have been several attempts to employ RV (e.g. [30, 6, 7]). The Android operating system is however constantly changing, with Android 10 [8] being the newest version at the time of writing, and monitoring tools have to stay up to date with the more recent releases. With this project we want to observe the current state of Runtime Verification on Android and determine how it can be pushed further. Aiming to improve the state of the art, we focused our efforts on the newly supported Kotlin language [9]: more precisely, we wished to investigate how its unique features over Java could be monitored. The project was developed in cooperation with Opera Software AB [19], a Nor- wegian software company primarily known for its desktop web browser Opera, its mobile web browser Opera Mini and its recent shift to mobile development with fin- tech applications for Android (e.g. [20, 21, 22]). Opera Software, or just “Opera”, was founded in 1995 and to this day has headquarters in Oslo, Norway, and offices in Poland (Wrocław), Sweden (Linköping, Göteborg, Stockholm) and China (Beijing), focusing on browsers for most of its existence and promoting Web standards through participation in the W3C. 1.1 Problem Statement Runtime Verification is one of several methods used for ensuring the correctness of programs. It involves the monitoring of a running program or system while observing that it complies with one or more given properties during its execution; a report is then generated detailing whether any property has been violated. 1 1. Introduction Runtime Verification, or “RV”, can be split in two major tasks: 1. defining the properties that the system must comply with: they will be expressed with a statement that will be more or less complex depending on their nature; 2. generating the monitors that will verify these properties: this is often platform-dependent as it involves modifying a program or wrapping it in an observable layer. When monitors are not part of the target system but, rather, a second program, a need arises to connect the monitors to the system in such a way they can effectively observe the execution. The operation of detecting traces of events from the target program and relaying them to a monitoring system is known as instrumentation [4] and can be carried out on the target program’s binary code or directly on its source code. 1.1.1 RV of Android Since the introduction of theAndroid operating system, there has been a growing interest in applying RV to it. This has spawned several research efforts throughout the past years (e.g. [13, 30, 7]). For most of Android’s lifetime, its main language for development has been Java; naturally, most attempts to introduce RV to Android have seen the employment of Java-specific tools (e.g. [12]) to instrument applica- tions. As of 2017, the status of preferred development language for Android has shifted to Kotlin, a more recent programming language that can interoperate with Java while also introducing a set of unique features. There are a number of differences between Java and Kotlin; the most relevant for our research are: • checked exceptions: compared to Java, Kotlin does not force the program- mer to handle expected exceptional behaviour; when an API can be expected to throw an exception, the Kotlin compiler will raise no error or warning. Be- hind this choice is a push to favour special return types instead of exceptions [37], when possible; • coroutines: introduced as a concept in the 1960s [16], they are lightweight tasks that can run concurrently inside threads; they do not exist in Java and Kotlin introduced them as an alternative to callbacks [17] in Android development, especially for dealing with lifecycle-aware structures. We found that most of the recommended usage practices of Kotlin can be monitored by means of static analysis. On the other hand, coroutines are quite different. In Kotlin they are intended for use with structured concurrency [42], which means entry and exit points must be made clear and all tasks are either completed or cancelled before the end of the execution. This approach, coupled with Kotlin’s choice to not enforce exception handling, means that a whole coroutine scope will be terminated should any task crash [18]. Coroutines are designed with the goal of running either attached to a single thread or moving from one thread to another. They are implemented as stackless, which 2 1. Introduction reduces the overhead caused by saving the current state between each thread jump, and can move between main and secondary threads depending on the task they need to carry out. We deemed these aspects of coroutines to be interesting to investigate in our project to further RV on Android. 1.2 Goals and Challenges In this thesis project we aimed to define a RV framework for Android applications with the three main tasks of: 1. expressing a range of system properties as wide as possible; 2. making the monitoring system translate said properties into meaningful An- droid code without loss of information; 3. defining a feedback to be returned by the monitoring system in the form of either a corrective action or a report produced post-execution. With this framework we aimed to address the research questions expressed below. What are the meaningful properties of Kotlin coroutines that we can verify by employing RV? We decided to focus our efforts in coroutines, which are well-suited for Runtime Ver- ification and Enforcement. We could inspect official guidelines, design patterns and recommended practices to identify examples of good app behaviour and transform these examples into properties to observe or enforce. Can we use the Kotlin language to monitor properties of the more specific Kotlin features? We set out to use the Kotlin language to either adapt an existing tool for RV of Android or create our own. This process would ideally allow us to maximise our ability to monitor Kotlin properties. How can we push further the current situation for RV of Android appli- cations by using Kotlin? We would evaluate our framework to determine what possible use cases it could suit best and how it could advance the state of the art for RV of Android. Our evalua- tion would include testing the way that properties are observed as well as measuring additional impacts of our solution to the target app’s execution; for example, per- formance overheads caused by the monitoring effort. Our research question would be addressed by using theDesign Researchmethodol- ogy [14]: we would formulate hypotheses on the problem at hand and, subsequently, design a solution for each hypothesis and implement it into an artefact. This, once tested, would turn out to either validate or invalidate the starting hypothesis. We would follow this procedure iteratively for each of our research questions. 3 1. Introduction This project can be split in two major phases: a theoretical one for observing the state of RV on the Android platform and a more practical one for contributing to the current situation. The first phase, which covered the first half of the project, consisted in us assessing the situation on multiple fronts: • investigating the state of Android RV gave us an overview of what had already been tried and what had not yet been accomplished; • examining the unique features of the Kotlin language, as well as its similarities and differences with Java, helped us single out what could be interesting for our research. At the conclusion of the first phase we could determine that not only was the sit- uation of application-centric RV quite advanced for Java Android applications but it also was for Kotlin ones. Android applications run on Dalvik bytecode, which is converted from JVM bytecode: since Kotlin on Android compiles to JVM, any monitor that works on a Java application will therefore also work on a Kotlin one. Our investigation presented us with a choice to focus on either delving deeper into RV with little concern for development language or shifting our main focus to Kotlin. We opted for the latter and chose coroutines as our main interest with the following motivations: • multithreading is an important topic in Android development: multiple apps and services are always running in either the foreground or the background and often multiple threads are used within the same application; several APIs have been developed to improve the control a programmer has over a multithreaded environment and coroutines contribute in their own unique way; • we found several limitations with the current debugging tools for coroutines, especially on Android where the debug agent is not supported due to a lack of libraries; • while it is possible to implement coroutines in Java [51], they are so far only a language feature in Kotlin and, as such, are expected to be expanded in the future and receive continued support by JetBrains. The goal of our project turned therefore from “improving the state of the art of Android RV” to a more specific “expanding Android RV to coroutines for both monitoring a concurrent environment and strengthening the available debugging tools”, with our attention turning from the final user of an Android application to the developer. After establishing the final scope of our project, we moved onto the second phase, which consisted of actually implementing our solution. 4 2 Background and Related Work For this project we focused our efforts in improving the state of the art for Runtime Verification on the Android mobile platform. There have been several research efforts on RV of Android with variations in its approach to both defining properties and monitoring them. From its launch in 2008 to early 2019, Android applications could almost exclusively be written in Java [24] and therefore most implementations of monitoring tools for Android direct their efforts towards translating properties into Java code. This chapter will cover both RV and the Kotlin language, listing some of their features and their relevance to Android: section 2.1 describes the main research topics involved with RV for Android and gives an overview of the current situation and how it can be advanced further; section 2.2 lists the main features of Kotlin and the differences between this language and Java, focusing towards the end on Kotlin coroutines. 2.1 Runtime Verification and its Application to Android The nature and the goal of a program can determine what properties should be considered for monitoring. Some programs handle personal data and it is therefore critical to ensure this data is not leaked; some other programs carry out money transactions and they need to be robust enough to handle failures with minimal impact. Properties can also involve the platform on which a program is run and the impact it has on the platform’s resources. Properties can be expressed at different levels of abstraction as well as combine platform-specific issues with general concerns such as security: for example, limiting data access of a company app running on an BYOD (or “Bring Your Own Device”, i.e. using one’s personally owned devices rather than officially provided ones) smartphone of an employee [10]. While properties can be as simple as “user’s location is never tracked”, the more interesting cases are the ones where properties are defined by a context, given as either the current state of the execution (“user’s location is never tracked while the device is charging”) or a trace, i.e. a sequence of actions that matches the current execution (“user’s location is only tracked after the user has taken a picture”). 5 2. Background and Related Work OK Bad [condition] event event [condition] event Figure 2.1: Example of a connected graph representing an execution. 2.1.1 Expressing Properties The more complex properties are, the more they require an expressive language to represent them. While some simple “if A, then B” properties can most likely be written in a straightforward way, others (such as the examples given in the previous paragraph) require a more articulate formula. Previous works on this topic have used both logic-based approaches (mostly based on variations of linear temporal logic [25]) and finite-state automata [30] to represent stateful properties and traces. Automata are stateful control mechanisms designed to read an input, evaluate it and advance their state depending on the outcome of the evaluation. A finite-state automaton can only enter one from a limited number of states; the automaton starts in what is called the “initial” state and determines the next one by means of a transition function: this is a function that considers the combination of current state and latest input to decide what state to advance to. When used for properties, automata can be represented as connected graphs with variables to keep track of the current evaluation of the property: • each node represents a number of execution states grouped in the same equiv- alence class: usually there are one or more “bad” states which are associated with the breach of a property; • each edge contains a condition and an event, where the condition is a statement that is checked when the event is detected. The initial state is usually a “good” state and, with the violation of a property, the automaton enters one of the “bad” states. An example of such a graph can be seen in Figure 2.1. The execution is intentionally left simple for ease of understanding, with only one kind of event, one condition and no variables. The graph can be read as: • execution begins in the “OK” state; • whenever an event is triggered but the condition is not met, the “OK” state is maintained; • if an event is triggered and the condition is met, the state shifts to “Bad”; • once in “Bad” state, any event will maintain the “Bad” state regardless of the 6 2. Background and Related Work condition being met or not. 2.1.2 Monitoring the Execution Instrumentation is the phase that follows the definition of properties. As mentioned in section 1.1, it consists of connecting the target application to a monitoring system that can detect traces of events. The paradigm of Aspect-Oriented Programming is focused on separating “cross- cutting concerns”, which are anything that influences multiple functionalities in a program, e.g. error reporting or security. Monitors fit quite well in this category, so AOP is naturally seen as an ideal way of implementing them [11]. Aspect-Oriented Programming as a paradigm is well-suited for the purpose of adding monitors to a program in a cross-cutting way, which means weaving code into several places regardless of their functionality. As mentioned earlier in this document, the most common way of implementing monitors for Android application is by employing tools that translate properties to Java monitoring code. The Java implementation of the AOP paradigm is called AspectJ [12] and employs a dedicated compiler to “read” directives and “inject”, or “weave”, them into the target program. By writing monitoring code through an aspect-oriented approach, AspectJ can weave monitors into applications that are already compiled, making potentially any application in the Google Play Store able to be controlled. AspectJ works by reading one or more “aspects”: they are implementation units composed of pieces of “advice”. Their structure can be explained as follows: • a join point is a well-specified point in the execution flow of the program, e.g. a method call, an instantiation or a value being returned; • a pointcut identifies one or more join points through a filter on several search parameters such as package, class, method, arguments or annotations, to name a few; • an advice is a block of code that can be executed before, after or around a pointcut, as well as when the join point matched by the pointcut returns a value or when it throws an exception. An aspect includes one or more pieces of advice as well as inter-type declarations of objects that can store data or observe the behaviour of specific instances. Figures 2.2 and 2.3 showcase different ways of defining an aspect: though AspectJ features two types of syntax for writing advice code, the general idea is that a piece of advice executes a given block of code at a given time before or after an event takes place, if that event has been identified by means of a pointcut. The examples show different definitions for the same advice code: 1. two pointcuts called callSetInt and callGetInt identify the join points where the methods setInt(..) and getInt(..) for the class foo.bar.Baz are called; 2. a piece of advice is executed before each of the pointcuts: • one is called before the callSetInt pointcut is executed and increases an internal counter; • the other is called before the callGetInt pointcut is executed and de- creases the counter. 7 2. Background and Related Work Figure 2.2: Example of an aspect implemented using the AspectJ syntax: this syntax allows to define pointcuts and pieces of advice in a way that is similar to methods. After instrumenting an application with the above example, the advice code will run just before the above methods are called on an instance of the foo.bar.Baz class. 2.1.2.1 Application-centric RV A widely used approach in RV has been the monitoring of an application of choice: the user chooses a set of properties to verify for one app, then expresses those properties by means of some RV tool that translates them to code, which is used to instrument the app. Most Android tools employ a combination of logic and automata to define properties and a technology based on AspectJ for weaving monitors into the target application. The AspectJ compiler instruments the compiled .class files by weaving aspect code into them [23]. This way, it is possible to recompile any Android app to add moni- toring code: 1. the APK (Android Package) file of an app is processed through a Dalvik decompiler such as, for example, dex2jar [15], yielding a JAR (Java Archive) containing the bytecode files of the application; 2. the JAR is recompiled by ajc (the AspectJ compiler) and the monitors are added as pieces of advice; 3. the instrumented JAR is recompiled into an APK that is installed over the previous version. This process proves to be a powerful tool for weaving monitors into the bytecode of any Android application; so long as a verifiable property can be somehow converted to a Java algorithm, it can be woven into any app. While AspectJ is designed with Java in mind, it can also be made to work with Android applications written (partially or entirely) in Kotlin. Kotlin is a cross- platform language (as is explained in section 2.2) and, when employed for Android development, it compiles to JVM bytecode. AspectJ works by reading compiled Java classes, so it can read compiled Kotlin classes in the same fashion. With a certain degree of knowledge of how the Kotlin compiler translates Kotlin 8 2. Background and Related Work Figure 2.3: Example of an aspect implemented using AspectJ annotations: they can be used to mark methods of a standard Java class as pointcuts (with the @Pointcut annotation) or pieces of advice (e.g. by using the annotations @Before, After and Around). code to JVM bytecode it is possible to instrument a Kotlin app with AspectJ: the programmer only needs to declare pointcuts using whatever Java code the Kotlin compiler generates. In the field of app-centric RV there is a push to both: • define properties in a way that is as flexible as possible to translate them to code; • improve the reach of AspectJ monitors to cover not just the app but its libraries as well. 2.1.2.2 Device-centric RV One of the limitations of the application-centric approach described above is that property monitoring (or enforcement, where applicable) is restricted to the appli- cation level: the device, as a whole, is left largely unaffected. This means that properties involving the device as a whole can not be verified: if, for example, a parent wants to limit their child’s time spent on the Internet, they will need to act directly on the child’s device rather than on single applications. There are generally two approaches to running runtime verification on a system with multiple applications running on it: • having a single, central monitor watch over the applications, receiving constant 9 2. Background and Related Work updates about what is happening in each of them (which has been carried out on Android in the past [27]); • having several monitors, one for each relevant application, communicate with one another to coordinate the verification process (as seen in DMaC [28]). Both solutions come with advantages and drawbacks [26]. The first approach, known as “orchestration”, allows a centralised structure to keep track of all events happen- ing in the device, but it also means that it is involved in a lot of traffic, with high risk of bottlenecks; the second approach, despite being more distributed and not requiring one monitor to do the work of many, comes at the cost of needing inter- communication between the several monitors (and possibly shared memory). 2.1.3 Existing Tools Throughout the existence of Android, several tools for RV have been developed, both for device-centric verification (as mentioned previously) and for the application- oriented approach. RV-Droid [6] is one such tool, and one of the first to be developed. It enabled Runtime Verification (as well as Enforcement) by means of allowing the device owner to instrument a target application among the ones installed: the user would choose the target app and select an amount of possible properties to verify chosen from a range. RV-Droid would then translate those properties into monitors, either locally or by means of a remote server, and subsequently weave the monitors into the app. The monitors would finally be integrated into the target application by the process described in section 2.1.2.1 but by means of a modified version of AspectJ. This kind of verification had the advantage of being easily set up and relatively unintrusive: the user would only have to download an app from Google’s Play Store, without the need for rooting their device or performing other, heavier modifications. Unfortunately it came at the same time with the limitation of not being able to instrument code from the libraries used by the target app: libraries provided by the Android runtime, as well as the ones downloaded dynamically as the app is executed, elude the static weaving process. Another, more recent Android tool for RV is ADRENALIN-RV [7], which ad- dresses these issues by: • instrumenting the core libraries provided by the Android OS during the device boot operations; • constantly weaving monitors into dynamically-generated library code by means of a remote server exchanging data with the tool installed onto the device. This approach allows an extensive coverage when verifying a single application but it also comes at the cost of requiring a heavy customisation to the target device: the device needs to be injected with monitors during its boot phase and then needs to be constantly connected to the remote server for the continuous instrumentation process. 10 2. Background and Related Work val a: Foo = null// yields a compilation error val b: Foo? = null// correct Listing 1: Difference in declaration for nullable and non-nullable types in Kotlin: the former are marked with a question mark (?). 2.1.4 Limits of the Current Situation While ADRENALIN-RV may have pushed the effectiveness of monitors, the field of RV is always striving to achieve a more powerful language to define properties, which is the “other” topic of interest of this field. Furthermore, there is a growing interest in monitoring the intercommunication be- tween processes [29], allowing the verification of properties that either span multiple applications or involve one app exploiting another app’s resources and permissions to perform actions otherwise not allowed. 2.2 The Kotlin Programming Language Kotlin is a programming language developed by JetBrains in 2011. Since October 2017 it has been officially supported for Android development alongside Java. As of 2019 it has become the preferred language by Google. One of Kotlin’s main features is interoperability with Java: this allows any Kotlin program to execute Java code and employ Java libraries. Being a cross-platform language, Kotlin can compile to JVM, JavaScript or LLVM; on the Android platform it targets the JVM. For the purposes of this project, we considered compilation from Kotlin to JVM as our main focus. There are, however, some peculiarities coming from the cross- platform nature of the language; they will be addressed later as they could impact mobile development. Kotlin has several differences with Java, some of which are described in the sections below, complete with how Kotlin’s exclusive features are translated to JVM code by the compiler. To do this we compiled Kotlin classes and then decompiled them through the Procyon [41] decompile tool. This process gave us a translation of our Kotlin code to Java which we used for observing how the Kotlin compiler converts the unique features of the former language to the latter. 2.2.1 Non-Nullable Types One of the main features setting apart Kotlin from Java is the introduction of nullable and non-nullable types. Every Kotlin type is by default not nullable and can only be treated as nullable with the ? symbol, as shown in Listing 1. The purpose behind this is to avoid null references [31] as much as possible. The Kotlin compiler can also recognise cases where a nullable object is not null: in these cases it will relax its checks, as shown in Listing 2. There is still the possibility of triggering a NullPointerException: this happens when a nullable value is forcefully read by using the !! operator. This way the 11 2. Background and Related Work val foo: Bar? = someValue() if(foo != null) { /* foo is not null */ } val foo: Bar? = someValue() if(foo == null) return // after this point, // foo is not null Listing 2: The Kotlin compiler uses smart casts on nullable objects after they have been checked for null. programmer can still use nullable types but has a better way of avoiding null refer- ences. It’s worth noting that some non-nullable types are already present in Java: they are primitive types and not classes, therefore they are never instantiated. When translating Kotlin to JVM code, the compiler assigns variables with a primitive type if: • they are explicitly typed as non-nullable: – Kotlin’s non-nullable Int and Boolean types compile to Java’s int and boolean primitive types; – Kotlin’s nullable Int? and Boolean? types compile to Java’s Integer and Boolean classes; • they are declared as nullable (e.g. Int?) but are initialised to a non-null value and are never assigned a null value throughout their lifecycle; • their inferred Kotlin type is non-nullable: declaring a variable val a = kotlin.random.Random.nextInt() causes a to evaluate to final int. Compiled boolean expressions make use of short-circuit evaluation, which means that an expression is not evaluated if the result is inconsequential (e.g. in the expression true || (foo && bar), evaluation of foo && bar is skipped); this fea- ture is used by the compiler to avoid assertions when possible (see Listing 3 for an example). It is possible to check whether some lateinit var foo field has been initialised by calling the isInitialized property on its reference: ::foo.isInitialized will return false if the field has no value. In the JVM this translates to checking whether the backing field is null. 2.2.1.1 Interoperability with Java and Platform Types By design Kotlin can be used together with Java: interoperability between the two is a selling point of the language. When using Java classes in Kotlin code, all null checks are relaxed; instead of forc- ing the programmer to check for null references, Kotlin marks any Java objects as “platform types” [50], that may or may not be null. The reason for this is ensuring that “safety guarantees for (Java objects in Kotlin) are the same as in Java”. Kotlin does not allow to use platform types explicitly (see Listing 4), so the program- mer is forced to treat them as either nullable or non-nullable. It is possible to coerce Kotlin into treating a Java object as either nullable or not by using the @Nullable and @NotNull annotations (from the library org.jetbrains.annotations) in the Java source code as shown in section 2.4. 12 2. Background and Related Work // Kotlin code: fun foo(a: Boolean, b: Boolean?){ var c = a && b!! } // Compiled Java code: public final void foo(final boolean a, @Nullable final Boolean b) { boolean b2 = false; Label_0023: { if (a) { if (b == null) { Intrinsics.throwNpe(); } if (b) { b2 = true; break Label_0023; } } b2 = false; } final boolean c = b2; } Listing 3: Short-circuit evaluation applied to nullable objects: when a nullable variable is asserted non-null in a skippable expression, the compiler tries to skip the assertion. 2.2.2 Default Function Arguments It is possible to assign a default value to one or more arguments in a Kotlin function, as seen in Listing 5. If a function with default parameters is always called with explicit values for all of its arguments (e.g. foo(0, 3, 2)), then the compiler translates it to a “normal” Java method; otherwise it splits it into: • the “normal” method without default arguments; • an additional “default” method (e.g. foo$default) with the original argu- ments, plus an additional one that indicates which arguments had an explicit value. This method initialises all arguments that were not provided by the caller and then calls the “normal” method using these values. 2.2.3 Absence of Checked Exceptions Exception handling is treated differently in Kotlin than it is in Java: the most important difference is that Kotlin does not use Java’s checked exceptions. While a Java method may be forced to enclose one or more lines of code in a try-catch block (or add the throws keyword to a method’s signature), Kotlin imposes no such requirement. The use of checked exceptions is a divisive topic as there are both arguments in 13 2. Background and Related Work val valueFromJava = javaObject.getFooValue()// typed as Foo! val foo: Foo? = valueFromJava// allowed val bar: Foo = valueFromJava// allowed but, when translated // to bytecode, adds an assertion // that valueFromJava is not null val baz: Foo!// compile error: Unexpected token Listing 4: Platform types can not be used explicitly and the user is not allowed to assign them to a variable. fun foo(a: Int = 1, b: Int = a + 2, c: Int? = b) = c ?: b Listing 5: Example of function with default arguments: calling the function foo without specifying any value for its arguments a, b and c will populate them with the indicated values (or evaluated expressions). favour [34, 35] and against [32, 33] it. The absence of this feature in Kotlin means that no method is forced to handle exceptions and therefore any exception not caught right away will be automatically re-thrown to the calling method. The documentation provided by JetBrains, however, discourages programmers from catching and throwing exceptions [37] and recommends instead the use of special return types. These types are “special” in that they wrap their values with additional information in case something has gone wrong. Typical special return types are Maybe and Either: • the Maybe (or Option) type contains either a value or “Nothing”: before an instance is used in an expression, the programmer must check that it contains an actual value; • the Either type is a more complex version of that: instead of containingmaybe a value, it contains one of two values called “left” and “right”. This type is usually employed to carry error information: a “right” value is mnemonically associated with the expected result of an expression, while a “left” value is usually an error trace or other information describing a failure. The Kotlin documentation suggests using special return types defined in the external library Arrow: Option [39] and Either [38] as mentioned above, plus a Try that behaves more similarly to a traditional try-catch [40]. The rationale behind this is most likely to discourage the use of exceptions as special return values. Despite JetBrains’ encouragement to use special return types instead of Exceptions, Kotlin does not come with any such types by default, forcing the user to either make their own custom ones or importing external libraries. The Android framework, due to its Java origin, makes use of exceptions and therefore Kotlin applications for Android are forced to either adopt exceptions or implement wrappers that simulate the aforementioned types. The compiler raises an error whenever a null check is skipped and the use of lateinit can already guarantee the use of a value without having to initialise it to null. Of course, this alternative is not always possible since lateinit may not be 14 2. Background and Related Work Figure 2.4: The Kotlin compiler assigns Java objects with platform types unless the Java code uses specific annotations. The upper part of the figure contains Java code with definitions for the methods getNullValue, getAnnotatedNullValue, which is annotated with @Nullable, and getAnnotatedNotNullValue, which is annotated with @NotNull. The lower part of the figure showcases the return types assigned to each method by the Kotlin compiler. used on primitive types such as Int, Boolean and others. Kotlin’s interoperability with Java gives Java the ability to make calls to Kotlin code: This means that a Java method may invoke a Kotlin function that throws an unchecked exception. The compiler raises no errors in such cases (see Figure 2.5), which can prove a problem on the Java side. As a workaround for this, the Kotlin function throwing the exception can use the annotation @Throws to specify which exception types it might throw; in this case, the compiler will raise an error when an exception is not handled on the Java side (see Figure 2.6). The opposite statement also holds true, as Kotlin code can invoke methods of Java classes. There is however no way to translate throws declarations from a Java method to its Kotlin caller, which means that a piece of Kotlin code invoking a Java method that may throw a certain exception may not be required to catch that exception. 2.2.4 Coroutines Coroutines are an alternative to threads for implementing asynchronous program- ming: they focus on concurrency rather than parallelism, are generally lightweight and require none of the typical structures for mutual exclusion. In Kotlin they are intended for use with structured concurrency [42], which means entry and exit points must be made clear and all tasks are either completed or cancelled before the end of the execution. 15 2. Background and Related Work Figure 2.5: The Java method receiveSomething is not notified about the Kotlin function getFoo throwing an IOException. A single thread can run several coroutines: they follow a pattern of suspend/resume where they can be suspended at any time, their state is saved and then restored whenever they resume. A coroutine can also suspend on one thread and resume on another after transferring its state. There are four predefined types of CoroutineDispatchers that are used to deter- mine where a coroutine is going to run: • the Default one is meant for “heavy” computations (e.g. sorting or parsing data) and executes outside of the main thread; • the Main one is the thread where the application is running: in Android it is also known as “UI thread”; • the IO one is ideal for blocking operations such as network or database access; • the Unconfined one can execute in any thread (or thread pool) and can switch from one thread to another at every resume. Its use is discouraged due to the lack of control of where a given coroutine will end up running. An additional option is available for dispatching coroutines on user-defined thread pools, using functions like newSingleThreadContext; threads are expensive re- sources, so this option should be used sparingly. Despite its name, the Default dispatcher is not actually the “default” choice: when- ever a new coroutine is started, its scope will be the same as that of the caller. In other words, the Main thread will always be used unless otherwise specified. The function withContext(..) can be used to change the dispatcher of a coroutine as 16 2. Background and Related Work Figure 2.6: The @Throws annotation ensures that Java method receiveSomething will not compile unless the call to getFoo handles the possible exception. shown in Figure 2.7. The lightweight nature of coroutines and their ability to be moved almost effortlessly between threads can be a concern when it comes to debugging: some developer blogs have, for example, detailed some problems with breakpoints being ignored [47] when a coroutine jumps between threads. The coroutine library does provide some tools to assist the debugging process: • by enabling “debug mode” [43] any call to Thread.currentThread().name will yield the name for both the active thread and the current coroutine; • the debug agent is a tool that keeps a record of all running coroutines; it does however require an API that is not provided by the Android runtime [44], making it not work in that context. 2.2.4.1 Types of Coroutines A coroutine can be executed in multiple ways and this comes with heavily different use-cases. • A standard job will simply execute whatever instructions are inside it. • An async coroutine is expected to eventually yield some value: this is delivered as a Deferred value as soon as the coroutine is created. The caller can then await the end of the computation. If the coroutine is cancelled before it can return a value, however, it will throw an exception. Because of Kotlin’s lack 17 2. Background and Related Work Figure 2.7: By using withContext it is possible to switch context for the currently running coroutine without spawning a new one. The figure displays the source code on the upper half and the output on the lower half. Compare the first part of the output (before the first empty line) with the second: in the latter there are three different coroutines (marked as SECOND COROUTINE#3, #4, #5) whereas in the former there is only one (marked as SECOND COROUTINE#2), indicating that no new coroutine was created. of checked exceptions, the compiler will issue no warnings when await is used. • A produce coroutine will return not a value but, rather, a stream of values to be delivered to a ReceiveChannel. This channel can be listened to in order to retrieve messages. • An actor coroutine is similar to produce with the main difference being that it consumes values as opposed to sending them: it reads from a SendChannel that can be delivered messages to. In the specific case of channels, Kotlin provides a select expression that allows to: • listen to multiple ReceiveChannels and process the first message received from one of them; • keep multiple SendChannels open and deliver a message to the first one avail- able (i.e. the first channel that is neither closed or busy processing another message). The current implementation of channels and the related builder functions, actor and produce, are still experimental and subject to change. Furthermore, some functionalities are still not completely working at the moment. A similar entity to coroutines are flows: they are asynchronous streams of elements, 18 2. Background and Related Work which means they deliver one value at a time as opposed to returning a single value at the end (e.g. async) or sending values through a channel like produce. Flows are implemented as “cold” streams: their job is executed only when the Flow’s collect method is invoked: this makes them different from coroutines. By employing a buffer (which is, as of now, still an experimental feature) it is possible for a flow to keep emitting values even when the caller consumes them at a slow rate. For the purposes of this project, we decided to forego experimental features such as channels and buffered flows. Experimental libraries can be subjected to changes in their implementation: it would therefore be risky to base our research on their current state. It would however be a good idea to keep an eye on their situation and possible updates. 2.2.4.2 Termination of a Coroutine At any time the execution of a coroutine can be cancelled by invoking the CoroutineScope’s cancel method: this causes a CancellationException to be sent to that coroutine. This kind of exception is not treated as a “real” exception as much as a prompt to terminate. Any coroutine receiving this exception will quickly execute the following steps: 1. execute any code it might find inside a finally block; 2. recursively cancel all of its children (by forwarding the same CancellationException to them; 3. terminate. Depending on the type of coroutine some additional events will occur: • an async coroutine will pass the CancellationException instead of its ex- pected return value; as mentioned earlier this behaviour will cause the corre- sponding await function to receive the exception; • coroutines bound to channels (i.e. produce and actor) will close their channel: this causes any attempt to communicate to that channel to yield a null value. Coroutines will also terminate if any other kind of exception is thrown. Since corou- tines follow the paradigm of structured concurrency, cancellation in this case is propagated both downstream and upstream: this means that both the children and the parent of the failed coroutine will terminate. There is one way to prevent the cancellation from propagating upstream: the failing coroutine must be spawned by a SupervisorJob. In this case, the parent will not be affected by the failure and will simply receive the exception that caused the failure. It is worth noting that catching an exception in a try-catch block will not prevent termination. It will, however, allow for a quick handling such as, for example, logging. Any code inside a finally block will be executed. 2.2.4.3 Coroutines in Android Coroutines are well-ingrained in the MVVM (Model-View-ViewModel) architecture used by Android [54] and are specifically meant to be used for long-running tasks or CPU-intensive computations launched by the ViewModel [45]. The Ktx library provides the ViewModel class with its own viewModelScope that is automatically 19 2. Background and Related Work destroyed whenever the ViewModel itself is cleared: this ensures that all coroutines running in the viewModelScope will be safely terminated. The Ktx library provides coroutine support for other life-cycle components, too [46]. The LifecycleScope is a state-aware scope that calls coroutines when the enclosing life-cycle object is created, started or resumed; specifically it suspends a coroutine until the object reaches the intended state. 2.2.5 Problems and Limitations In the previous section we detailed some features that, when combined, we perceive as problems of the Kotlin language: • the absence of checked exceptions, combined with the way coroutines handle exceptions, can potentially result in some exceptions such as CancellationException not being handled; • the recommendation to forego exceptions as special return types, combined with CancellationExceptions being used as special return types, sets the behaviour of coroutines apart from the rest of the language; There are some more issues with how Kotlin handles coroutines: these are a side ef- fect of the compilation to native code. While this is not a restriction of Kotlin/JVM, it does impact how the same block of code can be translated to JVM or, for ex- ample, iOS. The main problem comes with threading rules defined in LLVM: data can be either mutable or shared and, if shared, it must be frozen: it can never be “unfrozen” and is therefore completely immutable [48]. As a direct consequence, all coroutines in Kotlin/Native code must be defined in the same thread if they are to communicate with each other. JetBrains is aware of this issue but has no estimation for when it will be fixed [49]. A possible consequence of this problem is that multi-platform apps, created in Kotlin for both Android and iOS and compiled respectively to JVM and LLVM, will either have runtime crashes or not work with coroutines at all. This is obviously an edge case but it is worth mentioning. 20 3 Prototype for a Monitoring Tool In this chapter we go through the properties that we deemed interesting to monitor on Kotlin coroutines: section 3.1 contains the description for each property as well as a table containing a summary of all of them. The following sections detail our first approaches in translating these properties to a monitoring system in Kotlin. Section 3.2 details our first approach using monitors hardcoded to the architectural components of an app, as well as our definition of several app-specific properties. Section 3.3 describes our second approach using annotations. 3.1 Target Properties We had already considered some properties in the planning phase, and we integrated them with other behaviours that we wanted to verify (or enforce, in some cases). The result was a list of four scenarios that will be described in this section. For convenience we decided to give each property a number and a name: these are also used as titles for the sections describing each property. 3.1.1 Property 1: DestroyedWithOwner Each activity in an Android application has its own lifecycle and can be destroyed and recreated multiple times: specifically, an Activity is destroyed when the OS is low on memory and needs to save space, when the screen is rotated (unless a specific configuration setting is employed [62]) or, quite simply, whenever its finish method is invoked. Whenever the same activity is recreated, a new object is instantiated and initialised. For this reason, any object that can “survive” an activity should either keep no references to it, or be destroyed together with it. This applies not just to Activity but also to LifecycleOwner and its subclasses as well as other outliers such as, for instance, ViewModel. We use the term “lifecycle component” to refer to any Kotlin (or Java) class fitting this description. In short, lifecycle components should not be leaked. In the case of coroutines, they execute a given block of code which may or may not contain references to a lifecyle component. It would be wasteful if a background thread were to download a large amount of bytes to display an image on an activity that was destroyed minutes before, with an obviously negative impact on perfor- mances (for the currently running app, the device battery and possibly the user’s mobile data contract). It would be even worse if a new instance of this task were to 21 3. Prototype for a Monitoring Tool Activity launched onCreate() onStart() onResume() Activity running Activity goes to the background onPause() Activity no longer visible onStop() Activity is finishing or being destroyed onDestroy() Apps with higher priority need memory App process killed User navigates to the activity Activity terminated User returns to the activity User navigates to the activity onRestart() Figure 3.1: Graph of the lifecycle of an Android Activity, taken from the official Android documentation [63]. be launched every time the related activity was created. Because of this, we wanted to ensure that all coroutines started in the scope of a given lifecycle component would be terminated together with that component. This behaviour is compliant with and, in fact, recommended by, the definition of structured concurrency used by Kotlin [42]. The destruction of a lifecycle component is not something that can be inferred after an examination of the code: it can be triggered by factors unrelated to the target application, as mentioned earlier and detailed in 3.1. Static analysis can verify whether a task contains references to e.g. instances of the Activity or ViewModel class but, in order to verify that the task is terminated together with its related component, we need to employ Runtime Verification. 22 3. Prototype for a Monitoring Tool 3.1.2 Properties of tasks that return a value As mentioned in section 2.2.3, Kotlin discourages the use of exceptions as return types and recommends instead that the user define their own (or use predefined ones such as the Option type defined in the Λrrow library [39]). Unfortunately, with coroutines this is not always possible since exceptions are closely tied to their intercommunication: the CancellationException class is used as a notification of termination. For this we identified three possible use cases: 1. a task is executed without any failures and returns a given value (which will be Unit for launched operations); 2. a task is cancelled at some point during its execution and therefore cannot return any value; 3. a task fails at some point during its execution and throws a given exception Scenarios 1 and 3 can easily be associated with “ideal” and “exceptional” be- haviour, respectively; the second case is, however, unclear: can this be counted as an “exceptional” case? Different types of coroutines have different treat- ments for this scenario: those started with the launch command simply ignore any CancellationExceptions, while those started with the async command will rethrow such an exception when the method await tries to retrieve their return value. This behaviour can be best monitored while the app is running, which makes prop- erties of tasks returning a value a good case for the employment of RV. We decided to examine the two clearer scenarios, which are 1 and 3, in order to express them as properties that we could monitor with our tool. 3.1.2.1 Property 2: NormalAsync If a given block of code is executed without any failures, it will yield a certain return value depending on the type of coroutine on which it was running: • launched tasks will yield Unit; • async tasks will yield a Deferred value, i.e. a “future” result that even- tually evaluates to a value of type T. 3.1.2.2 Property 3: ExceptionalAsync and Property 4: NeedHandler If a given block of code fails during its execution, it usually throws a Throwable subclass detailing the kind of failure it triggered and, possibly, the line of code that caused it. As mentioned earlier in this document, any exception being thrown inside a, async coroutine (except for CancellationExceptions) will “taint” the current CoroutineContext and mark it for termination. Meanwhile, launch coroutines will rethrow the exception to their parent task all the way until the top level, at which point the CoroutineExceptionHandler inside the context is given the task to “handle” the failure [60]. We decided therefore to monitor that, in both cases, any exception is either rethrown or wrapped inside another Throwable object that is, in turn, thrown. This behaviour 23 3. Prototype for a Monitoring Tool was called ExceptionalAsync for async, in opposition to the previous one, and NeedHandler for launch. 3.1.3 Property 5: NoBlockUI and Property 6: UpdateUI On a general note, Android apps are recommended to leave the main execution thread as lightweight as possible and avoid blocking it [59] because it would both cause a poor user experience and risk triggering an OS alert as seen in Figure 3.2. There are some scenarios that are, however, downright forbidden and they can be summarised as “using the wrong thread” for certain operations: • when a I/O operation is executed on the UI thread, the Android runtime launches a NetworkOnMainThreadException; • likewise, when a background thread tries to access UI elements, the Android runtime throws a CalledFromWrongThreadException. Figure 3.2: Notification from the Android OS that the UI thread is blocked for the application aptly named “App blocking the UI thread”. This behaviour is clearly considered harmful and we decided that it was worth monitoring in coroutines, since one of their biggest features is exactly the ability to “jump” from one thread to another, making it difficult to establish through static analysis where a task may be running at a given time. Using the scenario described above, we identified the properties: • “NoBlockUI”, according to which an I/O operation is always executed in a background thread; • “UpdateUI” which states the opposite, i.e. that an update of the UI needs to be carried out in the main thread. 3.1.4 Property 7: ResumeIfNeeded Under the hood coroutines are a sequence of callbacks that are suspended at one or more points in their execution. These suspend points can be traced back to any invocation of a suspend function inside a coroutine code block. The code inside a coroutine is executed until a suspend point is reached: here the coroutine returns a special value to “warn” its dispatcher that its execution is not finished. Later, the dispatcher checks whether the coroutine is suspended, complete 24 3. Prototype for a Monitoring Tool Figure 3.3: Suspend points marked on lines 55, 58, 61 and 63. or cancelled and, in the first case, it resumes the coroutine; the execution will start from right after the last suspend point. In cases where a computation takes a large amount of time to complete, however, there might not be a chance for the dispatcher to check for cancellation [61]. A good practice is generally to check the flag isActive which returns false whenever the current coroutine is not supposed to execute anymore; there is also an ensureActive method that throws a CancellationException unless the isActive flag is true. These checks are executed at runtime and, therefore, we felt it would be appropriate to ensure at runtime that a task is only completed if necessary. 3.1.5 List of Monitors From the scenarios described above we could isolate four events that corresponded to the same amount of Kotlin instructions: • component.destruction() refers to the destruction of some “component”: our assumption is that the component is lifecycle-aware and, as such, its method onDestroy is called as the component is on its way to termination; • launch(context, task) corresponds to an invocation of the launch method with the arguments: – context, a representation of a CoroutineContext, i.e. a map that may contain a CoroutineDispatcher and a CoroutineExceptionHandler, among other things; – task, a representation of a block of Kotlin code; • async(context, task) represents an invocation of the async method with the same argument constraints as described for launch; • await(deferredTask) represents the invocation of the await method with an argument deferredTask of generic type Deferred, where T can be any Kotlin type; since only the async method returns values of type Deferred, this event implies that a task had previously been started by means if async. Our monitors were therefore defined as follows: • one monitor would detect the launch(context, task) event and verify the NeedHandler property; 25 3. Prototype for a Monitoring Tool Table 3.1: Summary of the properties that we wanted to verify and which of the monitors defined in section 3.1.5 would carry out the necessary controls. Property Monitors co m po ne nt .d es tr uc ti on () la un ch (c on te xt , ta sk ) as yn c( co nt ex t, ta sk ) aw ai t( de fe rr ed T as k) DestroyedWithOwner Yes No No No NormalAsync No No No Yes ExceptionalAsync No No No Yes NeedHandler No Yes No No NoBlockUI No Yes Yes No UpdateUI No Yes Yes No ResumeIfNeeded No Yes Yes No • one monitor would detect the async(context, task) event; • one monitor would detect the await(deferredTask) event and verify the properties NormalAsync and ExceptionalAsync; • every instance of a lifecycle-aware class (e.g. ViewModel) would contain a monitor to detect that object’s this.destruction() event and verify the De- stroyedWithOwner property. Properties that were not tied to a specific type of coroutine, such as NoBlockUI, UpdateUI and ResumeIfNeeded would be verified by monitors for both builder functions, i.e. launch(context, task) and async(context, task). Each of our monitor should be able to intercept actions and decide, based on a detected event, whether an action was compliant with the “correct” behaviour or not. An action disrupting our desired behaviour could be reported or even cancelled, depending on the approach we would follow. A summary of the properties and monitors defined so far can be found on Table 3.1. 3.2 First Approach: Hardcoding Monitors into the Application As a first way of implementing monitors we opted for a rough, small-scale approach: we created a rudimentary app with both network activity and background compu- tations. Our app asks the user to submit a tag and searches images containing that tag 26 3. Prototype for a Monitoring Tool Figure 3.4: The image browsing app developed for testing hard-coded monitors. on the image hosting website Flickr [52]. This operation uses the Flickr “search” feed [53] to receive the metadata for up to twenty user-submitted images in JSON format. As soon as the reading operation is completed, the app parses the response and extracts the URL for each image, which is then loaded and displayed on screen as shown in Figure 3.4. 3.2.1 Definition of tasks to execute on coroutines The operations of reading from Flickr and parsing the JSON response are carried out with coroutines. We tried to implement a hard-coded monitor for the previously mentioned properties. All monitors were intended to activate an error state upon detecting a violation of either property. We also identified two more properties specific for this application, which are pre- sented below. 3.2.1.1 Property 8: AlwaysOneJob We wanted the app to only carry out one search operation at a time. Since the operation is started whenever the user taps a button on the screen, an undefined 27 3. Prototype for a Monitoring Tool Activity ViewModel buttonTags search launch { getImages } parseFlickrImageJson updateResultLiveDataloadNewData async { readText } Figure 3.5: Simplified representation of the app’s architecture: once the buttonTags UI element is tapped by the user, it triggers the search method which, in turn, launches the getImages method on a coroutine. This coroutine launches an asynchronous task to read bytes from the remote endpoint (in the readText method) and parses the result with the method parseFlickrImageJson to obtain one or more images. The images are used by the updateResultLiveData method to notify the activity, which receives the new data and displays in the method loadNewData. amount of tasks might be launched in a matter of seconds and all images retrieved through the Flickr feed would be competing for access to the UI. We decided to avoid this scenario by limiting the highest amount of active search operations to one: if the user were to tap the button while a search was ongoing, nothing would happen. This property should be enforced at the moment of starting a search operation, so we chose to monitor with property when detecting a launch(context, task) event: as can be seen in Figure 3.5, the webservice is indeed invoked inside the method launch. 3.2.1.2 Property 9: SuccessWithJSON The Flickr feed is supposed to return a list of up to 20 images, expressed in JSON format. Unforeseen problems with the endpoint (such as e.g. errors on the remote server) might, however, happen, and this would result in the application receiving an invalid JSON response. This was added to the possible “exceptional” scenarios but, since it didn’t entail an exception being thrown, we felt it wouldn’t fit the ExceptionalAsync property: the parsing operation was executed outside of the async method invocation (see Figure 3.5). We therefore rationalised that the async method should, instead, return a string containing a valid JSON. For this reason we had the property monitored at each occurrence of the event await(deferredTask), the same event that triggers 28 3. Prototype for a Monitoring Tool Table 3.2: Updated version of Table 3.1 containing properties 8 and 9. Property Monitors co m po ne nt .d es tr uc ti on () la un ch (c on te xt , ta sk ) as yn c( co nt ex t, ta sk ) aw ai t( de fe rr ed T as k) DestroyedWithOwner Yes No No No NormalAsync No No No Yes ExceptionalAsync No No No Yes NeedHandler No Yes No No NoBlockUI No Yes Yes No UpdateUI No Yes Yes No ResumeIfNeeded No Yes Yes No AlwaysOneJob No Yes No No SuccessWithJSON No No No Yes the monitoring of ExceptionalAsync. Table 3.2 shows a summary of the properties to verify in this first approach, as well as monitors carrying out the necessary controls for each property. 3.2.2 Class implementation of monitors Trying to translate our notation for properties from paper to code, we used a nota- tion similar to the automaton approach described in section 2.1.1. We defined the following classes and data structures: • a trigger is a String that we send to the monitor to notify an event taking place; • a condition is a function that reads the current state and outputs a Boolean: true when the condition is met and false otherwise; • an action is a function that reads the current state and outputs a new state with updated values. The state itself is defined as a map of entries where the keys are in String format; some notable keys are state to distinguish between “starting”, “OK” and “error” states and result to carry the result of the last computation, as well as error to carry the error raised in the last unsuccessful operation. The state is updated by the monitor’s action function but can also be changed directly by modifying the entries for the aforementioned keys, or adding new ones. Finally, the monitor itself is a class containing the state and a list of properties, 29 3. Prototype for a Monitoring Tool Monitor trigger look for properties containing that trigger execute for each retrieved property read state evaluate condition is the condition met?Yes No execute action to update state do not update state state Figure 3.6: Representation of the inner workings of the Monitor class described in Figure 3.7. The method check reads an input trigger and starts a series of internal operations that may or may not cause a change in the monitor’s state. The method yields a state that is either an “OK state” or an “error state”. defined as triplets of a trigger, a condition and an action. The class diagram is shown in Figure 3.7 and the overall architecture can be seen in Figure 3.6. With the monitor being defined, we needed to find a way to insert it into our app. 3.2.3 Running the monitors in the ViewModel By design Android recommends using the Model-View-Viewmodel (or MVVM) ar- chitecture [54], where the ViewModel class takes care of the logic, so we decided to focus our efforts in instrumenting it. We defined an abstract MonitoredViewModel class containing: • a centralised Monitor observing a list of app-independent properties; • an API for notifying the monitor before or after an event is triggered (exem- plified in Figures 3.8 and 3.9): – doNext reads a trigger and feeds it to the monitor, updating the current state, and then executes a given block of code saving the result (or the 30 3. Prototype for a Monitoring Tool Monitor +observe(vararg Property) : Boolean +overwriteState(State) : Unit +updateState(Pair) : Unit +check(Trigger) : State +stateOf(String) : Any? State -keys : Set -isErrorState : Boolean -isOkState : Boolean +update(vararg Pair) : Unit +overwriteState(State) : Unit +toErrorState(String) : Unit +toOkState() : Unit +get(String) : Any? Property -trigger : Trigger -condition : (State) -> Boolean -action : (State) -> State Figure 3.7: UML representation of the State, Property and Monitor classes. The Trigger class is a type alias for String. error) in the state; – afterResult behaves in a similar way but executes the block of code before having the monitor check the trigger; – maybeDoNext is an enforcement-oriented version of doNext that only ex- ecutes the block of code when the monitor outputs an “OK” state and reverts the state otherwise; • an API for running coroutines in a controlled environment: – launch has the same effect as launching a coroutine normally, but it also saves the resulting job in the viewmodel for future reference; – await calls the await function for a given Deferred value and then checks the monitor for properties defined for async tasks. We decided to use a single, centralised monitor in order to only have one block of business logic running at one time. This Monitor instance would receive a Trigger from the API and, one by one, check all properties that trigger was relevant for. Any such properties would be, sequentially, examined: if the given Condition, applied to the internal State, were to yield true, then the monitor would update the State as specified in its own Action. We set up the viewmodel to extend the MonitoredViewModel and defined the prop- erties from the list in 3.2, translating them to Property instances as shown in Listing 6. The first property, AlwaysOneJob was checked using the maybeDoNext function and we noticed that it successfully prevented the viewmodel from starting any search operations if another one was already processing. To test the SuccessWithJSON property we set the reading task to randomly return an invalid result and verified that the state of the monitor was correctly set to “error”. We replaced all invocations of the launch and await coroutine functions with the ones defined in our class. This way we could easily verify the other, non-app specific properties. 31 3. Prototype for a Monitoring Tool START Check trigger Execute block Did the block yield a value? Was it cancelled? Save null as state[result] Save value as state[result] Save the exception as state[result] Return state[result] END doNext Yes Yes No No START Check trigger Execute block Did the block yield a value? Was it cancelled? Save null as state[result] Save value as state[result] Save the exception as state[result] Return state[result] END afterResult Yes Yes No No Figure 3.8: Flow chart detailing the logic for the functions doNext and afterResult of the MonitoredViewModel class. 32 3. Prototype for a Monitoring Tool START Check trigger Execute block Did the block yield a value? Was it cancelled? Save null as state[result] Save value as state[result] Save the exception as state[result] Return state[result] END maybeDoNext Yes Yes No No Save current state as state0 Is the new state still OK? Restore the previous state? Overwrite state0 into current state Return null Yes Yes No No Figure 3.9: Flow chart for the function maybeDoNext of the MonitoredViewModel class. 33 3. Prototype for a Monitoring Tool private val alwaysOneJob = Property( trigger = TRIGGER_SEARCH, condition = { state -> state[KEY_ACTIVE_SEARCHES]?.let { it as Int > 0 } ?: false }, action = { state -> state.apply { toErrorState("One search job must be running at most") } } ) private val successWithJson = Property( trigger = TRIGGER_FINISHED_READING, condition = { state -> if (state.isOkState) false else try { JSONObject(state[State.KEY_RESULT] as String) false } catch (e: JSONException) { // The result is not a valid JSON object. true } catch (e: ClassCastException) { // The result is not a String. true } }, action = { state -> state.apply { toErrorState("Successful read but invalid JSON") } } ) Listing 6: Implementations of the AlwaysOneJob and SuccessWithJSON properties expressed using the Property class. 34 3. Prototype for a Monitoring Tool Activity ViewModel buttonTags search maybeDoNext: launch { getImages } readState: active searches afterResult: async { readText } updateState: active searches ++ parseFlickrImageJson updateState: active searches −− updateResultLiveDataloadNewData Figure 3.10: Updated version of the architecture in Figure 3.5 with added infor- mation on instrumentation: the coroutine builder methods launch and async, dis- played in a different colour, are wrapped inside the maybeDoNext and afterResult methods (shown respectively in Figures 3.9 and 3.8). Throughout the execution of each method inside the viewmodel, the internal state of the Monitor instance is updated with the number of search operations currently ongoing (“active searches ++” signifying an increment by 1 and “active searches −−” a decrement by 1). 35 3. Prototype for a Monitoring Tool 3.2.3.1 Considerations This implementation proved effective to a degree but, obviously, came with its share of disadvantages. The first drawback was in the nature of the Monitor: it was a unique instance “containing” a list of properties, a big difference to our planned approach with several monitors defined for coroutine- and component-specific events. Another shortcoming was the reliance on being called by the doNext, maybeDoNext and afterResult methods, with the invocation procedure being fairly intrusive as shown in the Listings 7 and 8 that feature the same method, respectively with and without the boilerplate code. We wanted to make a second attempt at a basic implementation, while trying to keep the boilerplate code to a minimum and, if possible, move it away from the viewmodel to a separate structure. This would help split what concerned the execution of the app and what concerned its behaviour throughout the execution. 36 3. Prototype for a Monitoring Tool 1 fun search(tags: String) { 2 val uri = createFlickrUri(tags) 3 launch(coroutineExceptionHandler) { 4 val runningJobs = 5 readState(KEY_ACTIVE_SEARCHES) as Int? ?: 0 6 maybeDoNext(TRIGGER_SEARCH, true) { 7 updateState(KEY_ACTIVE_SEARCHES to runningJobs + 1) 8 getImages(uri) 9 } 10 } 11 } Listing 7: Implementation of the search method: the code on lines 5-8 is only used to manually inform the viewmodel of a new search operation being initiated, with invocations of the methods readState and updateState to respectively read and change the internal state of the monitor. 1 fun search(tags: String) { 2 val uri = createFlickrUri(tags) 3 launch(coroutineExceptionHandler) { 4 getImages(uri) 5 } 6 } Listing 8: “Ideal” implementation of the search method where the boilerplate code featured in Listing 7 is hidden. 3.3 Second Approach: Using Annotations to Generate Compile-time Monitors One of the most popular ways to instrument classes in Android is, as mentioned, the Aspect-Oriented paradigm. Kotlin, however, introduces a functional approach to writing Android code which can sometimes be used instead of AOP [55]. We decided to investigate this situation to determine in which cases a functional approach can help us better define and monitor a given property. Kotlin annotations are, like in Java, used for attaching metadata to blocks of code: this is then processed by the compiler which, based on the annotation, can run additional checks (such as the Java @Override ensuring that a method is actually overriding something), relax some constraints (e.g. @SuppressWarnings hiding some warning messages) or generate some specific code, as is the case with the AspectJ compiler AJC. With these premises, we looked into what we could actually employ annotations for: • generate boilerplate instead of needing the user to move through the cumber- some API of the hardcoded implementation seen earlier: this, however, was not something we strictly needed annotations to do as the previously mentioned 37 3. Prototype for a Monitoring Tool functional approach could abstract away most of the unnecessary code; • employing AJC to weave monitors in our Kotlin code: despite sounding promising, this solution did not really employ the unique features of the lan- guage and used instead its interoperability with Java; this essentially meant that we would be weaving monitors into the compiled JVM classes as opposed to the actual code written by the user, so we opted not to use this solution, either. We decided to forego this approach and determined two other possible imple- mentations for our tool: (a) extending the CoroutineScope to include monitors; (b) extending the lifecycle components (such as the MonitoredViewModel of the initial implementation) and carry out the instrumentation inside of them. Solution (a) seemed like an elegant solution as that would allow us to execute any verification in a way that would be transparent to the end user. Unfortunately this solution came with the drawback of making it harder for us to spawn a coroutine with a block of code of type suspend MonitoredCoroutineScope.() -> T (where T can be any Kotlin type) using the existing infrastructure. Solution (b) was, on the other hand, in contrast with the Kotlin guideline of defining every coroutine builder as an extension of CoroutineScope, because it would put a different class (be it LifecycleOwner or ViewModel) between the scope and the builder function. Our final solution was therefore a blend of the two ideas: we implemented a MonitoredComponent interface, containing the monitoring API, and then overloaded the builder functions launch and async to include the component as an argument. It is described in the next chapter. 38 4 Implementation of an API for Monitoring Kotlin Coroutines Our experiences with the approaches presented in chapter 3 made it clear for us what we wanted to achieve: an API that would take away the boilerplate and still monitor the points specified in 3.1.5, following the Kotlin coding guidelines of tying coroutine builders to the CoroutineScope class. Our API should not replace the existing one for coroutines, so making new functions (e.g. “monitoredAsync” or “launchWithMonitor”) was out of scope. Similarly, we did not want to add a number of extra steps that the programmer should take when defining monitors, since that was one of the drawbacks of the hardcoded model as discussed in section 3.2.3.1. The result was a new interface called MonitoredComponent, which is examined in this chapter. Section 4.1 describes our implementation for the interface and all classes related to it; section 4.2 details how we refactored the implementation to improve its API and move it closer to our research goal. 4.1 The Interface MonitoredComponent and its API We thought that it would be best to stay as close as possible to the existing tools: overloading [64] the existing methods launch and async was deemed the most appro- priate course of action. As shown in Listing 9, we added the MonitoredComponent class as an argument so as to use it “behind the curtains” to observe all points described earlier in the processes of starting and terminating tasks on coroutines. 4.1.1 First Version of the API Our first goal was to move away as much as possible from the hardcoded architecture. We thus decided to employ class inheritance to abstract away from the previous concept of a viewmodel handling the states of both the UI and of the monitoring system: it was definitely confusing, since two different concepts of state (one referred to the Android app and one to its compliance with our properties) could easily be misinterpreted, and it prevented us from working on the Activity class, which is a lifecycle-aware component in and of itself, with a different lifecycle than the ViewModel’s (see Figure 3.1 for reference). 39 4. Implementation of an API for Monitoring Kotlin Coroutines fun CoroutineScope.launch( component: MonitoredComponent, context: CoroutineContext = EmptyCoroutineContext, block: suspend CoroutineScope.() -> Unit ): Job fun CoroutineScope.async( component: MonitoredComponent, context: CoroutineContext = EmptyCoroutineContext, block: suspend CoroutineScope.() -> Unit ): Deferred Listing 9: Our version of the coroutine builder methods: the signature is thus the same as the standard one for CoroutineScope.launch [56] and CoroutineScope.async [57] with the sole addition of the MonitoredComponent as an argument. val launchedTasks: ArrayList val recommendedDispatchers: HashMap val defaultHandler: CoroutineExceptionHandler val monitoredApplication: MonitoredApplication? Listing 10: Declarations for the internal data structures in the MonitoredComponent interface. Our MonitoredComponent interface was therefore supposed to be implemented by both the Activity and ViewModel classes, not to mention any other class the de- veloper should see fit. It needed to hold records of the tasks that had started, as well as their dispatchers and exception handlers. We defined the following data structures: • launchedTasks, an ArrayList that kept track of all the tasks launched with the current MonitoredComponent as argument; • recommendedDispatchers, a HashMap storing the “best” coroutine dispatcher to use with each task; • defaultHandler, a CoroutineExceptionHandler that should be inserte