Visar 1 - 5 av 1623
- PostHardware Acceleration of Machine Learning(2023) Chen, Fangzhou; Sköld, William; Chalmers tekniska högskola / Institutionen för data och informationsteknik; Chalmers University of Technology / Department of Computer Science and Engineering; Petersen Moura Trancoso, Pedro; Petersen Moura Trancoso, PedroThe Transformer architecture has been widely used in various fields, as demonstrated by GPT-3, a large language model that shows impressive performance. However, achieving such excellent performance requires high computational capabilities. Therefore, improving the computational power of current machine learning systems is of great importance. This thesis aims to optimize and accelerate fine-tuning of Transformer-based models while taking into account several evaluation criteria, such as training time, energy consumption, cost, and hardware utilization. Additionally, a comparison is made between GPU training settings and specialized AI accelerators, such as TPU training settings. In our study, a high-performance kernel for the Adan optimizer was introduced, and the LightSeq library is applied to accelerate existing Transformer components. We also introduce mixed precision training into our workflow and compare all these optimization techniques step by step with baseline performance. In addition, our analysis includes distributed training with multiple GPUs, and a backpropagation time estimation algorithm is introduced. Next, Google’s TPU accelerator is used to run our task, and its performance is compared to the similar GPU setup used in our study. Finally, the advantages and disadvantages of different methods are systematically analyzed, while training on V100, A100, A10 and T4 with different configurations. Meanwhile, the workflow between GPUs and TPUs is analyzed, illustrating the pros and cons of different accelerators. Various weights for measuring optimization methods based on time, energy consumption, cost, and hardware utilization are proposed. Our analysis shows that optimal scores in all metrics can be achieved by implementing the optimized LightSeq model, kernel fusion for the Adan optimizer, and enabling mixed precision training. While training with TPU offers certain advantages, such as large batch sizes when loading training data, the ease of use, reliability, and software stability of GPU training surpasses that of TPU training.
- PostFactors affecting the migration of a large embedded system(2023) Johansson, Linnea; Xu, Wanting; Chalmers tekniska högskola / Institutionen för data och informationsteknik; Chalmers University of Technology / Department of Computer Science and Engineering; Horkoff, Jennifer; Hebig, Regina; Leitner, PhilippMigrating a small project can be a hard thing to do. Migrating the code at a large company is even more challenging. In this thesis, the focus is on trying to find common elements in how a large automotive company migrates their embedded code. We interviewed ten people from various teams to find common topics that make it harder or easier to migrate. We also developed an architecture recovery tool called GRASS to look at metrics such as fan-in and fan-out of the source code made of C code. During the latter part of spring, we held a focus group with the people interviewed. By combining these data, we found that it is harder to migrate code if a team has dependencies on many other teams, or if there are hard dependencies on the supplier’s code. GRASS was able to identify these aspects, but the interviewees thought that GRASS needed improvements in order to be useful to them in the migrating process. Additionally, the hardware aspect of an embedded system can make it harder if the hardware is limited in capacity, or if there are real-time requirements that make latencies in the system unacceptable. Lastly, we found that some teams had small parts that they were able to automate, and these automation scripts might be useful to others.
- PostTrust in Lightweight Virtual Machines: Integrating TPMs into Firecracker(2023) Parkegren, Alexandra; Veltman, Melker; Chalmers tekniska högskola / Institutionen för data och informationsteknik; Chalmers University of Technology / Department of Computer Science and Engineering; Ali-Eldin Hassan, Ahmed; Morel, VictorDue to the rise of service-based software products, cloud computing has seen significant growth in recent years. When software services use cloud providers to run their workloads, they place implicit trust in the cloud provider, without any explicit trust relationship. One way to achieve such explicit trust in a computer system is to use a hardware Trusted Platform Module (TPM), which is a coprocessor for secure cryptographic functionality. However, in the case of managed platform-as-a-service offerings, there is currently no provider exposing the trusted computing capabilities of a TPM. The main goal of this project is to enable system designers to improve trust by providing access to a TPM within a cloud-based environment. This was achieved by integrating a TPM device into the Firecracker hypervisor, originally developed by Amazon Web Services. In addition to this, multiple performance tests along with an attack surface analysis were performed to evaluate the impact of the changes introduced. The results show a significant performance impact; however, by using a resource pool, they could be partially mitigated. The analysis of the attack surface shows that there is no major change in the Firecracker hypervisor itself. However, the attack surface is extended by allowing cloud users to communicate with a TPM. Therefore, we discuss the impact and possible mitigations of the increased attack surface. Then we describe what it takes for a cloud service provider to offer trusted computing capabilities to its customers. Lastly, we conclude that the slight performance decrease along with the attack surface increase should be acceptable trade-offs in order to enable trusted computing in platform-as-a-service offerings.
- PostMeasurement Error Simulation for Lidar(2023) Bayraktaroglu, Sena; Chalmers tekniska högskola / Institutionen för data och informationsteknik; Chalmers University of Technology / Department of Computer Science and Engineering; Assarsson, Ulf; Sintorn, ErikAutonomous vehicles require plenty of testing and validation, and simulation environments are a good choice for validation since they enable the testing of multiple scenarios. One of the most common sensors in the field is the Lidar sensor, which allows the retrieval of 3D information from the environment, often used with other sensors and cameras. Measurement errors become important when using a Lidar sensor with other sensors. However, these errors are not correctly included or modeled in most simulation environments. This thesis work aims to explore how measurement errors can be included in the simulation environment. A model was developed to find the error’s standard deviation depending on the target material’s reflectivity and distance and applied with the Weierstrass fractal function to model the repetitive behavior of the error. Results show that Weierstrass fractal function was appropriate for modeling the repetitive behavior. The model developed for the standard deviation of the error was working well in close distances, while further experimentation was needed for long ranges.
- PostRequirements Grounded MLOps - A Design Science Study(2023) Bastajic, Milos; Boman Karinen, Jonatan; Chalmers tekniska högskola / Institutionen för data och informationsteknik; Chalmers University of Technology / Department of Computer Science and Engineering; Heyn, Hans-Martin; Horkoff, JenniferThe use of Machine learning (ML) has increased significantly in recent years, however, organizations still struggle with operationalizing ML. In this thesis, we explore the intersection between machine learning operations (MLOps) and Requirements engineering (RE) by investigating the current best practices, challenges, and potential solutions associated with developing an MLOps process. The goal of this thesis was to create an artifact that would guide MLOps implementation from an RE perspective, resulting in a more systematic approach to managing ML models in production by identifying and documenting the goals and objectives. The study adopted a Design Science Research methodology, which comprised investigating three research questions while the design artifact was being created in parallel. The research questions examined the difficulties currently faced in creating an MLOps process, identified potential solutions to these difficulties, and assessed the effectiveness of these solutions. The study was conducted in three cycles, with each cycle answering all research questions but focusing mainly on one specific question, allowing for the initial creation and subsequent refinement of the artifact based on data collected during each cycle. By establishing a more thorough understanding of how the two domains interact and by offering practical guidance for implementing MLOps processes from a RE perspective, this study advances both the MLOps and RE fields. Quality feedback was collected on the artifact in the form of theoretical evaluations. However, the main shortcoming of the study is the lack of evaluation of the artifact’s effectiveness under real-world conditions. Therefore, a recommendation for further research is to conduct case studies testing the artifact in real-world settings to evaluate its effectiveness and improve upon its limitations.