Chalmers Open Digital Repository
Välkommen till Chalmers öppna digitala arkiv!
Här hittar du:
- Studentarbeten utgivna på lärosätet, såväl kandidatarbeten som examensarbeten på grund- och masternivå
- Digitala specialsamlingar, som t ex Chalmers modellkammare
- Utvalda projektrapporter
Forskningspublikationer, rapporter och avhandlingar hittar du i research.chalmers.se
Enheter i Chalmers ODR
Välj en enhet för att se alla samlingar.
Senast inlagda
Reasoning about Mutability in Graded Modal Type Theory
(2025) Marozas, Julian
Pure functional programming enables easier maintainability, parallelism, and reasoning about programs. However, mutable state has historically been at odds with the functional paradigm. Linear types provide a way to safely integrate mutable state into functional programming languages. This thesis explores the intersection of functional programming and mutable state, focusing on the challenges and innovations surrounding mutable arrays in languages with linear or uniqueness types. We present a partial formalization of a graded lambda calculus with array primitives. Graded modal types are used in an attempt to show that efficient mutable operations are safe. We tried to prove bisimilarity between copying and mutable operational semantics, but the proof is not complete.
Physical Space in Reconfigurable Interacting Systems
(2025) De Ridder, Tom
In this thesis, we provide the expressive LaC process calculus and accompanying semantics for modeling multi-agent systems (MAS). LaC-calculus is an extension of R-CHECK. It allows for modeling of MAS inside a discrete physical space environment, which influences the behavior of the systems inside. In addition, it proves three different communication methods (broadcast, multicast and unicast) and the movement of systems inside the physical space. The calculus consists of three levels: the environment, the system and the process level. The systems can self-organize tasks, communication and movement based on their configurations. The environment influences the behavior of the systems as well, by blocking movement that cannot be made and by providing local communication methods.
Design and Implementation of an AMBA CHI-Compliant Snoop Cache Coherence Controller
(2025) Gao, Weihan; Cui, Yuxuan
In a multi-processor system, efficient cache coherence mechanisms are important for ensuring that data in every cache remains up-to-date across different cores. The AMBA Coherent Hub Interface (CHI) is a high-performance, scalable protocol designed by ARM to address the challenges of modern system-on-chip (SoC) architectures. This thesis presents the design and implementation of a snoop cache coherence controller using the AMBA CHI protocol. The snoop cache coherence controller is not only to ensure data consistency among the processors but also to reduce the network traffic through the snoop filter in the controller. In this thesis, we designed and implemented a cache coherence controller in hardware description language (HDL), and we used a multi-processor simulator named Multi- CacheSim and SPLASH-3 benchmark to model and test two kinds of snoop filters, counting stream register and cache-like snoop filter, and evaluate their message filter rate which represents the performance in snoop traffic reduction. The results demonstrate that the snoop-based CHI-compliant coherence controller can effectively maintain cache coherence in a multi-processor system based on CHI architecture. Additionally, the cache-like snoop filter can reduce network traffic. By comparing the results of the snoop filters, we can conclude that in most cases, the cache-like snoop filter performs better than the snoop filter based on a stream register. However, both have their advantages, with each performing better under certain circumstances.
Improving the Quality of Experience in Real-Time Communication Systems through Data-Driven Bandwidth Estimation with Deep Reinforcement Learning
(2025) Xu, Wen
Real-Time Communication (RTC) systems have become increasingly popular, with accurate bandwidth estimation being a critical factor in ensuring Quality of Experience (QoE) for end users. Traditional probe-based and model-based methods for bandwidth estimation have limitations, such as introducing additional overhead or relying on assumptions that may not hold in dynamic network conditions. Datadriven approaches, particularly those using machine learning techniques, have shown promise but may require substantial amounts of labeled data and struggle to adapt to changing network conditions. In this thesis, we propose an offline deep reinforcement learning (DRL) approach for bandwidth estimation in RTC applications. Our method leverages historical network data to train an agent that learns an optimal bandwidth estimation policy without the need for explicit probing or labeled data. This approach expects to offer improved adaptability to dynamic network conditions, reduced overhead, and enhanced accuracy compared to traditional and data-driven methods. We evaluate the performance of our proposed method across various network scenarios. The results reveal valuable insights and highlight the potential of offline DRL for achieving reliable bandwidth estimation in RTC applications. To accommodate reproducibility, we have made our source code publicly available1.
Enabling Energy Efficient Training for AI Algorithms by Controlling Resource Allocation
(2025) Blade, Emelie; Kontola, Samuel
Training deep learning models typically involves large-scale computations that require significant energy resources, making the process both costly and environmentally unsustainable. One reason for this that the default strategy of using high frequencies during deep neural network training. However, the various layers in a deep learning network have varying computational and memory access patterns, leading to potential mismatches and bottlenecks. The purpose of this thesis was to address this challenge by exploring resource allocation strategies that can reduce the energy consumption on a fine-grained level when training CNNs on GPUs. The research focuses on predicting the computational and memory demands of different deep network layers, and creating appropriate execution strategies to reduce energy consumption by reducing idle times of compute and memory units. These resource allocation strategies are based on both analysis of arithmetic intensity as well as exhaustive searches, allocating the appropriate resources by adjusting the compute and memory clock frequency combinations of each layer. This thesis demonstrates that resource allocation strategies can potentially reduce energy consumption during deep learning training. This was analysed for two deep learning models, ResNet50 and VGG16, on two different GPUs, NVIDIA RTX A4000 and NVIDIA RTX 2000 Mobile. For full training executions using our execution strategies, there were no significant improvements to the energy efficiency that did not increase the execution time. With a slight increase in execution time, one strategy achieved moderate energy savings. Focusing on the forward propagation phase there was improved results. The same strategy yielded execution times comparable to the default, in some cases even better, with moderate energy savings. If users are willing to sacrifice some performance, another execution strategy achieves a significant reduction in energy consumption with only a slight increase in execution time.