- PostImproving the Accuracy of FFT-based GPGPU Ocean Surface Simulations(2022)In this paper, we explore how the current state of the art in real-time ocean simula tions can be improved in terms of simulation accuracy, while preserving performance. Current methods, both in academia and in the industry, simulate an ocean model in frequency space on the GPU, convert said model on an approximately frame-by frame basis to the spatial domain using the Fourier transform, and then read back the resulting heightfield to the CPU as input to the application’s physics engine. We propose a fully GPU-based simulation framework that eliminates these GPU readbacks, successfully eliminating the latency-induced simulation errors present in current solutions, while preserving both ocean interactivity and performance. Along this report we also present a prototype of our framework as an Unreal Engine project. From comparing our proposed framework with the current state of the art, we find: • a significant correction in simulation accuracy of boats and their wakes; • near-equivalent GPU performance and improved CPU performance; • the need to rewrite certain physics behaviors for the GPU that are commonly available as built-in functionality in modern CPU-based physics engines; • an arguably more complicated implementation. We conclude that the errors are significant enough to consider in related work and that the proposed approach is worthwhile investigating further in future work. The prototype code is available at: https://github.com/NeonSky/master-thesis
- PostAutomated Penetration Tester in a Telecommunication Network(2022)In the modern world of networks, there are a plethora of vulnerabilities present in every possible part of software and hardware. Companies can never claim that their product or service is secure, it is impossible to prove. With this, malicious actors can exploit the system to their advantage gain information or capital, and disrupt the service. This poses a threat to organizations and users since confidential information could be compromised. To prevent vulnerabilities in systems, penetration testing is implemented: ethical hackers looking for exploits that can later be patched to secure the system. Penetration testing is a manual task utilizing automated tools to speed up repetitive work to focus on other parts that demand creativity or human intuition. There is a vast amount of tools that contribute to improving testing. Many of the tools are designed to work against one host at a time and only hosts directly connected to the tool host. There are relevant studies on automating penetration testing, an example is with AI agents learning vulnerabilities and exploiting them have been successful. There is also relevant research in enabling agents to spread to multiple nodes performing actions controlled by a master, mimicking distributed attack patterns closer to human behavior. This paper aims to develop an automated penetration tester with the ability to perform tests on nodes indirectly to enable widespread testing on multiple machines. The goal with this is to increase testing and allow usability. To test this we have developed a proof of concept, a modular tool named Hinser, capable of performing attacks on targets from an intermediate host relaying executions sent from the tool host. This includes: gathering information about a target; scanning a target internally and externally with known tools to analyze vulnerabilities; exploiting the target; returning successful results; creating regression tests for future testing. Hinser was successful at the tasks and could perform indirect testing against the targets.
- PostStatistics Monitor Design for Data Flow and Performance Analysis of an AMBABased SoC System(2022)Today’s advanced system-on-chip (SoC) contains multiple intellectual properties (IPs) and technology with billions and billions of transistors all packed in an ultrasmall form factor. All of it needs to perform flawlessly meeting demanding power and performance goals on tight schedules. Hence the complexity of SoC is sharply increasing. However, the performance of the system is not scaling linearly with the number of gate count. Henceforth, understanding the internal, dynamic behavior and having a constructive utilization of resources is critical in SoC design. In this thesis, we present a statistics monitor which is capable of monitoring data flow and performance metrics of AMBA-based SoC systems. The study considers different performance parameters such as system-level throughput, latency, bus efficiency, etc. The statistics monitor outputs such statistics data. The data obtained from the monitor unit provide insights into the SoC design, by assisting in the detection of performance bottleneck of the system.
- PostLow latency video analytics system with multi-exit neural networks(2022)Computer vision-based control systems have become increasingly powerful and promising in tackling real-world problems. This can be accredited to the use of deep learning methods in these systems with state-of-the-art performance sometimes outperforming humans in tasks which require subjective decision making. This has resulted in increased interest in these systems from Swedish industry, including Volvo. One example system where these systems are used is the Volvo GPSS system, where semantic segmentation is used to perform real-time decisions based on pixel level classification of a monitored area. However, such systems frequently deal with a trade-off between latency and accuracy. This is primarily due to the increasing number of model layers being used to develop Deep-Neural-Network models for vision systems, resulting in equal resource utilization regardless of input complexity. In this thesis, we develop an approach that employs input adaptive multi-exit strategy to exploit latency benefits of dynamic processing based on the input complexity. The proposed approach aims to have a reduced average inference time as the simple input samples takes an early exit and only the complex samples need more computation offered by all the model layers. The open source CityScapes dataset and the Volvo dataset were used in a number of multi-exit semantic segmentation experiments with HRNet architecture chosen as the backbone. The thesis work studies three novel exit strategies, including reinforcement learning, auxiliary models, and fast Fourier transform. Out of all the methods examined, the reinforcement learningbased exit strategy displayed the best performance advantages, with accuracy on par with unbranched HRNet and a significant decrease in latency and computation.
- PostTraffic isolation techniques for Networks-on-Chip(2022)As the number of cores available on modern multiprocessor Systems-on-chip increases, the traditional bus interconnection fails to provide enough scalability to handle the increased network load. To handle these shortcomings, an interconnection network, called Network-on-chip, can be used to provide better performance and scalability to the number of cores, supporting simultaneous transmission of multiple messages from different cores. However, there are some security vulnerabilities in this type of network. The network can be overloaded, potentially preventing critical applications to communicate properly, which can by achieved by an attacker performing a denial-of-service attack. Attackers can also potentially deduce the contents of network traffic based on fluctuations in response latencies, known as timing side-channel attacks. By isolating traffic flows, the potential impact of these problems can be reduced. This thesis presents a network-on-chip featuring three techniques that provide the user with tools to isolate traffic flows. The three techniques are (1) source throttling, (2) fixed virtual channel allocation per traffic flow, and (3) fixed timeslots for the switch allocator. Source throttling can be used to limit the traffic injection rate of problematic nodes. By statically allocating virtual channels to high-priority flows, packets belonging to these flows can be given contention-free access to resources of the NoC. Finally, schedulable switch allocator timeslots prevent malicious nodes from using timing information to find out when and what a node is transmitting. Through simulation, the different techniques’ effectiveness in protecting against attacks is evaluated. The results show that source throttling can provide protection against denial-of-service attacks with few aggressor nodes but cannot protect against timing side-channel attacks. Fixed allocation of virtual channels effectively protects against denial of service attacks, even with many aggressor nodes, but does not provide protection against timing side-channel attacks. Separate switch allocator timeslots are not effective on their own, but by combining fixed virtual channel allocation with separate switch allocator timeslots, protection against timing side-channel attacks is shown to be possible.