Balancing Performance and Memory
| dc.contributor.author | LIU, YI | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
| dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
| dc.contributor.examiner | Gulisano, Vincenzo | |
| dc.contributor.supervisor | Gulisano, Vincenzo | |
| dc.date.accessioned | 2026-01-16T09:21:08Z | |
| dc.date.issued | 2025 | |
| dc.date.submitted | ||
| dc.description.abstract | Stream processing has become a cornerstone of real-time data analytics, particularly in edge-to-cloud computing environments where timely insights are crucial. Stream Aggregates, as fundamental stateful operators, maintain windowed state to compute summaries over continuous data streams. However, in resource-constrained edge deployments, managing memory efficiently while maintaining high performance remains a significant challenge. Recent work has shown that compressing infrequently accessed window instances can reduce memory usage, but existing approaches rely on fixed, manually configured thresholds-parameters that are difficult to tune without prior knowledge of data characteristics such as reporting frequency and access patterns. Moreover, these methods often overlook the impact of different compression libraries on system behavior. To address these limitations, this thesis presents a study on adaptive memory compression for stream aggregates. We first evaluate the performance trade-offs of three widely used compression libraries-Snappy, Zstandard (Zstd), and JZlib-within a stream processing context. Our results show that each library offers a distinct balance between compression efficiency and processing overhead, with Snappy providing superior throughput and latency at the cost of lower memory savings, while Zstd achieves better compression at the expense of higher CPU cost. Building on these findings, we propose a dynamic, self-tuning mechanism that automatically adjusts the compression threshold D based on runtime feedback. Instead of requiring analysts to specify a fixed D, our approach allows them to define a target range for a performance metric-such as the non-compressed/compressed (n/c) ratioand the system adapts D online using simple adjustment rules. This enables robust and predictable compression behavior under varying workloads, without requiring expert knowledge. We implement and evaluate our approach using the Liebre stream processing engine and the Linear Road benchmark. Experimental results demonstrate that our adaptive mechanism effectively stabilizes the n/c ratio within user-defined bounds, with negligible performance overhead compared to an optimally tuned fixed-D configuration. The dynamic strategy proves resilient to workload fluctuations and configuration resets, making it suitable for real-world, unpredictable environments. | |
| dc.identifier.coursecode | DATX05 | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12380/310908 | |
| dc.language.iso | eng | |
| dc.setspec.uppsok | Technology | |
| dc.subject | stream processing | |
| dc.subject | aggregate | |
| dc.subject | memory compression | |
| dc.title | Balancing Performance and Memory | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Computer science – algorithms, languages and logic (MPALG), MSc |
