Finding Needles in the Haystack - A CEP Approach to Detect Recurring Grid Issues
| dc.contributor.author | Larsson, Erik | |
| dc.contributor.author | Ngo, Josef | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
| dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
| dc.contributor.examiner | Massimiliano Gulisano, Vincenzo | |
| dc.contributor.supervisor | Massimiliano Gulisano, Vincenzo | |
| dc.date.accessioned | 2025-10-28T14:26:11Z | |
| dc.date.issued | 2025 | |
| dc.date.submitted | ||
| dc.description.abstract | The digitalization of electricity grids through the Advanced Metering Infrastructure (AMI) has led to unprecedented volumes of data and automated event generation from smart meters. While this enhanced monitoring capability provides valuable insights into grid conditions, the high frequency of generated events creates chal lenges for utility companies who must distinguish between routine fluctuations and genuine operational issues. This thesis investigates the application of Complex Event Processing (CEP) to im prove anomaly detection in smart meter data by correlating meter-generated events (e.g., overvoltage or physical tampering) with time-series measurement data (e.g., electricity consumption or voltage). Using real-world datasets from Göteborg En ergi’s AMI system three CEP-based queries were implemented in Apache Flink: A query that detects consumption anomalies following physical tampering events (terminal cover dismounted), an overvoltage query that identifies sustained voltage problems rather than momentary spikes, and a duplicate timestamp query that re veals systematic data collection issues. The terminal cover dismounted query achieved filtering ratios between 0.46% and 1.00%, reducing 1,413 events to 6-13 actionable anomalies with 46-50% precision. The overvoltage query demonstrated filtering ratios from 0.006% to 1.44%, effectively reducing 98,547 events to manageable numbers for analyst review. The duplicate timestamp query discovered a previously unknown systematic issue affecting nearly 269,000 meters. The queries were evaluated using throughput, CPU usage, memory usage, and la tency. Performance evaluation demonstrates strong scalability characteristics with processing throughput of approximately 400,000 tuples per second, significantly ex ceeding the estimated production data flow of 4,000 tuples per second. The system maintained consistent CPU usage and conservative memory requirements (∼10GB peak), supporting practical deployment with resources available at utility compa nies. Latency evaluation for the terminal cover dismounted query and the duplicate timestamp query showed a median between 21 and 50 seconds while the overvoltage query had an increased median between 230 and 270 seconds with outliers above 2600 seconds. These findings show that CEP can enhance anomaly detection in AMI systems, enabling automated correlation of events and measurements while reducing false positives and analyst workload. The approach shows promise for grid monitoring applications and provides a foundation for more sophisticated anomaly detection v systems using CEP. Future research can investigate ways to further minimize latency, enabling operators to detect and respond to anomalies more promptly. | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12380/310682 | |
| dc.setspec.uppsok | Technology | |
| dc.subject | Complex Event Processing | |
| dc.subject | Stream Processing | |
| dc.subject | Smart Grid | |
| dc.subject | Advanced Metering Infrastructure | |
| dc.subject | Anomaly Detection | |
| dc.subject | Apache Flink | |
| dc.title | Finding Needles in the Haystack - A CEP Approach to Detect Recurring Grid Issues | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Computer systems and networks (MPCSN), MSc |
