Industrial Video Anomaly Detection Using a Weakly Supervised Predictive Autoencoder
dc.contributor.author | Dittmer, Petter | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för elektroteknik | sv |
dc.contributor.examiner | Durisi, Giuseppe | |
dc.contributor.supervisor | Pettersson, Victor | |
dc.contributor.supervisor | Berntsson, Björn | |
dc.date.accessioned | 2025-06-18T08:42:49Z | |
dc.date.issued | 2025 | |
dc.date.submitted | ||
dc.description.abstract | In this thesis, a predictive autoencoder model and video data pipeline are formulated for detecting anomalies in production flow in industrial environments. Common modifications to deep neural networks, such as spatial attention blocks, dropouts and skip connections, are investigated to assess whether they affect the overall anomalydetection performance for the intended industrial scenario. The project was done in cooperation with the company EyeAtProduction AB in Borås, Sweden. The model is designed for flexibility, robustness and short training times in new environments, rather than state of the art performance. It uses a pretrained version of the image recognition network ResNet-18 for encoding sequences of four video frames. The encoded frames are merged with a 1 × 1 convolution operation, and then decoded via transposed convolutions, resulting in a prediction of the next frame following the sequence. By only training the network on footage of normal production, it will become proficient at predicting normal movement and spatial features, but struggle to reconstruct anomalous sequences and objects. Anomalies can therefore be detected based on the degree of error between the prediction and the true next frame. The models show promise in both controlled environments and real-world cases, but even with heavy data augmentation they are still sensitive to lighting changes and vibrations in the camera, making them prone to false positives. More research would need to be done to minimize this problem further, but possible solutions could be collecting larger and more diverse training sets, and making the threshold adapt to the long term shifts in the prediction scores during inference. | |
dc.identifier.coursecode | EENX30 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/309516 | |
dc.language.iso | eng | |
dc.relation.ispartofseries | 00000 | |
dc.setspec.uppsok | Technology | |
dc.subject | Anomaly detection | |
dc.subject | autoencoder | |
dc.subject | convolutional network | |
dc.subject | ResNet | |
dc.subject | production processes | |
dc.title | Industrial Video Anomaly Detection Using a Weakly Supervised Predictive Autoencoder | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Engineering mathematics and computational science (MPENM), MSc |