Security Log Analysis with Explainable Machine Learning
Examensarbete för masterexamen
Physical access control systems are implemented to restrict access in order to prevent attacks from happening in the physical space. These systems usually produce access logs that contain information to track accesses made by users. The access logs can however end up becoming large and difficult to interpret, making security assessment impractical for administrators and as a consequence, the logs are rarely inspected. The current method of detecting anomalies by manual inspection is often not a feasible approach in preventing attacks. For this reason, anomaly detection using machine learning is a method that can aid administrators in detecting attacks and being able to proactively prevent them from happening again. In this thesis, we first analyze users from a dataset of physical access logs and cluster them into groups with similar behavior based on their access pattern. Next, we train two LSTM autoencoder models for each cluster in order to detect anomalies of two different access sequence lengths. Finally, we evaluate the model with the help of a security expert from the industry by reviewing explanations produced using SHAP values. The results in this thesis show that our method was able to reduce the number of log events that need to be manually inspected by 95.6% in the given dataset. The results also show that the explanations provided by SHAP values was able to help in understanding what caused an anomaly. In conclusion, our proposed method is advantageous compared to manual inspection as it greatly reduces the amount of work required to detect anomalies, and the SHAP values are able to help security administrators to work in a more proactive manner.
security , physical access control , anomaly detection , machine learning , deep learning , LSTM autoencoder , explainability , SHAP