Reconstructing Private Data from Trained Models

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

This thesis investigates whether Model Inversion (MI) attacks can be effectively adapted to tabular data—a domain where risks are underexplored compared to the image modality. To address this question, we propose a novel adaptation of the Pseudo-Label Guided Model Inversion (PLG-MI) attack for tabular data by utilizing a Conditional Tabular Generative Adversarial Network (CTGAN). In support of this contribution, new evaluation metrics are proposed—most notably, class-level column shape scores—which serve to measure the similarity between reconstructed and original private data. These metrics offer a practical means to evaluate the privacy risks posed by inversion attacks in the tabular setting. As an initial step, we have reproduced the PLG-MI attack for images, and verified that the attack is robust even on target models trained on unbalanced private data sets. Then, by applying the adapted tabular attack to a deep neural network diagnosis classifier trained on the MIMIC-IV clinical dataset, we demonstrate that sensitive features can be recovered with high accuracy. This shows that MI attacks can generalize to the tabular domain; with default hyperparameters and minimal tuning, our method recovers sensitive features with high accuracy. We have identified that the use of transformations between structured and unstructured data, as well as the common use of tree-based models in the tabular domain, can prevent adversarial gradient access, thereby limiting the applicability of white-box model inversion attacks to specific scenarios. Overall, our results confirm that model inversion attacks pose a real privacy threat in the tabular domain while also clarifying the technical boundaries that define when such attacks are viable. Our work will be made available as a part of the LeakPro repository: github.com/aidotse/leakpro

Beskrivning

Ämne/nyckelord

Model Inversion Attacks, Reconstruction Attacks, Adversarial Machine Learning.

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced