Reconstructing Private Data from Trained Models
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This thesis investigates whether Model Inversion (MI) attacks can be effectively
adapted to tabular data—a domain where risks are underexplored compared to the
image modality. To address this question, we propose a novel adaptation of the
Pseudo-Label Guided Model Inversion (PLG-MI) attack for tabular data by utilizing
a Conditional Tabular Generative Adversarial Network (CTGAN). In support
of this contribution, new evaluation metrics are proposed—most notably, class-level
column shape scores—which serve to measure the similarity between reconstructed
and original private data. These metrics offer a practical means to evaluate the
privacy risks posed by inversion attacks in the tabular setting. As an initial step,
we have reproduced the PLG-MI attack for images, and verified that the attack is
robust even on target models trained on unbalanced private data sets. Then, by
applying the adapted tabular attack to a deep neural network diagnosis classifier
trained on the MIMIC-IV clinical dataset, we demonstrate that sensitive features
can be recovered with high accuracy. This shows that MI attacks can generalize to
the tabular domain; with default hyperparameters and minimal tuning, our method
recovers sensitive features with high accuracy. We have identified that the use of
transformations between structured and unstructured data, as well as the common
use of tree-based models in the tabular domain, can prevent adversarial gradient
access, thereby limiting the applicability of white-box model inversion attacks to
specific scenarios. Overall, our results confirm that model inversion attacks pose a
real privacy threat in the tabular domain while also clarifying the technical boundaries
that define when such attacks are viable. Our work will be made available as
a part of the LeakPro repository: github.com/aidotse/leakpro
Beskrivning
Ämne/nyckelord
Model Inversion Attacks, Reconstruction Attacks, Adversarial Machine Learning.