Is This Data Point In your Training Set? Similarity-based Inference Attacks: Performance Evaluation of Range Membership Attacks to Audit Privacy Risks in Machine Learning Models

Nawrin Nova, Sifat; Shubham, Saha

Is This Data Point In your Training Set? Similarity-based Inference Attacks: Performance Evaluation of Range Membership Attacks to Audit Privacy Risks in Machine Learning Models

dc.contributor.author	Nawrin Nova, Sifat
dc.contributor.author	Shubham, Saha
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Rhouma, Rhouma
dc.contributor.supervisor	Duvignau, Romaric
dc.date.accessioned	2025-12-03T14:59:43Z
dc.date.issued	2025
dc.date.submitted
dc.description.abstract	Membership Inference Attacks (MIAs) pose a serious threat to the privacy of machine learning (ML) models by determining whether a specific data point was used during model training. A recent and powerful variant, the Range Membership Inference Attack (RaMIA), assesses privacy risks over a range of semantically similar data points. However, its practical application is limited by a high query overhead, as it requires querying the target model for every sample in the range. This thesis proposes and evaluates a novel approach designed to overcome this limitation by combining range queries with group testing principles to reduce the number of queries sent to the target model without losing the attack performance and making the attack more stealthy. Instead of testing every sample, this method first groups similar data points based on their extracted features and then queries only a small number of strategically chosen representatives. All the experiments are conducted on the CIFAR-10 dataset, comparing its performance against the standard RaMIA baseline. The results demonstrate that RaMIA with group testing successfully reduces the number of queries by 84% in a setting of 50 augmentations. This work reveals that even minor enhancements in query design and decoding strategy can lead to substantial gains in auditing. Moreover, we provide practical recommendations for tuning key hyperparameters and integrate our attack into the LeakPro framework for reproducibility and broader adoption in privacy auditing of ML models.
dc.identifier.coursecode	DATX05
dc.identifier.uri	http://hdl.handle.net/20.500.12380/310799
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Machine Learning
dc.subject	Membership Inference Attacks
dc.subject	Range Membership Inference Attacks
dc.subject	Group Testing
dc.subject	Privacy Auditing
dc.subject	Decoding
dc.title	Is This Data Point In your Training Set? Similarity-based Inference Attacks: Performance Evaluation of Range Membership Attacks to Audit Privacy Risks in Machine Learning Models
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Computer systems and networks (MPCSN), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 25-166 SS SN.pdf
Storlek:: 4.05 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen