Add noise to remove noise: Local differential privacy for feature selection

Mina Alishahi*, Vahideh Moghtadaiee, Hojjat Navidan

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Feature selection has become significantly important for data analysis. It selects the most informative features describing the data to filter out the noise, complexity, and over-fitting caused by less relevant features. Accordingly, feature selection improves the predictors’ accuracy, enables them to be trained faster and more cost-effectively, and provides a better understanding of the underlying data. While plenty of practical solutions have been proposed in the literature to identify the most discriminating features describing a dataset, an understanding of feature selection over privacy-sensitive data in the absence of a trusted party is still missing. The design of such a framework is specifically important in our modern society, where each individual through accessing the Internet can play simultaneously the role of a data provider and a data-analysis beneficiary. In this study, we propose a novel feature selection framework based on Local Differential Privacy (LDP), named LDP-FS, which estimates the importance of features over securely protected data while protects the confidentiality of each individual data before leaving the user's device. The performance of LDP-FS in terms of scoring and ordering the features is assessed by investigating the impact of datasets properties, privacy mechanism, privacy levels, and feature selection techniques on this framework. The accuracy of classifiers trained on the selected subset of features by LDP-FS is also presented. Our experimental results demonstrate the effectiveness and efficiency of the proposed framework.

Original languageEnglish
Article number102934
Number of pages22
JournalComputers and Security
Volume123
DOIs
Publication statusPublished - Dec 2022

Keywords

  • Feature ranking
  • Feature selection
  • Local differential privacy
  • Machine learning
  • Privacy preserving

Fingerprint

Dive into the research topics of 'Add noise to remove noise: Local differential privacy for feature selection'. Together they form a unique fingerprint.

Cite this