Algorithms in Privacy & Security for Data Analytics and Machine Learning

Liang, Yuting

Algorithms in Privacy & Security for Data Analytics and Machine Learning

dc.contributor.advisor	Samavi, Reza
dc.contributor.author	Liang, Yuting
dc.contributor.department	Computing and Software	en_US
dc.date.accessioned	2020-04-28T05:21:57Z
dc.date.available	2020-04-28T05:21:57Z
dc.date.issued	2020
dc.description.abstract	Applications employing very large datasets are increasingly common in this age of Big Data. While these applications provide great benefits in various domains, their usage can be hampered by real-world privacy and security risks. In this work we propose algorithms which aim to provide privacy and security protection in different aspects of these applications. First, we address the problem of data privacy. When the datasets used contain personal information, they must be properly anonymized in order to protect the privacy of the subjects to which the records pertain. A popular privacy preservation technique is the k-anonymity model which guarantees that any record in the dataset must be indistinguishable from at least k-1 other records in terms of quasi-identifiers (i.e. the subset of attributes that can be used to deduce the identity of an individual). Achieving k-anonymity while considering the competing goal of data utility can be a challenge, especially for datasets containing large numbers of records. We formulate k-anonymization as an optimization problem with the objective to maximize data utility, and propose two practical algorithms for solving this problem. Secondly, we address the problem of application security; specifically, for predictive models using Deep Learning, where adversaries can use minimally perturbed inputs (a.k.a. adversarial examples) to cause a neural network to produce incorrect outputs. We propose an approach which protects against adversarial examples in image classification-type networks. The approach relies on two mechanisms: 1) a mechanism that increases robustness at the expense of accuracy; and, 2) a mechanism that improves accuracy. We show that an approach combining the two mechanisms can provide protection against adversarial examples while retaining accuracy. We provide experimental results to demonstrate the effectiveness of our algorithms for both problems.	en_US
dc.description.degree	Master of Science (MSc)	en_US
dc.description.degreetype	Thesis	en_US
dc.description.layabstract	Applications employing very large datasets are increasingly common in this age of Big Data. While these applications provide great benefits in various domains, their usage can be hampered by real-world privacy and security risks. In this work we propose algorithms which aim to provide privacy and security protection in different aspects of these applications. We address the problem of data privacy; when the datasets used contain personal information, they must be properly anonymized in order to protect the privacy of the subjects to which the records pertain. We propose two practical algorithms for anonymization which are also utility-centric. We address the problem of application security, specifically for Deep Learning applications where adversaries can use minimally perturbed inputs to cause a neural network to produce incorrect outputs. We propose an approach which protects against these attacks. We provide experimental results to demonstrate the effectiveness of our algorithms for both problems.	en_US
dc.identifier.uri	http://hdl.handle.net/11375/25409
dc.language.iso	en	en_US
dc.subject	Privacy	en_US
dc.subject	Security	en_US
dc.subject	Anonymization	en_US
dc.subject	Machine Learning	en_US
dc.subject	Algorithms	en_US
dc.title	Algorithms in Privacy & Security for Data Analytics and Machine Learning	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Liang_Yuting_2020Apr_MSc.pdf
Size:: 2.72 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.68 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Open Access Dissertations and Theses