PREVENTING DATA POISONING ATTACKS IN FEDERATED MACHINE LEARNING BY AN ENCRYPTED VERIFICATION KEY

Mahdee, Jodayree

Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/29304

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	He, Wenbo	-
dc.contributor.advisor	Janicki, Ryszard	-
dc.contributor.author	Mahdee, Jodayree	-
dc.date.accessioned	2023-12-21T19:34:40Z	-
dc.date.available	2023-12-21T19:34:40Z	-
dc.date.issued	2024-06	-
dc.identifier.uri	http://hdl.handle.net/11375/29304	-
dc.description	Federated learning has gained attention recently for its ability to protect data privacy and distribute computing loads [1]. It overcomes the limitations of traditional machine learning algorithms by allowing computers to train on remote data inputs and build models while keeping participant privacy intact. Traditional machine learning offered a solution by enabling computers to learn patterns and make decisions from data without explicit programming. It opened up new possibilities for automating tasks, recognizing patterns, and making predictions. With the exponential growth of data and advances in computational power, machine learning has become a powerful tool in various domains, driving innovations in fields such as image recognition, natural language processing, autonomous vehicles, and personalized recommendations. traditional machine learning, data is usually transferred to a central server, raising concerns about privacy and security. Centralizing data exposes sensitive information, making it vulnerable to breaches or unauthorized access. Centralized machine learning assumes that all data is available at a central location, which is only sometimes practical or feasible. Some data may be distributed across different locations, owned by different entities, or subject to legal or privacy restrictions. Training a global model in traditional machine learning involves frequent communication between the central server and participating devices. This communication overhead can be substantial, particularly when dealing with large-scale datasets or resource-constrained devices.	en_US
dc.description.abstract	Recent studies have uncovered security issues with most of the federated learning models. One common false assumption in the federated learning model is that participants are the attacker and would not use polluted data. This vulnerability enables attackers to train their models using polluted data and then send the polluted updates to the training server for aggregation, potentially poisoning the overall model. In such a setting, it is challenging for an edge server to thoroughly inspect the data used for model training and supervise any edge device. This study evaluates the vulnerabilities present in federated learning and explores various types of attacks that can occur. This paper presents a robust prevention scheme to address these vulnerabilities. The proposed prevention scheme enables federated learning servers to monitor participants actively in real-time and identify infected individuals by introducing an encrypted verification scheme. The paper outlines the protocol design of this prevention scheme and presents experimental results that demonstrate its effectiveness.	en_US
dc.language.iso	en_US	en_US
dc.subject	Federated Learning	en_US
dc.subject	Data Poisoning Attacks	en_US
dc.subject	Security Vulnerabilities	en_US
dc.subject	Model Aggregation	en_US
dc.subject	Edge Server Supervision	en_US
dc.subject	Attack Evaluation	en_US
dc.subject	Prevention Scheme	en_US
dc.subject	Real-time Monitoring	en_US
dc.subject	Encrypted Verification	en_US
dc.subject	Protocol Design	en_US
dc.subject	Experimental Results	en_US
dc.subject	Participant Identification	en_US
dc.subject	Robustness in FL	en_US
dc.subject	Machine Learning Security	en_US
dc.subject	Model Integrity	en_US
dc.title	PREVENTING DATA POISONING ATTACKS IN FEDERATED MACHINE LEARNING BY AN ENCRYPTED VERIFICATION KEY	en_US
dc.type	Thesis	en_US
dc.contributor.department	Computer Science	en_US
dc.description.degreetype	Thesis	en_US
dc.description.degree	Doctor of Philosophy (PhD)	en_US
dc.description.layabstract	federated learning models face significant security challenges and can be vulnerable to attacks. For instance, federated learning models assume participants are not attackers and will not manipulate the data. However, in reality, attackers can compromise the data of remote participants by inserting fake or altering existing data, which can result in polluted training results being sent to the server. For instance, if the sample data is an animal image, attackers can modify it to contaminate the training data. This paper introduces a robust preventive approach to counter data pollution attacks in real-time. It incorporates an encrypted verification scheme into the federated learning model, preventing poisoning attacks without the need for specific attack detection programming. The main contribution of this paper is a mechanism for detection and prevention that allows the training server to supervise real-time training and stop data modifications in each client's storage before and between training rounds. The training server can identify real-time modifications and remove infected remote participants with this scheme.	en_US
Appears in Collections:	Open Access Dissertations and Theses

Files in This Item:

File	Description	Size	Format
Jodayree_Mahdee_MJ_202311_Ph.D. Computer Science.pdf Open Access	A THESIS SUBMITTED TO THE DEPARTMENT OF COMPUTING AND SOFTWARE AND THE SCHOOL OF GRADUATE STUDIES OF MCMASTER UNIVERSITY IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY	1.38 MB	Adobe PDF	View/Open

Show simple item record