Robust Approaches for Learning with Noisy Labels

Lu, Yangdi

Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/27915

Title:	Robust Approaches for Learning with Noisy Labels
Authors:	Lu, Yangdi
Advisor:	He, Wenbo
Department:	Computing and Software
Keywords:	Machine Learning;Weakly Supervised Learning;Classification
Publication Date:	2022
Abstract:	Deep neural networks (DNNs) have achieved remarkable success in data-intense applications, while such success relies heavily on massive and carefully labeled data. In practice, obtaining large-scale datasets with correct labels is often expensive, time-consuming, and sometimes even impossible. Common approaches of constructing datasets involve some degree of error-prone processes, such as automatic labeling or crowdsourcing, which inherently introduce noisy labels. It has been observed that noisy labels severely degrade the generalization performance of classifiers, especially the overparameterized (deep) neural networks. Therefore, studying noisy labels and developing techniques for training accurate classifiers in the presence of noisy labels is of great practical significance. In this thesis, we conduct a thorough study to fully understand LNL and provide a comprehensive error decomposition to reveal the core issue of LNL. We then point out that the core issue in LNL is that the empirical risk minimizer is unreliable, i.e., the DNNs are prone to overfitting noisy labels during training. To reduce the learning errors, we propose five different methods, 1) Co-matching: a framework consists of two networks to prevent the model from memorizing noisy labels; 2) SELC: a simple method to progressively correct noisy labels and refine the model; 3) NAL: a regularization method that automatically distinguishes the mislabeled samples and prevents the model from memorizing them; 4) EM-enhanced loss: a family of robust loss functions that not only mitigates the influence of noisy labels, but also avoids underfitting problem; 5) MixNN: a framework that trains the model with new synthetic samples to mitigate the impact of noisy labels. Our experimental results demonstrate that the proposed approaches achieve comparable or better performance than the state-of-the-art approaches on benchmark datasets with simulated label noise and large-scale datasets with real-world label noise.
URI:	http://hdl.handle.net/11375/27915
Appears in Collections:	Open Access Dissertations and Theses

Files in This Item:

File	Description	Size	Format
Yangdi_Lu_202209_Doctor-of-Philosophy.pdf.pdf Open Access	Yangdi Lu's PhD thesis	48.35 MB	Adobe PDF	View/Open

Show full item record