Date Approved
6-2-2023
Embargo Period
6-6-2023
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Electrical & Computer Engineering
College
Henry M. Rowan College of Engineering
Funder
U.S. Department of Education
Advisor
Robi Polikar, Ph.D.
Committee Member 1
Ravi Ramachandran, Ph.D.
Committee Member 2
Gregory Ditzler, Ph.D.
Committee Member 3
Shen-Shyang Ho, Ph.D.
Committee Member 4
Ghulam Rasool, Ph.D.
Subject(s)
Machine learning
Disciplines
Electrical and Computer Engineering | Engineering
Abstract
Machine learning is an ever-growing and increasingly pervasive presence in every-day life; we entrust these models, and systems built on these models, with some of our most sensitive information and security applications. However, for all of the trust that we place in these models, it is essential to recognize the fact that such models are simply reflections of the data and labels on which they are trained. To wit, if the data and labels are suspect, then so too must be the models that we rely on—yet, as larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information. While recent work has begun to investigate mitigating the effect of noisy labels, to date this critical field has been disjointed and disconnected, despite the common goal. In this work, we propose a new model of label noise, which we call “labeler-dependent noise (LDN).” LDN extends and generalizes the canonical instance-dependent noise model to multiple labelers, and unifies every pre-ceding modeling strategy under a single umbrella. Furthermore, studying the LDN model leads us to propose a more general, modular framework for noise-robust learning called “labeler-aware learning (LAL).” Our comprehensive suite of experiments demonstrate that unlike previous methods that are unable to remain robust under the general LDN model, LAL retains its full learning capabilities under extreme, and even adversarial, conditions of label noise. We believe that LDN and LAL should mark a paradigm shift in how we learn from labeled data, so that we may both discover new insights about machine learning, and develop more robust, trustworthy models on which to build our daily lives.
Recommended Citation
Dawson, Glenn, "A GENERAL MODEL FOR NOISY LABELS IN MACHINE LEARNING" (2023). Theses and Dissertations. 3123.
https://rdw.rowan.edu/etd/3123