Date Approved

6-2-2023

Embargo Period

6-6-2023

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Electrical & Computer Engineering

College

Henry M. Rowan College of Engineering

Advisor

Robi Polikar, Ph.D.

Committee Member 1

Ravi Ramachandran, Ph.D.

Committee Member 2

Gregory Ditzler, Ph.D.

Committee Member 3

Shen-Shyang Ho, Ph.D.

Committee Member 4

Ghulam Rasool, Ph.D.

Subject(s)

Machine learning

Disciplines

Electrical and Computer Engineering | Engineering

Abstract

Machine learning is an ever-growing and increasingly pervasive presence in every-day life; we entrust these models, and systems built on these models, with some of our most sensitive information and security applications. However, for all of the trust that we place in these models, it is essential to recognize the fact that such models are simply reflections of the data and labels on which they are trained. To wit, if the data and labels are suspect, then so too must be the models that we rely on—yet, as larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information. While recent work has begun to investigate mitigating the effect of noisy labels, to date this critical field has been disjointed and disconnected, despite the common goal. In this work, we propose a new model of label noise, which we call “labeler-dependent noise (LDN).” LDN extends and generalizes the canonical instance-dependent noise model to multiple labelers, and unifies every pre-ceding modeling strategy under a single umbrella. Furthermore, studying the LDN model leads us to propose a more general, modular framework for noise-robust learning called “labeler-aware learning (LAL).” Our comprehensive suite of experiments demonstrate that unlike previous methods that are unable to remain robust under the general LDN model, LAL retains its full learning capabilities under extreme, and even adversarial, conditions of label noise. We believe that LDN and LAL should mark a paradigm shift in how we learn from labeled data, so that we may both discover new insights about machine learning, and develop more robust, trustworthy models on which to build our daily lives.

Share

COinS