Date Approved
10-21-2015
Embargo Period
3-3-2020
Document Type
Thesis
Degree Name
M.S. Electrical and Computer Engineering
Department
Electrical and Computer Engineering
College
Henry M. Rowan College of Engineering
Advisor
Polikar, Robi
Subject(s)
Machine learning
Disciplines
Electrical and Computer Engineering
Abstract
An increasing number of real-world applications are associated with streaming data drawn from drifting and nonstationary distributions. These applications demand new algorithms that can learn and adapt to such changes, also known as concept drift. Proper characterization of such data with existing approaches typically requires substantial amount of labeled instances, which may be difficult, expensive, or even impractical to obtain. In this thesis, compacted object sample extraction (COMPOSE) is introduced - a computational geometry-based framework to learn from nonstationary streaming data - where labels are unavailable (or presented very sporadically) after initialization. The feasibility and performance of the algorithm are evaluated on several synthetic and real-world data sets, which present various different scenarios of initially labeled streaming environments. On carefully designed synthetic data sets, we also compare the performance of COMPOSE against the optimal Bayes classifier, as well as the arbitrary subpopulation tracker algorithm, which addresses a similar environment referred to as extreme verification latency. Furthermore, using the real-world National Oceanic and Atmospheric Administration weather data set, we demonstrate that COMPOSE is competitive even with a well-established and fully supervised nonstationary learning algorithm that receives labeled data in every batch.
Recommended Citation
Dyer, Karl, "COMPOSE: Compacted object sample extraction a framework for semi-supervised learning in nonstationary environments" (2015). Theses and Dissertations. 553.
https://rdw.rowan.edu/etd/553