Date Approved

4-3-2003

Embargo Period

5-2-2016

Document Type

Thesis

Degree Name

M.S. in Electrical Engineering

Department

Electrical & Computer Engineering

College

Henry M. Rowan College of Engineering

Advisor

Polikar, Robi

Subject(s)

Pattern recognition systems; Machine learning

Disciplines

Electrical and Computer Engineering

Abstract

Pattern recognition problems span a broad range of applications, where each application has its own tolerance on classification error. The varying levels of risk associated with many pattern recognition applications indicate the need for a versatile algorithm with the ability to measure its own reliability. In this work, the supervised incremental learning algorithm Learn++ [1, 2], which exploits the synergistic power of an ensemble of classifiers, is further developed to add the capability of assessing its own confidence. Estimation of the true generalization performance of the classifier as well as the confidences on classification of individual data instances is investigated separately. Several confidence estimation techniques are explored such as majority voting, variance based confidence estimation, and the weighted exponential method.

Experiments for incorporating confidence estimation techniques for evaluating the confidence of decisions made by Learn++ produced promising results, with weighted exponential based confidence estimation providing the best performance. The objective of the confidence experiments was to evaluate the algorithm's ability to assess the confidence in its own decisions. In addition to the ability of Learn++ to assess its own confidence in generalization performance and individual data instances, several additional desirable traits were also observed. Confidence estimates on individual instances exhibit increased values on the correctly classified instances of the test dataset as additional subsets of training data become available. Furthermore, "decreasing" confidence levels for misclassified test data instances is observed, which indicates that the algorithm has the potential ability to detect its own misclassifications, effectively warning the user that the classifier outputs for those instances need to be evaluated with caution. Finally, the confidences of those instances, which were correctly classified, were generally high or very high after multiple incremental learning sessions.

Learn++ is in essence a procedure to sequentially generate and combine an ensemble of classifiers to achieve its desired properties of incremental learning and estimation of confidence levels. It was found that the algorithm itself is independent of the specific classifier architectures used in the ensemble, and has been shown to work with a versatile set of supervised classifiers, including the multilayer perceptron (MLP), radial basis function (RBF), and probabilistic neural networks (PNN). In addition to its versatility with respect to the classifier used, it can be used on a wide variety of applications. The small subset of the applications that were explored in this work includes volatile organic compound identification, nondestructive evaluation of a nuclear plant's pipings and tubings, handwritten character recognition, glass identification, iris plant recognition, and identifying the existence of structures within the ionosphere.

Share

COinS