Date Approved

6-4-2024

Embargo Period

6-4-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.)

Department

Electrical and Computer Engineering

College

Henry M. Rowan College of Engineering

Advisor

Shreekanth Mandayam, Ph.D.

Committee Member 1

Nidhal Bouaynaya, Ph.D.

Committee Member 2

Mira Lalovic-Hand, Ph.D.

Committee Member 3

Ravi Ramachandran, Ph.D.

Committee Member 4

John Schmalzel, Ph.D.

Keywords

image processing; image sensing; machine learning; neural networks

Subject(s)

Artificial Intelligence; Approximation Theory

Disciplines

Artificial Intelligence and Robotics | Computer Sciences | Electrical and Computer Engineering

Abstract

Artificial Intelligence (AI) has exploded into mainstream consciousness with commercial investments exceeding $90 billion in the last year alone. Inasmuch as consumer-facing applications such ChatGPT offer astounding access to algorithms that were hitherto restricted to academic research labs, public focus of attention on AI has created an avalanche of misinformation. The nexus of investor-driven hype, “surprising” inaccuracies in the answers provided by AI models – now anthropomorphically labeled as “hallucinations”, and impending legislation by well-meaning and concerned governments has resulted in a crisis of confidence in the science of AI. The primary driver for AI’s recent growth is the convergence of ubiquitous cloud computing technology and pivotal developments in artificial neural network (ANN) architectures and machine learning topologies, leading to the emergence and dominance of large language models. These models have cast aside the need for extensive data pre-processing and incorporate feature extraction into network training. In some cases, such as in the use of generative AI models for natural language processing, the results are astonishing, but also, in many ways, concerning. An overreliance on the network to perform multiple tasks, well beyond data interpolation, leads to output anomalies and errors. Also, increasing the size of datasets and number of network layers to ensure better performance is soon to hit a brick wall of incurred time and expense to train the network. But of greatest concern, despite the race to advance the technology, is that we are unable to adequately describe the operation of AI models mathematically, and, therefore, we do not completely understand how such systems develop the relationships within the training data to produce results. It is in this context that the research work presented in this dissertation comprehensively investigates the origins of ANNs based on approximation theory, in order to refocus efforts on data pre-processing, and away from the exclusive attention on the development of more and more complex architectures. In a sense, the science and mathematics of AI algorithms must provide a path for the robust, reliable, and rapid deployment of intelligent algorithms without an over-reliance of technological prowess – inexpensive bandwidth and computing power. An algorithm for shape recognition in image data is developed based on geometric transformations, which forms the input to data to any neural network architecture of choice. Results demonstrate that such intelligent pre-processing provides significantly more control to the network designer than brute-force methods of exponentially increasing network layers.

Share

COinS