Date Approved

8-23-2016

Embargo Period

8-24-2016

Document Type

Thesis

Degree Name

M.S. Electrical and Computer Engineering

Department

Electrical and Computer Engineering

College

Henry M. Rowan College of Engineering

Advisor

Ramachandran, Ravi

Committee Member 1

Thayasivam, Umashanger

Committee Member 2

Schmalzel, John

Keywords

affine transform, feature enhancement, GMM classifier, speaker recognition, speech coding distortion

Subject(s)

Automatic speech recognition; Speech processing systems

Disciplines

Electrical and Computer Engineering

Abstract

For wireless remote access security, forensics, border control and surveillance applications, there is an emerging need for biometric speaker recognition systems to be robust to speech coding distortion. This thesis examines the robustness issue for three coders, namely, the ITU-T 6.3 kilobits per second (kbps) G.723.1, the ITU-T 8 kbps G.729 and the 12.2 kbps 3GPP GSM-AMR coder. Both speaker identification (SI) and speaker verification (SV) systems are considered and use a Gaussian mixture model (GMM) classifier. The systems are trained on clean speech and tested on the decoded speech. To mitigate the performance loss due to mismatched training and testing conditions, four robust features, two enhancement approaches and feature (SI) and score (SV) based fusion strategies are implemented.

The first proposed novel enhancement method is feature compensation based on the affine transform and is used to map the features from the test scenario to the train scenario. The second is the McCree signal enhancement approach based on the spectral envelope information. A detailed two-way analysis of variance (ANOVA) supplemented with a multiple comparison test is performed in order to show statistical significance in application of these enhancement methods.

Share

COinS