Date Approved


Embargo Period


Document Type


Degree Name

M.S. Computer Science


Computer Science


College of Science & Mathematics


Vahid Heydari, Ph.D.

Committee Member 1

Shen-Shyang Ho, Ph.D.

Committee Member 2

Silvija Kokalj-Filipovic, Ph.D.


malware detection, machine learning, model agnostic language


Malware (computer software)


Computer Sciences


The adoption of the internet as a global platform has birthed a significant rise in cyber-attacks of various forms ranging from Trojans, worms, spyware, ransomware, botnet malware, rootkit, etc. In order to tackle the issue of all these forms of malware, there is a need to understand and detect them. There are various methods of detecting malware which include signature, behavioral, and machine learning. Machine learning methods have proven to be the most efficient of all for malware detection. In this thesis, a system that utilizes both the signature and dynamic behavior-based detection techniques, with the added layer of the machine learning algorithm with model explainability capability is proposed. This hybrid system provides not only predictions but also their interpretation and explanation for a malware detection task. The layer of a machine learning algorithm can be Logistic Regression, Random Forest, Naive Bayes, Decision Tree, or Support Vector Machine. Empirical performance evaluation results on publicly available datasets and manually acquired samples (both benign and malicious) are used to compare the five machine learning algorithms. DALEX (moDel Agnostic Language for Exploration and explanation) is integrated into the proposed hybrid approach to support the interpretation and understanding of the prediction to improve the trust of cyber security stakeholders in complex machine learning predictive models.