Date Approved
11-3-2014
Embargo Period
3-3-2020
Document Type
Thesis
Degree Name
M.S. Computer Science
Department
Computer Science
College
College of Science & Mathematics
Advisor
Hnatyshin, Vasil
Subject(s)
Data mining--Mathematics; Bioinformatics
Disciplines
Computer Sciences
Abstract
Metabolomics is the science of comprehensive evaluation of changes in the metabolome with a goal to elucidate underlying biological mechanisms of a living system. There is an opinion in the field of metabolomics, the study of the set of metabolites present within an organism, cell, or tissue, that the future development of the field is contingent upon two factors. One of the factors is the advancement of analytical instrumentation, and the other is developing data mining methodologies for extracting meaningful and interpretable experimental results. There are many different types of data mining methodologies, but the undertaking of selecting a particular technique for one's data is intricate. This task needs to take into account different issues like justifiability, reproducibility, and traceability when selecting and applying data mining techniques Random Forests methodology stands out among data mining techniques, since it can be used for classification, feature extraction, and analysis. Random Forests algorithm has many different customizable parameters that affect the outcome of a particular run. Identifying the best values for these customizable attributes is a task in itself. My work is focused on the study of the Random Forests algorithm, and the task of determining its optimal configuration parameters, for sample classification in the field of Metabolomics.
Recommended Citation
White, Curtis, "Using Random Forest in the field of metabolomics" (2014). Theses and Dissertations. 444.
https://rdw.rowan.edu/etd/444