BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & β-diversity. Feature subset selection--a sub-field of machine learning--can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome.
RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets.
CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.
Ditzler, G., Morrison, J.C., Lan, Y. et al. Fizzy: feature subset selection for metagenomics. BMC Bioinformatics 16, 358 (2015). https://doi.org/10.1186/s12859-015-0793-8
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License.