Date Approved
4-21-2025
Embargo Period
4-21-2025
Document Type
Thesis
Degree Name
Master of Science (M.S.) Bioinformatics
Department
Bioinformatics
College
College of Science & Mathematics
Advisor
Yong Chen, Ph.D.
Committee Member 1
Benjamin Carone, Ph.D.
Committee Member 2
Alison Krufka, Ph.D.
Keywords
Bioinformatics;Negative Binomial;Next Generation Sequencing;RNA Sequencing;Single Cell
Disciplines
Bioinformatics | Life Sciences
Abstract
Single cell RNA sequencing (scRNA-seq) is a powerful high throughput sequencing technology that quantifies the transcriptome at a single cell resolution. Differential expression (DE) analysis is a key scRNA-seq analysis task that identifies genes with statistically significant expression changes in response to biological stimuli. Existing DE methods inherently attempt to determine whether two sets of negative binomially distributed read counts are significantly different but lack exact testing strategies to do so. This work introduces a novel theoretical distribution, the Difference of Two Negative Binomial Distributions (DOTNB), and implements it within DEGage, an R package for DE analysis. Benchmarking DEGage against DESeq2, DESingle, edgeR, Monocle3, and scDD showed that DEGage offered greater sensitivity and robustness against scRNA-seq specific technical effects. After benchmarking, DEGage successfully identified regulators of long-term memory consolidation in engram neurons, and canonical prostate-cancer markers in a large-scale dataset of heterogeneous prostate cancer tissue. Given their success in the validation studies, DOTNB and DEGage can be further applied to new scRNA-seq projects and other forms of negative binomially distributed count data.
Recommended Citation
Petrany, Alicia, "Modeling Differential Expression In scRNA-seq Data With A Difference Of Two Negative Binomial Distributions" (2025). Theses and Dissertations. 3346.
https://rdw.rowan.edu/etd/3346