Omics-Based Computational Methods for Clinical Discovery
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis presents two independent computational projects that apply distinct analytical
frameworks to high-dimensional biological data, each addressing a clinically significant
question through a different molecular lens.
The first project applies attention-based deep learning to baseline metabolomic data from
the METADAP cohort, a longitudinal clinical study of patients with Major Depressive
Disorder initiating antidepressant therapy, with the objective of predicting the onset of
Metabolic Syndrome at six months. TabNet, a deep learning architecture specifically
designed for tabular data, achieved an area under the receiver operating curve of 0.84,
substantially outperforming all classical comparator models. Its built-in sequential
attention mechanism identified a panel of ten biologically plausible metabolites spanning
amino acid, lipid and glucose metabolism pathways as the primary predictive features,
and the quality of this feature selection was validated by showing that the identified
metabolites improved the performance of independent classifiers. These findings
demonstrate the potential of interpretable deep learning for early cardiometabolic risk
stratification in high-risk psychiatric populations.
The second project applies weighted gene co-expression network analysis to RNA
sequencing data from two Acute Myeloid Leukemia cell lines, one carrying a wildtype
NPM1 gene and one carrying the type A NPM1 mutation, under untreated and
EAPB04303-treated conditions. A signed co-expression network of 31 modules was
constructed and interrogated through four research questions. The analysis established
that NPM1 mutation status alone produces a coherent set of baseline transcriptional
differences between the two cell lines, that these differences shape their divergent
responses to treatment, and that the drug selectively disrupts protein synthesis and quality
control programs in NPM1-mutated cells. Most significantly, an unbiased genome-wide
analysis independently recovered the established mechanism of action of EAPB04303 as
a compound that blocks the cell division machinery, providing computational validation
of experimental findings. These results demonstrate that co-expression network analysis
can simultaneously characterize mutation-driven transcriptional phenotypes, map
treatment-induced rewiring and recover drug mechanisms of action from transcriptomic
data alone.
Together, the two projects illustrate that carefully matched computational approaches can
extract structured, interpretable and biologically meaningful signal from high-
dimensional omic data and contribute a methodological framework applicable to future
studies in precision medicine and targeted cancer therapy.