Omics-Based Computational Methods for Clinical Discovery

Abstract

This thesis presents two independent computational projects that apply distinct analytical frameworks to high-dimensional biological data, each addressing a clinically significant question through a different molecular lens. The first project applies attention-based deep learning to baseline metabolomic data from the METADAP cohort, a longitudinal clinical study of patients with Major Depressive Disorder initiating antidepressant therapy, with the objective of predicting the onset of Metabolic Syndrome at six months. TabNet, a deep learning architecture specifically designed for tabular data, achieved an area under the receiver operating curve of 0.84, substantially outperforming all classical comparator models. Its built-in sequential attention mechanism identified a panel of ten biologically plausible metabolites spanning amino acid, lipid and glucose metabolism pathways as the primary predictive features, and the quality of this feature selection was validated by showing that the identified metabolites improved the performance of independent classifiers. These findings demonstrate the potential of interpretable deep learning for early cardiometabolic risk stratification in high-risk psychiatric populations. The second project applies weighted gene co-expression network analysis to RNA sequencing data from two Acute Myeloid Leukemia cell lines, one carrying a wildtype NPM1 gene and one carrying the type A NPM1 mutation, under untreated and EAPB04303-treated conditions. A signed co-expression network of 31 modules was constructed and interrogated through four research questions. The analysis established that NPM1 mutation status alone produces a coherent set of baseline transcriptional differences between the two cell lines, that these differences shape their divergent responses to treatment, and that the drug selectively disrupts protein synthesis and quality control programs in NPM1-mutated cells. Most significantly, an unbiased genome-wide analysis independently recovered the established mechanism of action of EAPB04303 as a compound that blocks the cell division machinery, providing computational validation of experimental findings. These results demonstrate that co-expression network analysis can simultaneously characterize mutation-driven transcriptional phenotypes, map treatment-induced rewiring and recover drug mechanisms of action from transcriptomic data alone. Together, the two projects illustrate that carefully matched computational approaches can extract structured, interpretable and biologically meaningful signal from high- dimensional omic data and contribute a methodological framework applicable to future studies in precision medicine and targeted cancer therapy.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By