Identifying interactions in omics data for clinical biomarker discovery using symbolic regression
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Identifying interactions in omics data for clinical biomarker discovery using symbolic regression. / Christensen, Niels Johan; Demharter, Samuel; Machado, Meera; Pedersen, Lykke; Salvatore, Marco; Stentoft-Hansen, Valdemar; Iglesias, Miquel Triana.
In: Bioinformatics, Vol. 38, No. 15, 405, 2022, p. 3749-3758.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Identifying interactions in omics data for clinical biomarker discovery using symbolic regression
AU - Christensen, Niels Johan
AU - Demharter, Samuel
AU - Machado, Meera
AU - Pedersen, Lykke
AU - Salvatore, Marco
AU - Stentoft-Hansen, Valdemar
AU - Iglesias, Miquel Triana
PY - 2022
Y1 - 2022
N2 - Motivation The identification of predictive biomarker signatures from omics and multi-omics data for clinical applications is an active area of research. Recent developments in assay technologies and machine learning (ML) methods have led to significant improvements in predictive performance. However, most high-performing ML methods suffer from complex architectures and lack interpretability.Results We present the application of a novel symbolic-regression-based algorithm, the QLattice, on a selection of clinical omics datasets. This approach generates parsimonious high-performing models that can both predict disease outcomes and reveal putative disease mechanisms, demonstrating the importance of selecting maximally relevant and minimally redundant features in omics-based machine-learning applications. The simplicity and high-predictive power of these biomarker signatures make them attractive tools for high-stakes applications in areas such as primary care, clinical decision-making and patient stratification.
AB - Motivation The identification of predictive biomarker signatures from omics and multi-omics data for clinical applications is an active area of research. Recent developments in assay technologies and machine learning (ML) methods have led to significant improvements in predictive performance. However, most high-performing ML methods suffer from complex architectures and lack interpretability.Results We present the application of a novel symbolic-regression-based algorithm, the QLattice, on a selection of clinical omics datasets. This approach generates parsimonious high-performing models that can both predict disease outcomes and reveal putative disease mechanisms, demonstrating the importance of selecting maximally relevant and minimally redundant features in omics-based machine-learning applications. The simplicity and high-predictive power of these biomarker signatures make them attractive tools for high-stakes applications in areas such as primary care, clinical decision-making and patient stratification.
KW - DIFFERENTIAL EXPRESSION
KW - INFERENCE
KW - MODEL
U2 - 10.1093/bioinformatics/btac405
DO - 10.1093/bioinformatics/btac405
M3 - Journal article
C2 - 35731214
VL - 38
SP - 3749
EP - 3758
JO - Bioinformatics (Online)
JF - Bioinformatics (Online)
SN - 1367-4811
IS - 15
M1 - 405
ER -
ID: 314353858