Using deep learning to evaluate peaks in chromatographic data

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Using deep learning to evaluate peaks in chromatographic data. / Risum, Anne Bech; Bro, Rasmus.

In: Talanta, Vol. 204, 2019, p. 255-260.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Risum, AB & Bro, R 2019, 'Using deep learning to evaluate peaks in chromatographic data', Talanta, vol. 204, pp. 255-260. https://doi.org/10.1016/j.talanta.2019.05.053

APA

Risum, A. B., & Bro, R. (2019). Using deep learning to evaluate peaks in chromatographic data. Talanta, 204, 255-260. https://doi.org/10.1016/j.talanta.2019.05.053

Vancouver

Risum AB, Bro R. Using deep learning to evaluate peaks in chromatographic data. Talanta. 2019;204:255-260. https://doi.org/10.1016/j.talanta.2019.05.053

Author

Risum, Anne Bech ; Bro, Rasmus. / Using deep learning to evaluate peaks in chromatographic data. In: Talanta. 2019 ; Vol. 204. pp. 255-260.

Bibtex

@article{48a282c9cbb4434aad90b6dc01ee037a,
title = "Using deep learning to evaluate peaks in chromatographic data",
abstract = "Analysis of untargeted gas-chromatographic data is time consuming. With the earlier introduction of the PARAFAC2 (PARAllel FACtor analysis 2) based PARADISe (PARAFAC2 based Deconvolution and Identification System) approach in 2017, this task was made considerably more time-efficient. However, there are still a number of manual steps in the analysis which require data analytical expertise. One of these is the need to define whether or not each PARAFAC2 resolved component represents a peak suitable for integration. As the peaks may change in both shape and location on the elution time-axis, this presents a problem which cannot be readily solved by applying a linear classifier, such as PLS-DA (Partial Least Squares regression for Discriminant Analysis). As part of our ongoing efforts to further automate analysis of Gas Chromatography with Mass Spectrometry (GC-MS), we therefore explore a convolutional neural network classifier, capable of handling these shifts and variations in shape. The theory of convolutional neural networks and application on vector samples is briefly explained, and the performance is tested against a PLS-DA classifier, a shallow artificial neural network and a locally weighted regression model. The models are built on a training set with PARAFAC2 resolved components from eight different aroma related GC-MS runs with a total of over 70,000 elution profile samples, and validated using another, independent, GC-MS dataset. Based on Receiver Operating Characteristic curves (ROC) and manual analysis of the misclassified cases, it is shown that the convolutional network consistently outperforms the competing models, yielding an Area Under the Curve (AUC) value of 0.95 for peak classification. Examples are given illustrating that this new approach provides convincing means to automatically assess and evaluate modelled elution profiles of chromatographic data and thereby remove this laborious manual step.",
author = "Risum, {Anne Bech} and Rasmus Bro",
year = "2019",
doi = "10.1016/j.talanta.2019.05.053",
language = "English",
volume = "204",
pages = "255--260",
journal = "Talanta",
issn = "0039-9140",
publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - Using deep learning to evaluate peaks in chromatographic data

AU - Risum, Anne Bech

AU - Bro, Rasmus

PY - 2019

Y1 - 2019

N2 - Analysis of untargeted gas-chromatographic data is time consuming. With the earlier introduction of the PARAFAC2 (PARAllel FACtor analysis 2) based PARADISe (PARAFAC2 based Deconvolution and Identification System) approach in 2017, this task was made considerably more time-efficient. However, there are still a number of manual steps in the analysis which require data analytical expertise. One of these is the need to define whether or not each PARAFAC2 resolved component represents a peak suitable for integration. As the peaks may change in both shape and location on the elution time-axis, this presents a problem which cannot be readily solved by applying a linear classifier, such as PLS-DA (Partial Least Squares regression for Discriminant Analysis). As part of our ongoing efforts to further automate analysis of Gas Chromatography with Mass Spectrometry (GC-MS), we therefore explore a convolutional neural network classifier, capable of handling these shifts and variations in shape. The theory of convolutional neural networks and application on vector samples is briefly explained, and the performance is tested against a PLS-DA classifier, a shallow artificial neural network and a locally weighted regression model. The models are built on a training set with PARAFAC2 resolved components from eight different aroma related GC-MS runs with a total of over 70,000 elution profile samples, and validated using another, independent, GC-MS dataset. Based on Receiver Operating Characteristic curves (ROC) and manual analysis of the misclassified cases, it is shown that the convolutional network consistently outperforms the competing models, yielding an Area Under the Curve (AUC) value of 0.95 for peak classification. Examples are given illustrating that this new approach provides convincing means to automatically assess and evaluate modelled elution profiles of chromatographic data and thereby remove this laborious manual step.

AB - Analysis of untargeted gas-chromatographic data is time consuming. With the earlier introduction of the PARAFAC2 (PARAllel FACtor analysis 2) based PARADISe (PARAFAC2 based Deconvolution and Identification System) approach in 2017, this task was made considerably more time-efficient. However, there are still a number of manual steps in the analysis which require data analytical expertise. One of these is the need to define whether or not each PARAFAC2 resolved component represents a peak suitable for integration. As the peaks may change in both shape and location on the elution time-axis, this presents a problem which cannot be readily solved by applying a linear classifier, such as PLS-DA (Partial Least Squares regression for Discriminant Analysis). As part of our ongoing efforts to further automate analysis of Gas Chromatography with Mass Spectrometry (GC-MS), we therefore explore a convolutional neural network classifier, capable of handling these shifts and variations in shape. The theory of convolutional neural networks and application on vector samples is briefly explained, and the performance is tested against a PLS-DA classifier, a shallow artificial neural network and a locally weighted regression model. The models are built on a training set with PARAFAC2 resolved components from eight different aroma related GC-MS runs with a total of over 70,000 elution profile samples, and validated using another, independent, GC-MS dataset. Based on Receiver Operating Characteristic curves (ROC) and manual analysis of the misclassified cases, it is shown that the convolutional network consistently outperforms the competing models, yielding an Area Under the Curve (AUC) value of 0.95 for peak classification. Examples are given illustrating that this new approach provides convincing means to automatically assess and evaluate modelled elution profiles of chromatographic data and thereby remove this laborious manual step.

U2 - 10.1016/j.talanta.2019.05.053

DO - 10.1016/j.talanta.2019.05.053

M3 - Journal article

C2 - 31357290

VL - 204

SP - 255

EP - 260

JO - Talanta

JF - Talanta

SN - 0039-9140

ER -

ID: 222926500