OnTheFly2.0: a text-mining web application for automated biomedical entity recognition, document annotation, network and functional enrichment analysis

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 4.44 MB, PDF document

  • Fotis A Baltoumas
  • Sofia Zafeiropoulou
  • Evangelos Karatzas
  • Savvas Paragkamian
  • Foteini Thanati
  • Ioannis Iliopoulos
  • Aristides G Eliopoulos
  • Reinhard Schneider
  • Jensen, Lars Juhl
  • Evangelos Pafilis
  • Georgios A Pavlopoulos

Extracting and processing information from documents is of great importance as lots of experimental results and findings are stored in local files. Therefore, extracting and analyzing biomedical terms from such files in an automated way is absolutely necessary. In this article, we present OnTheFly2.0, a web application for extracting biomedical entities from individual files such as plain texts, office documents, PDF files or images. OnTheFly2.0 can generate informative summaries in popup windows containing knowledge related to the identified terms along with links to various databases. It uses the EXTRACT tagging service to perform named entity recognition (NER) for genes/proteins, chemical compounds, organisms, tissues, environments, diseases, phenotypes and gene ontology terms. Multiple files can be analyzed, whereas identified terms such as proteins or genes can be explored through functional enrichment analysis or be associated with diseases and PubMed entries. Finally, protein-protein and protein-chemical networks can be generated with the use of STRING and STITCH services. To demonstrate its capacity for knowledge discovery, we interrogated published meta-analyses of clinical biomarkers of severe COVID-19 and uncovered inflammatory and senescence pathways that impact disease pathogenesis. OnTheFly2.0 currently supports 197 species and is available at http://bib.fleming.gr:3838/OnTheFly/ and http://onthefly.pavlopouloslab.info.

Original languageEnglish
Article numberlqab090
JournalNAR Genomics and Bioinformatics
Volume3
Issue number4
Number of pages10
DOIs
Publication statusPublished - 2021

Bibliographical note

© The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

ID: 282189822