popsicleR: A R Package for Pre-processing and Quality Control Analysis of Single Cell RNA-seq Data

Research output: Contribution to journalJournal articleResearchpeer-review


  • Fulltext

    Final published version, 1.41 MB, PDF document

  • Francesco Grandi
  • Jimmy Caroli
  • Oriana Romano
  • Matteo Marchionni
  • Mattia Forcato
  • Silvio Bicciato

The advent of single-cell sequencing is providing unprecedented opportunities to disentangle tissue complexity and investigate cell identities and functions. However, the analysis of single cell data is a challenging, multi-step process that requires both advanced computational skills and biological sensibility. When dealing with single cell RNA-seq (scRNA-seq) data, the presence of technical artifacts, noise, and biological biases imposes to first identify, and eventually remove, unreliable signals from low-quality cells and unwanted sources of variation that might affect the efficacy of subsequent downstream modules. Pre-processing and quality control (QC) of scRNA-seq data is a laborious process consisting in the manual combination of different computational strategies to quantify QC-metrics and define optimal sets of pre-processing parameters. Here we present popsicleR, a R package to interactively guide skilled and unskilled command line-users in the pre-processing and QC analysis of scRNA-seq data. The package integrates, into several main wrapper functions, methods derived from widely used pipelines for the estimation of quality-control metrics, filtering of low-quality cells, data normalization, removal of technical and biological biases, and for cell clustering and annotation. popsicleR starts from either the output files of the Cell Ranger pipeline from 10X Genomics or from a feature-barcode matrix of raw counts generated from any scRNA-seq technology. Open-source code, installation instructions, and a case study tutorial are freely available at https://github.com/bicciatolab/popsicleR.

Original languageEnglish
Article number167560
JournalJournal of Molecular Biology
Issue number11
Number of pages11
Publication statusPublished - 2022

Bibliographical note

Funding Information:
This work was supported by funds from Fondazione AIRC under 5 per Mille 2019 program (ID. 22759) to S.B. and from the PRIN 2017 Project 2017HWTP2K of the Italian Ministry of Education, University and Research and the FAR 2019 ( E54I19002000001 ) and GR-2016-02362451 of the Italian Ministry of Health to M.F.. F.G. is a recipient of a Doctoral Fellowship Progetti di formazione alla ricerca (Bando 2018) from Regione Emilia Romagna. O.R. has been supported by Fondazione Umberto Veronesi (Post-Doctoral Fellowship 2020). We thank Martina Dori and Andrea Grilli for their support in coding the graphical routines of popsicleR.

    Research areas

  • bioinformatics, data analysis, R language, single cell RNA-sequencing, software tools

ID: 306590962