RPCA-based techniques for pattern extraction, hotspot identification and signal correction using data from a dense network of low-cost NO2 sensors in London

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 10.9 MB, PDF document

High-density low-cost air quality sensor networks are a promising technology to monitor air quality at high temporal and spatial resolution. However the collected data is high-dimensional and it is not always clear how to best leverage this information, particularly given the lower data quality coming from the sensors. Here we report on the use of robust Principal Component Analysis (RPCA) using nitrogen dioxide data obtained from a recently deployed dense network of 225 air pollution monitoring nodes based on low-cost sensors in the Borough of Camden in London. RPCA addresses the brittleness of singular value decomposition towards outliers by using a decomposition of the data into low-rank and sparse contributions, with the latter containing outliers. The modal decomposition enabled by RPCA identifies major periodic patterns including spatial and temporal bias, dominant spatial variance, and north-south bias. The five most descriptive components capture 98 % of the data's variance, achieving a compression by a factor of 1500. We present a new technique that uses the sparse part of the data to identify hotspots. The data indicates that at the locations of the top 15 % most susceptible nodes in the network, the model identifies 23 % more hotspots than in all other locations combined. Moreover, the median hotspot event at these at-risk locations exceeds the mean NO2concentration by 33μg/m3. We show the potential of RPCA for signal correction; it corrects random errors yielding a reference signal with R2>0.8. Moreover, RPCA successfully reconstructs missing data from a sensor with R2=0.72 from the rest of the sensor network, an improvement upon PCA of around 50 %, allowing air quality estimations even if a sensor is out of use temporarily.

Original languageEnglish
Article number171522
JournalScience of the Total Environment
Volume925
Number of pages11
ISSN0048-9697
DOIs
Publication statusPublished - 2024

Bibliographical note

Funding Information:
The authors would like to thank Airscape for providing the data. MvR acknowledges support from the Natural Environment Research Council (NERC) air quality Future Urban Ventilation Network (NE/V002082/1). MvR would like to thank Prof. Ben Barratt for valuable feedback on an early version of this paper.

Publisher Copyright:
© 2024 The Authors

    Research areas

  • Air Quality Monitoring, Hotspot identification and signal correction, Low-Cost Sensor Networks, Robust Principal Component Analysis (RPCA), Spatial and Temporal Patterns in Air Pollution

ID: 389079852