skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Deep Ensemble Machine Learning Framework for the Estimation of PM2.5 Concentrations

Environmental health perspectives, 2022-03, Vol.130 (3), p.37004-37004 [Peer Reviewed Journal]

ISSN: 0091-6765 ;EISSN: 1552-9924 ;DOI: 10.1289/EHP9752 ;PMID: 35254864

Full text available

Citations Cited by
  • Title:
    Deep Ensemble Machine Learning Framework for the Estimation of PM2.5 Concentrations
  • Author: Yu, Wenhua ; Li, Shanshan ; Ye, Tingting ; Xu, Rongbin ; Song, Jiangning ; Guo, Yuming
  • Is Part Of: Environmental health perspectives, 2022-03, Vol.130 (3), p.37004-37004
  • Description: BACKGROUNDAccurate estimation of historical PM2.5 (particle matter with an aerodynamic diameter of less than 2.5μm) is critical and essential for environmental health risk assessment. OBJECTIVESThe aim of this study was to develop a multiple-level stacked ensemble machine learning framework for improving the estimation of the daily ground-level PM2.5 concentrations. METHODSAn innovative deep ensemble machine learning framework (DEML) was developed to estimate the daily PM2.5 concentrations. The framework has a three-stage structure: At the first stage, four base models [gradient boosting machine (GBM), support vector machine (SVM), random forest (RF), and eXtreme gradient boosting (XGBoost)] were used to generate a new data set of PM2.5 concentrations for training the next-stage learners. At the second stage, three meta-models [RF, XGBoost, and Generalized Linear Model (GLM)] were used to estimate PM2.5 concentrations using a combination of the original data set and the predictions from the first-stage models. At the third stage, a nonnegative least squares (NNLS) algorithm was employed to obtain the optimal weights for PM2.5 estimation. We took the data from 133 monitoring stations in Italy as an example to implement the DEML to predict daily PM2.5 at each 1km×1km grid cell from 2015 to 2019 across Italy. We evaluated the model performance by performing 10-fold cross-validation (CV) and compared it with five benchmark algorithms [GBM, SVM, RF, XGBoost, and Super Learner (SL)]. RESULTSThe results revealed that the PM2.5 prediction performance of DEML [coefficients of determination (R2)=0.87 and root mean square error (RMSE)=5.38μg/m3] was superior to any benchmark models (with R2 of 0.51, 0.76, 0.83, 0.70, and 0.83 for GBM, SVM, RF, XGBoost, and SL approach, respectively). DEML displayed reliable performance in capturing the spatiotemporal variations of PM2.5 in Italy. DISCUSSIONThe proposed DEML framework achieved an outstanding performance in PM2.5 estimation, which could be used as a tool for more accurate environmental exposure assessment. https://doi.org/10.1289/EHP9752.
  • Publisher: Environmental Health Perspectives
  • Language: English
  • Identifier: ISSN: 0091-6765
    EISSN: 1552-9924
    DOI: 10.1289/EHP9752
    PMID: 35254864
  • Source: PubMed Central (Open access)
    US Government Documents
    DOAJ Directory of Open Access Journals
    AUTh Library subscriptions: ProQuest Central

Searching Remote Databases, Please Wait