Abstract:
In order to optimize and evaluate the air quality forecasting accuracies of four kinds of pollutants (NO
2, O
3, PM
2.5 and PM
10) in Shanghai, a new spatial-temporal coupling ensemble forecasting method (ET-BPNN, short for ensemble tree- back propagation neural network) was established based on four machine learning algorithms. Firstly, the forecasting results are optimized using random forest, extreme random tree and gradient boosting decision tree, with input features selected from multi-scale forecasting data based on four air quality numerical forecasting models (CMAQ, CAMx, NAQPMS and WRFChem), the meteorological data of mesoscale weather model (WRF, including 2m temperature, 2m humidity, 10m wind speed, 10m wind direction, atmospheric pressure and hourly accumulated precipitation) and observations. The best machine learning algorithm was chosen by comparing the root mean square error. Secondly, further optimization was proceeded with BP neural network. Results show that: (1) Compared with the traditional ensemble mean algorithm, the root mean square error (RMSE) between the ET-BPNN simulation and observed hourly concentration of NO
2, O
3, PM
2.5 and PM
10 are reduced by 51.9%, 60.1%, 63.0% and 60%, respectively. (2) The optimization effect of ET-BPNN algorithm is significantly improved compared with three machine learning algorithms, and the RMSE of NO
2, O
3, PM
2.5 and PM
10 are reduced by 42.7%, 20.1%, 19.7% and 9.7%, respectively. (3) ET-BPNN shows an effective optimization on PM
2.5 forecasting in autumn and winter when its concentration is higher, with decreased forecasting bias found at different stations. (4) The ET-BPNN also improves the forecasting performance on pollution process of O
3_8h and PM
2.5, with peak value simulated more accurate than the traditional ensemble average algorithm.