Machine learning model for air pollution prediction in Skopje, North Macedonia
Date Issued
2020-07
Author(s)
Andonovic, Viktor
Todorov, Zdravko
Abstract
The low quality of air, especially high concentration of particulate matter that have significant
negative effect on human health and environment, is a global problem in urban areas. Thus,
early air pollution prediction is an urgent need in Skopje, North Macedonia with highly
increased concentration of particulate matter especially during the winter months.
The objective of this paper is to develop machine learning model for predicting the air pollution
in Skopje. The methods are based on processing the collected data from different measurement
locations in Skopje, generating numerous weather and pollution features, and choosing the
optimal parameters (hyperparameters) for the model. The information for the various pollutants
were provided from the measurement stations located near the Faculty of Electrical Engineering
and Information Technologies building. The measured data are gathered from the three sensor
nodes that are collecting data for following parameters: particulate matter with 10 or less
micrometres (PM10), particulate matter with 2.5 or less micrometres (PM2.5), CO and NO2,
and sending these data to a server for online monitoring or off-line analysis.
The pollution data, together with the weather information for temperature, humidity, wind
speed, and wind direction were combined to train the prediction model. The results show that
the weather information is correlated with the air pollution, which allows to train a model that
predicts the air pollution based on the weather data and the historical data about the pollution.
The experimental evaluation showed that the best performing model, XGBoost, achieves Mean
Absolute Error for PM10 values of 6.8, 9.7, and 12.4 for the nodes 3, 2, and 1 respectively, and
for PM2.5 values 6.36, 8.81 and 8 for nodes 3, 2 and 1 respectively.
negative effect on human health and environment, is a global problem in urban areas. Thus,
early air pollution prediction is an urgent need in Skopje, North Macedonia with highly
increased concentration of particulate matter especially during the winter months.
The objective of this paper is to develop machine learning model for predicting the air pollution
in Skopje. The methods are based on processing the collected data from different measurement
locations in Skopje, generating numerous weather and pollution features, and choosing the
optimal parameters (hyperparameters) for the model. The information for the various pollutants
were provided from the measurement stations located near the Faculty of Electrical Engineering
and Information Technologies building. The measured data are gathered from the three sensor
nodes that are collecting data for following parameters: particulate matter with 10 or less
micrometres (PM10), particulate matter with 2.5 or less micrometres (PM2.5), CO and NO2,
and sending these data to a server for online monitoring or off-line analysis.
The pollution data, together with the weather information for temperature, humidity, wind
speed, and wind direction were combined to train the prediction model. The results show that
the weather information is correlated with the air pollution, which allows to train a model that
predicts the air pollution based on the weather data and the historical data about the pollution.
The experimental evaluation showed that the best performing model, XGBoost, achieves Mean
Absolute Error for PM10 values of 6.8, 9.7, and 12.4 for the nodes 3, 2, and 1 respectively, and
for PM2.5 values 6.36, 8.81 and 8 for nodes 3, 2 and 1 respectively.
Subjects
