Development of statistical models for optimizing the performance of the electrostatic filter in a waste to energy plant
Examensarbete för masterexamen
Process data for an electrostatic precipitator (ESP) from Uddevella Energi AB is measured with one-hour resolution. There are 23 predictors which are ash concentrations, steam productions, voltages, currents, flue gas properties (volumetric flowrate, temperature, pressure, oxygen, and water content), and exhaust gas compositions (HCl, CO, NOX, CO2 and SO2). The data is preprocessed by removing outliers using standard deviation method and Mahalanobis distance, resulting in 3 different scenarios (s1, s2, and s3). To avoid overfitting, data in each scenario is split into training and test sets for 7 cases having different amount of data in training and test set (i.e., the training/test set percentages of data were: 50-50, 55-45, 60-40, 65-35, 70-30, 75-25, and 80-20). The main predictive models are linear regression and support vector machines (SVM). Each of them is additionally applied with principal component analysis (PCA) and partial least squares (PLS) for dimensionality reduction. Thus, there are 6 models in total (i.e., Linear regression, Principal component regression (PCR), Partial-least square regression (PLSR), SVM, SVM with PCA, and SVM with PLS). From investigation, scenario 2 with outliers removed by standard deviation method gives the best performance in most cases. For the prediction trend, linear regression, PCR and PLSR models have bad prediction at very low and very high efficiency. With all 23 predictors, SVM with PLS give the best prediction trend among 6 models, and case 65-35 provides the best performance with RMSE of 0.0035, R2 of 0.86, MARE of 0.26% and MaxARE of 1.45%. Feature selection is performed to improve the models. The best predictor combination to be removed is CO2, SO2, H2O, HCl, CO, O2wet, Pin, and NOX, leaving 15 predictors for the models. Unusual trend of SVM and SVM with PCA from using all predictors is reduced or even disappeared, while all models get improved when this reduced set of 15 predictors is used. SVM with PCA model gives best performance for all splitting cases with 15 predictors and case 50-50 provides the best indicator values with the lowest RMSE of 0.0029, highest R2 of 0.9161, lowest MARE of 0.19% and MaxARE of 1.94%. Thus, SVM with PCA model with 15 predictors using scenario 2 and case 50-50 is recommended for ESP efficiency prediction.
Electrostatic precipitator (ESP) , Linear Regression , Support Vector Machines (SVM) , Principal Component Analysis (PCA) , Partial Least Square (PLS)