A machine learning approach for predicting bacteria content in drinking water

dc.contributor.authorEric, Jonsson
dc.contributor.departmentChalmers tekniska högskola / Institutionen för matematiska vetenskapersv
dc.contributor.examinerAxelson-Fisk, Marina
dc.contributor.supervisorDannélls, Dana
dc.contributor.supervisorCahn, Jacob
dc.date.accessioned2023-06-12T09:17:01Z
dc.date.available2023-06-12T09:17:01Z
dc.date.issued2023
dc.date.submitted2023
dc.description.abstractThe current method for finding whether drinking water contains bacterial contamination is a very slow process and it can take up to eight days before the results are obtained. During this time, a significant proportion of the population has potentially obtained diseases from contaminated water. As a mitigating action, this thesis aimed to understand if machine learning could be a promising method for forecasting the bacteria level and how such a model could be designed. The project was performed in association with a case company called Nocoli, which is spun out of Chalmers Ventures and desired an examination of the potential implementation. A literature review including eight different case studies of how machine learning was previously applied in the field and three semi-structured interviews with industryspecific stakeholders were conducted. The research methodology originated from the fact that both an overview of the current industry situation as well as machine learning applicability was required. Moreover, by using an extracted theory of machine learning algorithms for different objectives, the case studies were evaluated to find patterns that could meet the case companys demands. It was found that machine learning is promising and desired in the industry to improve current operations. The Random Forest algorithm was recommended in the initial stage due to its trade-off between accuracy and interpretability. Data on bacterial content and other factors including weather was intended as the data source. The recommendation included a 3:1:1 split between training-, validation-, and test sets as well as using a recursive feature selection algorithm. Additionally, a combination of error measures was recommended including Mean Squared Error with an out-of-bag supplement to reduce overfitting. Furthermore, although no data could be obtained to evaluate the recommended model, it was concluded that machine learning could have a positive impact on today’s approach and contribute to improved water management and safety by enabling reliable forecasts.
dc.identifier.coursecodeMVEX03
dc.identifier.urihttp://hdl.handle.net/20.500.12380/306164
dc.language.isoeng
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectmachine learning, forecasting, drinking water quality, contaminated water, drinking water treatment, escherichia coli prediction, HPC method, Random Forest.
dc.titleA machine learning approach for predicting bacteria content in drinking water
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeData science and AI (MPDSC), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Master_Thesis_Eric_Jonsson_2023.pdf
Storlek:
1.05 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: