Efficient Prediction of Water Quality Index (WQI) Using Machine Learning Algorithms
基于机器学习算法的水质指数(WQI)高效预测
作者: Md. Mehedi Hassan, Md. Mahedi Hassan, Laboni Akter, Md. Mushfiqur Rahman, Sadika Zaman, Khan Md. Hasib, Nusrat Jahan, Raisun Nasa Smrity, Jerin Farhana, M. Raihan, Swarnali Mollick
Abstract

The quality of water has a direct influence on both human health and the environment. Water is utilized for a variety of purposes, including drinking, agriculture, and industrial use. The water quality index (WQI) is a critical indication for proper water management. The purpose of this work was to use machine learning techniques such as RF, NN, MLR, SVM, and BTM to categorize a dataset of water quality in various places across India. Water quality is dictated by features such as dissolved oxygen (DO), total coliform (TC), biological oxygen demand (BOD), Nitrate, pH, and electric conductivity (EC). These features are handled in five steps: data pre-processing using min-max normalization and missing data management using RF, feature correlation, applied machine learning classification, and model’s feature importance. The highest accuracy Kappa, Accuracy Lower, and Accuracy Upper findings in this research are 99.83, 99.17, 99.07, and 99.99, respectively. The finding showed that Nitrate, PH, conductivity, DO, TC, and BOD are the key qualities that contribute to the orderly classification of water quality, with Variable Importance values of 74.78, 36.805, 81.494, 105.770, 105.166, and 130.173, respectively.


Keywords:River water; water quality prediction; WQI; NN

摘要

水质对于人类健康和环境均具有直接影响。水的用途广泛,包括饮用、农业用水和工业用水。水质指数(WQI)被视为水资源管理的关键指标。本研究的目的是使用机器学习技术,如RF、NN、MLR、SVM和BTM,对印度各地的水质数据集进行分类。水质受溶解氧(DO)、总大肠菌群(TC)、生物需氧量(BOD)、硝酸盐、pH值和电导率(EC)等因素的影响。这些特征分五个阶段进行处理:使用最小-最大归一化的数据预处理和RF的缺失数据管理、特征相关性、机器学习分类算法以及模型的特征重要性。本研究中的最高准确度Kappa值、较低准确度和较高准确度分别为99.83、99.17、99.07和99.99。研究结果表明,硝酸盐、PH值、电导率、DO、TC和BOD等因素有助于水质的有序分类,其变量重要值分别为74.78、36.805、81.494、105.770、105.166和130.173。


关键词:河水;水质预测;水质指数(WQI);NN

Figure of HCIS-D-21-00014R1.png


论文信息 PAPER INFORMATION
所属期刊
Human-Centric Intelligent Systems
ISSN(Online)
2667-1336
学科领域
计算机科学
发表时间
2021-01-12