Introduced by Devansh Jalota, Areeb Khan, Kunyi Ni, Anna Shang, Zhuxin Zhang
Biogas production in China is highly inefficient due to the lack of control of the production processes. This team serves to solve this very problem by creating a data science pipeline that analyzes input biowaste data from two biogas facilities in China and informs the biogas plant operators and managers about their production potential and revenue expectations. Their easy to use graphical user interface not only informs users about the waste inputs that most influence output but also enables future biogas investors and plant managers to learn about the production potential of biogas in different parts of China. All the plant operators have to do is input the amount of the different types of waste they will be inputting into the plant over a specified time period and observe their expected revenues to almost 90% accuracy!
The project was a long, arduous journey for the team and involved work from data cleaning and preprocessing to the implementation of machine learning algorithms — for both regression and classification purposes — and finally to the integration of these elements into a unified user interface. Their major difficulties in the project were with regards to model accuracies, which they improved through some novel feature engineering and hyperparameter
tuning techniques. Overall, it was the culmination of a semester-long effort by a highly interdisciplinary team that made this project possible. Their tool has immense potential in the rapidly growing biogas market in China, and can be used by the facilities to gain instant information on how their production processes can be improved.