Source: Case Western Reserve University submitted to NRP
PREDICTIVE TOOLS FOR DEGRADATION OF CHEMICALS OF EMERGING CONCERN IN LIVESTOCK MANURE BY ANAEROBIC DIGESTION AND ADVANCED OXIDATION PROCESSES
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1022123
Grant No.
2020-67019-31019
Cumulative Award Amt.
$499,998.00
Proposal No.
2019-06496
Multistate No.
(N/A)
Project Start Date
May 1, 2020
Project End Date
Apr 30, 2025
Grant Year
2020
Program Code
[A1411]- Foundational Program: Agricultural Water Science
Recipient Organization
Case Western Reserve University
10900 Euclid Avenue
CLEVELAND,OH 44106
Performing Department
Department of Civil Engineerin
Non Technical Summary
Chemicals of Emerging Concern (CEC) such as veterinary antibiotics and hormones have been extensively used in the US and globally, and their usage will likely increase substantially in coming decades. A large portion of CEC is excreted unchanged into the environment through manure land applications. Given the importance of veterinary antibiotics and hormones to animal husbandry, it is imperative to be able to predict the treatment efficiencies of these contaminants in important waste management processes for reuse of these nontraditional water resources. Currently, no predictive tools are available to estimate how well existing control technologies can remove CEC in manure.In this project, we propose to examine the degradation of 22+ veterinary antibiotics and hormones (from eight groups) in anaerobic digestion (AD), one of the best manure treatment processes, in both batch and continuous stirred-tank reactors. Moreover, given the insufficient removal of some CEC in AD, we propose to significantly enhance CEC removal by three advanced oxidation processes based on hydroxyl and/or sulfate radicals and permangange/bisulfite (a powerful oxidant that can destruct CEC in milliseconds) in either pre- or post-AD treatment. Armed with machine learning and other advanced modeling approaches, predictive tools will be developed between the observed reactivity in both AD and AOPs and the CEC structures to allow estimation of the reactivity of a wide range of CEC in these processes. Results of this project will result in more realistic estimates of the discharge of CEC from livestock manure into the environment. Knowing such information will not only enable us to develop sustainable management plans to ensure the environmental and health safety of the vast volume of manure, but also allow us to assess the accurate risk of applying different CEC-containing manures to farm lands and the liquid fraction of AD effluents for irrigation. Greener agricultural products can also be designed by knowing the removal behavior of the existing products. Ultimately, results of this project can be used by the USDA to improve U. S. agriculture and food systems.
Animal Health Component
0%
Research Effort Categories
Basic
90%
Applied
0%
Developmental
10%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
13302100001100%
Knowledge Area
133 - Pollution Prevention and Mitigation;

Subject Of Investigation
0210 - Water resources;

Field Of Science
0001 - Administration;
Goals / Objectives
The ultimate goal of this proposal is to provide predictive tools to allow assessment of CEC removal during AD of manure and treatment by AOPs. To achieve this goal, we have four specific objectives:Objective 1. Develop analytical methods for representative CEC in both liquid and solid matrices. 22 veterinary antibiotics and hormones belonging to eight groups have been identified to represent the most relevant CEC with sufficient structural diversity.Objective 2. Examine the removal efficiencies of the 22 CEC in AD, with and without pre-treatment by AOPs. Batch reactors will be first employed to test the treatability, kinetics, and major products of all the selected CEC. Then, continuous stirred-tank reactors (CSTRs) will be employed to monitor the treatment efficiencies of mixed CEC under field conditions.Objective 3. Examine the removal efficiencies of the 22 CEC by AOPs in AD effluents and associated with AD biosolids. Each selected CEC will be first tested for its reactivity toward at least one of three AOPs in DI water. Then, reaction kinetics and products of the reactive CEC will be monitored in AD effluent. Experiments will also test the ability of the AOPs to destruct selected CEC pre-adsorbed to AD biosolids.Objective 4. Combine the experimental data with any usable data that can be mined from the literature to develop predictive tools for the removal of a wide range of veterinary antibiotics and hormones by AD and AOPs, based on pp-LFERs for the sorption of CEC by biosolids and DNN-MF for the degradation of all structurally-related CEC.
Project Methods
Although a number of studies have reported the removal of CEC in AD, only a small number of CEC were examined in these studies and there is a lack of understanding of how different CEC structures affect their removal in the AD process. Given the large variety of antibiotics and hormones applied to agricultural fields with the vast volume of manure, it is time-consuming, labor-intensive and costly to examine all CEC in AD; however, without such information, it is challenging to evaluate the risk of directly applying AD effluents to agricultural fields and to decide what proper management practices should be in place to ensure the environmental and health safety of the effluents. Therefore, it is imperative to develop predictive models that can be used to estimate the treatment efficiencies of various CEC in AD based on their structural and/or physicochemical properties. Currently, no such predictive model is available. The proposed research will address this challenge by establishing predictive models that can estimate 1) the biosorption and biodegradation of a variety of CEC in AD, an important, promising manure treatment process, and 2) the CEC degradation either pre- or post-AD in three powerful AOPs. To the PDs' knowledge, this will be the first research to do this.To develop the predictive models, experiments have been designed to examine the degradation kinetics and products of 22 representative CEC in both AD and AOPs. Then, two different approaches will be used to developed the predictive models: poly-parameter linear free energy relationships (pp-LFERs) for adsorption by biosolids in AD and machine learning for degradation in both AD and AOPs. pp-LFERs have been widely used in modeling adsorption, but haven't been applied to biosolids; machine learning is an emerging powerful tool to discover hidden trends in complex, large data sets and will likely yield accurate prediction results. Other than the experiment results, the literature will be extensively reviewed for any available data in the degradation of other CEC in either AD or AOPs. This is because experimental efforts are always limited compared with the vast variety number of CEC in the environment. Also, accurate modeling requires a large amount of data and the quality of the models is directly affected by the number of data involved in the modeling process. Once the models have been developed, they will be properly validated and tested against new datasets that have not been used in the model development process.Efforts: The PDs will work closely with the postdoc and PhD student on analytical method development, setting up experiments for both AD and AOPs, and developing models for both adsorption and degradation. Results of this project will be disseminated to the scientific community through publications in peer-reviewed journals and presentations to inter-disciplinary audiences at national and international conferences (e.g., ACS, Gordon Research Conferences, AEESP, SETAC annual meetings). In addition, a one-day workshop will be organized in year 3 to invite farmers, livestock industry personnel and other relevant stakeholders to attend. Lectures and presentations will be delivered by the research team and guest speakers as the need arises. Webinar of the workshop will also be utilized to ensure as much participation as possible.Evaluation: The following time table will be closely followed to ensure that we're taking the right steps toward completing the proposed objectives. Completing each sub-objective will be a milestone; completing all the objectives will be used as a "measure" for the successful completion of the project. The types of data obtained will include: (1) analytical methods for the analysis of the all the involved CEC and their degradation products; (2) degradation kinetics and products of the CEC in AD in both batch and CSTR reactors, either with or without AOPs as the pre-treatment; and (3) the developed predictive models for both adsorpiton and degradation.Research ObjectivesYear 1Year 2Year 3123412341234Objective 1 1.1 CEC selection*** 1.2 Analytical methods***Objective 2 2.1 AD batch reactions******** 2.2 AD CSTR reactions**Objective 3 3.1 AOPs in DI water******** 3.2 Oxidation products**** 3.3 AOPs in AD effluent**** 3.4 AOPs of biosolids****Objective 4 4.1 pp-LFERs**** 4.2 Predictive Models****** 4.3 Cross correlations****Reports and publications*****

Progress 05/01/24 to 04/30/25

Outputs
Target Audience:PI Zhang's work targets a diverse audience within the scientific community, including researchers in environmental science and machine learning, as evidenced by her numerous peer-reviewed publications on polymer biodegradation and adsorption in soil-water systems. Her active participation in major conferences, such as the ACS National Conference and SETAC Annual Meetings, aims to engage academics and industry professionals interested in topics like abiotic humification and advanced machine learning models. Collaborations with industry leaders like BASF and P&G further extend her reach to corporate stakeholders seeking practical applications of her research. Additionally, her involvement in high-profile symposiums and international conferences attracts experts and decision-makers focused on advancing environmental applications. The upcoming publication of PhD student Yushu Cheng's dissertation on "Machine Learning Modeling for Predicting Anaerobic Biodegradation of Organic Contaminants in Different Environments" at Case Western Reserve University will also contribute to reaching academic peers and specialists in the field by the end of 2025. Changes/Problems:Over the past five years, we have fully embraced emerging machine learning tools, finding them extremely powerful and relying on them more than initially planned. This shift led us to focus on data curation and machine learning modeling, using experimental approaches only whennecessary to build comprehensive datasets. Despite these methodological changes, the project objectives remain unchanged, with a move away from traditional QSARs towards advanced machine learning modeling. What opportunities for training and professional development has the project provided?We have involved three PhD students and one postdoc (50% effort) in the project and they have been trained toward their professional development. PI Zhang also incorporated the project findings and the relevant background knowledge to two of her courses: Intro to Environmental Engineering (for juniors and seniors) and Environmental Engineering Chemistry (for graduate students and seniors). Two low-income high school students also worked on the project over the summer of 2024, on contaminant adsorption experiments. How have the results been disseminated to communities of interest?PI Zhang has been invited to give 11 talks related to this project in the past year to a broad range of audiences, ranging from national and international conferences such as ACS and AEESP national meetings, to the industry (BASF), to reputable universities in the US and China, and to broad audience including the CWRU Emeriti Academy.Her students and Zhang also presented the findings 9 times at national and international conferences in the past year. Invited talks: Zhang, H.,* 2025 "A data-centric approach to machine learning for environmental applications", Department of Energy, Environmental and Chemical Engineering, Washington University - St. Louis, April 25, 2025 Zhang, H.,* 2025 "Machine learning modeling for environmental applications", University of Minnesota - Twin City, April 14, 2025 Zhang, H.,* 2025 "Microplastics and forever chemicals: An Environmental perspective", The CWRU Emeriti Academy, April 8, 2025 Zhang, H.,* 2025 "Polymer Biodegradation in Aquatic Environments: A Machine Learning Model Informed by Meta-Analysis of Structure-Biodegradation Relationships", BASF, March 13, 2025 (virtual). Zhang, H.,* 2024 "Common pitfalls and best practices when using machine learning in environmental research", Symposium "Opportunities and Potential Limitations of Applying Artificial Intelligence and Machine Learning to Soil Science". The Soil Science Society of America ASA-CSSA-SSSA Annual Meeting, Nov 10-13, 2024. San Antonio, TX. Zhang, H.,* 2024 "Aerobic biodegradation of polymers in aquatic environments: High-throughput methods and machine learning models", Symposium on "Environmental Fate of Polymers", The 45th SETAC North America Annual Meeting, Oct 20 - 24, 2024. Fort Worth, Texas. Zhang, H.,* 2024 "Synergizing domain knowledge, experimental data, and active learning for modeling environmental processes", the SETAC Asia-Pacific 14th Biennial Meeting, September 21-25, 2024. Tianjin, China. Zhang, H.,* 2024 "Advanced machine learning for predicting harmful algal blooms and streamlining water quality analysis without equipment", Symposium on "Data Science in Environmental Research for a Sustainable Future", ACS Fall National Conferences 2024, August 18 - 22. Denver, CO. Zhang, H.,* 2024 "Modeling environmentally relevant chemical reactions with machine learning", Biennial Conference on Chemical Education 2024, July 28 - Aug 1, 2024. Lexington, Kentucky. Zhang, H.,* 2024 "Modeling Adsorption of Organic Chemicals Using Machine Learning: From Single Solute to Solute Mixtures", ISPT 2024 International Congress on Separation and Purification Technology. July 7-11, 2024. Zhengzhou, China Zhang, H.,* 2024 "Modeling environmentally relevant chemical reactions with machine learning", Training school on "Cross-border transfer and development of sustainable resource recovery strategies towards zero waste", June 9 - June 14, 2024. National University of Ireland - Galway, Galway, Ireland (virtual). What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? One Ph. D. student, Chengrui Lin, has finished a paper, entitled "Polymer Biodegradation in Aquatic Environments: A Machine Learning Model Informed by Meta-Analysis of Structure-Biodegradation Relationships" (Environ. Sci. Technol. 2025, 59, 1253−1263), on building machine learning models for aerobic biodegradation kinetics data of a number of polymers in aquatic environments. This is the extension of the original proposal as we saw the need to model polymers in addition to the initial proposed idea of small molecules. This is because, while polymers are produced on a large scale, with 400.3 million tons of solid polymers (plastics) and 36.3 million tons of liquid polymers produced annually, biodegradable polymers constitute less than 2% of total production, indicating a need for research to expand their availability. While working with biodegradation of a number of organic compounds by either sediment/soil or digestion sludge, we recognized the importance of sorption in affecting chemical biodegradation. So a second PhD student, Jiachun Sun, worked on literature review of all reported soils sorption data of various organic compounds. He ended up collecting a data set of 20,945 data points covering the sorption of 419 diverse organic compounds onto 1037 different soils. He subsequently developed machine learning models to predict the soil sorption of a much larger number of organic compounds onto global soil. This work, together with the work above on the predictive modeling of anaerobic biodegradation of various organic compounds in sediment/soil and sludge, moved a major step forward in modeling and predicting the fate and transformation of hundreds and thousands of organic compounds in both natural/agricultural environments and wastewater treatment sludges. This work is recently published in the Journal of Hazardous Materials (JHM), entitled "Predicting sorption of diverse organic compounds in soil-water systems: Meta-analysis, machine learning modeling, and global soil mapping" (JHM, 488 (2025) 137480). Building on her first paper on modeling the anaerobic biodegradation of hundreds of organic compounds in soil and sludge, the third PhD student, Yushu Cheng, recognized the limited dataset in her modeling effort, despite her time-consuming literature review and lab experimental efforts. In order to significantly expand the efficiency in collecting all potentially available data from the literature so we can build the largest ever dataset for the anaerobic biodegradation of all available organic compounds in different environments, Yushu decided to take advantage of the latest large language models (LLMs), such as ChatGPT, and developed an automated information extraction pipeline, named Literature to Dataset (L2D). Using anaerobic biodegradation kinetics of organic contaminants under different conditions as the target, this structured multi-step prompting pipeline dynamically incorporates domain constraints and integrates an LLM-based self-validation process. L2D achieves 93% accuracy in extracting 8 features for anaerobic biodegradation conditions, surpassing existing frameworks without requiring additional training. These results establish L2D as a scalable, low-code alternative for generating structured datasets in complex environmental domains. This work, entitled "L2D: A Versatile Literature-to-Dataset Pipeline for Complex, User-Specific Data Extraction in Environmental Research", is in review with Water Research. Yushu is also expected to graduate in the Fall of 2025, with the dissertation entitled "Machine Learning Modeling for Predicting Anaerobic Biodegradation of Organic Contaminantsin Different Environments". After graduation in May, 2024, Dr. Yidan Gao continued to work on the oxidation of phenolic compounds by manganese oxides under alkaline conditions. One objective was to understand how to facilitate post-treatment of AD effluents using AOPs, and the other objective was to examine the formation of organic matter during the process, which will help us examine the carbon balance of organic compounds during the process. We submitted this paper, entitled "MnO2 and O2 facilitated organic matter formation via oxidation and catalysis in alkaline air conditions", to Environ. Sci. Technol. for review in late 2024. After receiving the initial reviewers' comments (major revision), we have recently submitted a revised version for review.

Publications

  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2025 Citation: Lin, Chengrui; � Kuan Huang; � and Zhang, H.*; 2025 Predictive models for polymer biodegradation kinetics, Environ. Sci. Technol. 59, 2, 12531263
  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2025 Citation: Jiachun Sun �, Kai Zhang�, and Zhang, H.*; 2025 Predicting Adsorption of Diverse Organic Compounds in Soil-Water Systems: Meta-analysis, Predictive Models, and Global Soil Mapping, J. Haz. Mat., 488, 137480
  • Type: Conference Papers and Presentations Status: Published Year Published: 2025 Citation: 1. Zhang, H.*; 2025 Integrating Feature Engineering and Domain Expertise for Machine Learning in Environmental Applications, Symposium on Sensors and data analytics for environmental applications, ACS National Conference 2025, March 23-27, 2025
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 2. Gao, Yidan� and Zhang, H.*; 2024 Early-stage abiotic humification: Understanding the roles of MnO2 and oxygen in catechol oxidation under alkaline air conditions, The 45th SETAC North America Annual Meeting, Oct 20  24, 2024. Fort Worth, Texas
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 4. Zheng, Zihan; � Kai Zhang; � Peiling Yu; � and Zhang, H.*; 2024 Bi-solute adsorption modeling with active learning: A new strategy toward estimating multi-solute adsorption capacities on GACs, Symposium on Data Science in Environmental Research for a Sustainable Future, ACS Fall National Conferences 2024, August 18 - 22. Denver, CO
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 5. Jiachun Sun �, Kai Zhang�, and Zhang, H.*; 2024 Predicting Adsorption of Diverse Organic Compounds in Soil-Water Systems: Meta-analysis, Predictive Models, and Global Soil Mapping, Symposium on Data Science in Environmental Research for a Sustainable Future, ACS Fall National Conferences 2024, August 18 - 22. Denver, CO
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 6. Cheng, Yushu �, Kai Zhang�, Kuan Huang� and Zhang, H.*; 2024 Meta-Analysis and Machine Learning Models for Predicting Anaerobic Biodegradation Rates of Organic Contaminants in Sediments and Sludge, Gonter Award Symposium (Invited), ACS Fall National Conferences 2024, August 18 - 22. Denver, CO
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2025 Citation: 7. Cheng, Yushu �, and Zhang, H.*; 2025 Automated data collection from literature - A flexible, model-free approach using conversational large language models and prompt engineering, 2025 AEESP Research and Education Conference, Durham, NC, May 20-22, 2025
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 8. Cheng, Yushu �, Kai Zhang�, Kuan Huang� and Zhang, H.*; 2024 Meta-Analysis and Machine Learning Models for Predicting Anaerobic Biodegradation Rates of Organic Contaminants in Sediments and Sludge, 2024 CAPEES Virtual Student Poster Competition, July 19, 2024
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 9. Jiachun Sun �, Kai Zhang�, and Zhang, H.*; 2024 Predicting Adsorption of Diverse Organic Compounds in Soil-Water Systems: Meta-analysis, Predictive Models, and Global Soil Mapping, 2024 CAPEES Virtual Student Poster Competition, July 19, 2024
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: 3. Lin, Chengrui; � and Zhang, H.*; 2024 Predictive models for polymer biodegradation kinetics, Symposium on Data Science in Environmental Research for a Sustainable Future, ACS Fall National Conferences 2024, August 18 - 22. Denver, CO


Progress 05/01/20 to 04/30/25

Outputs
Target Audience:PI Zhang has been invited to give 53 talks related to this project in the past five years to a broad range of audiences. Her PhD students and postdocs involved in the projects also delivered 16 conference presentations. Here is a list of all institutions categorized into groups, along with the number of institutions in each group: Academic Institutions and Departments (23) Case Western Reserve University Cleveland State University Colorado School of Mines Dalian Institute of Technology East China Normal University Fudan University Huanqiao University King Abdullah University of Science and Technology (KAUST) Nanjing University Nankai University National University of Ireland - Galway Northumbria University, UK Peking University, China Research Center for Eco-Environmental Sciences (RCEES), Chinese Academy of Science Shanghai JiaoTong University Southern Illinois University Edwardsville University of California Berkeley University of California Riverside University of Cincinnati University of Colorado Boulder University of Minnesota - Twin City University of Science & Technology of China Washington University - St. Louis Professional Societies and Conferences (19) 10th National Conference of the Society of Toxicology 2023 Mesilla Chemistry Workshop on Aqueous Solution/Oxide Interfaces 8th Annual CWRU/Tohoku Data Science Symposium ACS National Meetings AEESP Biennial Conference on Chemical Education Biennial Conference on Chemical Education 2024 CAPEES virtual poster session Fifth International Symposium for Persistent, Bioaccumulating and Toxic Substances International Congress on Separation and Purification Technology Congress 2024 International Water Association Resource Recovery 2023 SETAC Asia-Pacific Biennial Meeting SETAC North America Annual Meeting SPT International Lectureship on Separation and Purification Technology The 2nd Greater Bay Area Symposium on Separation and Purification Technology The 20th Beijing Conference and Exhibition on Instrumental Analysis The 7th National Conference on Ecotoxicology The Soil Science Society of America ASA-CSSA-SSSA Annual Meeting The Society for Analytical Chemistry of Pittsburg (SACP) and The Spectroscopy Society of Pittsburg (SSP) Industry and Corporations (3) BASF Lubrizol P&G Community and Public Forums (4) FESE Youth Forum on AI and Environmental Science Journal of Eco-Environment & Health invited talk series The CWRU Emeriti Academy Unitarian Universalist Congregation of Cleveland, Forums that Matter Readers for the following journals: ACS ES&T Water Chem. Eng. J. Environ. Sci. Technol. J. Haz. Mat. Water Res. Changes/Problems:Due to pandemic-related restrictions, the project team shifted focus from experimental work to developing machine learning-based predictive models for Advanced Oxidation Processes (AOPs) and anaerobic biodegradation kinetics. This change allowed for significant progress in modeling, which was prioritized in the first three years. The team found the emerging machine learning tools to be extremely powerful, leading to a strategic decision to rely more heavily on these methods than initially planned. As a result, they concentrated on data curation and modeling, using experimental approaches only when necessary to build comprehensive datasets. Despite these changes, the project's objectives remain unchanged, with a transition from traditional QSARs to advanced machine learning modeling. The recruitment of six PhD students has further bolstered the team's capacity to meet and exceed original project goals, particularly in the areas of contaminant degradation, polymer biodegradation, and adsorption modeling. What opportunities for training and professional development has the project provided?The project has actively engaged six PhD students, including three women, and one postdoctoral researcher (at 50% effort), fostering their professional development across various domains. These individuals have been trained in literature review, research design, machine learning modeling, and specialized areas such as Advanced Oxidation Processes (AOPs), biodegradation, and sorption. Additionally, they have honed their skills in data analysis, manuscript writing, and presenting research findings. Impressively, each participant has authored at least one peer-reviewed journal article as the first author and presented their work at national conferences. Furthermore, PI Zhang has integrated the project's findings and relevant background knowledge into her curriculum, enhancing two courses: "Introduction to Environmental Engineering," aimed at juniors and seniors (30-40 students annually), and "Environmental Engineering Chemistry," targeted at graduate students and seniors (5-12 students annually). In addition, the project provided opportunities for one female undergraduate and two low-income high school students to participate in contaminant adsorption experiments during the summers of 2023 and 2024. They acquired essential laboratory skills for conducting adsorption experiments, mastered techniques for contaminant analysis, and practiced data analysis and visualization tools. Additionally, they successfully designed their own posters and presented their findings through a poster session at the conclusion of their summer involvement, showcasing their contributions and learning experiences. How have the results been disseminated to communities of interest?Conference presentations (16) by Zhang's PhD students working on the project: Zhong et al.; 261th ACS National Meeting, Division of Environmental chemistry, April 5 - 30, 2021 Zhong et al.; 261th ACS National Meeting, Division of Environmental chemistry, April 5 - 30, 2021 Huang and Zhang; 2022 AEESP Research and Education Conference, St Louis, MO. June 28-30, 2022 Zhong et al.; 2022 AEESP Research and Education Conference, St Louis, MO. June 28-30, 2022 Cheng et al.; 8th Annual CWRU/Tohoku Data Science Symposium, August 8-9, 2022. Cleveland OH Sun et al.; 8th Annual CWRU/Tohoku Data Science Symposium, August 8-9, 2022. Cleveland OH Huang and Zhang; Lubrizol - CWRU poster session, CWRU, Cleveland, OH. March 1, 2023 Gao and Zhang; The 45th SETAC North America Annual Meeting, Oct 20 - 24, 2024. Fort Worth, Texas Cheng et al.; 2024 CAPEES Virtual Student Poster Competition, July 19, 2024 Sun et al.; 2024 CAPEES Virtual Student Poster Competition, July 19, 2024 Cheng et al.; Gonter Award Symposium (Invited), ACS Fall National Conferences 2024, August 18 - 22. Denver, CO Sun et al.; ACS Fall National Conferences 2024, August 18 - 22. Denver, CO Zheng et al.; ACS Fall National Conferences 2024, August 18 - 22. Denver, CO Lin et al.; ACS Fall National Conferences 2024, August 18 - 22. Denver, CO Zhang, H.; ACS National Conference 2025, March 23-27, 2025 Yushu Cheng and Zhang, H.; 2025 AEESP Conferences, May 20-22, 2025. Durham, NC Three PhD dissertations: Zhong, Shifa; "Permanganate reaction kinetics and mechanisms and machine learning application in oxidative water treatment", April 2021. Case Western Reserve University Gao, Yidan; "Mechanism-based or machine learning-based kinetic modeling for abiotic chemical transformation with Fe(II) or Mn oxides", May 2024. Case Western Reserve University Cheng, Yushu; "Machine Learning Modeling for Predicting Biodegradation of Organic Contaminants in Different Environments" December 2025, Case Western Reserve University. A list of 53 invited presentations delivered by PI Zhang: Department of Environmental Science & Engineering, Fudan University, China September 17, 2021 (virtual) School of Environment, Nanjing University, China September 22, 2021 (virtual) The 7th National Conference on Ecotoxicology, Guilin, China September 28, 2021 (virtual) School of Civil Engineering, Huanqiao University, China October 6, 2021 (virtual) East China Normal University, China October 19, 2021 (virtual) Department of Computer and Data Science, Case Western Reserve University October 21, 2021 Department of Civil and Environmental Engineering, University of Colorado Boulder October 22, 2021 (virtual) Research Center for Eco-Environmental Sciences (RCEES), Chinese Academy of Science, China November 17, 2021 (virtual) Nankai University, China December 2, 2021 Department of Civil and Environmental Engineering, University of California Berkeley February 11, 2022 (virtual) SPT International Lectureship on Separation and Purification Technology February 16, 2022 (virtual) Symposium on "Advanced (Nano)Materials, Membranes, and Manufacturing for Water Treatment and Reuse", ACS Spring National Meeting March 20-24, 2022, San Diego, CA Peking University, Beijing, China May 8, 2022 (virtual) Journal Eco-Environment & Health invited talk series, Nanjing, China May 11, 2022 (virtual) Department of Chemistry, Southern Illinois University Edwardsville, IL June 27, 2022 8th Annual CWRU/Tohoku Data Science Symposium, Cleveland, OH August 8-9, 2022 Journal of Eco-Environment & Health invited talk series, Nanjing, China August 11, 2022 (virtual) Department of Ecology, Evolution, and Environmental Science, Cleveland State University, Cleveland, OH September 9, 2022 5th webinar (on-line) entitled "Machine learning in advanced sustainable materials", Northumbria University, UK October 18, 2022 (virtual) King Abdullah University of Science and Technology (KAUST), Saudi Arabia October 21-26, 2022 Dalian Institute of Technology, China November 10, 2022 (virtual) Unitarian Universalist Congregation of Cleveland, Forums that Matter, Cleveland, OH November 6, 2022 ISPT 10th International Congress on Separation and Purification Technology December 11-14, 2022 (virtual) 2023 Mesilla Chemistry Workshop on Aqueous Solution/Oxide Interfaces, Mesilla, New Mexico February 5-7, 2023 10th National Conference of the Society of Toxicology, China April 8-11, 2023 (virtual) FESE Youth Forum on AI and Environmental Science May 12, 2023 (virtual) The 2nd Greater Bay Area Symposium on Separation and Purification Technology Hong Kong, China May 19-22, 2023 (virtual) School of the Environment, Nanjing University, Nanjing, China May 29, 2023 Department of Environmental Science, University of Science & Technology of China, HeFei, China May 30, 2023 School of Ecology and Environment, East China Normal University, Shanghai, China June 16, 2023 School of Environmental Science and Engineering, Shanghai JiaoTong University, China June 17, 2023 Research Center for Eco-Environmental Sciences, University of Chinese Academy of Science, China June 20, 2023 (virtual) Department of Environmental Sciences, University of Chinese Academy of Science, China June 20, 2023 (virtual) Department of Chemical and Environmental Engineering, University of California Riverside, Riverside, CA August 10, 2023 Journal of Eco-Environment & Health invited talk series August 17, 2023 (virtual) Symposium on "Machine Learning, Artificial Intelligence and Big Data in Geochemistry", Division of Geochemistry, 2023 Fall ACS National Meeting, San Francisco, CA August 13-17, 2023 The 20th Beijing Conference and Exhibition on Instrumental Analysis 2023 September 6-8, 2023 (virtual) Department of Civil and Environmental Engineering, Colorado School of Mines September 15, 2023 (virtual) Department of Chemical and Environmental Engineering, University of Cincinnati, Cincinnati, OH September 22, 2023 International Water Association Resource Recovery 2023, Shenzhen, China November 1-3, 2023 Symposium on Aquatic Redox Chemistry, ACS spring 2024 National conferences, New Orleans March 17-21, 2024 The Continuing Education Committee, The Society for Analytical Chemistry of Pittsburg (SACP) and The Spectroscopy Society of Pittsburg (SSP), Pittsburg, PA April 27, 2024 Training school on "Cross-border transfer and development of sustainable resource recovery strategies towards zero waste", National University of Ireland - Galway, Galway, Ireland June 9-14, 2024 (virtual) ISPT 2024 International Congress on Separation and Purification Technology, Zhengzhou, China July 7-11, 2024 Biennial Conference on Chemical Education 2024, Lexington, Kentucky July 28-August 1, 2024 Symposium on "Data Science in Environmental Research for a Sustainable Future", ACS Fall National Conferences 2024, Denver, CO August 18-22, 2024 SETAC Asia-Pacific 14th Biennial Meeting, Tianjin, China September 21-25, 2024 Symposium on "Environmental Fate of Polymers", The 45th SETAC North America Annual Meeting, Fort Worth, Texas October 20-24, 2024 Symposium "Opportunities and Potential Limitations of Applying Artificial Intelligence and Machine Learning to Soil Science", The Soil Science Society of America ASA-CSSA-SSSA Annual Meeting, San Antonio, TX November 10-13, 2024 BASF March 13, 2025 (virtual) The CWRU Emeriti Academy April 8, 2025 University of Minnesota - Twin City April 14, 2025 Department of Energy, Environmental and Chemical Engineering, Washington University - St. Louis April 25, 2025 What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? QSAR Model Development:We developed QSAR models for the logkHO· values of 1,089 organic compounds toward the hydroxyl radical (HO·)-- a crucial reactive oxygen species in many advanced oxidation processes (AOPs)--in the aqueous phase using two machine learning algorithms. An ensemble model combining Deep Neural Networks (DNN) with XGBoost demonstrated superior performance. This model enables users to predict second-order rate constants of numerous organic compounds towards the hydroxyl radical. Additionally, this study provides essential mechanistic insights into ML-assisted environmental tasks, enhancing the trustworthiness of ML-based models, improving them for specific applications, and leveraging the implicit knowledge they contain.? Anaerobic Biodegradation Modeling:We compiled the most comprehensive dataset to date for anaerobic biodegradation modeling in sediments and sludge, consisting of 978 records compared to fewer than 100 in other studies. Extensive data and post-model analysis revealed, for the first time, that including experimental settings--categorized into chemicals, inoculum, additives, and conditions--was critical for modeling anaerobic degradation. We developed a binary classification model with 81% accuracy and a regression model with an R² of 0.56 for sediments (80% and 0.31 for sludge). These models define applicability domains, providing reasonable predictions for the half-lives of 41% (sediments) and 67% (sludge) of over 4,000 persistent, bioaccumulative, and toxic chemicals. The models facilitate biodegradation predictions for diverse chemicals under varying conditions, supporting applications in anaerobic digestion treatment design and optimization, environmental risk assessment, site remediation, and chemical persistence evaluation in sludge systems. Biodegradable Polymer Development:Although biodegradable polymers represent a promising solution for reducing environmental impact, they account for less than 2% of total polymer production. To expand their availability, we curated an extensive aerobic biodegradation dataset of 74 polymers and 1,779 data points from published literature and 28 original experiments. A meta-analysis evaluated the effects of experimental conditions, polymer structure, and the combined impact on biodegradation. We developed a machine learning model to predict polymer biodegradation in aquatic environments, achieving an R² test score of 0.66 using Morgan fingerprints, detailed experimental conditions, and thermal decomposition temperature (Td) as input descriptors. This research marks the first attempt at developing regression models for polymer ultimate biodegradation. We launched the model on a free, user-friendly website, offering significant potential for designing environmentally friendly polymers and aiding material scientists, environmental researchers, and sustainability-focused industries. This model can also help estimate the treatability of agricultural waste that contains polymer-based additives such as controlled-release fertilizers, water retention and soil conditioning, mulching films, seed coatings, greenhouse films, and packaging and storage. Soil-Water Sorption Modeling:We compiled a comprehensive soil-water sorption dataset encompassing 20,945 data points for 419 organic compounds with various functional groups and 1,037 different soils. Machine learning models were developed to cover the entire spectrum of speciation for cationic, neutral, and anionic species. Utilizing soil properties from the Harmonized World Soil Database, the models predicted sorption of diverse organic compounds based on global soil properties under simulated environmental scenarios. To encourage widespread adoption among users with minimal coding experience, we created a free online platform to host the models:https://envmodel-cwru.streamlit.app/. Global mapping is crucial for managing soil health and mitigating environmental risks by analyzing the influence of soil properties. Identifying key pollutants based on soil characteristics enables prioritized monitoring and remediation, while mapping regions at higher risk allows targeted interventions, aiding pollution prevention, resource allocation, and policy development. The models can also help agriculture evaluate the fate and transport of organic contamination in soil, which helps risk assessment and site remediation. Oxygen's Role in Organic Matter Formation:This study presents new insights into the complex role of oxygen in the oxidative transformation of phenolic chemicals under varying pH conditions. It reveals, for the first time, oxygen's dual function as both an oxidant and a catalyst under alkaline conditions with MnO2 during early-stage organic matter (OM) formation from catechol. This novel finding challenges conventional understanding, highlighting oxygen's regenerative capabilities and catalytic influence in polyphenol transformation. The study demonstrates that while MnO2 and circumneutral conditions facilitate polymerization, oxygen and high alkalinity promote ring cleavage and hydroxylation, significantly impacting OM formation. This advancement has profound implications for artificial OM production, biomass recycling, carbon cycle management, and climate change mitigation. Additionally, the enhanced effects can be harnessed for practical applications such as soil remediation and enhancement, where OM formation enriches soil quality, boosts nutrient content, improves water retention, and supports microbial activity. L2D Development:L2D (Literature to Data) represents a new approach in applying large language models (LLMs) to environmental research. This structured multi-step prompting pipeline incorporates domain constraints and self-validation, enhancing the selectivity of information extraction. L2D focuses on relevant experimental details, reducing manual data compilation efforts and offering a low-code, domain-flexible solution accessible to researchers without extensive coding skills. By introducing this adaptable framework, L2D facilitates automated literature mining and encourages the environmental research community to explore generative AI applications. Ultimately, L2D lowers the barrier for automated extraction, supporting a shift toward scalable, AI-assisted synthesis in environmental research. AOPs and Adsorption Modeling:In addition to completed papers, we conducted an in-depth review (book chapter) on controlling disinfection by-product formation in AOPs. We are finalizing a machine learning model predicting the adsorption of organic compound mixtures onto granular activated carbon--a common pre- and post-treatment process that couples with anaerobic biodegradation and/or AOPs for wastewater treatment. By modeling realistic contaminant mixtures, this model provides accurate estimates of contaminant removal capacities, enabling precise design of adsorption systems. It also facilitates the estimation of proper lifetimes for adsorbents, ensuring timely regeneration and optimal treatment system performance.

Publications

  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2024 Citation: Cheng, Yushu �, Kai Zhang�, Kuan Huang� and Zhang, H.*; 2024 Meta-Analysis and Machine Learning Models for Predicting Anaerobic Biodegradation Rates of Organic Contaminants in Sediments and Sludge, Environ. Sci. Technol., 58, 29, 1297612988
  • Type: Other Journal Articles Status: Under Review Year Published: 2025 Citation: Cheng, Yushu � and Zhang, H.; 2025 Automated data collection from literature - A flexible, model-Free approach using conversational large language models and prompt engineering, Water Res., In review


Progress 05/01/23 to 04/30/24

Outputs
Target Audience:PI Zhang has been invited to present her research related to this project at 18 national and international institutions and conferences: Zhang, H.,* 2024 "Modeling environmentally relevant chemical reactions with machine learning", the Continuing Education Committee, The Society for Analytical Chemistry of Pittsburg (SACP) and The Spectroscopy Society of Pittsburg (SSP), April 27, 2024, Pittsburg, PA. Zhang, H.,* 2024 "Revisiting the Synergy Between Kinetic Modeling and Reaction Mechanisms in Environmental Oxidation Reactions", Symposium on Aquatic Redox Chemistry, ACS spring 2024 National conferences. New Orleans, March 17-21, 2024 Zhang, H.,* 2023 "Modeling environmentally relevant chemical reactions with machine learning", International Water Association Resource Recovery 2023. Shenzhen, China. November 1-3, 2023 Zhang, H.,* 2023 "Lake Erie in bloom, and what to do about it", CASFER Convergence Research Seminar Series, October 6, 2023 (virtual) Zhang, H.,* 2023 "Modeling environmentally relevant chemical reactions with machine learning", Department of Chemical and Environmental Engineering, University of Cincinnati, Cincinnati, OH. September 22, 2023 Zhang, H.,* 2023 "Modeling environmentally relevant chemical reactions with machine learning", Department of Civil and Environmental Engineering, Colorado School of Mines. September 15, 2023 (virtual) Zhang, H.,* 2023 "Building mechanistically sound machine learning models for environmental applications", The 20th Beijing Conference and Exhibition on Instrumental Analysis (BCEIA) 2023, September 6-8, 2023 (virtual). Zhang, H.,* Kuan Huang, Haiping Ai, and Kai Zhang 2023 "Machine learning modeling of environmentally relevant chemical reactions for organic compounds", Symposium on "Machine Learning, Artificial Intelligence and Big Data in Geochemistry", Division of Geochemistry, 2023 Fall ACS National Meeting, August 13-17, 2023. San Francisco, CA. Zhang, H.,* 2023 "How to build mechanistically sound machine learning models?", Journal of Eco-Environment & Health invited talk series. August 17, 2023 (virtual) Zhang, H.,* 2023 "Modeling environmentally relevant chemical reactions with machine learning", Department of Chemical and Environmental Engineering, University of California Riverside, Riverside, CA. August 10, 2023 11.Zhang, H.,* 2023 "Modeling Transformation Kinetics of POPs during Water/wastewater Treatment using Machine Learning", Department of Environmental Sciences, University of Chinese Academy of Science, China. June 20, 2023 (virtual) Zhang, H.,* 2023 "Modeling POPs Exposure in the Environment using Machine Learning", Research Center for Eco-Environmental Sciences, University of Chinese Academy of Science, China. June 20, 2023 (virtual) Zhang, H.,* 2023 "Latest machine learning modeling of environmentally relevant reactions", School of Environmental Science and Engineering, Shanghai JiaoTong University, China. June 17, 2023 Zhang, H.,* 2023 "Latest machine learning modeling of environmentally relevant reactions", School of Ecology and Environment, East China Normal University, Shanghai, China. June 16, 2023 Zhang, H.,* 2023 "Latest machine learning modeling of environmentally relevant reactions", Department of Environmental Science, University of Science & Technology of China, HeFei, China. May 30, 2023 Zhang, H.,* 2023 "Latest machine learning modeling of environmentally relevant reactions", School of the Environment, Nanjing University, Nanjing, China. May 29, 2023 Zhang, H.,* 2023 "Machine learning modeling of environmentally relevant reactions", The 2nd Greater Bay Area Symposium on Separation and Purification Technology (GBA-SPT 2023), Hong Kong, China. May 19-22, 2023 (virtual) Zhang, H.,* 2023 "Machine learning modeling in environmental science and engineering", FESE Youth Forum on AI and Environmental Science. May 12, 2023 (virtual) Changes/Problems:As stated in the last year's annual report, we have spent enough time to fully embrace the emerging machine learning tools. Now that we have more than five years of experience using these tools and found them extremely powerful, we have relied on them heavily than we originally planned in the proposal. As a results, we conducted more data curation and machine learning modeling, and used experimental approaches only when needed to build up comprehensive datasets. Despite these changes, the objectives of the project remained the same. We mainly shifted the approaches more to the emerging machine learning modeling, less of the traditional QSARs approach. We were also able to recruit five PhD students to work on the project last year so we were able to significantly catch up with the proposed plans. With these students on board, we're confident that we can not only successfully accomplish what we originally proposed, but achieve much more in terms of machine learning modeling of contaminant degradation and adsorption, as either single solutes or in solute mixtures. What opportunities for training and professional development has the project provided?We have involved five PhD students (two female) in the project so they have been trained to toward their professional development. PI Zhang also incorporated the project findings and the relevant background knowledge to two of her courses: Intro to Environmental Engineering (for juniors and seniors) and Environmental Engineering Chemistry (for graduate students and seniors). One female undergraduate and one low-income high school student also worked on the project over the summer of 2023, on contaminant adsorption experiments. How have the results been disseminated to communities of interest?PI Zhang has been invited to give 18 talks related to this project in the past year to a broad range of audiences, ranging from national and international conferences such as ACS national meetings, to the NSF CAFER Engineering Center seminar series, to reputable universities in the US and China, and to journal webinars. What do you plan to do during the next reporting period to accomplish the goals?All the five PhD students will continue their current research as detailed above and submit their manuscripts for peer-reviewed journal publication, namely, comprehensive machine learning models for anaerobic biodegradation of organic contaminants in digestion sludge, biodegradation of polymers, soil sorption of hundreds of organic compounds, post-treatment of mixed CECs by GACs, and oxidation of CECs by AOPs.

Impacts
(N/A)

Publications

  • Type: Journal Articles Status: Awaiting Publication Year Published: 2024 Citation: Cheng, Yushu �, Kai Zhang�, Kuan Huang� and Zhang, H.*; 2024 Meta-Analysis and Machine Learning Models for Predicting Anaerobic Biodegradation Rates of Organic Contaminants in Sediments and Sludge, Environ. Sci. Technol., Revised version with editor. (2024 C. Ellen Gonter Research Paper Award)
  • Type: Journal Articles Status: Other Year Published: 2024 Citation: Lin, Chengrui; � Kuan Huang; � and Zhang, H.*; 2024 Predictive models for polymer biodegradation kinetics, Environ. Sci. Technol. In preparation
  • Type: Journal Articles Status: Other Year Published: 2024 Citation: Zheng, Zihan; � Kai Zhang; � Peiling Yu; � and Zhang, H.*; 2024 Bi-solute adsorption modeling with active learning: A new strategy toward estimating multi-solute adsorption capacities on GACs, Environ. Sci. Technol. In preparation.
  • Type: Journal Articles Status: Other Year Published: 2024 Citation: Gao, Yidan� and Zhang, H.*; 2024 Early-stage abiotic humification: Understanding the roles of MnO2 and oxygen in catechol oxidation under alkaline air conditions, Environ. Sci. Technol., In preparation
  • Type: Journal Articles Status: Other Year Published: 2024 Citation: Jiachun Sun �, Kai Zhang�, and Zhang, H.*; 2024 Predicting Adsorption of Diverse Organic Compounds in Soil-Water Systems: Meta-analysis, Predictive Models, and Global Soil Mapping, Environ. Sci. Technol., In preparation
  • Type: Journal Articles Status: Under Review Year Published: 2024 Citation: 6. Siddiquee, Mashuk, Dae-Wook Kong, Yushu Cheng�, H. Zhang, Bridget Hegarty, Sara Mashhadi Nejad, Mohammad Mahmudul Hassan, Youngwoo Seo, 2024 Application of machine learning through multi-omics data to understand microbial dynamics in natural and engineering systems, ACS ES&T Water, In revision
  • Type: Theses/Dissertations Status: Published Year Published: 2024 Citation: Mechanism-based or machine learning-based kinetic modeling for abiotic chemical transformation with Fe(II) or Mn oxides Yidan Gao, 2024, Case Western Reserve University.


Progress 05/01/22 to 04/30/23

Outputs
Target Audience:PI Zhang has been invited to present her research related to this project at 13local,national and international institutions and conferences: 1. Zhang, H.,* 2023 "Machine learning modeling of environmental reactions of different sample sizes", 10th National Conference of the Society of Toxicology, China. April 8-11, 2023 (virtual) 2. Zhang, H.,* 2022 "Modeling environmentally relevant chemical separation and purification with machine learning", ISPT 10th International Congress on Separation and Purification Technology. December 11-14, 2022 (virtual) 3. Zhang, H.,* 2022 "Water...From Water Contamination to Lake Erie Blooms", Unitarian Universalist Congregation of Cleveland, Forums that Matter, Cleveland OH, November 6, 2022 4. Zhang, H.,* 2022 "Modeling environmental relevant chemical reactions with machine learning", Dalian Institute of Technology, China.November 10, 2022 (virtual) 5. Zhang, H.,* 2022 "Kinetic modeling of the transformation of contaminants in soil", King Abdullah University of Science and Technology (KAUST), Saudi Arabia. October 21-26, 2022 6. Zhang, H.,* 2022 "Machine Learning Modeling in Environmental Science and Engineering for Sustainable Development", 5thwebinar(on-line) entitled "Machine learning in advanced sustainable materials", Northumbria University, UK. October 18, 2022 (virtual) 7. Zhang, H.,* 2022 "Machine Learning Modeling for the reactivity of contaminants in engineered and natural environments", Department of Ecology, Evolution, and Environmental Science,Cleveland State University, Cleveland OH. September 9, 2022. 8. Zhang, H.,* 2022 "How to model environmentally related chemical reactions with machine learning", Journal of Eco-Environment & Health invited talk series, Nanjing China. August 11, 2022 (virtual) 9. Zhang, H.,* 2022 "Machine Learning Modeling for the reactivity of contaminants in engineered and natural environments", 8th Annual CWRU/Tohoku Data Science Symposium, August 8-9, 2022. Cleveland OH 10. Zhang, H.,* 2022 "Machine Learning Modeling for the reactivity of contaminants in engineered and natural environments", Department of Chemistry, Southern Illinois University Edwardsville, IL. June 27, 2022. 11.Zhang, H.,* "Predictive models for redox reactions of organic compounds in natural and engineered environments", 2023 Mesilla Chemistry Workshop onAqueous Solution/Oxide Interfaces, Mesilla, New Mexico, February 5 - 7, 2023 12.Zhang, H.,* "Machine Learning Modeling for the reactivity of contaminants in engineered and natural environments", 2020 Gordon Research Conference: Environmental Sciences: Water, Holderness School, Holderness, NH. June 19-24, 2022 13. Zhang, H.,* 2022 "Machine Learning: New Ideas and Tools in Environmental Science and Engineering", Journal Eco-Environment & Health invited talk series, Nanjing China. May 11, 2022 (virtual) Changes/Problems:Since the pandemic, we have spent enough time to fully embrace the emerging machine learning tools. Now that we have more than four years of experience using these tools and found them extremely powerful, we would like to rely on them more heavily than we originally planned in the proposal. As a results, we will conduct more data curation and machine learning modeling, and use experimentalapproaches only when needed to build up comprehensive datasets. Despite these changes, the objectives of the project remain the same. We mainly shifted the approaches more to the emerging machine learing modeling, less of the traditional QSARs approach. What opportunities for training and professional development has the project provided?We have involved two female PhD students in the project so they have been trained to toward their professional development. PI Zhang also incorporated the project findings and the relevant background knowledge to two of her courses: Intro to Environmental Engineering (for juniors and seniors) and Environmental Engineering Chemistry (for graduate students and seniors). How have the results been disseminated to communities of interest?PI Zhang has been invited to give 13 talks related to this project in the past year to a broad range of audiences, ranging from people attending a local church in Cleveland, OH, to faculty and students at King Abdullah University of Science and Technology (KAUST) inSaudi Arabia, to virtual attendees at a workshop at Northumbria University inUK, and to many virtual attendees in China and in the US. What do you plan to do during the next reporting period to accomplish the goals?The first PhD student plans to finish the manuscript in the next couple of months so it can be submitted for peer-review. She will contine to conduct experiments to quantify the biodegradation kinetics of all the selected CECs under different operating conditions. In this process, she will also need to first develop HPLC methods for compound detection. Once all the rate constants have been obtained, she will try to develop a predictive model that will allow us to predict the biogradation kinetics of a large number of CECs under different conditions in AD. The second PhD student will continue to use the Group Contribution Method to estimate the rate constants for as many chemicals as possible. For those CECs whose rate constants cannot be estimated by this approach, she will measuretheir rate constants experimentally. The goal is to obtain a comprehensive datasete of chemical reactivity toward hydroxyl radicals so that we can build a powerful machine learning model for prediction purposes.

Impacts
What was accomplished under these goals? One Ph. D. student has finished manually collecting anaerobic biodegradation kinetics data of a variety of organiccompounds from about 200 journal articles. The goal is to develop a predictive model to allow prediction of anaerobic degradation kineticsin AD of all possible organic contaminants, including the ones originally proposed in Objectives 1 - 3. Based on the dataset, the student hasdeveloped predictive models using both classical QSARs (quantitative structure-activity relationships) and emerging machinelearning modeling approaches. She has also interpreted the models and defined their applicability domains. With all these findings, she has written a manuscript to be published as a peer-reviewed journal article. We have revised the manuscript a few times and are in the last stage of polishing the manuscript. We have been invited to co-write a review article on "Application of machine learning through multi-omics data to understand microbial dynamics in natural and engineering systems" by a research group from the University of Toledo. We are in charge of writing the microbial dynamics during biodegradation of all studied pollutants, including anaerobic biodegradation of CECs. The above PhD student and I have finished our section and the review article is to be submitted for peer-review in early May, 2023. With help from co-PI Zhan and one of his students, the above student has learned to set up batch reactors for anaerobic digestion. After running the reactors a few rounds, she is now able to set up these reactors for different compounds under a variety of different conditions, including different temperature, sludge loading, the presence of electron acceptors and different compound loadings. Another first-year PhD student was recruited to work on the AOP part of the proejct. She has learned how to set up reactors for measuring the reaction rate constants of a few chemicals toward hydroxyl radicals. But she found that it was very time-consuming and labor-intersive to do experiments for a large number of chemicals. So she learned the Group Contribution Approach and started to use this approach to estimate the rate constants for about 260 organic compounds.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Zhang, H.,* 2022 Modeling environmentally relevant chemical separation and purification with machine learning, ISPT 10th International Congress on Separation and Purification Technology. December 11-14, 2022 (virtual)
  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Zhang, H.,* 2022 Machine Learning Modeling in Environmental Science and Engineering for Sustainable Development, 5th webinar (on-line) entitled Machine learning in advanced sustainable materials, Northumbria University, UK. October 18, 2022 (virtual)
  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Zhang, H.,* 2022 Machine Learning Modeling for the reactivity of contaminants in engineered and natural environments, 8th Annual CWRU/Tohoku Data Science Symposium, August 8-9, 2022. Cleveland OH
  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Zhang, H.,* Machine Learning Modeling for the reactivity of contaminants in engineered and natural environments, 2020 Gordon Research Conference: Environmental Sciences: Water, Holderness School, Holderness, NH. June 19-24, 2022
  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Cheng, Yushu, Kai Zhang, and Zhang, H.,* 2022 Machine learning models for predicting anaerobic biodegradation of organic contaminants, 8th Annual CWRU/Tohoku Data Science Symposium, August 8-9, 2022. Cleveland OH
  • Type: Journal Articles Status: Other Year Published: 2023 Citation: Cheng, Yushu �, Kai Zhang�, and Zhang, H.*; 2023 Comprehensive datasets and predictive models for anaerobic biodegradation of organic compounds, Environ. Sci. Technol., In preparation
  • Type: Journal Articles Status: Other Year Published: 2023 Citation: Siddiquee, Mashuk, Dae-Wook Kong, Yushu Cheng�, H. Zhang, Bridget Hegarty, Sara Mashhadi Nejad, Mohammad Mahmudul Hassan, Youngwoo Seo, 2023 Application of machine learning through multi-omics data to understand microbial dynamics in natural and engineering systems, ACS ES&T Water, In preparation
  • Type: Conference Papers and Presentations Status: Published Year Published: 2023 Citation: Zhang, H.,* Predictive models for redox reactions of organic compounds in natural and engineered environments, 2023 Mesilla Chemistry Workshop on Aqueous Solution/Oxide Interfaces, Mesilla, New Mexico, February 5  7, 2023


Progress 05/01/21 to 04/30/22

Outputs
Target Audience:Upon the publication of the article entitled "Shedding light on "Black Box" machine learning models for predicting the reactivity of HO- radicals toward organic compounds"(Chem. Eng. J., 405, 126627) and a student's PhD dissertation in 2021, the PI was invited to give 10talks at international conferences and institutions on this research: 1. Zhang, H.,* 2021, School of Environment, Nankai University, China. December 2, 2021 (virtual presentation). 2. Zhang, H.,* 2021, Research Center for Eco-Environmental Sciences (RCEES), Chinese Academy of Science, China. November 17, 2021. (virtual presentation) 3. Zhang, H.,* 2021, Department of Civil and Environmental Engineering, University of Colorado Boulder, October 22, 2021 (virtual presentation). 4. Zhang, H.,* 2021, Department of Computer and Data Science Case Western Reserve University, October 21, 2021. 5. Zhang, H.,* 2021, East China Normal University, China, October 19, 2021. (virtual presentation) 6. Zhang, H.,* 2021, School of Civil Engineering, Huanqiao University, China, October 6, 2021. (virtual presentation) 7. Zhang, H.,* 2021, The 7th National Conference on Ecotoxicology, Guilin, China, September 28, 2021. (virtual presentation) 8. Zhang, H.,* 2021, School of Environment, Nanjing University, China, September 22, 2021. (virtual presentation) 9. Zhang, H.,* 2021, Department of Environmental Science & Engineering, Fudan University, China, September 17, 2021. (virtual presentation) 10. Zhang, H.,* 2021 Fifth International Symposium for Persistent, Bioaccumulating and Toxic Substances (IJRC-PTS 2021), Beijing, China, July 26-28, 2021. (virtual presentation) Changes/Problems:The PI and the team started the first year of the project in 2020 during the pandemic so the majority of the work was switched to modeling AOPs and anaerobic biodegradation kinetics. In the second year of the project, the project teamdid not want to simply stop the modeling work because they would like to get the modelingfinished and published before returning to the originally proposed other objectives. That's why they still focused on the modeling work in the second year of the project. Now that the modeling work is near end, they will start to work on the experimental portion of the project in the third year. What opportunities for training and professional development has the project provided?Two Ph. D. students (one of them is female) have been involved in this project, one on AOPs and the other on AD. How have the results been disseminated to communities of interest?1. The PI has given teninvited talks atinternational conferencesand institutions: 1.Zhang, H.,* 2021, School of Environment, Nankai University, China. December 2, 2021 (virtual presentation). 2.Zhang, H.,* 2021, Research Center for Eco-Environmental Sciences (RCEES), Chinese Academy of Science, China. November 17, 2021. (virtual presentation) 3.Zhang, H.,* 2021, Department of Civil and Environmental Engineering, University of Colorado Boulder, October 22, 2021 (virtual presentation). 4.Zhang, H.,* 2021, Department of Computer and Data Science Case Western Reserve University, October 21, 2021. 5.Zhang, H.,* 2021, East China Normal University, China, October 19, 2021. (virtual presentation) 6.Zhang, H.,* 2021, School of Civil Engineering, Huanqiao University, China, October 6, 2021. (virtual presentation) 7.Zhang, H.,* 2021, The 7thNational Conference on Ecotoxicology, Guilin, China, September 28, 2021. (virtual presentation) 8.Zhang, H.,* 2021, School of Environment, Nanjing University, China, September 22, 2021. (virtual presentation) 9.Zhang, H.,* 2021, Department of Environmental Science & Engineering, Fudan University, China, September 17, 2021. (virtual presentation) 10.Zhang, H.,* 2021 Fifth International Symposium for Persistent, Bioaccumulating and Toxic Substances (IJRC-PTS 2021), Beijing, China, July 26-28, 2021. (virtual presentation) 2. One Ph. D. student presented his findings at the 2021 ACS spring national meetings: Zhong, Shifa, Jiajie Hu, Xiong Yu and Zhang, H.; 2021 "Molecular image-convolutional neural network (CNN) assisted QSAR models for predicting contaminant reactivity toward OH radicals: Transfer learning, data augmentation and model interpretation", 261th ACS National Meeting & Exposition, Division of Environmental chemistry (virtual meeting), April 5 - 30, 2021. Zhong, Shifa, Kai Zhang, Dong Wang and Zhang, H.; 2021 "Sheddinglighton "Black Box" machine learning models for predicting the reactivity of HO- radicals toward organic compounds", 261th ACS National Meeting & Exposition, Division of Environmental chemistry (virtual meeting), April 5 - 30, 2021. What do you plan to do during the next reporting period to accomplish the goals?The PI tried to recruit a postdoc associate to work on the proposed work on AD in 2021 but it did not work out, so this year, the PI plans to rely soly on two PhD students who have been recruited to continue the proeject: 1. To have the current Ph. D. student to continue her work on the predictive modeling of anaerobic biodegradation. Once the model has been obtained and published as a peer-reviewed journal article, this student will combine both experimental and modeling approaches to examine the anaerobic digestion of the proposed veterinary pharmaceuticals. Specifically, selected organic contaminants will be first identified to represent the chemical space as much as possible; the biodegradation kinetics of these chemicals under anaerobic conditions will be experimentally examined; the obtained kinetic data will be combined with the previous dataset collected from the literature to further develop predictive models. 2. A newPh. D. studenthas been recruited to start in the fall of 2022 to continue the work on AOPs. The obtained mechine learning models are still limited in terms of the size of the training dataset so the applicability domain of the models is not wide enough. To expand the models to a wide range of organic contaminants, including the ones proposed in the project, the student will conduct bench-top experiments of selected chemcials for their reacitivity toward hydroxyl radicals. Toward this goal, advanced analytical methods will be first established in the lab so the degradation kinetics of these chemicals can be monitored. These new data will be combined with the original dataset collected from the literature and a much more robust, widely applicable model will be developed after that.

Impacts
What was accomplished under these goals? Continuing last year's modeling effort so we can publish the results,one Ph. D. student developed machine learning basedpredictive models for the reactivity of 1000+organic compounds toward the hydroxylradical (one of the most reactive and widely occurring reactive species in AOPs) and four other common oxidants: sulfate radicals, HClO, ozone and ClO2. Model interpretation was subsequently conducted to verify that known oxidation mechanisms arefollowed and we can trust the models. After a comprehensive literature review, we decided not to build a predictive model for permangante because there is only a small number of organic chemicals having reported reactivity toward permanganate. Another Ph. D. student continued to manually collectanaerobic biodegradation kinetics dataof a variety of organic compounds from about 200 journal articles. Now that the data collection is almost complete, the student has started to develop predictive models using both classical QSARs (quantitative structure-activity relationships) and emerging machine learning modeling approaches. The goal is to develop a predictive model to allow prediction ofanaerobic degradation kinetics in AD of all possible organic contaminants, including the ones originally proposed in Objectives1 - 3.

Publications

  • Type: Journal Articles Status: Other Year Published: 2022 Citation: Cheng, Y.; K. Zhang; K. Huang; H. Zhang; 2022 "Machine learning modeling of anaerobic biodegradation of organic contaminants in the environment", In preparation.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2021 Citation: Zhang, H.,* 2021 Machine Learning: New Ideas and Tools in Environmental Science and Engineering, The 7th National Conference on Ecotoxicology, Guilin, China, September 28, 2021. (virtual presentation)
  • Type: Conference Papers and Presentations Status: Published Year Published: 2021 Citation: Zhang, H.,* 2021 Fifth International Symposium for Persistent, Bioaccumulating and Toxic Substances (IJRC-PTS 2021), Beijing, China, July 26-28, 2021. (virtual presentation)
  • Type: Conference Papers and Presentations Status: Published Year Published: 2021 Citation: Zhong, Shifa, Kai Zhang, Dong Wang and Zhang, H.; 2021 Shedding light on Black Box machine learning models for predicting the reactivity of HO- radicals toward organic compounds, 261th ACS National Meeting & Exposition, Division of Environmental chemistry (virtual meeting), April 5  30, 2021.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2021 Citation: Zhong, Shifa, Jiajie Hu, Xiong Yu and Zhang, H.; 2021 Molecular image-convolutional neural network (CNN) assisted QSAR models for predicting contaminant reactivity toward OH radicals: Transfer learning, data augmentation and model interpretation, 261th ACS National Meeting & Exposition, Division of Environmental chemistry (virtual meeting), April 5  30, 2021.


Progress 05/01/20 to 04/30/21

Outputs
Target Audience:Published a peer-reviewed journal article to reach environmental scientists and engineers: Zhong, Shifa, Kai Zhang, Dong Wang and Zhang, H.*; 2021 "Sheddinglighton "Black Box" machine learning models for predicting the reactivity of HO- radicals toward organic compounds", Chem. Eng. J., 405, 126627. Giving two invited talks to international and regional professionals: Zhang, H.,* 2020 First DICP Artificial Intelligence Symposium, Daliang, China, Dec 7-8, 2020. Zhang, H.,* "Oxidative water treatment using persulfates: Galvanic oxidation processes, bromide removal, and QSARs", Department of Chemical and Environmental Engineering, University of Cincinnati, Cincinnati, OH September 25, 2020 Student presentation at the 2021 ACS spring national meetings: Zhong, Shifa, Kai Zhang, Dong Wang and Zhang, H.; 2021 "Sheddinglighton "Black Box" machine learning models for predicting the reactivity of HO- radicals toward organic compounds", 261th ACS National Meeting & Exposition, Division of Environmental chemistry (virtual meeting), April 5 - 30, 2021. Changes/Problems:The pandemic restricted us from accessing the lab during a large portion of the year, so we decided to focus on the modeling work. This turned out to be working quick well, allowing us to develop comprehensive machine learning-based predictive models for both AOPs and anaerobic biodegradation. What opportunities for training and professional development has the project provided?Two Ph. D. students (one of them is female) have been involed in this project, one on AOPs and the other on AD. How have the results been disseminated to communities of interest?1. The PI has given two invited talks at one international symposium and a university seminar: Zhang, H.,* 2020 First DICP Artificial Intelligence Symposium, Daliang, China, Dec 7-8, 2020. Zhang, H.,* "Oxidative water treatment using persulfates: Galvanic oxidation processes, bromide removal, and QSARs", Department of Chemical and Environmental Engineering, University of Cincinnati, Cincinnati, OH September 25, 2020 2. One Ph. D. student presented his findings at the 2021 ACS spring national meetings: Zhong, Shifa, Kai Zhang, Dong Wang and Zhang, H.; 2021 "Sheddinglighton "Black Box" machine learning models for predicting the reactivity of HO- radicals toward organic compounds", 261th ACS National Meeting & Exposition, Division of Environmental chemistry (virtual meeting), April 5 - 30, 2021. 3. One Ph. D. student (Shifa Zhong) defended his dessertation on April 6, 2021 virtually to a group of local and national professionals. What do you plan to do during the next reporting period to accomplish the goals?1. To recruit a postdoc associate to start the proposed work on AD 2. To recruit another Ph. D. students to continue the work on AOPs 3. To have the current Ph. D. student to continue her work on the predictive modeling of anaerobic biodegradation.

Impacts
What was accomplished under these goals? Because of the pandemic, we hadlimited access to the lab space so we focused on the modeling effortin Objective 4. Specifically, one Ph. D. student mined the literature for all available rate constants for organic compounds toward the hydroxyl radical (one of the most reactive and widely occurring reactive species in AOPs) and developed a machine learning based predictive model. Model interpretation was subsequently conducted to verify that known oxidation mechanisms are followedand we can trust the model. Another Ph. D. student has been collecting anaerobic degradation kinetics data ofavariety of organic compounds. The goal is to develop a predictive model using machine learing tools to allow prediction of anaerobic degradation kinetics in AD ofall possible organic contaminants, including the ones originally proposed inObjectives 1 - 3.

Publications

  • Type: Journal Articles Status: Published Year Published: 2020 Citation: Zhong, Shifa, Kai Zhang, Dong Wang and Zhang, H.*; 2021 Shedding light on Black Box machine learning models for predicting the reactivity of HO- radicals toward organic compounds, Chem. Eng. J., 405, 126627.
  • Type: Theses/Dissertations Status: Submitted Year Published: 2021 Citation: Zhong, Shifa 2021, "Permanganate reaction kinetics and mechanisms and machine learning application in oxidative water treatment", Ph. D. dissertation, Case Western Reserve University, Cleveland OH