CPS: Medium: Collaborative Research: Robust and Intelligent

CPS: MEDIUM: COLLABORATIVE RESEARCH: ROBUST AND INTELLIGENT

Sponsoring Institution

National Institute of Food and Agriculture

Project Status

COMPLETE

Funding Source

AFRI COMPETITIVE GRANT

Reporting Frequency

Annual

Accession No.

1022076

Grant No.

2020-67021-31526

Cumulative Award Amt.

$613,051.00

Proposal No.

2020-01473

Multistate No.

(N/A)

Project Start Date

Jun 1, 2020

Project End Date

May 31, 2024

Grant Year

2020

Program Code

[A7302]- Cyber-Physical Systems

Recipient Organization
GEORGIA INSTITUTE OF TECHNOLOGY
(N/A)
ATLANTA,GA 30332

Performing Department
(N/A)

Non Technical Summary
Harnessing recent progresses on wastewater treatment, food security, data analytics, and machine intelligence, we propose to study novel optimized technology-driven controlled-environment agriculture (CEA) systems that can achieve high areal vegetable productivity to increase the food and nutritional security in urban areas with low operating cost and reduced energy consumption.Our project focuses on two core CPS research areas, i.e., control and data analytics, inspired by the design and operations of a pilot testbed at Georgia Tech for coupling the water and nutrients in domestic wastewater (DWW) to high-productivity CEAs. Food production in the CEAs must warrant the high cost of land in urban areas, which makes it necessary to reduce the total DWW-CEA operating cost and increase productivity. However, it is highly challenging to control and optimize this complex system of subsystems. In our case, we need to coordinate the Pilot-Plant and Pilot-Farm, examine their inter-correlation, and support dynamic and robust optimal decisions to achieve the highest production yield, while simultaneously satisfying various performance specifications on nutrient compositions, operating cost and energy consumption, and meeting other safety requirements of DWW-CEAs. Moreover, the profound impact of numerous operating conditions and parameters on the vegetable phenotype, yield and nutrient compositions during different growth periods need to be thoroughly understood. We have to move from model-driven CPS fundamentals to an integrated data-driven model-based approach.

Animal Health Component

10%

Research Effort Categories

Basic

80%

Applied

10%

Developmental

10%

Classification

Knowledge Area (KA)	Subject of Investigation (SOI)	Field of Science (FOS)	Percent
204	1499	2020	50%
402	1499	2020	50%

Knowledge Area
204 - Plant Product Quality and Utility (Preharvest); 402 - Engineering Systems and Equipment;

Subject Of Investigation
1499 - Vegetables, general/other;

Field Of Science
2020 - Engineering;

Keywords

control

data analytics and optimization

Goals / Objectives
The objective of this 3-year interdisciplinary project is to maximize food productivity and nutrition while minimizing cost, energy and waste. The research is organized as four thrusts, each objective-driven and delineated as follows.InThrust 1, first-principles water models are adapted to the physical system and calibrated before on-site experimental validation. As first-principles models are not available for hydroponic agriculture, inThrust 2we adopt a hybrid approach where parameter-dependent biochemical reactions are supported by machine-learning algorithms with the purpose of parameter estimation and capture of possible unmodeled dynamics. A distinguishing feature ofThrust 2is the implementation of non-invasive spectroscopy and imaging to gain additional plant morphology and nutrition information. The subsystems (e.g., water and hydroponics) are then integrated as a system-wide model and validated experimentally. InThrust 3, we devise control algorithms with the purpose of achieving desired closed-loop performance despite the presence of disturbances and/or uncertain dynamics. As a part ofThrust 4, the resulting CPS is implemented in simulation software and experimentally validated to determine a feasible deployment scale. We expect that our novel, foundational research contributions will be immediately transferable to the water, waste, agriculture and remote sensing industries, but benefit other complex system control applications entirely.

Project Methods
Our project focuses on two CPS research areas, i.e., control and data analytics, inspired by the design and operations of technology-driven decentralized CEAs. Our research advances the following CPS challenges:how to predict, control and optimize highly dynamic systems.Food production in these CEAs must warrant the high cost of land in urban areas, which makes it necessary to reduce the total DWW-CEA operating cost and increase productivity. However, it is highly challenging to optimize this complex system. First, it is difficult to achieve optimal performance for a complicated system of subsystems. In our case, we need to coordinate the Pilot-Plant and Pilot-Farm (together with real-time data collection and new sensor implementation), examine their inter-correlation, and support dynamic and robust optimal decisions to achieve the highest yield, best nutrient compositions, lowest operating cost and energy consumption, while simultaneously meeting various safety requirements of DWW-CEAs. Second, the profound impact of numerous operating conditions and parameters (e.g., light source, CO2, nutrient dosing rate, temperature, phenotypes, and nutrient uptake rate) on the vegetable yield and nutrient compositions during different growth periods need to be thoroughly understood. We have to movefrom model-driven CPS fundamentalsinto anintegrated data-driven model-based approach. In order to optimize DWW-CEA, we need to develop a dynamic model of the entire physical system accompanied by algorithms. ThismediumCPS project requres multi-disciplinary efforts spanning a few disciplines, including Industrial and Systems Engineering, Agricultural and Biological Engineering, and Environmental Engineering.

Progress 06/01/20 to 05/31/24

Outputs
Target Audience:This project focuses on the development of enabling technology for controlled environment agriculture. The target audience mainly includes researchers and students in agriculture and biological engineering, industrial and systems engineering, civiland environmental engineering, and some related disciplines. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Most of these acitivities have been summerzied in the previous progress reports. How have the results been disseminated to communities of interest?Most of these acitivities have been summerzied in the previous progress reports. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? We have achieved the goals of the project. In addition to other accomplishments reported in the previous reports, we would like to hightlight that our paper entitled "Revolutionizing membrane design using machine learning-Bayesian optimization" has been selected as one of the recipients of the 2022 Environmental Science & Techology (ES&T) award.

Publications

Progress 06/01/23 to 05/31/24

Outputs
Target Audience:The research results have been presented at academic conferences in a few different disciplines, including environmental engineering, algoriculture, andindustrial and systems engineering. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Two graduate students are involved into the final period of this project. How have the results been disseminated to communities of interest?The results have been presented at different conferences (see the previous section), as well as a few invited seminars at different institutions. Some of these invited seminars (primarily by PI Lan) are listed as follows. Invited speaker, "Uniform Optimality for Convex and Nonconvex Optimization", Depart- ment of Statistics and Data Science University of Pennsylvania, December 7, 2023. Invited speaker, "Uniform Optimality for Convex and Nonconvex Optimization", Depart- ment of Industrial Engineering & Operations Research University of California at Berkeley November 28, 2023. Invited speaker, "Policy Optimization over General State and Action Spaces", Department of Mathematics & Statistics University at Albany April 6, 2023. ? What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? In the final period of this project, the PIs have investigated the application of the stochastic and robust optimization into CEAs. The major accomplishments are published in Environmental Science and Technology, Nature Communcations Engineering among other leading journal in the field.

Publications

Type: Journal Articles Status: Submitted Year Published: 2024 Citation: Gerry Chen, Sunil Kumar Narayanan, Thomas Gautier Ottou, Benjamin Missaoui, Harsh Muriki, C�dric Pradalier, Yongsheng Chen, "Hyperspectral Neural Radiance Fields", arXiv:2403.14839
Type: Journal Articles Status: Published Year Published: 2024 Citation: Huang, Y., Afolabi, M.A., Lan, G., Liu, S., and Chen, Y. (2024) MXene-Coated Ion-Selective Electrode Sensors for Highly Stable and Selective Lithium Dynamics Monitoring, Environmental Science & Technology, https://doi.org/10.1021/acs.est.3c06235
Type: Journal Articles Status: Published Year Published: 2024 Citation: Wiley Helm*, Shifa Zhong*, Elliot Reid*, Thomas Igou*, Yongsheng Chen (2024), Development of gradient boosting- assisted machine learning data-driven model for free chlorine residual prediction, Frontier Environmental Science & Engineering, 2 (18): DOI: 10.1007/s11783-024-1777-6
Type: Journal Articles Status: Published Year Published: 2023 Citation: Moyosore A Afolabi, Dequan Xiao, Yongsheng Chen, "The Impact of Surface Chemistry and Synthesis Conditions on the Adsorption of Antibiotics onto MXene Membranes", Molecules, 29(1): 148
Type: Journal Articles Status: Published Year Published: 2023 Citation: Jiahao He, Yongsheng Chen, "Optimization of antibiotic analysis in leafy vegetables by simple solid-phase extraction and liquid chromatography triple quadrupole mass spectrometry", Environmental Advances, 15: 100463
Type: Conference Papers and Presentations Status: Published Year Published: 2023 Citation: Chen, Y.S. Membrane-Based Minus Approach to Shift Water Treatment Paradigm, IWA-MTC 2023, St. Louis, July 23- 26, 2023
Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Chen, Y.S. Coupling Resource Recovery with Digital Agriculture and AI to Enhance Urban Sustainability and Resilience, Digital Agriculture Conference at UIUC Center for Digital Agriculture, March 5-7, 2024
Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Chen, Y.S. Decentralized Municipal Wastewater Treatment to Recover Resources for Urban Food Production Using Controlled Environment Agriculture, Boston, MA, USA, March 11-13, 2024
Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Chen, Y.S. Machine Learning Assisted Membrane Fabrication for Fit-for-Purpose Water Reuse and Resource Recovery, Chemical Separations: Gordon Research Conference, Grand Galvez, January 21-26, 2024
Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Chen, Y.S. Transforming Urban Infrastructure: Innovative Methods for Drinking Water Treatment, Wastewater Resource Recovery, and Decentralized Precision Digital Agriculture to Boost Urban Sustainability and Resilience, Department of Civil and Environmental Engineering at UIUC, March 5-7, 2024

Progress 06/01/22 to 05/31/23

Outputs
Target Audience:PI Lan has given the following invited talks related to this project. 1. Semi-plenary speaker, "Policy Mirror Descent for Reinforcement Learning", The seventh International Conference on Continuous Optimization (ICCOPT) and the Modeling and Optimization: Theory and Applications (MOPTA) conference, Bethlehem, PA, July 28, 2022. 2. Invited speaker, "First-order Policy Optimization for Robust Markov Decision Process", Statistics and Optimization in Data Science Workshop Daniels School of Business, Purdue University, May 26, 2023. 3. Invited speaker, "Policy Optimization over General State and Action Spaces", SIAM Optimization Conference Seattle, WA, June 2, 2023. 4. Invited speaker, "Policy Optimization over General State and Action Spaces", Department of Mathematics & Statistics University at Albany, April 6, 2023. 5. Invited speaker, "Dual dynamic programming for stochastic programs over an infinite horizon", 2023 International Workshop on Stochastic Optimization (virtual) Nanjing, China, April 27, 2023 6. Invited speaker, "Policy Optimization over General State and Action Spaces", PanOptiC Workshop University of Florida, March 9, 2023. 7. Invited speaker, "Policy Optimization over General State and Action Spaces", Institute of Mathematical Sciences, National University of Singapore, December 8, 2022 8. Invited seminar, "Policy Mirror Descent for Reinforcement Learning", 1W-MINDS, October 27, 2022. 9. Invited panelist, Operations Research and Artificial Intelligence, 2022 annual INFORMS meeting, Indianapolis, IN, October 18, 2022. 10. Invited speaker, Nonconvex Stochastic Optimization Methods, 2022 INFORMS ICS Prize session, Indianapolis, IN, October 17, 2022. 11. Invited seminar, "Policy mirror descent for reinforcement learning", Statistical inference and convex optimization, Autrans, Vercors, France, June 14, 2022. Changes/Problems:We have asked for one year no-cost extension due to the delay in the collaboration with the pilot plant. The pilot plant has not been completed ontime due to the COVID situation in the past few years. What opportunities for training and professional development has the project provided?A few Ph.D. students, including Ji Gao,Caleb Ju, Georgios Kotsalis,and Abigael Wahlen are involved into this project. Georgios Kotsalis has graduated with a Ph.D. degree in August 2022. How have the results been disseminated to communities of interest?PI Lan has given the following invited talks related to this project. 1. Semi-plenary speaker, "Policy Mirror Descent for Reinforcement Learning", The seventh International Conference on Continuous Optimization (ICCOPT) and the Modeling and Optimization: Theory and Applications (MOPTA) conference, Bethlehem, PA, July 28, 2022. 2. Invited speaker, "First-order Policy Optimization for Robust Markov Decision Process", Statistics and Optimization in Data Science Workshop Daniels School of Business, Purdue University, May 26, 2023. 3. Invited speaker, "Policy Optimization over General State and Action Spaces", SIAM Optimization Conference Seattle, WA, June 2, 2023. 4. Invited speaker, "Policy Optimization over General State and Action Spaces", Department of Mathematics & Statistics University at Albany, April 6, 2023. 5. Invited speaker, "Dual dynamic programming for stochastic programs over an infinite horizon", 2023 International Workshop on Stochastic Optimization (virtual) Nanjing, China, April 27, 2023 6. Invited speaker, "Policy Optimization over General State and Action Spaces", PanOptiC Workshop University of Florida, March 9, 2023. 7. Invited speaker, "Policy Optimization over General State and Action Spaces", Institute of Mathematical Sciences, National University of Singapore, December 8, 2022 8. Invited seminar, "Policy Mirror Descent for Reinforcement Learning", 1W-MINDS, October 27, 2022. 9. Invited panelist, Operations Research and Artificial Intelligence, 2022 annual INFORMS meeting, Indianapolis, IN, October 18, 2022. 10. Invited speaker, Nonconvex Stochastic Optimization Methods, 2022 INFORMS ICS Prize session, Indianapolis, IN, October 17, 2022. 11. Invited seminar, "Policy mirror descent for reinforcement learning", Statistical inference and convex optimization, Autrans, Vercors, France, June 14, 2022. What do you plan to do during the next reporting period to accomplish the goals?In the next period, the PI will continue the development of numerical method for stochastic optimal control. Together with the collaborators, we will apply these techniques for contorled environament algriculture systems.

Impacts
What was accomplished under these goals? In this periods, we have had a few signifiant progresses regarding Thrust 3 and Thrust 4. In Thrust 3, we developeda model-free first-order method for linear quadratic regulator with optimal sampling complexity.We consider the classic stochastic linear quadratic regulator (LQR) problem under an infinite horizon average stage cost. By leveraging recent policy gradient methods from reinforcement learning, we obtain a first-order method that finds a stable feedback law whose objective function gap to the optima is at most ε with high probability using O(1/ε) samples. Our proposed method seems to have the best dependence on ε within the model-free literature without the assumption that all policies generated by the algorithm are stable almost surely, and it matches the best-known rate from the model-based literature, up to logarithmic factors. The improved dependence on ε is achieved by showing the accuracy scales with the variance rather than the standard deviation of the gradient estimation error. Our developments that result in this improved sampling complexity fall in the category of actor-critic algorithms. The actor part involves a variational inequality formulation of the stochastic LQR problem, while in the critic part, we utilize a conditional stochastic primal-dual method and show that the algorithm has the optimal rate of convergence when paired with a shrinking multi-epoch scheme. Moreover, we developeddual dynamic programming for stochastic programs over an infinite horizon.We consider a dual dynamic programming algorithm for solving stochastic programs over an infinite horizon. We show non-asymptotic convergence results when using an explorative strategy, and we then enhance this result by reducing the dependence of the effective planning horizon from quadratic to linear. This improvement is achieved by combining the forward and backward phases from dual dynamic programming into a single iteration. We then apply our algorithms to a class of problems called hierarchical stationary stochastic programs, where the cost function is a stochastic multi-stage program. The hierarchical program can model problems with a hierarchy of decision-making, e.g., how long-term de- cisions influence day-to-day operations. We show that when the subproblems are solved inexactly via a dynamic stochastic approximation-type method, the resulting hierarchical dual dynamic programming can find approximately optimal solutions in finite time. Preliminary numerical results show the practical benefits of using the explorative strategy for solving the Brazilian hydro-thermal planning problem and economic dispatch, as well as the potential to exploit parallel computing. In thrust 4, we developedreinforcement learning-based control for waste biorefining processes under uncertainty.Waste biorefining processes face significant challenges related to the variability of feedstocks. The supply and composition of multiple feedstocks in these processes can be uncertain, making it difficult to achieve economically feasible and sustainable waste valorization for large-scale production. Here, we introduce a reinforcement learning-based framework that aims to control these uncertainties and improve the efficiency of the process. The framework is tested on an anaerobic digestion process and is found to perform better than traditional control strategies. In the short term, it achieves faster target tracking with increased precision and accuracy, while in the long term, it shows adaptive and robust behavior even under additional seasonal supply variability, meeting downstream demand with high probability. This reinforcement learning-based framework offers a promising and scalable solution to address uncertainty issues in real-world biorefining processes. If implemented, this framework could contribute to sustainable waste management practices globally, making waste biorefining processes more economically viable and environmentally friendly.

Publications

Type: Journal Articles Status: Submitted Year Published: 2022 Citation: C. Ju, G. Kotsalis and G. Lan, A model-free first-order method for linear quadratic regulator with �(1/?) sampling complexity, submitted to SIAM Journal on Control and Optimization, February 2023.
Type: Journal Articles Status: Submitted Year Published: 2023 Citation: J. Gao, A. Wahlen, C. Ju, Y. Chen, G. Lan, and Z. Tong, Reinforcement Learning-Based Control for Waste Biorefining Processes Under Uncertainty, submitted to Nature Communications Engineering, May 2023.
Type: Journal Articles Status: Other Year Published: 2023 Citation: C. Ju and G. Lan, Dual dynamic programming for stochastic programs over an infinite horizon, released on arXiv, March 2023.

Progress 06/01/21 to 05/31/22

Outputs
Target Audience:The target audience of this project we reached out during this report period include graduate and undergraduate students across different desciplines, including agriculture, environmental engineering, operations research, industrial engineering, and machine learning. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?This project has involved threePh.D. students and one post-doc research fellow. They helped to conduct data analysis andexperiments, build up the data collection equipments, implement new models and algorithms, write research papers,and give presentations. How have the results been disseminated to communities of interest?The research results have been presented to broader areas of environmental engineering, optimization, and machine learning through numerous invited seminars. A few of them given by PI Lanare listed below. 1.Semi-plenary speaker, ``Policy Mirror Descent for Reinforcement Learning",The seventh International Conference on Continuous Optimization (ICCOPT)and the Modeling and Optimization: Theory and Applications (MOPTA) conferenceBethlehem, PA, July 28, 2022. 2. Tutorial speaker (1.5 hours),``Complexity of Stochastic GradientDescent",INFORMS Applied Probability Society (APS), Tutorial Session,October 24, 2021. 3.Invited speaker (virtual),``Policy Mirror Descent for ReinforcementLearning", NESS-NextGen Data Science Day, November 6, 2021. 4.Invited seminar (virtual), ``Fast convergence rates for onlinereinforcement learning", Intel Technical Seminar,January 7, 2022. 5. Invited seminar (virtual), ``Stochastic Optimization Methods forReinforcement Learning", University of North Carolina at Chapel Hill,November 21, 2021. 6. Invited seminar (virtual), ``Stochastic Optimization for ReinforcementLearning", B-TU Cottbus-Senftenberg, October 20, 2021. What do you plan to do during the next reporting period to accomplish the goals?In the next period, we will further develop robust control and optimization methods, andapply the developed algorithms for the proposed CEA and related subjects.

Impacts
What was accomplished under these goals? To collect our data, we have constructed a prototype robot at a pilot site in Atlanta, GA which can autonomously take photographs from many viewpoints of roughly 50 lettuce plants.For reference, a time-lapse of the robot taking photos can be foundhere.Currently, we have data from 2 grow-cycles, each containing roughly 50 plants.For each plant, we collect photos and the mass at harvest time.The first dataset contains only photos at harvest time, whereas the second dataset contains photos each day over the ~4 week grow cycle. As a baseline approach for estimating plant mass, we use COLMAP to generate dense 3D reconstructions of each lettuce plant, then voxelize to estimate the plant volume/mass. For wastewater treatment, we have fabricated a novel nanofiltration membrane using machine learning to recover nutrients and remove emerging contaminants. A number of papers generated from this effort. For methodologicial innovation, we developed a new and general class of algorithms, called policy mirror descent methods, for reinfocement learning. We show for the first time in the literature that policy gradient methods can achieve competitive even better sample complexity guarantees than other methods. We also developed novel algorithms for solving problems with function constraints. These methodologies will be further pursued in the applicaiton areas of controlled-environment algriculture (CEA) and related fields.

Publications

Type: Journal Articles Status: Published Year Published: 2021 Citation: Abigail Cohen, Gerry Chen, Eli Berger, Sushmita Warrier, Guanghui Lan, Emily Grubert, Frank Dellaert, and Y.S. Chen (2021), Dynamically Controlled Environment Agriculture: Integrating Machine Learning and Mechanistic and Physiological Models for Sustainable Food Cultivation, ES&T Engineering, 2 (1): 3-19.
Type: Journal Articles Status: Published Year Published: 2022 Citation: Gao, Haiping; Zhong, Shifa; Zhang, Wenlong; Igou, Thomas; Berger, Eli; Reid, Elliot; Zhao, Yangying; Lambeth, Dylan; Gan, Lan; Afolabi, Moyosore; Tong, Zhaohui; Lan, Guanghui; Y. S. Chen (2022), "Revolutionizing Membrane Design Using Machine Learning-Bayesian Optimization", Environmental Science & Technology, 56 (4):2572-2578
Type: Journal Articles Status: Published Year Published: 2022 Citation: Yangying Zhao, Xin Tong, Juhee Kim, Tiezheng Tong, Ching-Hua Huang, and Y. S. Chen (2022), Capillary-Assisted Fabrication of Thin-Film Nanocomposite Membranes for Improved SoluteSolute Separation, Environmental Science & Technology, 56 (9):5849-5859
Type: Journal Articles Status: Published Year Published: 2022 Citation: Kim, Juhee, Xiaoyue Xin, Bethel T. Mamo, Gary L. Hawkins, Ke Li, Yongsheng Chen, Qingguo Huang, and Ching-Hua Huang. "Occurrence and Fate of Ultrashort-Chain and Other Per-and Polyfluoroalkyl Substances (PFAS) in Wastewater Treatment Plants." ACS ES&T Water (2022)
Type: Journal Articles Status: Published Year Published: 2021 Citation: X Tong, S Liu, Y Zhao, Y.S. Chen, J Crittenden (2021), Influence of the Exclusion-Enrichment Effect on Ion Transport in Two-Dimensional Molybdenum Disulfide Membranes, ACS Applied Materials & Interfaces, 13:26904-26914
Type: Journal Articles Status: Accepted Year Published: 2022 Citation: G Lan (2022), "Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes", Mathematical Programming, online-first.
Type: Journal Articles Status: Submitted Year Published: 2022 Citation: D. Boob, Q. Deng and G. Lan (2022), "Level Constrained First Order Methods for Function Constrained Optimization", submitted to Mathematical Programming, May 2022

Progress 06/01/20 to 05/31/21

Outputs
Target Audience:Local farmers and public policy makers about Controlled Environment Agriculture (CEA). Scientific community for data analysis and control, modeling and simulation development. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?This project has funded PhD students whose research interests lie in line with the project goals. Dr. Lan and Dr. Huo has hired one student working on optimization and control.Dr. Chen has hired one student in the Environmental Engineering program who focuses on integrating machine learning into the mechanistic models of water treatment and hydroponic agriculture. How have the results been disseminated to communities of interest?The PIs have given quite a few invited virtual seminars during this reporting periods. Some of them are listed below. "Stochastic Optimization Algorithms forReinforcement Learning",Online Seminar on Mathematical Foundations of Data Science,March 5, 2021. "Advancing Stochastic Optimization forReinforcement Learning",ISE Seminar,Lehigh University,March 9, 2021. "Algorithms for Multi-stage StochasticOptimization",Google Virtual Seminar,April 14, 2021. "Advancing Stochastic Optimization forReinforcement Learning", Daniel J. Epstein Seminar Series,University of Southern California, February 16, 2021. What do you plan to do during the next reporting period to accomplish the goals?The PIs will work according to the plan outline in the proposal. In particular, we expect to see more products on the application of machine learning, data analysis and control methods in CEAs.

Impacts
What was accomplished under these goals? Thrust 1: We have selected the ASM and ADM water principal models for the biological treatment units. Our team has written code representing each of these models in MATLAB and tested their validity with literature data. We have conducted an internal review of membrane physicochemical models, and how they interface with bioreactors in AnMBR and MBR systems. The next step is to integrate these models to represent the whole system and write them into functional code. Thrust 2: We have written a critical review paper on plant growth modeling, which investigates current mechanistic and machine learning-based plant models, and advocates for a hybrid approach. This review outlines plant models by physical compartment and describes the key sensor technologies which can feed data-driven models based on plant physiology. We next plan to implement the best practices found through this review to first derive base plant models from our current hydroponic growth experiments, and then develop online functionality which can update with a steady intake of data. Thrust 3: We have developed novel optimization and control methods, including the policy mirror descent methods for Markov decision process, the stochastic dual dynamic programming and robust control methods based on affine policies. These results have been accepted by or submitted to top journals in optimizaiton and control areas, such as Mathematical programming and SIAM Journal on Control and Optimization.

Publications

Type: Journal Articles Status: Accepted Year Published: 2020 Citation: Complexity of Stochastic Dual Dynamic Programming, accepted by Mathematical Programming, 2020.
Type: Journal Articles Status: Published Year Published: 2021 Citation: Convex optimization for finite horizon robust covariance control of linear stochastic systems, SIAM Journal on Control and Optimization 59-1 (2021), 296-319.
Type: Journal Articles Status: Submitted Year Published: 2021 Citation: Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes, submitted to Mathematical Programming.