Source: UNIVERSITY OF SOUTH CAROLINA submitted to
TOXIMAP: COMPUTATIONAL FRAMEWORK FOR PREDICTION OF GEOGRAPHICAL AND TEMPORAL INCIDENCE OF MYCOTOXINS IN US CROP FIELDS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
1011846
Grant No.
2017-67017-26167
Project No.
SC.W-2016-10402
Proposal No.
2016-10402
Multistate No.
(N/A)
Program Code
A1331
Project Start Date
Feb 15, 2017
Project End Date
Jun 13, 2019
Grant Year
2017
Project Director
Terejanu, G.
Recipient Organization
UNIVERSITY OF SOUTH CAROLINA
(N/A)
COLUMBIA,SC 29208
Performing Department
Computer Science
Non Technical Summary
Aflatoxin is a carcinogenic toxin produced by Aspergillus-family fungi occurring in soil and decaying vegetation, which can contaminate corn along with other relevant US crops like peanuts, treenuts and cotton before harvest and/or during storage. Aflatoxin causes liver cancer in humans and in a variety of animal species, and it has been associated with childhood stunting; an important public health issue that increases vulnerability to infectious diseases and cognitive impairments. It is estimated that 5 billion people worldwide are at risk of aflatoxin exposure. Prediction and control of aflatoxin contamination is a fundamental challenge for US grain industry, poultry producers, and makers of dairy products. The production of aflatoxin is highly dependent on environmental conditions and it is projected that environmental perturbations due to climate change will result in a significant increase in aflatoxin contamination incidents, further aggravating its economic impact. The lack of a systematic approach to determine the distribution of aflatoxin occurrence before harvest adversely affects the grain industry.The goal of this proposal is to develop a general predictive modeling framework for calculating aflatoxin occurrence in US crop fields before harvest, and package this knowledge in a user-friendly predictive web/mobile interface for generating nation-wide and real-time aflatoxin hazard maps. This has the potential to change certain behaviors in crop management to improve food safety (lower levels of aflatoxin) such as providing valuable information of when and where to test for aflatoxin before harvest, reducing the amount of chemicals used to grow the crops, determining the best time to harvest, as well as indicating the cornfield areas with the highest contamination levels for grain segregation. The mathematical foundation and the proposed software tool are generally applicable to any type of mycotoxin and crop. While hazard maps for aflatoxin and other mycotoxin occurrence are important to crop producers, they are also of interest to consumers, regulators, and insurers to enhance their decision-making process. Furthermore, the presence of mycotoxins in agricultural run-offs is a critical issue in environmental sustainability and ecological physiology of various land and aquatic ecosystems. Hence, the approach to be developed will be of relevance to a broader community beyond the agricultural community.
Animal Health Component
0%
Research Effort Categories
Basic
70%
Applied
(N/A)
Developmental
30%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
72315101102100%
Knowledge Area
723 - Hazards to Human Health and Safety;

Subject Of Investigation
1510 - Corn;

Field Of Science
1102 - Mycology;
Goals / Objectives
The major goal of this proposal is to develop a general predictive modeling framework for calculating mycotoxin occurrence in US crop fields before harvest. This mathematical framework will be accompanied by a web/mobile interface to explore temporal and geographical mycotoxin hazard maps for the entire United States.Objective #1. Build an inventory of favorable environmental conditions for aflatoxin production in corn using extensive field measurements. The PIs have already acquired a network of meteorological stations (soil moisture and temperature, atmospheric temperature, relative humidity, wind speed and direction) using a recent USDA/NRCS data collection grant. The sensors will be used to instrument a cornfield in partnership with a local farmer. In this project the PIs propose to leverage this existing infrastructure and supplement the data collection obtained from the sensor network with a plethora of remote sensing data at various resolutions and frequencies obtained using Google Earth Engine (imagery, geophysical, climate and weather). In addition, weekly corn samples will be collected to measure the levels of aflatoxin concentration during various developmental stages of corn. This data will be used to develop advanced aflatoxin predictive models with quantified uncertainties.Objective #2. Develop novel probabilistic data-driven models for general mycotoxin prediction with quantified uncertainties. A novel approach is proposed to obtain analytical expectations of mycotoxin concentrations as a function of environmental factors. These external variables (temperature, humidity etc.) are determined at a location of interest using spatial interpolations, which is subject to interpolation errors. The proposed approach accounts for both model uncertainties and interpolation errors in generating predictions for mycotoxin concentrations. Separate Gaussian processes are trained using data collected in Objective #1 for both physical models and forcing models, and then they are stacked to obtain prediction of mycotoxin production. Analytical expressions will be derived for first and second-order moments of the proposed stacked Gaussian process. While in this project only the aflatoxin will be modeled, the proposed nonparametric models can be generally applied across the mycotoxin spectrum and in other environmental science projects.Objective #3. Web/mobile interface to explore temporal and geographical aflatoxin hazard maps for the entire Continental United States (CONUS). Predictions generated in Objective #2 will be made broadly available to the community via a dedicated web/mobile interface that will be built and hosted as part of this project. The PIs propose to leverage the data collected in Objective #1 and computational framework derived in Objective #2 to provide both a realtime monitoring aflatoxin tool for CONUS and a forecasting tool to study the impact of climate change on aflatoxin production. The proposed software platform will be based on a plug-and-play framework, where new mycotoxin models, crops, geographical regions, and environmental data can be easily integrated.
Project Methods
Model. A Stacked Gaussian Process will be developed to obtain analytical expectations of aflatoxin production in US corn fields. The proposed probabilistic model is obtained in two stages: the production of aflatoxin with quantified uncertainties is modeled under various temperature and water activity conditions, and then the predictive aflatoxin model is linked with environmental conditions obtained on a regular basis to generate probabilistic risk maps. Data collected as part of our current USDA/NRCS-CIG project along with remote sensing data collected using Google Earth Engine will be used to train and validate the proposed model.Website. The predictions of the proposed probabilistic model will be compiled and visualized using a dedicated website (toximap.com).Formal validation methods based on experimental data will be used to understand the science behind the aflatoxin prediction. With respect to the impact of the predictions of the proposed model on the target audience, the website will be endowed with visitor tracking and feedback capabilities to evaluate the efficacy of the hazard maps in practice.

Progress 02/15/17 to 02/14/20

Outputs
Target Audience:1) Undergraduate and graduate students at both University of South Carolina and University of North Carolina at Charlotte. Courses taught in 2018 that incorporate material used in this research: UofSC - Environmental pollution and Health (35 students), UofSC - Medical Mycology (81 students), UNCC - Visual Analytics (73 students). About 10% of students taught at UofSC and 15% at UNCC are African-American. 2) The StackedGP model and the public code at (https://bitbucket.org/uqlab/stackedgp) has attractedmore attention in the environmental and geoscience communities. PI Terejanu gave an invitedtalk about it at the SAMSI Workshop on Coupling Uncertain Geophysical Hazards in March 24-26, 2019 in Raleigh, NC. Changes/Problems:As a result of the PI transfer from University of South Carolina to University of North Carolina at Charlotte on August 15th 2018, and the period lost between August 15th 2018 until the actual transfer of funds caused by the late relinquishing of funds from University of South Carolina followed by the government shutdown, we will request a no-cost extension once the funds are transferred at University of North Carolina at Charlotte to be able to accomplish the last tasks. What opportunities for training and professional development has the project provided?The project has served to develop an excellent avenue for collaborations with various researchers from the fields of biology, public health and engineering. It has allowed students and faculty to work directly with farmers and learn about the techniques in agriculture that are used for increasing corn-productivity. Apart from the two graduate students that were directly connected with the project, the project has allowed training of an undergraduate researcher. How have the results been disseminated to communities of interest?Publishing research in peer-reviewed impactful journals and making software available under open source license is the primary way of disseminating our research. But alongside we also regularly attend the Richland Soil and Water Conservation District Meetings that gets us directly in contact with farmers where we can share our results and get their input in an informal setting. Other modes of dissemination include research presentations in different academic environments - includes research meetings, seminars, and classroom teaching. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Objective #1 1) We have instrumented a local cornfield with 8 stations that have recorded soil moisture at two different depths, atmospheric relative humidity and temperature sensors, as well as wind direction and speed. One of the stations has been destroyed at harvest, but we do have the environmental data for the 2017 growing season from 7 sensors. 2) Establishment of aflatoxin extraction and quantification. We have optimized this method and validated it investigating aflatoxin and other mycotoxins in corn and the corn-based food products (publication under review in Science of the Total Environment). The method is used to obtain aflatoxin levels for the corn samples from the locations corresponding to the 7 stations. All this local data will be used to train and validate the models developed in the Objective #2. 3) To supplement the data collected locally, we have developed software to extract and clean data from NOAA to obtain daily temperature data that has been collected from 2114 and 2324 weather stations for 2009 and 2012 respectively over US. Elevation data related to these weather stations has been determined using Google Maps API. 4) In addition, we have collected and digitized four datasets from the literature of wet-lab experiments that capture the stochasticity relation between aflatoxin and environmental factors. Furthermore, we have also collected previously reported aflatoxin levels at different locations for different states (IA 1983, IN 1983, NC 1978). This data as the data collected locally will be used to train and validate the models developed in Objective #2. 5) In 2018 we have instrumented again the local cornfield and collected measurements at 5 different locations. The data collected includes aflatoxin concentrations, soil moisture, atmospheric relative humidity and temperature sensors, as well as wind direction and speed. 6) We have completed establishing the standard operating procedures (SOP) for mycotoxin measurements using corn field samples that can work very well for processed corn-based food products as well. We have validated the accuracy of our mycotoxin readings using both HPLC/MS and ELISA based approaches. 7) Given the limited field aflatoxin measurements we have also downloaded the crop insurance data from USDA Risk Management Agency. These data contain claims related to aflatoxin and mycotoxins that can be used to supplement the other data sources. 8) To supplement the field measurements, we have digitized a number of wet-lab data points where aflatoxin concentration has been measured in control experiments with varying temperature between 16-40oF and water activity 0.9 and 0.99. These dataset has been used to create the predictive StackedGP model in Objective #2. 9) Unfortunately, the number of field measurements of aflatoxin concentration both locally collected and from the literature that we have access to is rather limited. Also, the datasets available in the literature for aflatoxin production obtained in the wet-lab are based on growth mathematical models of Aspergillus flavus grown on substrates that not necessary resemble corn kernels. As a result, we have started to develop corn-seed preparation protocols suitable of acoustic scanning to understand the growth of the fungal in corn kernels. We have previously developed both the methodology in obtaining quantities of interest from acoustic tomography as well as fungal growth models but we need to adapt their parameters to the specifics of corn kernels. These models will be used to get more accurate predictions of aflatoxin concentration using wet-lab data that is easier to collect and we can used it to inform the global model. Objective #2 1) We have developed a general probabilistic modeling framework (StackedGP) based on a network of independently trained Gaussian processes, to obtain approximated expectations of quantities of interest that require model composition as is the case with predicting geospatial distribution of aflatoxin concentration. The model uses an approximate scheme to propagated probabilities through StackedGP models to obtain estimates of quantities of interest with quantified uncertainties (publication under review in Environmental Software & Modelling). Eventually, this will allow us to ask questions such as what is theprobability that aflatoxin concentration is greater than 20 ppb for a certain location. 2) Based on the StackedGP model we have developed a Python software package that allows any developer to create and customized StackedGP models. We have made the software available under a Git repository at https://bitbucket.org/uqlab/stackedgp. The software package can be used in a variety of modeling domains as we have demonstrated in the manuscript such as geoscience, chemical dispersion, and prediction of wildfires. 3) Finally, we have developed an initial customized StackedGP model to make daily predictions of aflatoxin at arbitrary locations in US and accumulate the aflatoxin production over the growing season of corn. Currently, we are working on training the model using the data in Objective #1 and planning various validation experiments. 4) Unraveling a basic knowledge on the physiology of a fungal cell that sheds light on why a fungal cell synthesizes aflatoxin. Addressing this question is key for our experimental design and interpretation of data for development and validation of models proposed in this research. We have shown that aflatoxin production is a mechanism through which fungal cells can alleviate the production of reactive oxygen species (ROS) (manuscript accepted in Toxins). This finding implicates that all environmental factors that activate ROS production should be considered as risk factors for aflatoxin contamination. 5) We have updated the StackedGP model obtained in year one with a novel derivation to analytically obtain correlations between predictions at different geographical locations, which are used to update the predictions of StackedGP using field measurements. 6) We have developed a StackedGP for aflatoxin prediction in South Carolina using three stacked Gaussian processes to model temperature, water activity, and daily aflatoxin using data collected from different literature studies and weather stations. The model has been used to make and update predictions using field measurements of aflatoxin concentration in South Carolina. 7) Based on the above aflatoxin StackedGP model, initial risk levels of aflatoxin incidence have been derived for both 2009 and 2012 across US. From a validation perspective we have initial results that these predictions are correlated with aflatoxin insurance claims in the respective years. Significant additional work is required to improve model predictions by incorporating other data sources and to create a more extensive set of validation scenarios over to gain confidence in the calculated aflatoxin risk levels. 8) To determine the parameters of the fungal growth models mentioned in Objective #1we have started initial scan using the ultrasonic system, frequency 75 MHz-125 MHz, with peak amplitude at 100 MHz. For a corn kernel we have identified four zones that have different material properties and where we can use sound wave speed measurements to correlated with various quantities of interest. 9) To quantify the uncertainty in fungal growth models which may become high-dimensional we have developed a new ensemble-based algorithm for approximate Bayesian inference. These fungal models will be used to further improve the aflatoxin StackedGP model. Objective #3 Nothing to report for Objective #3 for this period. Once we award is transfered to UNCC we will continue with the remaining tasks.

Publications

  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kareem Abdelfatah, Junshu Bao, and Gabriel Terejanu. Geospatial uncertainty modeling using Stacked Gaussian Processes. Environmental Modelling & Software, 109:293  305, 2018.
  • Type: Journal Articles Status: Under Review Year Published: 2019 Citation: Kareem Abdelfatah, Jonathan Senn, Noemi Glaeser, and Gabriel Terejanu. Prediction and Measurement Update of Fungal Toxin Geospatial Uncertainty using a Stacked Gaussian Process. Agricultural Systems, under review, 2019.
  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kenne, Gabriel J. and Gummadidala, Phani M. and Omebeyinje, Mayomi H. and Mondal, Ananda M. and Bett, Dominic K. and McFadden, Sandra and Bromfield, Sydney and Banaszek, Nora and Velez-Martinez, Michelle and Mitra, Chandrani and Mikell, Isabelle and Chatterjee, Saurabh and Wee, Josephine and Chanda, Anindya. Activation of Aflatoxin Biosynthesis Alleviates Total ROS in Aspergillus parasiticus. Toxins 10(2) 2018
  • Type: Journal Articles Status: Published Year Published: 2019 Citation: Phani M. Gummadidala, Mayomi H. Omebeyinje, James A. Burch, Paramita Chakraborty, Prasanta K. Biswas, Koyeli Banerjee, Qian Wang, Rubaiya Jesmin, Chandrani Mitra, Peter D.R. Moeller, Geoffrey I. Scott, Anindya Chanda, Complementary feeding may pose a risk of simultaneous exposures to aflatoxin M1 and deoxynivalenol in Indian infants and toddlers: Lessons from a mini-survey of food samples obtained from Kolkata, India, Food and Chemical Toxicology, Volume 123, 2019, Pages 9-15
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Chao Chen, Xiao Lin, and Gabriel Terejanu. An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection. In International Conference on Pattern recognition (ICPR), Beijing, China, August 2018
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Xiao Lin and Gabriel Terejanu. EnLLVM: Ensemble based Nonlinear Bayesian Filtering using Linear Latent Variable Models. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 2019
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Chao Chen, Yuan Huang, and Gabriel Terejanu. Approximate Bayesian Neural Network Trained with Ensemble Kalman Filter. The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary 2019
  • Type: Theses/Dissertations Status: Published Year Published: 2018 Citation: Chao Chen. Uncertainty Estimation of Deep Neural Networks. PhD Dissertation, University of South Carolina, 2018
  • Type: Theses/Dissertations Status: Published Year Published: 2018 Citation: Xiao Lin. Inference Framework for Model Update and Development. PhD Dissertation, University of South Carolina, 2018
  • Type: Websites Status: Published Year Published: 2017 Citation: http://uncertaintyquantification.org/usda-nifa-toximap-computational-framework-for-prediction-of-geographical-and-temporal-incidence-of-mycotoxins-in-us-crop-fields/
  • Type: Websites Status: Published Year Published: 2017 Citation: https://bitbucket.org/uqlab/stackedgp


Progress 02/15/17 to 06/13/19

Outputs
Target Audience:1) Undergraduate and graduate students at both University of South Carolina and University of North Carolina at Charlotte. Courses taught in 2018 that incorporate material used in this research: UofSC - Environmental pollution and Health (35 students), UofSC - Medical Mycology (81 students), UNCC - Visual Analytics (73 students). About 10% of students taught at UofSC and 15% at UNCC are African-American. 2) The StackedGP model and the public code at (https://bitbucket.org/uqlab/stackedgp) has attractedmore attention in the environmental and geoscience communities. PI Terejanu gave an invitedtalk about it at the SAMSI Workshop on Coupling Uncertain Geophysical Hazards in March 24-26, 2019 in Raleigh, NC. Changes/Problems:As a result of the PI transfer from University of South Carolina to University of North Carolina at Charlotte on August 15th 2018, and the period lost between August 15th 2018 until the actual transfer of funds caused by the late relinquishing of funds from University of South Carolina followed by the government shutdown, we will request a no-cost extension once the funds are transferred at University of North Carolina at Charlotte to be able to accomplish the last tasks. What opportunities for training and professional development has the project provided?The project has served to develop an excellent avenue for collaborations with various researchers from the fields of biology, public health and engineering. It has allowed students and faculty to work directly with farmers and learn about the techniques in agriculture that are used for increasing corn-productivity. Apart from the two graduate students that were directly connected with the project, the project has allowed training of an undergraduate researcher. How have the results been disseminated to communities of interest?Publishing research in peer-reviewed impactful journals and making software available under open source license is the primary way of disseminating our research. But alongside we also regularly attend the Richland Soil and Water Conservation District Meetings that gets us directly in contact with farmers where we can share our results and get their input in an informal setting. Other modes of dissemination include research presentations in different academic environments - includes research meetings, seminars, and classroom teaching. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Objective #1 1) We have instrumented a local cornfield with 8 stations that have recorded soil moisture at two different depths, atmospheric relative humidity and temperature sensors, as well as wind direction and speed. One of the stations has been destroyed at harvest, but we do have the environmental data for the 2017 growing season from 7 sensors. 2) Establishment of aflatoxin extraction and quantification. We have optimized this method and validated it investigating aflatoxin and other mycotoxins in corn and the corn-based food products (publication under review in Science of the Total Environment). The method is used to obtain aflatoxin levels for the corn samples from the locations corresponding to the 7 stations. All this local data will be used to train and validate the models developed in the Objective #2. 3) To supplement the data collected locally, we have developed software to extract and clean data from NOAA to obtain daily temperature data that has been collected from 2114 and 2324 weather stations for 2009 and 2012 respectively over US. Elevation data related to these weather stations has been determined using Google Maps API. 4) In addition, we have collected and digitized four datasets from the literature of wet-lab experiments that capture the stochasticity relation between aflatoxin and environmental factors. Furthermore, we have also collected previously reported aflatoxin levels at different locations for different states (IA 1983, IN 1983, NC 1978). This data as the data collected locally will be used to train and validate the models developed in Objective #2. 5) In 2018 we have instrumented again the local cornfield and collected measurements at 5 different locations. The data collected includes aflatoxin concentrations, soil moisture, atmospheric relative humidity and temperature sensors, as well as wind direction and speed. 6) We have completed establishing the standard operating procedures (SOP) for mycotoxin measurements using corn field samples that can work very well for processed corn-based food products as well. We have validated the accuracy of our mycotoxin readings using both HPLC/MS and ELISA based approaches. 7) Given the limited field aflatoxin measurements we have also downloaded the crop insurance data from USDA Risk Management Agency. These data contain claims related to aflatoxin and mycotoxins that can be used to supplement the other data sources. 8) To supplement the field measurements, we have digitized a number of wet-lab data points where aflatoxin concentration has been measured in control experiments with varying temperature between 16-40oF and water activity 0.9 and 0.99. These dataset has been used to create the predictive StackedGP model in Objective #2. 9) Unfortunately, the number of field measurements of aflatoxin concentration both locally collected and from the literature that we have access to is rather limited. Also, the datasets available in the literature for aflatoxin production obtained in the wet-lab are based on growth mathematical models of Aspergillus flavus grown on substrates that not necessary resemble corn kernels. As a result, we have started to develop corn-seed preparation protocols suitable of acoustic scanning to understand the growth of the fungal in corn kernels. We have previously developed both the methodology in obtaining quantities of interest from acoustic tomography as well as fungal growth models but we need to adapt their parameters to the specifics of corn kernels. These models will be used to get more accurate predictions of aflatoxin concentration using wet-lab data that is easier to collect and we can used it to inform the global model. Objective #2 1) We have developed a general probabilistic modeling framework (StackedGP) based on a network of independently trained Gaussian processes, to obtain approximated expectations of quantities of interest that require model composition as is the case with predicting geospatial distribution of aflatoxin concentration. The model uses an approximate scheme to propagated probabilities through StackedGP models to obtain estimates of quantities of interest with quantified uncertainties (publication under review in Environmental Software & Modelling). Eventually, this will allow us to ask questions such as what is theprobability that aflatoxin concentration is greater than 20 ppb for a certain location. 2) Based on the StackedGP model we have developed a Python software package that allows any developer to create and customized StackedGP models. We have made the software available under a Git repository at https://bitbucket.org/uqlab/stackedgp. The software package can be used in a variety of modeling domains as we have demonstrated in the manuscript such as geoscience, chemical dispersion, and prediction of wildfires. 3) Finally, we have developed an initial customized StackedGP model to make daily predictions of aflatoxin at arbitrary locations in US and accumulate the aflatoxin production over the growing season of corn. Currently, we are working on training the model using the data in Objective #1 and planning various validation experiments. 4) Unraveling a basic knowledge on the physiology of a fungal cell that sheds light on why a fungal cell synthesizes aflatoxin. Addressing this question is key for our experimental design and interpretation of data for development and validation of models proposed in this research. We have shown that aflatoxin production is a mechanism through which fungal cells can alleviate the production of reactive oxygen species (ROS) (manuscript accepted in Toxins). This finding implicates that all environmental factors that activate ROS production should be considered as risk factors for aflatoxin contamination. 5) We have updated the StackedGP model obtained in year one with a novel derivation to analytically obtain correlations between predictions at different geographical locations, which are used to update the predictions of StackedGP using field measurements. 6) We have developed a StackedGP for aflatoxin prediction in South Carolina using three stacked Gaussian processes to model temperature, water activity, and daily aflatoxin using data collected from different literature studies and weather stations. The model has been used to make and update predictions using field measurements of aflatoxin concentration in South Carolina. 7) Based on the above aflatoxin StackedGP model, initial risk levels of aflatoxin incidence have been derived for both 2009 and 2012 across US. From a validation perspective we have initial results that these predictions are correlated with aflatoxin insurance claims in the respective years. Significant additional work is required to improve model predictions by incorporating other data sources and to create a more extensive set of validation scenarios over to gain confidence in the calculated aflatoxin risk levels. 8) To determine the parameters of the fungal growth models mentioned in Objective #1we have started initial scan using the ultrasonic system, frequency 75 MHz-125 MHz, with peak amplitude at 100 MHz. For a corn kernel we have identified four zones that have different material properties and where we can use sound wave speed measurements to correlated with various quantities of interest. 9) To quantify the uncertainty in fungal growth models which may become high-dimensional we have developed a new ensemble-based algorithm for approximate Bayesian inference. These fungal models will be used to further improve the aflatoxin StackedGP model. Objective #3 Nothing to report for Objective #3 for this period. Once we award is transfered to UNCC we will continue with the remaining tasks.

Publications

  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kareem Abdelfatah, Junshu Bao, and Gabriel Terejanu. Geospatial uncertainty modeling using Stacked Gaussian Processes. Environmental Modelling & Software, 109:293  305, 2018.
  • Type: Journal Articles Status: Under Review Year Published: 2019 Citation: Kareem Abdelfatah, Jonathan Senn, Noemi Glaeser, and Gabriel Terejanu. Prediction and Measurement Update of Fungal Toxin Geospatial Uncertainty using a Stacked Gaussian Process. Agricultural Systems, under review, 2019.
  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kenne, Gabriel J. and Gummadidala, Phani M. and Omebeyinje, Mayomi H. and Mondal, Ananda M. and Bett, Dominic K. and McFadden, Sandra and Bromfield, Sydney and Banaszek, Nora and Velez-Martinez, Michelle and Mitra, Chandrani and Mikell, Isabelle and Chatterjee, Saurabh and Wee, Josephine and Chanda, Anindya. Activation of Aflatoxin Biosynthesis Alleviates Total ROS in Aspergillus parasiticus. Toxins 10(2) 2018
  • Type: Journal Articles Status: Published Year Published: 2019 Citation: Phani M. Gummadidala, Mayomi H. Omebeyinje, James A. Burch, Paramita Chakraborty, Prasanta K. Biswas, Koyeli Banerjee, Qian Wang, Rubaiya Jesmin, Chandrani Mitra, Peter D.R. Moeller, Geoffrey I. Scott, Anindya Chanda, Complementary feeding may pose a risk of simultaneous exposures to aflatoxin M1 and deoxynivalenol in Indian infants and toddlers: Lessons from a mini-survey of food samples obtained from Kolkata, India, Food and Chemical Toxicology, Volume 123, 2019, Pages 9-15
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Chao Chen, Xiao Lin, and Gabriel Terejanu. An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection. In International Conference on Pattern recognition (ICPR), Beijing, China, August 2018
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Xiao Lin and Gabriel Terejanu. EnLLVM: Ensemble based Nonlinear Bayesian Filtering using Linear Latent Variable Models. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 2019
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Chao Chen, Yuan Huang, and Gabriel Terejanu. Approximate Bayesian Neural Network Trained with Ensemble Kalman Filter. The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary 2019
  • Type: Theses/Dissertations Status: Published Year Published: 2018 Citation: Chao Chen. Uncertainty Estimation of Deep Neural Networks. PhD Dissertation, University of South Carolina, 2018
  • Type: Theses/Dissertations Status: Published Year Published: 2018 Citation: Xiao Lin. Inference Framework for Model Update and Development. PhD Dissertation, University of South Carolina, 2018
  • Type: Websites Status: Published Year Published: 2017 Citation: http://uncertaintyquantification.org/usda-nifa-toximap-computational-framework-for-prediction-of-geographical-and-temporal-incidence-of-mycotoxins-in-us-crop-fields/
  • Type: Websites Status: Published Year Published: 2017 Citation: https://bitbucket.org/uqlab/stackedgp


Progress 02/15/18 to 02/14/19

Outputs
Target Audience:1) Undergraduate and graduate students at both University of South Carolina and University of North Carolina at Charlotte. Courses taught in 2018 that incorporate material used in this research: UofSC - Environmental pollution and Health (35 students), UofSC - Medical Mycology (81 students), UNCC - Visual Analytics (73 students). About 10% of students taught at UofSC and 15% at UNCC are African-American. 2) The StackedGP model and the public code at (https://bitbucket.org/uqlab/stackedgp)isattracting more attention in the environmental and geoscience communities. PI Terejanu has been invited to give a talk about it at the SAMSI Workshop onCoupling Uncertain Geophysical Hazards in March 24-26, 2019 in Raleigh, NC. Changes/Problems:As a result of the PI transfer from University of South Carolina to University of North Carolina at Charlotte on August 15th 2018, and the period lost between August 15th 2018 until the actual transfer of funds (hopefully February 15th 2019) caused by the late relinquishing of funds from University of South Carolina followed by the government shutdown, we will request a no-cost extension once the funds are transferred at University of North Carolina at Charlotte to be able to accomplish the last tasks. What opportunities for training and professional development has the project provided?The project has served to develop an excellent avenue for collaborations with various researchers from the fields of biology, public health and engineering. It has allowed students and faculty to work directly with farmers and learn about the techniques in agriculture that are used for increasing corn-productivity. Apart from the two graduate students that were directly connected with the project, the project has allowed training of anundergraduate researcher. How have the results been disseminated to communities of interest?Publishing research in peer-reviewed impactful journals and making software available under open source license is the primary way of disseminating our research. But alongside we also regularly attend the Richland Soil and Water Conservation District Meetings that gets us directly in contact with farmers where we can share our results and get their input in an informal setting. Other modes of dissemination include research presentations in different academic environments - includes research meetings, seminars, and classroom teaching. What do you plan to do during the next reporting period to accomplish the goals?Objective #1 (1) We will continue to collect new field measurements from the instrumented local cornfield in the 2019 grow season. This will give us three years of aflatoxin field data that we can use to further refine and validate our global model. (2) One interesting observation that has resulted from the last field data collection is that aflatoxin measurements inside and outside the pivot irrigation system were below critical levels (20ppb) compared with the one measurement at the boundary of the pivot which was significantly higher. While this is just one measurement, we hope to replicate the experiment this year and see if we notice this again. The reason why this is important is because if aflatoxin concentrations are higher on the perimeter of the pivot irrigation systems then this type of geometrical information can be extracted from satellite imagery and can be used to further enhance model predictions. (3) To address the goal of fungal growth model in Objective #2 we will prepare samples for acoustic microscope imaging. Specifically, corn seeds (either non-inoculated seeds of inoculated with fungal spores) will be fixed using 3% glutaraldehyde, dehydrated in ethyl alcohol, and made into blocks using resin mix. Ultramicrotome sections from these resin blocks will be used for imaging using the acoustic microscope and various quantities of interest will be extracted to inform the fungal model, that will provide aflatoxin production on corn kernels. Objective #2 (1) The current StackedGP model will be further calibrated and validated using all the available field measurements either local or from the literature and insurance claim data will also be used to measure how well aflatoxin predictions correlated with aflatoxin claims. (2) The current fungal model will be updated using the ultrasound measurements obtained in Objective #1 (3) and it will be used to derive new corn specific aflatoxin production data that we will use to further update the global model. Objective #3 Finally, the global model obtained in Objective #2 (1) will be used to generate weekly aflatoxin risk mapsthat will be updated on the website www.ToxiMap.com. The interface of the website will be interactive where the user can easily change the time of prediction from the present to the past. The interface will be generated using JavaScript and D3.js visualization framework.

Impacts
What was accomplished under these goals? Objective #1 1) In 2018 we have instrumented again the local cornfield and collected measurements at 5 different locations. The data collected includes aflatoxin concentrations,soil moisture, atmospheric relative humidity and temperature sensors, as well as wind direction and speed. 2)We have completed establishing the standard operating procedures (SOP) for mycotoxin measurements using corn field samples that can work very well for processed corn-based food products as well. We have validated the accuracy of our mycotoxin readings using both HPLC/MS and ELISA based approaches. 3)Given the limited field aflatoxin measurements we have also downloaded the crop insurance data from USDA Risk Management Agency. These data contain claims related to aflatoxin and mycotoxins that can be used to supplement the other data sources. 4) To supplement the field measurements,we havedigitized a number of wet-lab data points where aflatoxin concentration has been measured in control experiments with varying temperature between 16-40oF and water activity 0.9 and 0.99. These dataset has been used to create the predictive StackedGP model in Objective #2. 5)Unfortunately, the number of field measurements of aflatoxin concentrationboth locally collected and from the literature that we have access to is rather limited. Also, the datasets available in the literature for aflatoxin production obtained in the wet-lab are based on growth mathematical models of Aspergillus flavus grown on substrates that not necessary resemble corn kernels. As a result, we have started to develop corn-seed preparation protocols suitable of acoustic scanning to understand the growth of the fungal in corn kernels. We have previously developed both the methodology in obtaining quantities of interest from acoustic tomography as well as fungal growth models but we need to adapt their parameters to the specifics of corn kernels. These models will be used to get more accurate predictions of aflatoxin concentration using wet-lab data that is easier to collect and we can used it to inform the global model. Objective #2 1) We have updated the StackedGP modelobtained in year one with a novel derivation to analytically obtain correlations between predictions at different geographical locations, which are used to update the predictions of StackedGP using field measurements. 2) We have developed a StackedGPfor aflatoxin prediction in South Carolina using three stacked Gaussian processes to model temperature, water activity, and daily aflatoxin using data collected from different literature studies and weather stations. The model has been used to makeand update predictionsusing field measurements of aflatoxin concentration in South Carolina. 3) Based on the above aflatoxin StackedGP model, initial risk levels of aflatoxin incidence have been derived for both 2009 and 2012 across US. From a validation perspective we have initial results that these predictions are correlated with aflatoxin insurance claims in the respective years. Significant additional work is required to improve model predictions by incorporating other data sources and to create a more extensive set of validation scenarios over to gain confidence in the calculated aflatoxin risk levels. 4) To determine the parameters of the fungal growth models mentioned in Objective #1 (5) we have started initial scanusing the ultrasonic system, frequency 75 MHz-125 MHz, with peak amplitude at 100 MHz. For a corn kernel we have identified four zones that have different material properties and where we can use sound wave speed measurements to correlated with various quantities of interest. 5)To quantify the uncertainty in fungal growth models which may become high-dimensional we have developed a new ensemble-based algorithm for approximate Bayesian inference. These fungal models will be used to further improve the aflatoxin StackedGP model.

Publications

  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kareem Abdelfatah, Junshu Bao, and Gabriel Terejanu. Geospatial uncertainty modeling using Stacked Gaussian Processes. Environmental Modelling & Software, 109:293  305, 2018.
  • Type: Journal Articles Status: Under Review Year Published: 2019 Citation: Kareem Abdelfatah, Jonathan Senn, Noemi Glaeser, and Gabriel Terejanu. Prediction and Measurement Update of Fungal Toxin Geospatial Uncertainty using a Stacked Gaussian Process. Agricultural Systems, under review, 2019.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Chao Chen, Xiao Lin, and Gabriel Terejanu. An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection. In International Conference on Pattern recognition (ICPR), Beijing, China, August 2018.
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2019 Citation: Xiao Lin and Gabriel Terejanu. EnLLVM: Ensemble based Nonlinear Bayesian Filtering using Linear Latent Variable Models. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 2019.
  • Type: Conference Papers and Presentations Status: Under Review Year Published: 2019 Citation: Chao Chen, Yuan Huang, and Gabriel Terejanu. Approximate Bayesian Neural Network Trained with Ensemble Kalman Filter. The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary 2019.
  • Type: Journal Articles Status: Accepted Year Published: 2018 Citation: Gummadidala PM, Omebeyinje MH, Burch JA, Chakraborty P, Biswas PK, Banerjee K, Wang Q, Jesmin R, Mitra C, Moeller PDR, Scott GI, Chanda A. Complementary feeding may pose a risk of simultaneous exposures to aflatoxin M1 and deoxynivalenol in Indian infants and toddlers: Lessons from a mini-survey of food samples obtained from Kolkata, India. Food Chem Toxicol. 2018 Oct 6. pii: S0278-6915(18)30729-4. doi: 10.1016/j.fct.2018.10.006. [Epub ahead of print] PubMed PMID: 30300722
  • Type: Theses/Dissertations Status: Published Year Published: 2018 Citation: Chao Chen. Uncertainty Estimation of Deep Neural Networks. PhD Dissertation, University of South Carolina, 2018
  • Type: Theses/Dissertations Status: Published Year Published: 2018 Citation: Xiao Lin. Inference Framework for Model Update and Development. PhD Dissertation, University of South Carolina, 2018


Progress 02/15/17 to 02/14/18

Outputs
Target Audience:At this point, through the release of our StackedGP software package accompanied by an archived report and publication under review, we are targeting a broad spectrum of researchers in environmental science. While we have developed this model and software package to make predictions of aflatoxin, it can also be used in a variety of other modeling domains as we have demonstrated in the manuscript such as geoscience, chemical dispersion,and prediction of wildfires. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?The project has served to develop an excellent avenue for collaborations with various researchers from the fields of biology, public health and engineering. It has allowed students and faculty to work directly with farmers and learn about the techniques in agriculture that are used for increasing corn-productivity. Apart from three graduate students that were directly connected with the project, the project has allowed training of several undergraduate researchers. How have the results been disseminated to communities of interest?Publishing research in peer-reviewed impactful journals and making software available under open source license is the primary way of disseminating our research. But alongside we also regularly attend the Richland Soil and Water Conservation District Meetings that gets us directly in contact with farmers where we can share our results and get their input in an informal setting. Other modes of dissemination include research presentations in different academic environments - includes research meetings, seminars, and classroom teaching. What do you plan to do during the next reporting period to accomplish the goals?Objective #1 1) We will instrument the local cornfield this year as well and collect environmentaldata. Alongside we will measure the aflatoxin levels to serve as validation data for our aflatoxin StackedGP model. 2) We will extract data from NOAA for 2018 with the goal to generate aflatoxin risk maps every week during the growing season. We are already in talks with people at Clemson extensionto provide us feedback on the risk maps that we will generate this year. 3) We will also collect data on aflatoxin insurance claims from USDA RMA with the goal of finding whether our aflatoxin predictions are correlated with aflatoxin insurance losses. This will serve as an additional validation for our model. Objective #2 1) The main objective this year is to train and validate the aflatoxin StackedGP model that we have initially developed. All the data that we have collected in the previous year and plan to collect this year will be used towards this objective. 2) With the goal to improve the predictive capability, we are planning to conduct experiments of fungal infection in corn-kernels that will help the development of a computational model for fungal growth and aflatoxin production in corn kernels. We believe that such a model can further constrain and improve the predictive capability of the aflatoxin StackedGP. Objective #3 1) We will start developing the user interface to deliver some of our initial hazard maps that we will produce this year. We hope to engage the community throughRichland Soil and Water Conservation District Meetings and Clemson extensionwith the goal to obtain feedback on the delivery/visualization of our hazard maps.

Impacts
What was accomplished under these goals? Objective #1 1) We have instrumented a local cornfield with 8 stations that have recorded soil moisture at two different depths, atmospheric relative humidity and temperature sensors, as well as wind direction and speed. One of the stations has been destroyed at harvest, but we do have the environmental data for the 2017 growing season from 7 sensors. 2) Establishment of aflatoxin extraction and quantification. We have optimized this method and validated it investigating aflatoxin and other mycotoxins in corn and the corn-based food products (publication under review in Science of the Total Environment). The method is used to obtain aflatoxin levels for the corn samples from the locations corresponding to the 7 stations. All this local data will be used to train and validate the models developed in the Objective #2. 3) To supplement the data collected locally, we have developed software to extract and clean data from NOAA to obtain daily temperature data that has been collected from 2114 and 2324 weather stations for 2009 and 2012 respectively over US. Elevation data related to these weather stations has been determined using Google Maps API. 4) In addition, we have collected and digitized four datasets from the literature of wet-lab experiments that capture the stochasticity relation between aflatoxin and environmental factors. Furthermore, we have also collected previously reported aflatoxin levels at different locations for different states (IA 1983, IN 1983, NC 1978). This data as the data collected locally will be used to train and validate the models developed in Objective #2. Objective #2 1) We have developed a general probabilistic modeling framework (StackedGP) based on a network of independently trained Gaussian processes, to obtain approximated expectations of quantities of interest that require model composition as is the case with predicting geospatial distribution of aflatoxin concentration. The model uses an approximate scheme to propagated probabilities through StackedGP models to obtain estimates of quantities of interest with quantified uncertainties (publication under review in Environmental Software & Modelling). Eventually, this will allow us to ask questions such as what is the probability that aflatoxin concentration is greater than 20 ppb for a certain location. 2) Based on the StackedGP model we have developed a Python software package that allows any developer to create and customized StackedGP models. We have made the software available under a Git repository at https://bitbucket.org/uqlab/stackedgp. The software package can be used in a variety of modeling domains as we have demonstrated in the manuscript such as geoscience, chemical dispersion, and prediction of wildfires. 3) Finally, we have developed an initial customized StackedGP model to make daily predictions of aflatoxin at arbitrary locations in US and accumulate the aflatoxin production over the growing season of corn. Currently, we are working on training the model using the data in Objective #1 and planning various validation experiments. 4)Unraveling a basic knowledge on the physiology of a fungal cell that sheds light on why a fungal cell synthesizes aflatoxin. Addressing this question is key for our experimental design and interpretation of data for development and validation of models proposed in this research. We have shown that aflatoxin production is a mechanism through which fungal cells can alleviate the production of reactive oxygen species (ROS) (manuscript accepted in Toxins). This finding implicates that all environmental factors that activate ROS production should be considered as risk factors for aflatoxin contamination. Objective #3 Nothing to report for Objective #3 for this period.

Publications

  • Type: Journal Articles Status: Under Review Year Published: 2018 Citation: Kareem Abdelfatah, Junshu Bao, Gabriel Terejanu. Geospatial Uncertainty Modeling using Stacked Gaussian Processes. under review Environmental Modelling & Software, 2018 (https://arxiv.org/abs/1612.02897v2)
  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kenne, Gabriel J. and Gummadidala, Phani M. and Omebeyinje, Mayomi H. and Mondal, Ananda M. and Bett, Dominic K. and McFadden, Sandra and Bromfield, Sydney and Banaszek, Nora and Velez-Martinez, Michelle and Mitra, Chandrani and Mikell, Isabelle and Chatterjee, Saurabh and Wee, Josephine and Chanda, Anindya. Activation of Aflatoxin Biosynthesis Alleviates Total ROS in Aspergillus parasiticus. Toxins 10(2) 2018
  • Type: Journal Articles Status: Under Review Year Published: 2018 Citation: Gummadidala et al. A survey of occurrence of aflatoxins and deoxynivalenol in baby foods from local markets in Kolkata, India: implications to human health. under review in Science of the Total Environment 2018
  • Type: Websites Status: Published Year Published: 2017 Citation: http://uncertaintyquantification.org/usda-nifa-toximap-computational-framework-for-prediction-of-geographical-and-temporal-incidence-of-mycotoxins-in-us-crop-fields/
  • Type: Websites Status: Published Year Published: 2017 Citation: https://bitbucket.org/uqlab/stackedgp