Performing Department
(N/A)
Non Technical Summary
Farming is a long journey that starts with preparing the land and ends with harvesting crops. There are many steps along the way - planting seeds, fighting off pests and disease, watering, and ensuring plants get the right nutrients. Every farm is distinct, shaped by the decisions of farmers and the inherent factors like soil quality and climate. However, farming today is facing tighter budgets, limited resources, and changing weather patterns. This means farmers need to manage their land even more carefully. They need tools that can help them predict how much crop they'll get based on their farming choices. Right now, that kind of tool is missing. Our study aims to fill that gap. We're building a tool called DeepYield, using the latest Artificial Intelligence (AI) and Deep Learning (DL) techniques. It's designed to predict sugar beet yields, from small experiments to big farms. The tool will gather different formats of data from different sources, make sense of it all, and provide farmers with clear insights about what factors most influence their yields. Plus, it'll be tailored to each farm's unique conditions, rather than offering a one-size-fits-all solution. We're also partnering with Western Sugar Cooperative, a major player in the sugar beet industry across four states (NE, CO, WY, MT). Together, we'll test DeepYield on over 1,000 real-world sugar beet fields. And to make sure the tool reaches farmers, we're creating a user-friendly website on phrec-irrigation.com. By understanding sugar beet farming better and using resources wisely, we believe this project will help farmers improve their methods. While we're focusing on sugar beets now, we think this approach could work for other crops like corn, soybean, and cotton too.
Animal Health Component
50%
Research Effort Categories
Basic
50%
Applied
50%
Developmental
(N/A)
Goals / Objectives
The overall goal of this project is to develop a field-specific, artificial intelligence (AI) and deep learning (DL) based yield prediction tool named DeepYield that will consider the holistic process of on-farm crop production by integrating agricultural data that vary greatly in format, sampling frequencies, and scale. Detailed objectives are: 1)Data generation and collection for the development of DeepYield at both research plots and on-farm settings via partnerships with key stakeholders; 2)Development of DeepYield model for accurate, explainable, evolving, and field-specific yield prediction; 3)Evaluation of DeepYield, packaging DeepYield into web-based graphical user interface, and disseminate it to stakeholders.
Project Methods
This study will collect and utilize multi-modal datasets that vary in spatial resolution (from in-situ sensor to Unmanned Aerial Vehicle (UAV) and satellites), sampling frequency (from automated sampling interval in minutes to weekly manual sampling), as well as in data formats (from numeric data to text data). The dependable variables, or prediction variables, include the following: 1) Time-series, point-sourced numerical data, such as: volumetric soil water content at different depths, canopy cover percentages, meteorological data, aboveground dry biomass, crop height, sugar beet yield on DAP x, sucrose content of sugar beet on DAP x; 2) Imagery temporal-spatial data such as overhead sugar beet RBG and spectral images (robot and UAV), satellite-based imagery (NDVI); Management related numerical data such as: plant density, length of crop season, seasonal nitrogen input, and seasonal irrigation input; Management related categorical data such as: variety, tillage, rotation, chemical applications, and soil texture. The prediction variable datasets are designed to complement each other. All data will be collected repeatedly for 3 years.We propose to collect data at three different settings: 1. Research plots (small geographic scale, highest data resolution); 2. Intense-sampling commercial fields (larger geographic scale, medium data resolution); 3. Western Sugar Cooperative (WSC) commercial fields (largest geographic scale, least data resolution). Data collected at research plots and intense-sampling commercial fields will be used to develop and evaluate DeepYield, while data collected at WSC fields (over 1,000 fields with total acreage ~ 110,000 acres) will be used to extensively evaluate DeepYield. This setup allows the development of DeepYield using detailed data while being able to evaluate its robustness and performance using incomplete field data, which is often the case in the real world.After data is collected, the DeepYield AI model will be developed, transferred, and evaluated. The design of DeepYield aims to integrate all datasets proposed earlier while providing excellent explainability as compared to "black-box" AI/DL models. The development of DeepYield will involve following steps: 1) multi-channel feature extraction, 2) within-channel attention mechanism, 3) cross-channel attention mechanism, and 4) fusion for prediction. Here, the channel refers to each type of input dataset. Strategies were also considered to improve model generalizability and robustness under limited sample sizes. Such strategies include: using data augmentation techniques, (2) pre-training the feature extractors for images and time series data based on large public datasets, (3) adopting parameter-efficient designs for the feature extractors such as EfficientNet, and (4) enforcing shared parameters for channels with high-correlated datasets.The end goal of this study is to adopt DeepYield at field level. However, due to the discrepancies in environmental conditions and management practices between research plots and on-farm fields, the model developed at one place may not fit the other. Furthermore, due to practical constraints, some in-season datasets may not be collected as frequently at on-farm fields as in research plots, or may be unavailable/missing in on-farm fields. These constraints prohibit us from learning "one DeepYield model that fits all". To tackle these unique challenges, we propose a novel transfer learning (TL) approach that will adjust and tailor DeepYield to each unique on-farm field based on available datasets from that field. This is similar to "recalibration" or "on-site calibration" processes of many mechanistic crop models but based on data-driven learning. Lastly, to the users (i.e., growers) of DeepYield, two questions can be of great interest: (1) Early prediction, which concerns how early in the growing season an accurate prediction for yield/sucrose can be made and this can help growers with early planning. (2) Continuous improvement of the prediction capability with data accumulation. We propose to use continual learning to address these two questions. The continual learning capability developed above can be used to continuously update the prediction model for each field over time (this project will collect three-year data for each on-farm field). With this capability, we can expect that accuracy, generalizability, and robustness of the model can be continuously improved for field-specific prediction.This project will collect in-season datasets and end-of-season yield/sucrose for 3 consecutive years from research plots and on-farm fields. We will first train DeepYield based on the research plot data in year 1 and use the proposed TL to fine-tune the model using on-farm data in year 1. To evaluate early-prediction capability, we will use the proposed continual learning to progressively update the model with accumulated data in each month, and compare each model with that using all the data in the growing season. For all the models, we will use cross-validation- based prediction accuracy to tune hyper-parameters. We will apply the trained year 1 model to the data in year 2 and 3 (as blind test sets), which will allow us to evaluate the generalization performance of the models. Furthermore, since our data collection will be repeated for 3 years, this provides us an opportunity to continuously improve the models with accumulated data in each field. Specifically, we will use the proposed continual learning to update the aforementioned year 1 models by combining year 1 & 2 data. We will compare these models on year 3 data (as a blind test set).DeepYield will be evaluated in four aspects: (1) accuracy assessment, we will compute the mean absolute prediction error and predictive correlation between the predicted and true yield/sucrose values. (2) explainability assessment. DeepYield will output attention weights, which can be used to locate important time segments, subregions of each field, and in-season variables. The identified time segments or variables will help the team to verify if the findings make sense by comparing them against well-known crop growth mechanistic models/principles. (3) generalizability assessment, we will train DeepYield by including all-but- one on-farm fields and test its performance on the remaining field (treated as a "new" field that our model will be applied to in the future). We will rotate through all fields in this way and assess the generalization capability of the model. (4) computational efficiency assessment, we will report the training and inference times of DeepYield under different computing environments (e.g., HPC vs desktop computer; cloud-based vs. local computing resources). This assessment is important for knowing the computing resource requirement for deploying DeepYield in the future.The project's outcomes will be communicated through both traditional and contemporary channels. These include field days, specialized grower meetings (such as the WSC Joint Research Committee annual meeting), regional and national conferences, dedicated websites, and multimedia content. Notably, we'll develop a web-based graphical user interface (GUI) on UNL's platform (https://phrec-irrigation.com). UNL and GaTech principal investigators will collaborate closely to design the DeepYield web pages, ensuring they're accessible to growers and stakeholders both during and after the growing season. We'll monitor and gather analytics on website usage, assessing engagement annually. Additionally, we'll produce concise YouTube tutorials offering step-by-step guidance on the project's scope and website utilization. Peer-reviewed journal articles and extension publications will also be pursued throughout the project's duration.