Source: COLORADO STATE UNIVERSITY submitted to NRP
DSFAS: ENABLING EFFECTIVE DECISION-MAKING IN DRYLAND FARMING FOR ARID AND SEMI-ARID REGIONS USING FIELD-SCALE SOIL MOISTURE CONTENT (SMC) MAPS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1033332
Grant No.
2024-67021-43840
Cumulative Award Amt.
$591,500.00
Proposal No.
2023-11688
Multistate No.
(N/A)
Project Start Date
Sep 15, 2024
Project End Date
Sep 14, 2027
Grant Year
2024
Program Code
[A1541]- Food and Agriculture Cyberinformatics and Tools
Recipient Organization
COLORADO STATE UNIVERSITY
(N/A)
FORT COLLINS,CO 80523
Performing Department
(N/A)
Non Technical Summary
The key goal of this project, codenamed QUENCH, is to estimate soil moisture content (SMC) at the root zone level for dryland farming. These SMC maps will be produced at 30m spatial resolution every 2-3 days. We stratify the target root zone for SMCs based on soil depth increments of 5, 10, 20, 50, and 100 cm. These choices are informed, and constrained, by the availability of in situ observations at these depths and are sufficient to capture fine-root productivity dynamics critical for plant growth. These field level SMC maps can be used by growers to inform their decision-making process throughout the crop season. Our study region is informed by data collated by the USDA identifying winter wheat areas that are in drought prone areas; we further restrict our study area to those regions where winter wheat is the main crop. Over 83% of the area where winter wheat is located experiences some form of drought, including exceptional, extreme, severe, or moderate conditions. Our target area encompasses the states of Colorado, Kansas, Nebraska, Oklahoma, and Texas. The SMC-based decision-making that the effort will support includes, (1) when to plant seeds, (2) what crops to grow, and also (3) when to harvest early and grow some other crop. In several areas, such as the High Plains Aquifer where groundwater sources have been severely depleted, irrigated farms are expected to transition to dryland farming.
Animal Health Component
40%
Research Effort Categories
Basic
40%
Applied
40%
Developmental
20%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2050210205050%
2050430303050%
Goals / Objectives
Arid and semiarid regions represent approximately 25% of CONUS and a significant fraction of arable land around the world. Considering the recent expansions of drought-impacted areas and the increased shortage of groundwater, exclusively rainfed arable land is expected to increase with transitions from irrigated to dryland agriculture.The key goal of the proposed effort, named QUENCH, is to estimate SMC by combining the complementary strengths of in-situ SMC sensors, remote sensing, scientific models, and novel deep learning techniques to produce SMC maps at the root zone level for dryland farms. These SMC maps will be produced at 30 m spatial resolution every 2-3 days.While our methodology does not preclude applicability to the entire CONUS (Contiguous United States), our study region focuses on areas where USDA has identified winter wheat as the primary crop.Over 83% of the area where winter wheat is located experiences some form of drought, including exceptional, extreme, severe, or moderate conditions. Our target area encompasses the states of Colorado, Kansas, Nebraska, Oklahoma, and Texas. Finally, we do not consider fields when they are snow covered (typically between December through mid-March).We identify four interconnected objectives to accomplish our main goal:Objective 1: Design models that capture variability and interactions across soils, topographical and meteorological variables, and remote/in-situ sensing to estimate SMC.Objective 2: Infuse domain knowledge in the model training process while extracting value from the limited number of in-situ observations.Objective 3: Model refinements at scale to cope with spatial, topographical, and meteorological variability while ensuring effective resource utilizations.Objective 4: Designan intuitive cyberinfrastructure for scientists, growers, and other users including stakeholders to explore dimensions associated with SMC and decision-making.
Project Methods
We will use deep neural networks (DNN) to produce SMC maps. DNNs offer three key advantages. First, they are exceptionally well-suited to assimilating and extracting patterns from high-dimensional, multimodal data while being robust to outliers. This capability is aligned with the diverse data types we consider including digital elevation models (DEMs), soil types, meteorological information, and hyper/multispectral data. Second, once trained, the DNNs produce inferences at high throughput without parameter calibration. Finally, underpinned byrepresentational learning (automated extraction of higher order features from the data) and the generation of hierarchical representations to capture highly nonlinear interactions between the input feature space, DNNs offer the potential to scale to large geographical extents.

Progress 09/15/24 to 09/14/25

Outputs
Target Audience:The primary target audience(s) during this reporting period include: (1) Conferences IEEE International Conference on Artificial Intelligence (IEEE CAI-2025),ACM SIGSPATIAL, International Conference on Advances in Geographic Information Systems (SIGSPATIAL-2024, SIGSPATIAL-2025), IEEE eScience Conference (e-Science-2025),IEEE International Conference on on Data Science and Advanced Analytics (DSAA-25), IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT-2024), International Conference on Artificial Intelligence (AAAI-2025), IEEE International Conference on Big Data (BigData2024) (2) Journals MDPI, Remote Sensing,Hydrology,Journal of Hydrology (3) Interdisciplinary collaboration This project is based on close collaboration across agricultural science, civil and environmental engineering, and computer science. Our PIs have held a project-wide meeting every month during the semester, in addition to ad hoc meetings to troubleshoot challenges. We have had more than 30additional ad hoc meetings among different subsets of the PIs and their research group members. During the reporting period, the PI and her graduate students held two regular meetings and one to two group meetings each week. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?During the reporting period, we provided the following training and professional development opportunities. 1. Ph.D. students Paahuni Khandelwal has been trained in AI-based modeling with multi-modal datasets to estimate current root zone soil moisture levels. She graduated in Spring 2025. Tanjim Faruk has been trained in vision modeling with in-situ observations combined with the scientific model. Rupasree Dey has been trained in multi-modal deep learning modeling with time series dataset as part of this project. Abdul Matin has been trained in knowledge distillation and domain adaptation within the context of soil moisture estimation particularly focusing on extending foundation model. Andrei Bachinin has been working on using spectral images to understand soil properties and soil moisture content estimation. 2. M.S. students. Cavin Alderfer and Samuel Swing were trained in the use of APSIM for simulating crop growth. Cavin finished his studies by completing two graduate certificates. A professional development plan was developed for/with Samuel Swing, and we are actively working towards accomplishing the goals laid out in his plan. Matthew Young and Freddy Rarrieu were trained in developing a cyberinfrastructure for data visualization. Both students completed their MS degree in Spring 2025. Everett Lewark has been trained in developing framework for the multi-resolution spatial datasets as part of the HERMIS framework. How have the results been disseminated to communities of interest?A. Journal articles and papers published in proceedings of conferences [1] Paahuni Khandelwal, Jeffery D. Niemann, David J. Mulla, Shrideep Pallickara, Sangmi Lee Pallickara, SUBTERRA:Estimating Soil Moisture at Root Zone Depths Using Science-guided Learning, In Proceedings of the IEEE Conference onArtificial Intelligence (IEEE CAI), 2025. [2] Andrei Bachinin, Rupasree Dey, Paahuni Khandelwal, Sam Leuthold, M. Francesca Cotrufo, Shrideep Pallickara, andSangmi Lee Pallickara, Science-Informed Multitask Transformer for Soil Property Prediction from FTIR Spectroscopy, (Toappear) Proceedings of the IEEE eScience Conference, 2025 [3] Tanjim Faruk, Abdul Matin, Shrideep Pallickara, and Sangmi Lee Pallickara, Accounting for Spatial Variability with the Histogram of Oriented Gradients Based Masking Improves Performance of Masked Autoencoder over Hyperspectral Satellite Imagery, student poster and abstract, In proceedings of the AAAI Conference on Artificial Intelligence, 2025. [4] Hansen, Paige, Nathan Orwick, Kassidy Barram, Pierce Smith, Jay Breidt, Sangmi Lee Pallickara, and Shrideep Pallickara. "Archimedes: A Framework to Support Distributional Similarity Analysis over Arbitrary Spatiotemporal Scopes at Scale." In2025 IEEE 25th International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pp. 215-225. IEEE, 2025. [5] Paahuni Khandelwal, Sangmi Lee Pallickara, Shrideep Pallickara, DeepSoil: A Science-guided Framework for GeneratingHigh Precision Soil Moisture Maps by Reconciling Measurement Profiles Across In-situ and Remote Sensing Data, In proceedings of the ACM SIGSPATIAL, International Conference on Advances in Geographic Information Systems, 2024.** Finalist for the Best Paper Award. [6] Federico Larrieu, Tyson O'Leary, Sangmi Lee Pallickara, and Shrideep Pallickara, MAGELLAN: Enabling Effective SearchOver Voluminous, High-Dimensional Scientific Datasets, In Proceedings of the IEEE/ACM International Conference on BigData Computing, Applications and Technologies (BDCAT), 2024. [7] Everett Lewark, Matthew Young, Paahuni Khandelwal, Sangmi Lee Pallickara, and Shrideep Pallickara, Periscope: A Framework for Visualizations of Multiresolution Spatiotemporal Data at Scale, In proceeding of the IEEE International Conference on Big Data (IEEE BigData), 2024. [8] Kassidy Barram, Sangmi Lee Pallickara, Shrideep Pallickara, SCRYBE: Enabling Programmatic Interfaces for ExplorationsOver Voluminous Spatiotemporal Data Collections, In proceedings of the IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT), 2024 [9] Sahaar, A.S., and J.D. Niemann, 2024, "Estimating Rootzone Soil Moisture by Fusing Multiple Remote Sensing Productswith Machine Learning," Remote Sensing, 16, 3699, doi: 10.3390/rs16193699 [10] Bindner, J.R., H.E. Proulx, K. Wickham, J.D. Niemann, J. Scalia, T.R. Green, and P.J. Grazaitis, 2025, "Dependence ofSoil Moisture and Strength on Topography and Vegetation Varies within a SMAP Grid Cell," Hydrology, 12(34), 1-24, doi:10.3390/hydrology12020034. [11] Fischer, S., J.D. Niemann, J. Scalia, M.D. Bullock, H.E. Proulx, B. Kim, T.R. Green, and P.J. Grazaitis, 2025, "Assessingthe Influence of Model Inputs on Performance of the EMT+VS Soil Moisture Downscaling Model for a Large Foothills Regionin Northern Colorado," Journal of Hydrology, 650, 132397, doi: 10.1016/j.jhydrol.2024.132397. [12] Faruk, Tanjim Bin, Abdul Matin, Shrideep Pallickara, and Sangmi Lee Pallickara. "TerraMAE: Learning Spatial-Spectral Representations from Hyperspectral Earth Observation Data via Adaptive Masked Autoencoders."(To appear) In proceedings of the ACM SIGSPATIAL, International Conference on Advances in Geographic Information Systems, 2025 B. Dissertations and Thesis [1] Paahuni Khandelwal, Large-scale Predictive Modeling for Spatiotemporally Evolving Phenomena. Doctoral Dissertation, Computer Science, Colorado State University, 2025 [2] Frederic Larrieu, Enabling Effective Search Over Voluminous, High-dimensional Scientific Datasets, Masters Thesis, Computer Science, Colorado State University, 2025 [3] Kati Patterson, Time series Analysis over Sparse, Non-stationary Datasets with Variational Mode Decomposition and Transfer Learning, Masters Thesis, Computer Science, Colorado State University, 2025 [4] Kassidy Baram, Enabling Programmatic Interfaces for Voluminous Spatiotemporal Data Collections, Masters Thesis, Computer Science, Colorado State University, 2025 [5] Matthew Young, Rapid & Interactive Explorations of Voluminous Spatial Temporal Datasets, Masters Thesis, Computer Science, Colorado State University, 2025 [6] Tajim Faruk, Towards Generating A Pre-training Image Transformer Framework for Preserving Spatio-Spectral Properties in Hyperspectral Satellite Images, Masters Thesis, Computer Science, Colorado State University, 2025 C. Web-based Soil Digital Twin for Soil moisture This is an on-going effort. However, to seek feedback, current version with the soil property mapping capability is available at: https://soiltwin.org/soils-in-silico/ What do you plan to do during the next reporting period to accomplish the goals? Develop a Vision AI-based models for root-zone soil moisture forecasting Generate training data corpus from the process-based model Measuring and estimating uncertainty of the root-zone soil moisture models. Extending our Hermes framework to ingest model outputs with effective model inference workflow.

Impacts
What was accomplished under these goals? (1)Develop models for the root zone soil moisture estimation (Object 1, 2, and 3) During the reporting period, out modeling effort focused on developing a Deep Learning based model that estimates soil moisture within the root zone. Root zone soil moisture is a key driver for decision making for winter wheat growers. It indicates the actual water available where winter wheat will be seeded. If the root zone soil moisture level is not suitable for the winter wheat, growers may consider different crop rotation plans. We have developed SUBTERRA, a science-guided deep learning framework designed to estimate soil moisture (SM) within the top 20 cm of root zone. The key contribution of this activity is to overcome limitations of traditional data-driven models by integrating scientific models, particularly the Richards' equation, into machine learning models. We combined sparse, high-precision in-situ sensor data with ancillary datasets such as satellite observations (SMAP), meteorological data (GridMET), soil hydraulic properties (POLARIS), and land-cover information (NLCD), to construct a more robust and physically consistent predictive model. Our methodology includes a Graph Neural Network (GNN) capable of handling heterogeneous and temporally misaligned data. SUBTERRA models spatial and temporal relationships through a graph structure where nodes represent diverse environmental variables and sensor readings, and edges encode physical relationships such as water movement between soil layers. The framework incorporates 24-hour soil moisture dynamics generated by solving Richards' equation to enhance the model's learning capacity, especially for depths where sensor data are missing or sparse. In SUBTERRA, we discretize the soil profile into 1 cm increments and simulate hourly soil moisture changes based on boundary conditions derived from in-situ measurements, meteorological inputs, and soil hydraulic properties. These simulations provide training targets and input features for the GNN model, allowing it to learn from physically realistic patterns even in the absence of direct measurements. Out model, SUBTERRA integrates scientific models into machine learning through data augmentation as well as extending training objectives. Our approach incorporates knowledge-guided loss functions that penalize deviations from physical laws, there by enforcing scientific consistency in the model's predictions. This hybrid learning strategy enables SUBTERRA to deliver high-accuracy soil moisture estimates even under conditions of severe data sparsity and heterogeneity. (2) Designing and developingan intuitive cyberinfrastructure for scientists, growers, and other stake holders (Objective 4) As part of this project, we have developed a cyberinfrastructure for visualizing and analytics of soil moisture (https://soiltwin.org/soils-in-silico/). Our framework, HERMES provides a digital twin of soil at the CONUS scale, targeting the soil properties and moisture content estimations. HERMES is based on a fundamental assumption that no single data store is suited to all forms of soil-related information. Our design is based on polyglot persistence, distributing datasets across storage systems based on their fitness to specific data modalities. Structured data such as tabular results from soil surveys or metadata reside in relational databases (PostgreSQL), where schema and indexing are well understood. Semi-structured data, including observational records, boundaries, and vectors shapes are managed using document stores (MongoDB). Raster data, with its emphasis on spatial colocation and voluminous characteristics, is stored in a tiled form across distributed key-value stores (Cassandra and Redis) enabling both parallel access and scalable partitioning. To support unified analysis across these heterogeneous data systems, we introduce a spatiotemporal scoping mechanism that imposes a unifying metadata abstraction. Every data element, regardless of format or storage backend, has a spatiotemporal scope associated with it i.e., it is tagged with a well-defined spatial footprint and temporal window. This spatiotemporal scope serves as a bridge across systems allowing datasets to be located, aligned, and queried in a federated fashion. Temporal reconciliation is handled explicitly. We have designed a federated query evaluation model, in which sub-queries are pushed down to their respective data stores and executed natively. Each store's own language and indexing strategy are leveraged directly: SQL in PostgreSQL, MQL over BSON documents in MongoDB, and spatially constrained key-range lookups in systems such as Cassandra or Redis. The architecture also accommodates specialized indexing structures tailored to the capabilities of each store. These include B+ Trees for efficient range queries in relational systems, and 2d-sphere indexes in document stores to support spherical geometry operations that account for the curvature of the earth which is essential for distance operators and computations. This native execution avoids premature data movement and exploits the optimization strategies of each backend. However, the integration of such results is a challenge.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Barram, Kassidy, Sangmi Lee Pallickara, and Shrideep Pallickara. "Scrybe: Enabling Programmatic Interfaces for Explorations Over Voluminous Spatiotemporal Data Collections." In 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT), pp. 248-257. IEEE, 2024.
  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2025 Citation: Fischer, Samantha C., Jeffrey D. Niemann, Joseph Scalia, Matthew D. Bullock, Holly E. Proulx, Boran Kim, Timothy R. Green, and Peter J. Grazaitis. "Assessing the influence of model inputs on performance of the EMT+ VS soil moisture downscaling model for a large foothills region in Northern Colorado." Journal of Hydrology 650 (2025): 132397.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Larrieu, Federico, Tyson OLeary, Sangmi Lee Pallickara, and Shrideep Pallickara. "Magellan: Enabling Effective Search Over Voluminous, High-dimensional Scientific Datasets." In 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT), pp. 218-227. IEEE, 2024.
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2025 Citation: Faruk, Tanjim Bin, Abdul Matin, Shrideep Pallickara, and Sangmi Lee Pallickara. "Accounting for Spatial Variability with the Histogram of Oriented Gradients Based Masking Improves Performance of Masked Autoencoder over Hyperspectral Satellite Imagery (Student Abstract)." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 28, pp. 29365-29367. 2025.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2025 Citation: Khandelwal, Paahuni, Jeffrey D. Niemann, David J. Mulla, Shrideep Pallickara, and Sangmi Lee Pallickara. "Subterra: Estimating Soil Moisture at Root Zone Depths Using Science-Guided Learning." In 2025 IEEE Conference on Artificial Intelligence (CAI), pp. 328-335. IEEE, 2025.
  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2025 Citation: Bindner, Joseph R., Holly Proulx, Kevin Wickham, Jeffrey D. Niemann, Joseph Scalia IV, Timothy R. Green, and Peter J. Grazaitis. "Dependence of Soil Moisture and Strength on Topography and Vegetation Varies Within a SMAP Grid Cell." Hydrology 12, no. 2 (2025): 34.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Lewark, Everett, Matthew Young, Paahuni Khandelwal, Sangmi Lee Pallickara, and Shrideep Pallickara. "Periscope: A Framework for Visualizations of Multiresolution Spatiotemporal Data at Scale." In 2024 IEEE International Conference on Big Data (BigData), pp. 1373-1380. IEEE, 2024.
  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2024 Citation: Sahaar, Shukran A., and Jeffrey D. Niemann. "Estimating Rootzone Soil Moisture by Fusing Multiple Remote Sensing Products with Machine Learning." Remote Sensing 16, no. 19 (2024): 3699.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2025 Citation: Hansen, Paige, Nathan Orwick, Kassidy Barram, Pierce Smith, Jay Breidt, Sangmi Lee Pallickara, and Shrideep Pallickara. "Archimedes: A Framework to Support Distributional Similarity Analysis over Arbitrary Spatiotemporal Scopes at Scale." In 2025 IEEE 25th International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pp. 215-225. IEEE, 2025.
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2025 Citation: Andrei Bachinin, Rupasree Dey, Paahuni Khandelwal, Sam Leuthold, M. Francesca Cotrufo, Shrideep Pallickara, and Sangmi Lee Pallickara, Science-Informed Multitask Transformer for Soil Property Prediction from FTIR Spectroscopy, (To appear) Proceedings of the IEEE eScience Conference, 2025.
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2025 Citation: Tanjim Faruk, Abdul Matin, Shrideep Pallickara, Sangmi Lee Pallickara, TerraMAE: Learning Spatial-Spectral Representations from Hyperspectral Earth Observation Data via Adaptive Masked Autoencoders, (To appear) Proceedings of the 33rd ACM SIGSPATIAL International Conference on Advances in Geospatial Information Systems (SIGSPATIAL 2025), Minneapolis, MN 2025
  • Type: Theses/Dissertations Status: Published Year Published: 2025 Citation: Paahuni Khandelwal, Large-scale Predictive Modeling for Spatiotemporally Evolving Phenomena. Doctoral Dissertation, Computer Science, Colorado State University, 2025
  • Type: Theses/Dissertations Status: Published Year Published: 2025 Citation: Frederic Larrieu, Enabling Effective Search Over Voluminous, High-dimensional Scientific Datasets, Masters Thesis, Computer Science, Colorado State University, 2025
  • Type: Theses/Dissertations Status: Published Year Published: 2025 Citation: Kati Patterson, Time series Analysis over Sparse, Non-stationary Datasets with Variational Mode Decomposition and Transfer Learning, Masters Thesis, Computer Science, Colorado State University, 2025
  • Type: Theses/Dissertations Status: Published Year Published: 2025 Citation: Kassidy Baram, Enabling Programmatic Interfaces for Voluminous Spatiotemporal Data Collections, Masters Thesis, Computer Science, Colorado State University, 2025
  • Type: Theses/Dissertations Status: Published Year Published: 2025 Citation: Matt Young, Rapid & Interactive Explorations of Voluminous Spatial Temporal Datasets, Masters Thesis, Computer Science, Colorado State University, 2025
  • Type: Theses/Dissertations Status: Published Year Published: 2025 Citation: Tajim Faruk, Towards Generating A Pre-training Image Transformer Framework for Preserving Spatio-Spectral Properties in Hyperspectral Satellite Images, Masters Thesis, Computer Science, Colorado State University, 2025
  • Type: Websites Status: Other Year Published: 2025 Citation: https://soiltwin.org/soils-in-silico/ Interactive Web-bases visual analytics environment for soil moisture map over CONUS