Source: UNIVERSITY OF FLORIDA submitted to NRP
DATA DRIVEN DISCOVERY IN ECOLOGY
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1023956
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Aug 1, 2020
Project End Date
Jul 31, 2025
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
UNIVERSITY OF FLORIDA
G022 MCCARTY HALL
GAINESVILLE,FL 32611
Performing Department
Wildlife Ecology and Conservation
Non Technical Summary
We will use data-driven discovery, where large amounts of data are used to generate understanding of patterns and processes, to provide an improved understanding of ecological systems and make predictions for how they will change in the future. This work includes the development and assessment of ecological models and cyberinfrastructure for the purpose of understanding how ecological systems change through time and making forecasts for their future states. It also includes creating new data on ecological systems by developing methods and pipelines for converting remote sensing ecological data and using this derived data to model ecological systems. This work is support by the general development of computatational tools to make working with scientific data faster and easier, and by increasing the number of scientists conducting data-driven discovery through developing and delivering training and broadening participation in this area of research.
Animal Health Component
0%
Research Effort Categories
Basic
100%
Applied
0%
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
1230699107025%
9017310107050%
9017210107025%
Goals / Objectives
Data driven discovery, where large amounts of data are used to generate understanding of patterns and processes, is considered the fourth major scientific paradigm. While a number of disciplines have actively embraced this paradigm ecology has been slow to do so. The major goal of this project is to leverage data-driven discovery to produce an improved understanding of ecological systems.Objectives:1. Development & assessment of ecological models: Use a diverse array of ecological data sources to help develop ecological models and evaluate the performance of existing models and theories.2. Ecological forecasting: Develop ecological forecasts for biodiversity and plant phenology using large datasets that can be used for making actionable societally relevant forecasts.3. Cyberinfrastructure development: Build software that automates the acquisition, cleaning, processing, and merging of publicly available ecological datasets so that researchers can spend their time doing research, not struggling with data management.4. Remove sensing: Develop methods and pipelines that yield precise individual level data on millions of individual trees. Use this data to address ecological questions at unprecedented scales.5. Training in data-driven discovery: Development and delivery of courses related to improving and broadening participation in the use of data-driven discovery in ecology more broadly.
Project Methods
Synthetic analysis of existing data: Most of our research involves the synthetic analysis of existing data using a diverse array of computational and statistical methods. Specifically we analyze large (often publicly available) ecological datasets using R, Python, Julia, PostgreSQL, and SQLite.High performance computing: We use the University of Florida's HiPerGator for data intensive analyses using both CPUs and GPUs.Deep learning: We use convolutional neural networks and other forms of deep learning to both develop ecological models and convert remote sensing data into ecological information.Software engineering: We use git, GitHub, continuous integration, automated testing,documentation development tools, and package management systems to produce high qualityscientific software for data management and analysis.

Progress 10/01/20 to 09/30/21

Outputs
Target Audience:Academic scientists Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Two graduate students and one undergraduate were recieved training in interdisciplinary collaboration and networking opportunities for professional development. How have the results been disseminated to communities of interest?Publications, software releases, research related social media posts, new websites. What do you plan to do during the next reporting period to accomplish the goals?1. Development & assessment of ecological models: We plan to compare process based and empirical models for describing the dynamics of small mammals in Arizona. 2. Ecological forecasting: We plan to evaluate the importance of the biological context (presence or absence of competitors) for influencing the predictions and accuracy of ecological forecasts in small mammals. We also plan to start developing new forecasts for wading birds in the Everglades. 3. Cyberinfrastructure development: We plan to develop software for automatically processing integrating data from the National Ecological Observatory Network to allow for the development of species predictions from remote sensing data. 4. Remote sensing: We plan to expand our remote sensing work to include new and improved methods and pipelines for predicting the species identity of individual birds and trees from remote sensing imagery. 5. Training in data-driven discovery: We plan to translations of some of the material in PI White's Data Carpentry for Biologists course to improve and broadening participation in the use of data-driven discovery in ecology more broadly.

Impacts
What was accomplished under these goals? 1. Development & assessment of ecological models: Development of new models for the population dynamics of small mammals in Arizona. 2. Ecological forecasting: Ongoing development and improvements to ecological forecasts for small mammals in Arizona, including a shift of high performance computing infrastructure for makign forcasts and a new dynamic website for displaying the results of the forecasts. 3. Cyberinfrastructure development: Ongoing developement of the Data Retriever software in Python, R, and Julia. 4. Remote sensing: Development of a new pipeline for conducting end-to-end processing of remote sensing imagery of birds in the Everglades. Development of new AI deep learning models to detect birds in remote sensing imagery. 5. Training in data-driven discovery: Improvements to the Data Carpentry for Biologists course including development of a full set of video lectures that have been viewed thousands of times on YouTube.

Publications

  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Weinstein, B.G., S.J. Graves, S. Marconi, A. Singh, A. Zare, D. Stewart, S.A. Bohlman, E.P. White. 2021. A benchmark dataset for individual tree crown delineation in co-registered airborne RGB, LiDAR and hyperspectral imagery from the National Ecological Observation Network. PLOS Computational Biology 17:e1009180 https://doi.org/10.1371/journal.pcbi.1009180
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Simonis, J.L., E.P. White, S.K. Morgan Ernest. 2021. Evaluating probabilistic ecological forecasts. Ecology 102:e03431. https://doi.org/10.1002/ecy.3431
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Marconi, S. S.J. Graves, B.G. Weinstein, S. Bohlman, and E.P. White. 2021. Estimating individual level plant traits at scale. Ecological Applications 31:302300 https://doi.org/10.1002/eap.2300
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Senyondo, H., D.J. McGlinn, P. Sharma, D.J. Harris, H. Ye, S.D. Taylor, J. Ooms, F. Rodr�guez-S�nchez, K. Ram, A. Pandey, H. Bansal, M. Pohlman, and E.P. White. 2021. Rdataretriever: R Interface to the Data Retriever. Journal of Open Source Software 6:2800 https://doi.org/10.21105/joss.02800
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Weinstein, B.G., S. Marconi, S. Bohlman, A. Zare, A. Singh, S.J. Graves, E.P. White. 2021. A remote sensing derived data set of 100 million individual tree crowns for the National Ecological Observatory Network. eLife 10:62922 https://doi.org/10.7554/eLife.62922
  • Type: Journal Articles Status: Published Year Published: 2020 Citation: Weinstein, B.G., S. Marconi, M. Aubry-Kientz, G. Vincent, H. Senyondo, E.P. White. 2020. DeepForest: A Python package for RGB deep learning tree crown delineation. Methods in Ecology and Evolution 11:17431751. https://doi.org/10.1111/2041-210X.13472
  • Type: Journal Articles Status: Published Year Published: 2020 Citation: Taylor, S.S., J.R. Coyle, E.P. White, and A.H. Hurlbert. 2020. A simulation study of the use of temporal occupancy for identifying core and transient species. PLOS ONE. https://doi.org/10.1371/journal.pone.0241198
  • Type: Journal Articles Status: Published Year Published: 2020 Citation: Adler, P.B., E.P. White, M.H. Cortez. 2020. Matching the forecast horizon with the relevant ecological processes. Ecography 43:17291739. https://doi.org/10.1111/ecog.05271


Progress 08/01/20 to 09/30/20

Outputs
Target Audience:Academic scientists Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals?1. Development & assessment of ecological models: We plan to compare general and site-specific models for species classification in trees and develop new classification models based on deep learning. 2. Ecological forecasting + Cyberinfrastructure development: We plan to update our automated forecasting system for desert rodents to use more high performance computing and have a dynamic front end. 4. Remote sensing: We plan to expand our remote sensing work to include methods and pipelines that yield precise individual level data on birds. 5. Training in data-driven discovery: We plan to produce online video content for PI White's Data Carpentry for Biologists course to improving and broadening participation in the use of data-driven discovery in ecology more broadly.

Impacts
What was accomplished under these goals? Work was conducted on cyberinfrastructure development and remote sensing leading to ongoing forward progress for the project, but the reporting period is only ~1 month so it is hard to describe things as "accomplished".

Publications