DSFAS: Leveraging multi-scale, multi-purpose open big data and machine learning to improve forecasts of and decision support for emerging pest threats

DSFAS: LEVERAGING MULTI-SCALE, MULTI-PURPOSE OPEN BIG DATA AND MACHINE LEARNING TO IMPROVE FORECASTS OF AND DECISION SUPPORT FOR EMERGING PEST THREATS

Sponsoring Institution

National Institute of Food and Agriculture

Project Status

ACTIVE

Funding Source

AFRI COMPETITIVE GRANT

Reporting Frequency

Annual

Accession No.

1028209

Grant No.

2022-67021-36465

Cumulative Award Amt.

$649,977.00

Proposal No.

2021-11479

Multistate No.

(N/A)

Project Start Date

Jan 15, 2022

Project End Date

Jan 14, 2027

Grant Year

2022

Program Code

[A1541]- Food and Agriculture Cyberinformatics and Tools

Recipient Organization
NORTH CAROLINA STATE UNIV
(N/A)
RALEIGH,NC 27695

Performing Department
(N/A)

Non Technical Summary
The spread of invasive plantpests and pathogens (hereafter pests) are well-known ecological and economic threats to agriculture, responsible for 10-40% of crop yield losses globally and resulting in an estimated $40 billion of production losses each year in the United States. The threat pests pose to food security is expected to increase due to climate change and the global nature of trade and travel. Although many pests are under regulatory control to prevent and mitigate outbreaks, control or eradication after pest establishment can be resource-intensive, and success requires rapid detection and effective implementation of appropriate strategies. Even preventative pesticide measures can be costly and environmentally damaging if over-applied and ineffective if under-applied or incorrectly timed, all with the potential for promoting pesticide-resistanceand loss of natural controls.Rapid responses and data-driven decision support tools are essential then for understanding and mitigating threats posed by damaging agricultural pests. However, sparse data typically limit the accuracyand iterative improvement of pest spread models. This research will couple advances in object detectionusing machine learning withwidely-available crowdsourced and satellite imagery to build an automated, repeatable process for expanding mapping efforts ofsusceptible crops (i.e., host species)essential to forecasting pest spread accurately and across multiple scales. The resulting maps of host species will improve pest spread models by addressing sparse data concerns and reducing delays in data availability, thereby enabling the continuous improvement of pest spread forecastsand shortening time to decision making.We will focus on several economically and culturally significant fruit and tree nut species threatened by emerging pests and climate change. We will collaborate with USDA Animal and Plant Health Inspection Service (APHIS), USDA Agricultural Research Service (ARS), state departments of agriculture, and growers associations to: (1) identify key pest threats to fruit and tree nut crops, (2) iteratively develop and validate host species maps and model forecasts, (3) continue co-developing our user-friendly decision support tool, the PoPS Forecasting Platform, and (4) add an alert system that translates forecasts and simulations into actionable insights for crop protection.The iterative near-term forecasting system, coupled with data inputs enhanced using machine learning, will reduce costs for pest surveys and help growers identify when and where to intervene to protect their crops, thus reducing production losses and chemical pesticide inputs.

Animal Health Component

60%

Research Effort Categories

Basic

10%

Applied

60%

Developmental

30%

Classification

Knowledge Area (KA)	Subject of Investigation (SOI)	Field of Science (FOS)	Percent
211	1219	1070	30%
211	1199	1070	30%
211	1040	1070	10%
211	1112	1070	10%
211	1213	1070	10%
211	1139	1070	10%

Knowledge Area
211 - Insects, Mites, and Other Arthropods Affecting Plants;

Subject Of Investigation
1112 - Cherry; 1199 - Deciduous and small fruits, general/other; 1213 - Walnut; 1139 - Grapes, general/other; 1219 - Edible tree nuts, general/other; 1040 - Mango;

Field Of Science
1070 - Ecology;

Keywords

host species distribution

image classification

object recognition

plant pest and pathogen spread

Goals / Objectives
Rapid responses and data-driven decision support tools are essential for understanding and mitigating threats posed by agricultural pests and pathogens. However, sparse data typically limit the accuracy and iterative improvement of pest spread models. The major goal of this project is to improve agricultural pest management decision support tools for stakeholders by coupling machine learning enhanced host maps with iterative near-term forecasts of pest spread. Our iterative near-term forecasting system, coupled with data inputs enhanced using machine learning, will reduce costs for pest surveys and help growers identify when and where to intervene to protect their crops, thus reducing production losses and chemical pesticide inputs. To achieve this goal our objectives are:Connect geotagged images with remote sensing technologies to create inventories of specialty crop hosts at scale. Creating geospatial inventories of host species at scale is challenged by data availability. Smaller-scale static maps can be created from time-intensive ground surveys or aerial imagery but are difficult to create and iteratively update over time at regional or country scales. Automating the process by leveraging and integrating remote sensing data portals, vetted crowdsourced observational data, and cloud computing can improve forecasting accuracy, reduce uncertainty, and importantly, decrease the time required to get predictions of pest spread. Specifically, we will use machine learning to identify host species and location from an image source using object detection and feed these new host location points to a machine learning algorithm to classify host area. To accomplish this objective, we will:Develop an object detection algorithm to identify host species from open imagery data to expand spatial extent and reduce latency in species observationsExpand satellite-based image classification algorithm to scale host mappingAssess accuracy of image classification through the collection of ground truth information from stakeholder engagementDevelop and compare iterative, near-term forecasts of pest spread affecting stakeholder-identified fruit and nut trees with and without enhanced host maps.Engaging with stakeholders to fully understand their needs and what information is required to make informed decisions facilitates the development of systems that better meet those needs and provide management relevant information. Here we will build on our current partnerships with USDA APHIS and state departments of agriculture, as well as build new partnerships with growers associations and local producers, to collaborate at all stakeholder levels. We also aim to speed up the data integration process by incorporating multi-scale data from our host mapping process, open citizen science data, and environmental drivers. We will quantify changes in forecast speed (i.e., time to delivery to decision makers) and accuracy changes due to improved data acquisition and integration of multi-scaled data. To accomplish this objective, we will:Create forecasts of pest spread affecting stakeholder identified cropsQuantify change in accuracy and precision of forecasts with updated host maps

Project Methods
Forecasting the spread of pests requires an understanding of the complex interactions between a pest, its host species, and environmental conditions.Sparse data, especially reference data as climate conditions change, are a well-known challenge facing the improvement and robustness of pest models. Spatial maps of host locations are a critical input to forecasts and currently are time-consuming to create. Methods for mapping host locations must balance trade-offs between acquisition cost and effort, spatial extent, spatial and temporal resolution, and computational requirements. We will use machine learning to identify host species and location from an image source using object detection and feed these new host location points to an ML algorithm to classify host area.For our first objective (i.e., create inventories of specialty crop hosts at scale), we will acquire geotagged imagery of host species from iNaturalist, GBIF, and Early Detection and Distribution Mapsusing their respective application programming interfaces (APIs). Data from these sources will include the image file and associated metadata including but not limited to location (i.e., coordinates), date of image capture, and species name. Evaluation of image and species identification quality and deduplication will be done leveraging common metadata tags and outlier checks. Data meeting quality thresholds will be used for two related classification tasks.The first algorithm will use the image, along with researcher-generated bounding boxes, and the species label to train an object detection algorithm to locate and classify a host species from an image. To increase the available amount of annotated data, we will also leverage recently created open data relevant to agriculture and plant pests that have been curated and annotated in concert by volunteers and experts.We place specific focus on deep Convolutional Neural Networks (CNN) given their success and accuracies above 70% in similar image classification and object detection efforts. We will also leverage transfer learning, which is the use of a model trained for a similar task on a large dataset as a starting point for developing a second model to help accelerate model training. Data will be pre-processed (e.g., resized to a uniform input size) and separated into training, validation, and test datasets. Image augmentations will be applied to training data to expand the variety of available images for training and increase the model's ability to generalize; these augmentations may include adjusting contrast or brightness, reducing image quality, and rotation. Model performance will be evaluated by examining classification (e.g. accuracy, precision, recall, etc.) and object detection (e.g. mean absolute precision) metrics, as well as visualizations to evaluate under- or overfitting. Model development will also focus on trying different network architectures and regularization techniques to minimize overfitting, and evaluating subsequent impacts to computational requirements and evaluation metrics. To increase the amoung of data availablefor a particular host species, a second development phase will explore using the geotagged iNaturalist, GBIF and EDDMapS data as automated data labels where they spatially coincide with Google Street View (GSV) and OpenStreetMap imagery data in addition to potentially manual tagging GSV data.Once developed, the classifier will be applied to unseen data from GSV to estimate host species class probabilities.For images meeting and exceeding a specified threshold (e.g. >50% likely to be the species of interest), we will convert the object bounding box to a geographic coordinate and verify classification, either with spatially coincident citizen science data or by randomly selecting a subset of images and manually validating in collaboration with stakeholders on the ground and through image interpretation of a stratified random sample of points. Positively identified outputs will then become additional point data for occurrence host maps and the raster presence/absence algorithm.The second algorithm will automate, validate, and expand the classification process for specialty crops in commercial production and in developed areasusing high-resolution satellite imagery and open data. Specifically, the host species presence data queried from geotagged imagery above will also serve as automated labeled points for pixels in satellite images (e.g. Landsat, Sentinel, Planet) to scale host presence maps to state and regional areas. Using a classifier previously developed by our team that uses repeat remote sensing observations in a Bayesian framework to provide uncertainty estimates of species presence that are converted to binary presence/absence maps at 10-m or 30-m spatial resolution and then rescaled to percent area mean and standard deviation maps at a stakeholder-selected resolution. Evaluation of the Area Under the Curve (AUC) of the Receiver Operator Characteristics (ROC) curve indicates performance of classification tasks for various thresholds.In addition to checking agreement between predictions based on unlabeled GSV data and broad categories in coarser resolution products like the NLCD (i.e., cultivated crops, deciduous forest) and crop classes in CDL when available (e.g. grapes, peaches), we also propose ground-truthing a percentage of labeled images that are in locations near USDA APHIS or state department of agriculture field surveys. Stakeholders at USDA APHIS and state departments of agriculture will validate the classification of the image either by visually assessing the image or visiting the location during a field work session. The coupling of locations with current field surveys is necessary to reduce the cost of manual validation. We will use the validation findings to inform data and model improvements as part of our iterative process.For our second objective (i.e., develop and compare forecasts of pest spread with and without machine learning enhanced host maps), wewill use the PoPS (Pest or Pathogen Spread) Forecasting System developed by our team to build forecasts for pests of fruit and nut trees.PoPS is an open-source, dynamic, process-based, spatially-explicit model that forecasts the spread of pests based on current locations and environmental and biological conditions specific to the pest being modeled.We will use the host mapping products created in objective 1 with fully specified spatially explicit uncertainties to propagate uncertainty from the host maps produced through the PoPS forecasts. Our data integration protocol will draw a unique value for each stochastic simulation of the model for every raster cell; e.g. if we are running 100,000 simulations, each simulation will draw a unique host map from the mean and standard deviation and use that map for the entire duration of the simulation for a total of 100,000 unique host maps. Providing probabilistic forecasts with fully specified uncertainties allows decision makers to target areas of high uncertainty for surveys and target high-probability areas for management. In this way, future decisions should have less uncertainty due to the linkage between surveys, management, and iterative forecasts.We will also compare improvements in model performance using previous host maps and our new host maps (objective 1). Model performance or skill will be assessed using multiple metrics as no single metric adequately captures the multiple dimensions of forecast accuracy. We will use root mean squared error (RMSE) to compare modeled population levels to observed population levels from field surveys. We will use accuracy, precision, recall/sensitivity, specificity, and odds ratiofor overall model performance across the entire landscape.

Progress 01/15/24 to 01/14/25

Outputs
Target Audience:We attended regular meetings with stakeholders at USDA APHIS and state departments of agriculture to: 1) inquire about how they would use forecasts and host maps in their workflow for managing pest outbreaks; and 2) to identify growers and producers to include in future working group conversations. We provided science-based knowledge opportunities for 4 North Carolina State University students (3 undergraduate students and 1 graduate student) and 1 North Carolina School of Science & Mathematics student. Students gained practical experience (e.g. data collection, analysis, scientific writing, programming) and knowledge on topics related to ecological modeling and remote sensing. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?We have undertaken one-on-one mentorship or individual study with several students (4 North Carolina State University students and 1 North Carolina School of Science & Mathematics high school student). We have also provided instruction on developing machine learning models to several graduate students at North Carolina State University. These activities resulted in greater proficiency in programming, geospatial data analytics, spatiotemporal modeling, ecological forecasting, scientific writing, presentation, and data interoperability.? How have the results been disseminated to communities of interest?We have disseminated results to communities of interest at several professional meetings: Poster presentation at the International Association of Landscape Ecology-North America meeting in Oklahoma City, OK in April, 2024. Title: "Invasive Tree Detection with AI and Street-View Imagery" Poster presentation at the USDA NIFA Project Director annual meeting for AFRI programs A1521 (Engineering for Agricultural Production and Processing), A1551 (Engineering for Precision Agriculture) and A1541 (Data Science for Food & Agriculture Systems) in conjunction with the International Conference on Precision Agriculture (ISPA) conference. Manhattan, KS in July, 2024. Title: "Invasive Tree Detection with AI and Street-View Imagery" Poster presentation at the USDA Spotted Lanternfly (SLF) Research and Technology Development meeting in Wooster, OH in October, 2024. Title: "Invasive Tree Detection with AI and Street-View Imagery" What do you plan to do during the next reporting period to accomplish the goals?Major Goal 1 - Objectives 1 & 2 We found that evaluation metrics for hosts identified in the image classification model varied widely per genus (Maples: Precision = 0.79, Recall = 0.71; Cherries: Precision = 0.69, Recall = 0.62; Tree-of-Heaven: Precision = 0.51, Recall = 0.54; Walnuts: Precision = 0.21, Recall = 0.41). To better understand what drives variation in model performance, we will test our image classification model on street-level images captured from different months of the year. This approach will inform whether the performance of host classification is improved when images captured from specific periods (e.g., May - August; summer). We will focus on genera with distinct seasonal changes in appearance (e.g., Maples) that are common in our dataset. These findings will also inform future attempts to scale host mapping through satellite-based imagery. ? Major Goal 1 - Objective 3: We have ongoing collaborations with the Spotted Lanternfly Strategic Planning Working Group (SFL SPWG). These stakeholders are interested in evaluating our approaches to detect non-crop and crop host species using street-level imagery. We will test our models in regions where Spotted Lanternfly is currently absent but is forecast to spread in the near future. We will disseminate our findings of newly identified hosts (e.g., Walnuts, Cherries, Tree-of-Heaven). These new host locations can then be evaluated by stakeholders for accuracy assessment. Major Goal 2 - Objectives 2/3: We will develop species distribution models for each host species with our identified host species observations and existing host data (the enhanced host map) versus existing host species data only. We will then calibrate and evaluate the forecast skill of a Spotted Lanternfly pest spread forecast with and without the enhanced host maps. We will disseminate these findings at the 2025 Ecological Forecasting Initiative Meeting.

Impacts
What was accomplished under these goals? Major Goal 1: Connect geotagged images with remote sensing technologies to create inventories of specialty crop hosts at scale. Objective 1: Develop an object detection algorithm to identify host species from open imagery data to expand spatial extent and reduce latency in species observations. Activities completed: We developed and evaluated a set of machine learning models to identify and classify host species using crowdsourced geotagged images (i.e., iNaturalist, EddMaps) and street-level images (i.e., Google Street View). We first trained an object detection model termed 'You-Only-Look-Once' (YOLO) to identify potential host trees in street-view images. The object detection model functions by dividing an image into multiple candidate regions and predicts whether an object (here, a tree) occurs in each region of the image. To develop the object detection model, we relied on a newly-available research-grade dataset, termed 'Auto Arborist'. The Auto Arborist dataset contains over two million street-level images of trees from 300 genus-level categories across 23 cities in the US and Canada. After identifying trees in street-level images, we then developed a separate image classification model termed 'EfficientNet' to classify the detected hosts to genus and species. To develop the image classification model, we again relied on the Auto Arborist dataset and integrated crowdsourced images of host tree species from iNaturalist and EddMaps. Both of these models used convolutional neural networks (CNNs) and large, diverse sets of training images to identify 100 unique non-crop and crop host trees. After identifying and classifying our target trees, we additionally find the location of each tree using a triangulation algorithm and multiple images of the same tree. From these models, we will expand the spatial extent and reduce the latency in species observations. Data collected: We downloaded crowdsourced and geotagged images of host species from iNaturalist.com and Eddmaps.org to assist in training our image classification model. These websites provide community-reviewed, research-grade images of a variety of crop and non-crop host species (e.g., Black Walnut, Juglans nigra; Mango, Mangifera indica; Tree-of-Heaven, Ailanthus altissima). In total, we downloaded N = 3,605,169 images across 100 genera. Results: After training our object detection and image classification models, we evaluated their performance on an independent set of testing images not used during model training. The performance of machine learning models is typically evaluated using a ratio of true positive (TP) predictions, or the number of times the model correctly identifies an instance in an image, relative to false positive (FP) or false negative (FN) detections. We calculated class-wise precision, or the proportion of all identified positives that are correct, as precision = TP / (TP + FP). We calculated class-wise recall, or the ability to predict all true positives (i.e., True Positive Rate (TPR)), as recall = TP / (TP + FN).We found the object detection model had a high precision of 0.87 and a recall of 0.82 for identifying potential host trees in street-level images. These evaluation metrics also varied for trees classified in the image classification model (e.g,. Maples: Precision = 0.79, Recall = 0.71; Tree-of-Heaven: Precision = 0.51, Recall = 0.54; Walnuts: Precision = 0.21, Recall = 0.41). These preliminary results will inform future attempts to scale host mapping through street-view and satellite-based imagery. Objective 2: Expand satellite-based image classification algorithm to scale host mapping. Data collected: The Auto Arborist dataset contains paired aerial and street-level images of trees from 300 genus-level categories. From this dataset, we obtained aerial images for the following target crop and non-crop hosts: 40,910 aerial images for Cherries (Prunus spp.) 4,886 aerial images for Walnut (Juglans spp.) 1,003 aerial images for Avocado (Persea spp.) 3,656 aerial images for Tree of Heaven (Ailanthus spp.) 19,724 aerial images Apples (Malus spp.) Objective 3: Assess accuracy of image classification through the collection of ground truth information from stakeholder engagement. Nothing to report. Major Goal 2: Develop and compare iterative, near-term forecasts of pest spread affecting stakeholder-identified fruit and nut trees with and without enhanced host maps. Objective 1: Engaging with stakeholders to fully understand their needs and what information is required to make informed decisions Activities completed: We attend regular meetings with USDA APHIS and state departments of agriculture to discuss how they would use forecasts and host maps in their workflow for managing pest outbreaks. We also identified growers and producers to include into future working group meetings. Activities completed: We have ongoing collaborations with the Spotted Lanternfly Strategic Planning Working Group, representing interests from the Animal and Plant Health Inspection Service (APHIS), the National Plant Board (NPB), and the National Association of State Departments of Agriculture (NASDA). These stakeholders and collaborators indicated interest in expanding the geographic scope of our analysis in Major Goal 1/ Objective 1. We submitted a grant through the USDA APHIS PPA 7721: Plant Pest and Disease Management and Disaster Prevention Program to detect and classify additional non-crop and crop host species (i.e., Maples; Grapes) with street-level imagery. Objective 2: Create forecasts of pest spread affecting stakeholder identified crops. Nothing to report Objective 3: Quantify change in accuracy and precision of forecasts with updated host maps Nothing to report

Publications

Progress 01/15/23 to 01/14/24

Outputs
Target Audience:Held monthly meetings with stakeholders at USDA APHIS and state departments of agriculture to: 1) inquire about how they would use forecasts and host maps in their workflow for managing pest outbreaks; and 2) to identify growers and producers to include in future working group conversations.Provided science-based knowledge opportunities for 3North Carolina State University students (3undergraduate students and 1 graduate student) and 1 North Carolina School of Science & Mathematics student. Students gained practical experience (e.g. data collection, analysis, scientific writing, programming, GIS) and knowledge on topics related to computer vision, pest and plant pathogen modeling, and remote sensing. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?One-on-one mentorship or individual study with students (3North Carolina State University students and 1 North Carolina School of Science & Mathematics high school student) resulting in greater proficiency in programming, geospatial data analytics, spatiotemporal modeling, ecological forecasting, scientific writing, presentation, and publication, crop pest and pathogen ecology, and data interoperability. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals?Major Goal 1 & Objective 1: We will use an iterative approach of labeling, training, and assessing our existing YOLO object detection model until it achieves high accuracy and precision. We will use this model to recognize other crop host species, such as Black Walnut, and again, will use the same iterative approach for consistent and reliable detection. We'll use a pretrained CNN and existing camera metadata associated with labeled street view images to pinpoint the geolocation of host trees in street view images. We will expand the spatial extent and latency of existing host observation (i.e., derived from global and regional databases such as GBIF and iNaturalist) by deriving street view images outside existing host observation areas. Then, we will use our object detection and geolocation models to identify and map the location of host trees within these undersampled locations. Dissemination of our Tree-of-Heaven object detection model performance will be presented at the IALE 2024 conference. Major goal 2 & Objective 1: Continue holding monthly meetings with USDA APHIS and state departments of agriculture to discuss how they would use forecasts and host maps in their workflow for managing pest outbreaks, and to identify growers and producers to include into future working group meetings. Major goal 2 & Objective 3: Build a species distribution model of each host species (Tree-of-Heaven, Black Walnut) with street view host species observations and existing host species data (the enhanced host map) vs. existing host species data only. Calibrate and evaluate the forecast skill of a Spotted Lanternfly pest spread forecast with and without enhanced host maps. We will disseminate findings at the 2024 Ecological Forecasting Initiative Meeting.

Impacts
What was accomplished under these goals? Major Goal 1: Connect geotagged images with remote sensing technologies to create inventories of specialty crop hosts at scale. Objective 1:Develop an object detection algorithm to identify host species from open imagery data to expand spatial extent and reduce latency in species observations. Activities completed:We developed an object detection model that identifies the host species Tree of Heaven (Ailanthus altissima) from RGB crowdsourced geotagged images (e.g., iNaturalist) and street-view images. We selected the You-Only-Look-Once (YOLO) object detection model, a type of convolutional neural network (CNN) that has shown to be both accurate and efficient when applied across large, diverse image datasets. We applied a modeling technique called transfer learning, which borrows information from previously-trained machine learning models, to further enhance detection of the host species. Data collected: Data collection efforts continued to focus on curating geotagged images of host species from diverse sources, including from Google Street View and iNaturalist. In addition, we accessed a newly-available research-grade dataset, termed Auto Arborist. The Auto Arborist dataset is a comprehensive dataset specifically designed for large-scale urban tree monitoring. It contains over two million paired aerial and street-level images of trees from 300 genus-level categories across 23 cities in the US and Canada. This dataset will be instrumental to further extend the capabilities of the object detection models for a variety of crop and non-crop host species (e.g., Black Walnut, Juglans nigra; Mango, Mangifera indica, Tree-of-Heaven, Ailanthus altissima) across diverse environments. Summary statistics:For Tree-of-Heaven, we created 8,694 labels from iNaturalist images, 3,451 labels from Google street view images, and 3,723 labels from Auto Arborist street view images. For Black walnut, we created 877 labels from iNaturalist images. For Mango, we created 848 labels from iNaturalist images. For a non-host lookalike, Staghorn Sumac (Rhus typhina), we created 479 labels from iNaturalist images. Results: Object detection models are typically evaluated with two primary metrics: precision (the proportion of correct detections relative to incorrect detections; TP/TP+FP) and recall (the proportion of correct detections relative to failed detections; TP/TP+FN). Our Tree-of-Heaven object detection model exhibited a moderate precision (76.3%) and recall (75.0%) values when evaluated at a confidence threshold of 0.25. Objective 2: Expand satellite-based image classification algorithm to scale host mapping. Data collected:The Auto Arborist dataset contains paired aerial and street-view images of trees from 300 genus-level categories. From this dataset, we obtained labels for the following target crop and non-crop hosts: 40,910 aerial images are labeled for Cherries (Prunus sp.) 4,886 aerial images are labeled for Walnut (Juglans sp.) 1,003 aerial images are labeled for Avocado (Persea sp.) 3,656 aerial images are labeled for Tree of Heaven 19,724 aerial images are labeled for Apples (Malussp.) Objective 3: Assess accuracy of image classification through the collection of ground truth information from stakeholder engagement. Nothing to report Major Goal 2: Develop and compare iterative, near-term forecasts of pest spread affecting stakeholder-identified fruit and nut trees with and without enhanced host maps Objective 1: Engaging with stakeholders to fully understand their needs and what information is required to make informed decisions facilitates the development of systems that better meet those needs and provide management relevant information Activities completed: Held monthly meetings with USDA APHIS and state departments of agriculture to discuss how they would use forecasts and host maps in their workflow for managing pest outbreaks, and identified growers and producers to include into future working group meetings. Objective 2: Create forecasts of pest spread affecting stakeholder identified crops. Nothing to report. Objective 3: Quantify change in accuracy and precision of forecasts with updated host maps Nothing to report.

Publications

Progress 01/15/22 to 01/14/23

Outputs
Target Audience: Held monthly meetings with stakeholders at USDA APHIS and state departments of agriculture to: 1) inquire about how they would use forecasts and host maps in their workflow for managing pest outbreaks; and 2) to identify growers and producers to include in future working group conversations. Conducted a semi-structured workshop with USDA APHIS and state departments of agriculture (CT, MA, RI) to present and gain feedbackon the effectiveness of pest and pathogen spread forecasts for developing management strategies and for facilitating collaborative decision-making amongst stakeholders and policy makers. Provided science-based knowledge opportunities for4 North Carolina State University students (3 undergraduate students and 1 graduate student) and 1 North Carolina School of Science & Mathematics student. Students gained practical experience (e.g. data collection, analysis, scientific writing, programming, GIS) and knowledge on topics related to computer vision, pest and plant pathogen modeling, and remote sensing. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?One-on-one mentorship or individual study with students (4North Carolina State Universitystudentsand1 North Carolina School of Science & Mathematics high school student)resulting in greater proficiency in programming,geospatial data analytics,spatiotemporal modeling,ecological forecasting,scientific writing, presentation, and publication, crop pest and pathogen ecology, and data interoperability. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Major Goal 1: Connect geotagged images with remote sensing technologies to create inventories of specialty crop hosts at scale. Objective 1: Develop an object detection algorithm to identify host species from open imagery data to expand spatial extent and reduce latency in species observations. Activities completed: We used a transfer learning approach to develop an object detection algorithm to the classify host species and non-host lookalike species from RGB geotagged images. The object detection model we expand upon is a pretrained convolutional network model (CNN) with a ResNet50 backbone that has been trained on the ImageNet benchmarking datasetandthe iNat2021 benchmarking dataset. This model can perform fine-grained species classification of >10K species, including the aforementioned crop host and non-crop host species. However, the number of training labels per species is small (~100-300 labels per species). Therefore, we created an additional labeling datasetto further fine-tune the pre-trained model's ability to identify our host species and non-host species lookalikesfrom RGB images. The labeling dataset we created includes 2-4x more labels of our focal speciesthan the iNat2021 benchmarking dataset. Data collected: The labeling dataset consists of geotagged imagery acquired from iNaturalist and Google Street View with researcher-annotated labels indicating the bounds of crop host species (Black Walnut, Juglans nigra; Mango, Mangifera indica), non-crop lookalike host species (Tree-of-Heaven, Ailanthus altissima), and non-host lookalike species (Staghorn sumac, Rhus typhina) within these images. Summary statistics/Results: 2,497 RGB images(iNaturalist and google street view) were labeled for our focal hosts to furtherfine-tune the object detection algorithm.The number of additional labels including in our dataset is: Black Walnut=877; Mango=848; Tree-of-Heaven=759; Staghorn Sumac=479. Objective 2: Expand satellite-based image classification algorithm to scale host mapping. Nothing to report. Objective 3: Assess accuracy of image classification through the collection of ground truth information from stakeholder engagement. Nothing to report. Major Goal 2: Develop and compare iterative, near-term forecasts of pest spread affecting stakeholder-identified fruit and nut trees with and without enhanced host maps Objective 1: Engaging with stakeholders to fully understand their needs and what information is required to make informed decisions facilitates the development of systems that better meet those needs and provide management relevant information Activities completed:Held monthly meetings with USDA APHIS and state departments of agriculture to discuss how they would use forecasts and host maps in their workflow for managing pest outbreaks, andidentified growers and producers to include into future working group meetings. Objective 2: Create forecasts of pest spread affecting stakeholder identified crops. Nothing to report Objective 3: Quantify change in accuracy and precision of forecasts with updated host maps Nothing to report.

Publications