Source: UNIVERSITY OF VIRGINIA submitted to NRP
FACT: THREE-STATE DATA SCIENCE FOR THE PUBLIC GOOD COORDINATED INNOVATION NETWORK
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
1019687
Grant No.
2019-68017-29934
Cumulative Award Amt.
$999,975.00
Proposal No.
2018-09185
Multistate No.
(N/A)
Project Start Date
Apr 1, 2019
Project End Date
Mar 31, 2022
Grant Year
2019
Program Code
[A1541]- Food and Agriculture Cyberinformatics and Tools
Recipient Organization
UNIVERSITY OF VIRGINIA
(N/A)
CHARLOTTESVILLE,VA 22901
Performing Department
Biocomplexity Institute
Non Technical Summary
This work promotes the creation of a data savvy and community-aware workforce and serve to bridge the gap in the application of data science to public good problems in rural America through the marriage of data science with agricultural, economic, social and behavioral sciences.We envision a 10-week DSPG summer program for undergraduate and graduate students, faculty, and Extension professionals engaged or interested in community development, food, health, nutrition, youth development, forestry, agriculture, and natural resource management in rural communities. The students and faculty will form community-focused research teams and execute a collection of collaboratively constructed data science projects in partnership with local rural community stakeholders.We will elicit community-based research problems, at the heart of our proposed data science experiential learning programs, by engaging the expertise and community knowledge of Extension professionals in our respective states. Extension educators, staff, and faculty will be eligible to act as project advisors who coordinate with community stakeholders to identify issues or concerns, note relevant data sources, and help guide the work of the research teams. During the course of the program, we will immerse our students and faculty in data science, including a set of workshops and training on statistical computing and visualization tools including R, Python, and GIS; accessing and using local, state, and federal data resources such as Census products and open data portals; and learning about policy, ethics, and entrepreneurship. Extension professionals and community stakeholders will have the opportunity to receive training on data science literacy, to access available tools and resources, and to understand the promise of data science to address rural problems.The interdisciplinary research DSPG teams in each state will:vertically integrate undergraduate and graduate students, post-docs, research faculty and staff, Extension professionals, and community leadership, andhorizontally integrate expertise across diverse fields of study related to rural communities, and research areas of interest to USDA.The focus of the proposed DSPG programs is at the interface of data science and addressing the needs of rural communities. This includes leveraging the expertise of our Cooperative Extension Systems combined with a clear understanding of the diverse challenges that rural and smaller communities face, in order to promote economic prosperity, health, education, social justice, natural resource protection, and related topics. As observed in successful pilots at Virginia Tech, the research projects undertaken at each DSPG site will likely cover a wide range of topics determined by our stakeholders, and the research teams will apply similar theory, data science methods, and general approaches to providing data-driven insights to the issues. The programs will be tailored to the unique structures of Cooperative Extension within each state.
Animal Health Component
60%
Research Effort Categories
Basic
10%
Applied
60%
Developmental
30%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
60860993030100%
Goals / Objectives
Our nation has entered a new era where communities of all types have access to data in greater quantity, detail, and variety than ever before to assess conditions, develop strategies and policies, and evaluate program impact. Rural communities, towns, cities, counties, and states face the challenge of integrating these data into their decision making and developing a workforce that can support their need to optimally utilize these data in their planning and operations. In rural communities this would include organizations such as local governments, farmer cooperatives, small businesses, chambers of commerce, school systems, and non-profits. In many rural communities these organizations are tightly woven into the fabric of prosperity and quality of life.Local organizations of all sorts need individuals trained in analytics and familiar with the social and community applications of those analytics to assist with topic areas such as agriculture, food systems and nutrition, policing/justice, fire and emergency management, social welfare administration, health, education, transportation, commerce. This will lead to advances in the applications of "all data" across problems in nutrition, food and agriculture, forestry and natural resources, physical, social and emotional health, positive youth development, and developing vital, resilient communities. We have adopted the term "all data revolution," because data sciences focus on data of all sizes, not just big data (Lazer et al. 2014).To meet this challenge, we are implementing a bold and creative approach for facilitating the integration of data science into all aspects of rural development as outlined above, through collaboration with Cooperative Extension programming. To do this, we are creating a Coordinated Innovation Network that engages the Extension, research, and education missions of the Land-Grant Universities of Iowa, Oregon, and Virginia. These universities will adapt the successful Virginia Tech Data Science for the Public Good (DSPG) program, with coordination by the University of Virginia's Social Decision and Analytics Division. (The researchers from Virginia Tech that developed and executed the successful DSPG program have recently moved to University of Virginia, Biocomplexity Institute & Initiative.) The adaptation of the experiential learning DSPG program will bring a tight coupling between our Cooperative Extension System programs and our data science research and education activities to extend the benefits of data science capabilities into rural communities to enhance the "public good." At the same time, DSPG will provide outstanding public service experiential learning opportunities for students, in rural contexts.At the heart of this research is testing the implementation of the DSPG program in three states, each with different sets of Cooperative Extension professionals, with the goal to prepare Blueprint for other states to mount similar programs, thus spreading data science across rural America in support of rural prosperity. The DSPG Blueprint will provide best practices and know-how for other public and land-grant institutions across the country to replicate our DSPG models. Through working together, we will identify and understand the opportunities and challenges of developing and deploying DSPG programs within a variety of Cooperative Extension structures, with unique compositions of Extension professionals, across diverse states, and within distinct institutional hierarchies. The network will have a specific focus on making data science accessible to rural communities via Extension.
Project Methods
The project has three phases: 1) Planning, Training, and Direct Engagement with Extension; 2) Implementing the DSPG Programs in Iowa, Oregon, and Virginia; and 3) Evaluation.i. First Phase - Planning, Training, and Direct Engagement with Extension i. a. Data science education program for Extension professionalsThe first phase of our project will involve direct engagement and collaboration with Extension professionals to increase their awareness and familiarity with data science and its ability to benefit rural communities, and to communicate information about data science to their stakeholders. At every step in the design and delivery of this data science education and research program, we will consult with our Extension collaborators, including the members of our leadership team, Advisory Group, and program teams. We will provide Extension professionals with sufficient familiarity about data science such that they can be effective communicators with their stakeholders on the topic. An additional goal of this phase of the program will be to begin the process of identifying possible DSPG projects.The components of this phase include:Content development. Information will include: what is data science; the kinds of questions data science can answer; the different elements of data science (data types, analysis types, security and privacy etc.); and, review of application areas (e.g. health and nutrition, food and agriculture, forestry, youth development, community vitality and resource development, transportation). We will also review existing online resources and include summaries and links for those that our Extension collaborators report as useful.Delivery methods. We envision a mixture of delivery formats, modeled on those already in use by Extension professionals. These include: in-person presentations, webinars, on-line resources and data visualizations, electronic material on USB sticks (internet access is not guaranteed in rural communities), and printed materials.Delivery and dissemination. Where possible, in-person presentations to off-campus sites will be coordinated across our states and with existing regional meetings of Extension professionals. The electronic training materials from our institutions will be made available and disseminated on our DSPG integrated web resource.i. b. Development of data science education programming for DSPG teamsIn addition to the outreach to Cooperative Extension in the first phase of the project, our institutions will also work together to develop the data science training materials that will form the basis of the DSPG summer program formal training. The training of the undergraduates and graduate students will include both basic training topics, and customized training focused on the needs of specific projects. Basic training will include Linux and the command line, R and Rstudio for analysis and visualization, use of each institutions' computing infrastructure, accessing data repositories, data topics (integrity, open data, security, privacy), application topics, ethics, reproducibility, project management and entrepreneurship. Advanced topics for specific projects could include programming in Python, MySQL, GIS data, web-scraping, machine learning, advanced statistical analysis, and courses needed for specific projects.The deployment of the training will be a combination of on-site and virtual training, taking advantage of the strengths each of our institutions bring to our network. All training materials will be made available on the overall project's integrated DSPG web resource.ii. Second Phase - Implementing the DSPG Programs at ISU, OSU, and VT/VSUThe second phase of our project involves the actual deployment of the DSPG summer programs. In addition to a series of training sessions on topics described above, the students and faculty will create research teams to address problems our Extension partners have identified through conversations with their stakeholders. The data science framework we established in 2017 (Keller, S., Lancaster ,V., Shipp, S. (2017) Building capacity for data-driven governance: Creating a new foundation for democracy. Statistics and Public Policy, pp. 1-11) will be used to guide the DSPG projects. Our data science approach emphasizes an important role for data discovery and the role of local, state, and federal data. The framework also highlights some of the DSPG program training and the general research approach to soliciting problems and executing data science team research.The data science framework starts with the research question (problem identification) and continues through the following steps: data discovery, inventory, screening, and acquisition; data quality assessment (data profiling); data preparation; data linkage; data exploration; assessment of the fitness-for-use; and statistical modeling and data analyses. The processes represented in the framework are critical for creating defensible and repeatable analyses. The data science framework is discussed in more detail below.Data Discovery identifies data that naturally exist. These data sources may have been collected for reasons other than the problem at hand and require repurposing. Once discovered, the data are inventoried and then screened to determine which are useful to acquire for the intended research questions and issues.Data Ingestion and Governance involves the quality assessment of the data through data profiling to evaluate representativeness, timeliness, accuracy, consistency, completeness, reliability, and relevance. The ingestion process needs to capture all metadata and provenance to inform the data preparation steps. The data may come from many sources. The governance (e.g., access and privacy) that surrounds the data needs to be captured and adhered to during the data linkage steps. Data exploration explores the data to understand the spatial and temporal biases and coverage.Fitness-for-Use Assessment and Statistical Modeling and Data Analyses are tightly coupled. Given a particular analysis, fitness-for-use of the associated data is a characterization of the information content in the data that can support the particular analysis.In the second year, the three university teams will coordinate with their Extension colleagues to finalize the projects to be analyzed and will focus on the summer student training and research experiences. The teams will be organized and projects will be assigned. For 10 weeks, the DSPG teams will: (1) provide students an immersion into the analytics tools and capabilities needed for their analysis through the training programs described above; and (2) complete the team-based analyses of the specific community problems identified through discussions with our Extension partners and their stakeholders.iii. Evaluation The evaluation of the project will be based on the progject's logic model which identifies objectives, inputs, activities, outputs, and outcomes. We will employ an implementation science evaluation framework called RE-AIM for Reach, Efficacy, Adoption, Implementation, Maintenance; (Glasgow, R. E., Vogt, T. M., & Boles, S. M. ,1999. Evaluating the public health impact of health promotion interventions: the RE-AIM framework.American Journal of Public Health,89(9), 1322-1327) to evaluate individual and institutional level program impacts, taking careful consideration of how differences in the extension structure may have affected DSPG implementation.UVA will oversee and coordinate the evaluation in collaboration with the university partners and with oversight of the overall project Advisory Group. The evaluation stages for each outcome will include information gathering, review of that input by the Advisory Group, and both formative and summative evaluations. A final report will be created to provide a resource for how other states might learn from our programs and adapt the DSPG program to best fit their needs.

Progress 04/01/19 to 03/31/22

Outputs
Target Audience:There are four target audiences: Local community stakeholders. We are working with local government and community stakeholders in Iowa, Oregon, and Virginia to address their questions using the Community Learning Data Driven Discovery (CLD3) process. Cooperative Extension System (CES). CES-professions are our partners in working with communities and their local governments. Data Science for the Public Good Interns and Fellows. Graduate fellows and undergraduate interns participating in the Data Science for the Public Good Young Scholars Programs this past summer received immersive data science training. They became part of the project teams that included academic mentors, CES professionals, and local community stakeholders. Researchers from Public and Land-grant Universities. This project brings together researchers from Public and Land-Grant Universities from Virginia (University of Virginia, Virginia Tech, and Virginia State University), Iowa (Iowa State University), and Oregon (Oregon State University), each with its own approach to data science, community-based research, and CES management to work together with CES professionals and DSPG young scholars. Changes/Problems:An important facet underlying this project was the effects of the COVID-19 pandemic that took hold in March 2020, producing unforeseen adjustments in the project activities. Universities had moved to virtual instruction and local communities were in crisis, increasing both the need and challenges of participating in the CLD3 process. The Three-State Coordination Network pivoted quickly to move the program to virtual engagements. This network included CES and project team training for faculty and students conducted online within each state. What opportunities for training and professional development has the project provided?Two sets of training were developed one for Cooperative Extension System (CES) professionals and a second for Data Science for the Public Good Young (DSPG) graduate fellows and undergraduate interns. The CES training included a Community Catalyst Training Series. Each live (virtual) session drew 75 to 135 people from across the United States and territories . The session were captured and are available on a CES resource (https://datascienceforthepublicgood.org/economic-mobility/training). Additionally, self-paced training modules were created for each topic and became available in the Fall of 2021. The segments include: The Process: Community Learning through Data Driven Discovery The Community: Community Capitals Economic Mobility Framework The Team: Team Science and Working Virtually as a Team The Discovery: Data Discovery Process The Culture: Building a Data Ready Culture The Action Plan: Finding Insights in Data 2. DSPG young scholars training includes the following modules that were posted on our website by May 2021: Intro to Git Data Discovery Naming and Coding Standards Literature Reviews Intro to R Three Ways of R Data Visualization with ggplot2 Research Ethics Web Scraping Exploratory Data Analysis Census and ACS Data Profiling for Repurposing Data Regular Expressions Text Mining Data Restructuring Data Cleaning Probabilities Random Variables Distributions Estimation and Cis Linear Models Data Linkage Scale Creation Python in R QGIS Spatial in R Shiny ML Classification ML Regression Science Media Writing Constructing Visualizations Overleaf How have the results been disseminated to communities of interest?The results are disseminated through presentations, webinars, virtual training session, and news media. These materials are available through our websites summarized below. For all communities of interest, the list of publications given earlier include both methodological research associated with doing data science and the outcomes from the community-based research projects. For community stakeholders and CES professionals: A Community Insights website (https://datascienceforthepublicgood.org/economic-mobility/community-insights) was created that captures the results from the CLD3 DSPG project teams and an Economic Mobility Data Infrastructure that emerged from the collection of activities. For CES professionals: National Action Dialogue- We engaged the coordination innovation network with the national organization, eXtension, resulting in an invitation to participate in their National Action Dialogue - Extension Futures event (https://impact.extension.org/2020/08/national-action-dialogue-extension-futures-summary-report/) and documented in their report. The "National Community Learning Network" is one of the five topics discussed through a series of breakout sessions with CES professionals from across the country. For all communities of interest, with a special focus on DSPG interns and fellows, two symposiums were held to showcase the coordination innovation network and the CLD3 project teams. Held national symposium on August 7, 2020 with keynote speaker Ken Prewitt, Carnegie Professor of Public Affairs, Special Advisor to the President, and former US Census Bureau Director. Virginia, Iowa, and Oregon DSPG teams presented their research through online breakout rooms. Oregon held a regional symposium on August 21, 2020 with keynote speaker Charisse Madlock-Brown, Professor of Health Informatics and Information Management at the University of Tennessee Health Science Center. The Oregon DSPG projects and selected projects from Iowa and Virginia were presented. Held national symposium on August 6, 2021 with keynote speaker Jeri Mulrow, Vice President and Director of Statistics and Evaluation Sciences at Westat. The coordination innovation network also received media attention: Virginia Media: 1/21/20 Dusseau, G. Biocomplexity Institute Awarded $1 Million Grant To Expand Data-Driven Governance And Advance Economic Mobility In Rural Communities. https://biocomplexity.virginia.edu/news/biocomplexity-institute-awarded-1-million-grant-expand-data-driven-governance-and-advance 6/24/20 National Community Learning Network video, https://www.youtube.com/watch?v=uKDdgY1E42s 10/5/20 Dusseau, G. Economic Mobility Project Team Designs New National Community Learning Network. https://biocomplexity.virginia.edu/news/economic-mobility-project-team-designs-new-national-community-learning-network Oregon Media: 7/30/20 Branam, Chris. OSU data science initiative addresses issues in Oregon's rural communities, Corvallis, Oregon, https://today.oregonstate.edu/news/osu-data-science-initiative-addresses-issues-oregon's-rural-communities 8/25/20 Carlson, Brad. OSU Researchers build app to monitor irrigation water quality. Capital Press. https://www.capitalpress.com/ag_sectors/orchards_nuts_vines/researchers-build-app-to-monitor-irrigation-water-quality/article_6e5ad122-e6ed-11ea-8e1c-afd293dfea4c.html Iowa Media: 1/10/20, Shawn Dorius is part of an ISU research team that received a $1 million grant, Department of Sociology: Publications and Accolades, https://soc.iastate.edu/2020/01/10/shawn-dorius-is-part-of-an-isu-research-team-that-received-a-1-million-grant/ 1/21/20, Marshalltown Realizes Benefits of Data Science for Public Good Project, Sandra Oberbroeckling, Christopher Seeger, CES new, https://www.extension.iastate.edu/news/marshalltown-realizes-benefits-data-science-public-good-project 2/17-18/20, Initiative takes advantage of existing data to solve community problems, Newservice, https://www.news.iastate.edu/news/2020/02/17/bigdata; NewsWise, https://www.newswise.com/articles/initiative-takes-advantage-of-existing-data-to-solve-community-problems; Retweeted 4 times as of 2/18, https://twitter.com/IowaStateUNews/status/1229454409572028419 Tweet, Tippie Analytics (U of Iowa) Twitter https://twitter.com/TippieAnalytics/status/1229505710926417920 https://request.hs.iastate.edu/news/email/?d1=02/18/2020&d2=02/18/2020&utm_source=newsletter-noHTML&utm_medium=email&utm_campaign=hshappenings College of Human Sciences Weekly News Email 'Happenings', https://request.hs.iastate.edu/news/email/?d1=02/18/2020&d2=02/18/2020&utm_source=newsletter-noHTML&utm_medium=email&utm_campaign=hshappenings 2/18/20, Cass Dorius did interview with Trent Rice at WHO Radio about the program, WHO Radio 3/2/20, News from the Vice President for Extension and Outreach, John Lawrence and Team, https://blogs.extension.iastate.edu/didyouknow/author/sternweis/ 3/5/20, Marshalltown, ISU partner to improve city transit system, Thomas Nelson, Times-Republican, https://www.timesrepublican.com/news/todays-news/2020/03/marshalltown-isu-partner-to-improve-city-transit-system/ and Announcement Email about presenters that attended research day 7/29/20, DSPG Young Scholars Use Data to Solve Local Community Problems, Chris Seeger, Iowa State University Extension & Outreach News, https://www.extension.iastate.edu/news/dspg-young-scholars-use-data-solve-local-community-problems 10/1/2020, DSPG Young Scholars Use Data to Solve Local Community Problems, Chris Seeger, Iowa State University Community Matters Newsletter, https://www.extension.iastate.edu/communities/files/newsletter/files/issue4_now_accessible.pdf https://www.extension.iastate.edu/communities/dspg-young-scholars-use-data-solve-local-community-problems 11/1/20, Dorius, C. Iowa League of Cities: Cityscape, City focus: In Support of Data-Driven Governance Updates. November, https://mydigitalpublication.com/publication/?m=26842&i=679035&view=contentsBrowser&ver=html5 2/8/21 ISU researchers use data to help communities discover and solve biggest problems, Iowa State University News Service, https://www.news.iastate.edu/news/2021/02/08/data-science What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Accomplishments this period: Created the National Advisory Panel (Chaired by Dr. Cathie Woteki, University of Virginia, Biocomplexity Institue, Visiting Distinguished Institute Professor; University of Iowa, Professor, Food Science & Human Nutrition; undersecretary for United States Department of Agriculture's Research, Education, and Economics mission area, as well as the department's chief scientist, 2010-2017) Developed CES engagement strategy Introduced CES within each state to Community Learning through Data Driven Discovery (CLD3) process through multiple presentations and webinars. See https://biocomplexity.virginia.edu/project/towards-national-community-learning-network-advance-economic-mobility Presented CLD3 and CES engagement strategy to the Western extension system. Developed evaluation plan and final evaluation report Shipp, S., Thurston J., Lambur, M., and Dorius, C. (2021). Evaluation - Towards a Community Learning Network to Advance Economic Mobility. Proceedings of the Biocomplexity Institute, Technical Report. TR# 2021-021. University of Virginia. https://doi.org/10.18130/p4np-qc43 Development of scope of CES training materials to develop for CES engagement. See https://datascienceforthepublicgood.org/economic-mobility/training DSPG Young Scholars' training development and schedule Launched DSPG websites at each university to advertise 10-week summer program provide student application materials Recruited and selected DSPG Fellows and Interns for all five DSPG Young Scholars Programs Solicited and implemented DSPG project proposals from across CES in the three states Selected 20 projects in 2020: 8 in Virginia, 6 in Iowa, and 6 in Oregon. Selected 11 projects in 2021: 7 in Virginia, 4 in Iowa. Held national symposiums on August 7, 2020 and August 6, 2021 Oregon held a regional symposium on August 21, 2020. The community-based research CLD3 projects across the network are below. Virginia: Stroke and COVID Population: A Health Equity Analysis Fostering Data Reuse: Measuring the Usability of Publicly Accessible Research Data Manpower Planning Using Skill Data A Racial Equity Case Study of the Provision of Parks and Other Amenities in Arlington County? Evaluating Residential Property Data Quality Understanding the Impact of COVID-19 on the Delivery of Emergency Medical Services Assessing factors of Economic Mobility through a Political Capital lens Fairfax County Labor Markets: Characterizing Local Workforce and Employment Networks Barriers to Health Care Access and Use in Patrick County, VA Halifax County: Factors of Incarceration and Recidivism Using PICES Data to Visualize District Level Multidimensional Poverty in Zimbabwe Availability of Services: Evolving Demographics, Housing, and Traffic in Rappahannock County Analyzing Vegetative Health using Landsat 8 Satellite Imagery Service Provision for Vulnerable Transition Aged Youth in Loudoun County, Virginia Tracking Indicators of the Economic and Social Mobility of the Black Community in Hampton Roads. Water Resource Management and Industry and Residential Growth in Floyd County Exploring the Skill Content of Jobs in Appalachia Measuring Economic and Social Infrastructure: Intergenerational Poverty in Page County Measuring Regional Food Insecurity and the Role of a Loudon County Food Hub Industry and Workforce Attraction and Retention in Wythe County Understanding Evictions in Richmond City: Policy, Community, and Individual Factors Demonstrating the value of the Appomattox River Trail (ART) Factors contributing to Health Care Inequities in Petersburg, VA Equity in Access to Parks in Chesterfield, Virginia Understanding Unemployment in the Prince George and Hopewell Region Iowa: Iowa's Integrated Data System for Decision-Making (Early Childhood Iowa) Supporting Eat Greater Des Moines and Food Rescue in Central Iowa DHR "Just the Facts" on Educational Attainment DHR "Just the Facts" on Economic and Workforce Development. Quality of Life in Small and Shrinking Iowa Cities Assessing the Impact of Publicly Accessible Research Data: What Can Repositories Tell Us About Data Reuse? Develop a Community Capitals Data Infrastructure to Support Community Economic Resilience Enlarge the ISU Extension Community Helpline Services Greatest Need of Excessive Alcohol Prevention Efforts Identify Communities Ready and Able to Support Substance Abuse Recovery Centers Pilot 'Systems of Care' Data Infrastructure to Support State Prevention, Treatment and Safety Response Efforts Oregon: Impacts of Dam Water Release Policy on Deschutes River Health and Recreation Regulatory Impacts on Economic Development in the Eastern Oregon Border Region Forecasting Tools for Cost Analysis of Water and Wastewater Facilities in Oregon Small Towns and Cities Water Quality Requirements for Fresh Produce Growers in Oregon Wintertime Air Quality Health Impacts in Oakridge and Westfir, Oregon Economic Mobility Baseline and Comparative Analysis for the South Wasco County School District Area, Oregon

Publications

  • Type: Other Status: Published Year Published: 2021 Citation: Keller, S. A., Shipp, S. S.,(2021). Data Acumen in Action. AMS Notices, October. https://www.ams.org/journals/notices/202109/noti2353/noti2353.html?adat=October%202021&trk=2353&galt=feature&cat=feature&pdfissue=202109&pdffile=rnoti-p1468.pdf


Progress 04/01/20 to 03/31/21

Outputs
Target Audience:There are four target audiences: Local community stakeholders. We are working with local government and community stakeholders in Iowa, Oregon, and Virginia to address their questions using the Community Learning Data Driven Discovery (CLD3) process. Cooperative Extension System (CES). CES-professions are our partners in working with communities and their local governments. Data Science for the Public Good Interns and Fellows. Graduate fellows and undergraduate interns participating in the Data Science for the Public Good Young Scholars Programs this past summer received immersive data science training and became part of the project teams that included academic mentors, CES professionals, and local community stakeholders. Researchers from Public and Land-grant Universities. This project brings together researchers from Public and Land-Grant Universities from Virginia (University of Virginia, Virginia Tech, and Virginia State University), Iowa (Iowa State University), and Oregon (Oregon State University), each with its own approach to data science, community-based research, and CES management to work together with CES professionals and DSPG young scholars. Changes/Problems:An important facet underlying this project was the effects of the COVID-19 pandemic that took hold in March 2020, producing unforeseen adjustments in the project activities. Universities had moved to virtual instruction and local communities were in crisis, increasing both the need and challenges of participating in the CLD3 process. The Three-State Coordination Network pivoted quickly to move the program to virtual engagements. This included CES and project team training for faculty and students being conducted online within each state. What opportunities for training and professional development has the project provided?Two sets of training were developed one for Cooperative Extension System (CES) professionals and a second for Data Science for the Public Good Young (DSPG) graduate fellows and undergraduate interns. 1. The CES training included a Community Catalyst Training Series. Each live (virtual) session drew 75 to 135 people from across the United States and territories . The session were captured and are available on a CES resource (https://datascienceforthepublicgood.org/economic-mobility/training). The segments include: The Process: Community Learning through Data Driven Discovery The Community: Community Capitals Economic Mobility Framework The Team: Team Science and Working Virtually as a Team The Discovery: Data Discovery Process The Culture: Building a Data Ready Culture The Action Plan: Finding Insights in Data 2. DSPG young scholars training includes the following modules that will be posted on our website by May 2021: Intro to Git Data Discovery Naming and Coding Standards Literature Reviews Intro to R Three Ways of R Data Visualization with ggplot2 Research Ethics Web Scraping Exploratory Data Analysis Census and ACS Data Profiling for Repurposing Regular Expressions Text Mining Data Restructuring Data Cleaning Probabilities Random Variables Distributions Estimation and Cis Linear Models Data Linkage Scale Creation Python in R QGIS Spatial in R Shiny ML Classification ML Regression Science Media Writing Constructing Visualizations Overleaf How have the results been disseminated to communities of interest?Through peer reviewed publication, presentations, webinars, and virtual training session. These materials are available through our websites summarized below. For all communities of interest the list of publications given earlier include both methodological research associated with doing data science and the outcomes from the community-based research projects. For community stakeholders and CES professionals: A Community Insights website (https://datascienceforthepublicgood.org/economic-mobility/community-insights) was created that captures the results from the CLD3 DSPG project teams and an Economic Mobility Data Infrastructure that emerged from the collection of activities. For CES professionals: National Action Dialogue- We engaged the coordination innovation network with the national organization, eXtension, resulting in an invitation to participate in their National Action Dialogue - Extension Futures event (https://impact.extension.org/2020/08/national-action-dialogue-extension-futures-summary-report/) and documented in their report. The "National Community Learning Network" is one of the five topics discussed through a series of breakout sessions with CES professionals from across the country. For all communities of interest, with a special focus on DSPG interns and fellows, two symposiums were help to showcase the coordination innovation network and the CLD3 project teams. Held national symposium on August 7, 2020 with keynote speaker Ken Prewitt, Carnegie Professor of Public Affairs, Special Advisor to the President, and former US Census Bureau Director. Virginia, Iowa, and Oregon DSPG teams presented their research through online breakout rooms. Oregon held a regional symposium on August 21, 2020 with keynote speaker Charisse Madlock-Brown, Professor of Health Informatics and Information Management at the University of Tennessee Health Science Center. The Oregon DSPG projects and selected projects from Iowa and Virginia were presented. The coordination innovation network also received media attention: Virginia Media: 1/21/20 Dusseau, G. Biocomplexity Institute Awarded $1 Million Grant To Expand Data-Driven Governance And Advance Economic Mobility In Rural Communities. https://biocomplexity.virginia.edu/news/biocomplexity-institute-awarded-1-million-grant-expand-data-driven-governance-and-advance 6/24/20 National Community Learning Network video, https://www.youtube.com/watch?v=uKDdgY1E42s 10/5/20 Dusseau, G. Economic Mobility Project Team Designs New National Community Learning Network. https://biocomplexity.virginia.edu/news/economic-mobility-project-team-designs-new-national-community-learning-network Oregon Media: 7/30/20 Branam, Chris. OSU data science initiative addresses issues in Oregon's rural communities, Corvallis, Oregon, https://today.oregonstate.edu/news/osu-data-science-initiative-addresses-issues-oregon's-rural-communities 8/25/20 Carlson, Brad. OSU Researchers build app to monitor irrigation water quality. Capital Press. https://www.capitalpress.com/ag_sectors/orchards_nuts_vines/researchers-build-app-to-monitor-irrigation-water-quality/article_6e5ad122-e6ed-11ea-8e1c-afd293dfea4c.html Iowa Media: 1/10/20, Shawn Dorius is part of an ISU research team that received a $1 million grant, Department of Sociology: Publications and Accolades, https://soc.iastate.edu/2020/01/10/shawn-dorius-is-part-of-an-isu-research-team-that-received-a-1-million-grant/ 1/21/20, Marshalltown Realizes Benefits of Data Science for Public Good Project, Sandra Oberbroeckling, Christopher Seeger, CES new, https://www.extension.iastate.edu/news/marshalltown-realizes-benefits-data-science-public-good-project 2/17-18/20, Initiative takes advantage of existing data to solve community problems, Newservice, https://www.news.iastate.edu/news/2020/02/17/bigdata; NewsWise, https://www.newswise.com/articles/initiative-takes-advantage-of-existing-data-to-solve-community-problems; Retweeted 4 times as of 2/18, https://twitter.com/IowaStateUNews/status/1229454409572028419 Tweet, Tippie Analytics (U of Iowa) Twitter https://twitter.com/TippieAnalytics/status/1229505710926417920 https://request.hs.iastate.edu/news/email/?d1=02/18/2020&d2=02/18/2020&utm_source=newsletter-noHTML&utm_medium=email&utm_campaign=hshappenings College of Human Sciences Weekly News Email 'Happenings', https://request.hs.iastate.edu/news/email/?d1=02/18/2020&d2=02/18/2020&utm_source=newsletter-noHTML&utm_medium=email&utm_campaign=hshappenings 2/18/20, radio interview, Cass Dorius did interview with Trent Rice at WHO Radio about the program, WHO Radio 3/2/20, News from the Vice President for Extension and Outreach, John Lawrence and Team, Email, https://blogs.extension.iastate.edu/didyouknow/author/sternweis/ 3/5/20, Marshalltown, ISU partner to improve city transit system, Thomas Nelson, Times-Republican, https://www.timesrepublican.com/news/todays-news/2020/03/marshalltown-isu-partner-to-improve-city-transit-system/ and Announcement Email about presenters that attended research day 7/29/20, DSPG Young Scholars Use Data to Solve Local Community Problems, Chris Seeger, Iowa State University Extension & Outreach News, https://www.extension.iastate.edu/news/dspg-young-scholars-use-data-solve-local-community-problems 10/1/2020, DSPG Young Scholars Use Data to Solve Local Community Problems, Chris Seeger, Iowa State University Community Matters Newsletter, https://www.extension.iastate.edu/communities/files/newsletter/files/issue4_now_accessible.pdf https://www.extension.iastate.edu/communities/dspg-young-scholars-use-data-solve-local-community-problems 11/1/20, Dorius, C. Iowa League of Cities: Cityscape, City focus: In Support of Data-Driven Governance Updates. November, https://mydigitalpublication.com/publication/?m=26842&i=679035&view=contentsBrowser&ver=html5 2/8/21 ISU researchers use data to help communities discover and solve biggest problems, Iowa State University News Service, https://www.news.iastate.edu/news/2021/02/08/data-science What do you plan to do during the next reporting period to accomplish the goals? Complete and disseminate the DSPG Young Scholars Program Blueprint for the implementation of similar Data Science for the Public Good Young Scholars Programs across the country. Complete the evaluation report for the three state coordination innovation network's DSPG Young Scholars programs.

Impacts
What was accomplished under these goals? Accomplishments this period: Created the National Advisory Panel (Chaired by Dr. Cathie Woteki) Developed the strategy for CES engagement Introduced CES within each state to Community Learning through Data Driven Discovery (CLD3) process through multiple presentations and webinars Developed the evaluation plan Development of scope of CES training materials to develop for CES engagement DSPG Young Scholars' training development and schedule Launched DSPG websites at each university to advertise 10-week summer program provide student application materials Recruited and selected DSPG Fellows and Interns for all five DSPG Young Scholars Programs Solicited and implemented DSPG project proposals from across CES in the three states and selected 20 projects, 8 in Virginia, 6 in Iowa, and 6 in Oregon Held national symposium on August 7, 2020 Oregon held a regional symposium on August 21, 2020. The community-based research CLD3 projects across the network are described below. Virginia: Addressing Barriers to Health Care Access and Use in Patrick County, Virginia: \ Patrick County's Cooperative Extension agent and University of Virginia researchers are working with Healthy Patrick County, a stakeholder group, and the Virginia Department of Health's local Population Health Manager to identify areas to improve critical social determinants of health including access to healthy food, technology and internet, and resources for their elderly population. Understanding Incarceration and Recidivism in Halifax County, Virginia: Halifax. Working together the county administrator, regional VCE agents, and University of Virginia researchers are creating a social and economic profile of the county focused on factors that affect prisoner reentry and recidivism, e.g., unemployment, poverty, teenage pregnancy, drug use; and job opportunities in the county. Employment Opportunities for Incarcerated Residents in Recovery in Page County, Virginia: Page County has a significant population of incarcerated residents in recovery ] The Page Count Cooperative Extension agent and Virginia Tech researchers are creating data insights to identify the areas to assist Page County's formerly incarcerated residents, including alleviating food insecurity, homelessness, low job skills, and substance abuse recovery. Establishing a Food Hub in Loudoun County, Virginia: Loudoun County's Cooperative Extension Agent and Virginia Tech researchers are examining food hubs in the Charlottesville and other areas as well as creating data-driven profiles of Loudoun County's low-income populations to inform the feasibility of creating a food hub. Attracting Industry to Wythe County, Virginia: The Virginia Cooperative Extension agent and Virginia Tech researchers are investigating factors that affect Wythe's ability to attract companies, as well as the benefits and costs of the Wythe industrial park. Improving Financial Literacy to Reduce Evictions in Richmond City, Virginia: Richmond City's Cooperative Extensions agent and Virginia Tech researchers are using data provided by the city combined with state and federal sources of data to improve their understanding of who the city is serving, including their spending habits and whether they have enough resources to manage. Creating Metrics to Support Completion of Appomattox River Trail (ART): The regional VCE agents and Virginia State University researchers are working with FOLAR to create metrics to describe the value of this project to stakeholders and to support fundraising. Data sources and metrics will track changes in health outcomes, tourism and business growth, and property values. Using a Collective Impact Model to Improve Health Outcomes in Greater Petersburg, Virginia :. The regional Virginia Cooperative Extension agents and Virginia State University researchers are discovering and acquiring demographic data and other data sources (such as health outcomes, transportation infrastructure, education, employment) and mapping them at the neighborhood level. Iowa: Develop a Community Capitals Data Infrastructure to Support Community Economic Resilience: Iowa State University Extension faculty and researchers are working together to develop measures that support economic mobility based on the Community Capitals framework in four key areas (economic, human, social, and natural). Enlarge the ISU Extension Community Helpline Services: Iowa State University Extension faculty and researchers are working with hotline managers to understand the system architecture, data structure and quality, and past uses of the data generated by the hotlines. Identify Communities in Greatest Need of Excessive Alcohol Prevention Efforts: Iowa State University Extension specialists and researchers are developing maps and analytic tools to identify where additional prevention resources are needed to manage alcohol-related problems. Identify Communities Ready and Able to Support Substance Use Recovery Centers: Iowa State University Extension faculty and researchers are identifying existing, unincorporated, substance use recovery infrastructure by type (e.g., housing, treatment centers, drinking driver education classes, gambling treatment, and substance use meetings). Pilot 'Systems of Care' Data Infrastructure to Support State Prevention, Treatment and Safety Response Efforts: Extension faculty and researchers are developing interactive data tools and insights to improve awareness of these resources. This pilot will inform the board's ongoing work to develop a health information platform to monitor and promote improved health, abstinence from substances, improved quality of life, social connectedness, employment and education, and reduced criminal justice involvement. Oregon: Regulatory Challenges and Impact on Economic Development in the Eastern Oregon Border Region: Oregon State University Extension professionals and researchers are undertaking data discovery, profiling, and exploration to acquire relevant data sets. The goal is to identify where there is a mismatch in resources - skilled labor, childcare, and housing and options to alleviate these mismatches. Forecasting Tools for Cost Analysis of Water and Wastewater Facilities in Oregon Small Towns and Cities: Oregon State Extension and University researchers are identifying the key variables influencing facility capital and operating costs to develop forecasting tools for cost analysis of water and wastewater facilities in Oregon's small towns and cities. Water Quality Requirements for Fresh Produce Growers in Oregon: Oregon Extension and University researchers are evaluating historical water quality data from the Tualatin River and the Treasure Valley area to put the current water quality criteria into context with the rule requirements. Air Quality Impacts in Oakridge and Westfir, Oregon: Oregon State Extension and university researchers are using a variety of air quality and health data to statistically analyze and present results using data visualization and geospatial data analysis tools. Creating an Economic Mobility Baseline for South Wasco County Area, Oregon: Working together the regional Cooperative Extension professional, an NGO Coordinating Stakeholder from the South Wasco Alliance and University of Virginia researchers are evaluating methods to construct economic mobility models and adjust the approach for application in a rural environment.

Publications

  • Type: Other Status: Published Year Published: 2021 Citation: Dorius, C., Dorius, S.F., and Seeger, C. (2020). In Support of Data Driven Governance. Cityscape, 75(5):14. https://www.bluetoad.com/publication/?m=26842&i=679035&p=14
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Busato, S., Bevans, S.J., Gates, D.A., and Heppell, S.A. (2021) Fishy Business in the Deschutes? Estimating the Effects of Dam Operations on River Ecology. MethodSpace. SAGE Publishing
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Chen, S., Haskins, C., Holt, M., Ramsey, F., Beverly, J., Glover, D., Tabassum, A., Wells, A. (2021). Industry and Workforce Attraction and Development in Wythe County. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Chen, S., Haskins, C., Holt, M., Ramsey, F., Beverly, J., Pranteddu, L., Tabassum, A., Grant, (2021). Intergenerational Poverty and Economic Mobility in Page County. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Chen, S., Haskins, C., Holt, M., Ramsey, F., Zhang, B., Pranteddu, L., Wells, A., Grant, T. (2021) Measuring Regional Food Insecurity and the Role of a Loudoun County Food Hub. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Chen, S., Haskins, C., Holt, M., Ramsey, F., Zhang, B., Glover, D., Pranteddu, L., Tabassum, A., Wells, A. (2021). Understanding Evictions in Richmond City: Policy, Community, and Individual Factors. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Dorius, C. J., Abraham, W., & Seeger, C. J. (2021) Develop a Community Capitals Data Infrastructure to Support Community Economic Mobility. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Dorius, S. F., Dorius, C. J., Hofmann, H., Rajabalizadeh, A., & Bustin, J. (Pending) A Computational Social Science Approach to Community-based Substance Use Recovery Resource Mapping. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Guenevere Patty, G., Conrad, A., Aga, P. Thessen, A. (2021). The Health Effects of Woodsmoke in Oakridge and Westfir, Oregon. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Hofmann, H., Jeppson, H., Dorius, S. F., Dorius, C. J. (2021). R Data Packages as Scaffolding System for Good Data Practices in the DSPG Program. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Ray, Susweta, McDonald, Sarah, Hart, Owen, Pietrowicz, Sean. Pristavec, Teja, Kramer, Brandon, Tobin, Joy, (2021). Fairfax County Labor Markets: Characterizing Local Workforce and Employment Networks. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Roberts, F., Berry, R, Graham, E., Walker, M., Thurston, J., Oh, E., Mikytuck, A., & Ratcliff, N. (2021) Halifax County Incarceration and Recidivism. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Seeger, C., Jahic, I., Zhu, X., & Dorius S. F. (2021). Using Data to Identify Communities in Greatest Need of Alcohol Prevention Efforts. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Sukul, A., Maloney, A., & Dorius, S. F. (2021). Enlarge the ISU Extension Community Helpline Services. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Vasan, T., Ma, M.C., Robinson, C., & Reitz, S. (2021) Barriers to Economic Opportunity in the Eastern Oregon Border Region. MethodSpace. SAGE Publishing.
  • Type: Journal Articles Status: Published Year Published: 2021 Citation: Stockham, Morgan, Chowdhury, Tasfia, Gomez, Isabel, Pristavec, Teja, Keller, Sallie. (2021). Addressing Barriers to Health in Patrick County, Virginia. MethodSpace. SAGE Publishing.


Progress 04/01/19 to 03/31/20

Outputs
Target Audience:There are four target audiences: Local government sponsors. We are working with local government sponsors in Iowa, Oregon, and Virginia to address their questions using the Community Learning Data Driven Discovery (CLD3) process. Cooperative Extension System. The CES are our partners in working with communities. This project has a special focus on CES professionals in Iowa, Oregon, and Virginia. This partnership will lead to advances in the applications of big data across problems in health and nutrition, food and agriculture, youth development, and community resource development. Data Science for the Public Good Interns and Fellows. Graduate fellows and undergraduate interns will participate in Data Science for the Public Good Young Scholars Programs this summer. They will participate in some immersive data science training and become part of the project teams that include academic mentors, CES professionals, and local community stakeholders. Public and Land-grant Universities. This project brings together Public and Land-Grant Universities from Virginia (University of Virginia, Virginia Tech, and Virginia State University), Iowa (Iowa State University), and Oregon (Oregon State University), each with its own approach to data science, community-based research, and Extension management. Working together across these diverse entities will lead to the formulation of best practices and a Blueprint for the implementation of similar Data Science for the Public Good Young Scholars Programs across the country. Changes/Problems:The USDA AFRI award has allowed us to accelerate our vision to bring the Community Learning Through Data-Driven Discovery (CLD3) and the Data Science for the Public Good young scholars program to local communities, especially rural ones. We want to sustain this effort in Iowa, Oregon, and Virginia. We also want to deploy the CLD3 process and DSPG program to other states. We are seeking funding from other sources and are planning to submit a renewal proposal this summer through the NIFA/USDA Agriculture and Food Research Initiative (AFRI) Competitive Grants Program - Food and Agriculture Cyberinformatics Tools (FACT). Right now, we are confronted with the unexpected COVID-19 crisis. To respond we have had to pivot from intensive face-to-face activities to complete virtual programs. This includes our DSGP Young Scholars Programs and our CES training. What opportunities for training and professional development has the project provided?Two sets of training are under development: (1) Cooperative Extension System (CES) professionals and (2) Data Science for the Public Good Young Scholars program. The CES training will focus on: Defining the Community Learning through Data Driven Discovery (CLD3); Legal and ethical considerations of using data Data Discovery process Data science literacy; Sustaining the DSPG and CLD3 processes DSPG training plan for students will include: Community learning through data drive discovery process (CLD3) Ethics and responsible conduct of research Doing data science and the data science framework Open science and use of Git, GitHub, and Rmarkdown Becoming proficient using tools to facilitate the research, e.g. literature searches, learning to code in R and GIS/QGIS Communication and dissemination of project outcomes including media training Team science and engaging with internal and external researchers and subject matter experts Special topics, e.g., social network analysis, machine learning, deep learning, classification, clustering, and visualization How have the results been disseminated to communities of interest?Through peer reviewed publication, presentations, webinars, media articles, and through our websites. What do you plan to do during the next reporting period to accomplish the goals?Major activities planned are: Implement the DSPG program at UVA, ISU, OSU, and VT/VSU virtually. Prepare for and host the DPSG Symposium virtually, to be held on August 6. Conduct a debriefing and report writing session to outline the DSPG Blueprint. Coordinate evaluation of program at each site and overall. Complete final report, including highlighting challenges, best practices, evaluation of results, and blueprint for deployment of similar programs in other states Participate in conference presentations and peer-reviewed journal publications. Seek additional funding to sustain the momentum gained during this project.

Impacts
What was accomplished under these goals? University of Virginia (UVA) hosted the three-state coordination network project kickoff in September 2019, bringing together the teams from Iowa State University (ISU), Oregon State University (OSU), Virginia Tech (VT), and Virginia State University (VSU). At this meeting, the group identified project teams and leads, established timelines, and agreed to meet monthly to track progress. As a result of the COVID-19 situation, the DSPG programs and CES training have pivoted and will be implement virtually. This is not small undertaking. Across the five universities there will be 57 students participating. The data science training as part of the DSPG Young Scholars programs will be synchronous and fully integrated across the coordination network. Project team interactions with CES professionals and local community stakeholders will also be completely virtual. During this period, thethree-state coordination network accomplished the following: Created the National Advisory Panel (UVA-led) Developed the strategy for CES engagement (full coordination network, led by ISU and UVA) Introduced CES within each state to Community Learning through Data Driven Discovery (CLD3) process through multiple presentations and webnairs (full coordination network) Developed the evaluation plan (full coordination network, UVA-led) Development of scope of CES training materials to develop for CES engagement (curriculum led by UVA and training material dissemination led by OUS) DSPG Young Scholars' training development and schedule (full coordination network, UVA-led) Launched DSPG websites at each university to advertise 10-week summer program provide student application materials (each state) UVA DSPG - https://biocomplexity.virginia.edu/social-decision-analytics/dspg-program ISU DSPG - https://dspg.iastate.edu/ OSU DSPG https://workspace.oregonstate.edu/data-science-for-the-public-good VT/VSU DSPG https://aaec.vt.edu/academics/undergraduate/beyond-classroom/dspg.html Recruited DSPG Fellows and Interns for all five DSPG Young Scholars Programs (full coordination network) Solicited DSPG project proposals from across CES in the three states and selected 20 projects, 8 in Virginia, 6 in Iowa, and 6 in Oregon (full coordination network)

Publications

  • Type: Journal Articles Status: Published Year Published: 2020 Citation: Keller, S. A., Shipp, S. S., Schroeder, A. D., & Korkmaz, G. (2020). Doing Data Science: A Framework and Case Study. Harvard Data Science Review, 2(1).