Recipient Organization
UNIVERSITY OF NEBRASKA
(N/A)
LINCOLN,NE 68583
Performing Department
(N/A)
Non Technical Summary
To create innovative, data-driven agricultural production systems, producers will need access to data, simulations, forecasts, and modeling to learn how they can transform their operations towards enhancing sustainability. The National Agricultural Producer Data Cooperative (NAPDC) in partnership with land grant universities, stakeholder organizations, and private-sector representatives will continue the development of a national framework to foster agricultural innovation by addressing critical challenges: Leverage and enhance ongoing projects that support stakeholders and collaborative partnerships to demonstrate the effectiveness and scalability of digital resources in capturing, storing, and accessing producer and public data; Develop a clear and complete sustainability plan for updates, expansion, and user support that delineates principles and examples of specific mechanisms for capacity integration that will support effective use of the data; Develop feasible appropriate assurances of regulatory and legal compliance including a data trust structure and FAIR principles; and Further next steps for framework growth and expansion through engagement and support of diverse participation in NAPDC activities, integration of feedback, and communication and dissemination of findings. Expected outcomes include further development of a data repository that leverages tools widely accepted in the industry to ensure access, privacy and transparency while enabling on-farm research; digital tools and resources (e.g., APIs) that enable farmer and livestock producer access and use of both private and public data resources; governance and sustainability plans that ensure long term viability, data quality, and privacy assurances; and white papers, peer-reviewed publications, and other products that help define the future of a national data framework. A diverse and experienced PI team, a comprehensive group of partner organizations, a scientific advisory board of national experts will assist the NAPDC in meeting its objectives and ensuring that its activities coordinate and complement existing programs in agricultural data services and stewardship.
Animal Health Component
30%
Research Effort Categories
Basic
20%
Applied
30%
Developmental
50%
Goals / Objectives
Consistent with the RFA, the overall objective of this project is to continue to develop a national data framework to foster agricultural innovation. The proposed activities will engage agricultural stakeholders across diverse systems and provide knowledge related to interoperability, security, privacy, sustainability and governance that is required for such a framework to succeed.1. Leverage and enhance ongoing projects that support stakeholders and collaborative partnerships to demonstrate the effectiveness and scalability of digital resources in capturing, storing, and accessing producer and public data [advanced cyberinfrastructure; data governance; cybersecurity and access control; prototypes and key use cases]2. Develop a clear and complete sustainability plan for updates, expansion, and user support that delineates principles and examples of specific mechanisms for capacity integration that will support effective use of the data [data fusion and data system interoperability; inventories of existing data sources and platforms; sustainability and resiliency].3. Develop feasible appropriate assurances of regulatory and legal compliance including a data trust structure and FAIR principles [data curation, representation, authentication, and stewardship; data governance; cybersecurity and access control]4. Further next steps for framework growth and expansion through engagement and support of diverse participation in NAPDC activities, integration of feedback, and communication and dissemination of findings [education, training, and workforce development; demonstrating value and return on investment]
Project Methods
Objective 1: Leverage and enhance ongoing projects that support stakeholders and collaborative partnerships to demonstrate the effectiveness and scalability of digital resources in capturing, storing, and accessing producer and public data.The Agricultural Data Coalition (ADC) has been working since 2014 to establish a farmer-controlled data cooperative entity to provide communication and sustainability for a data repository. The goal of this project is to continue enhancing the data repository to enable broader use within the university research and extension systems, including on-farm research results and integration with AgDataCommons.We will build upon the repository functionality by providing a means to not only store the various files in a common database and structure, whilemaintaining an unaltered archive of these original files for the farmer but also implement metadata standards on that common data structure.Data within this common model, also requires the ability to share common meaning. The team will continue its collaboration with AgGateway to implement industry standard semantic resources. While the ADC is a collaborative effort to create a neutral, independent farmer-centric repository forfarmers to store data (Objective 1), the Open Ag Data Alliance (or OADA) is an open-source effort with the goal to enable data security, privacy, and interoperability for the entire industry. The ADC team will determine whether the OADA AGAPECert and Mask & Link tools enable farmers to generate certifications about their data without needing to disclose their data. Critical input and feedback regarding database functionality for cover and specialty crops will be provided by projectparticipants representing land grant universities, nonprofits, and industry partners.The full value of agricultural data is only realized when it can be integrated from multiple, context-rich sources in ways that adhere to trust requirements of its owners. We propose to enable the USDA ARS National Animal Germplasm Program (NAGP)upload and query phenotypic/genotypic data with robust data quality checks, and access both the AnimalGenetic Resources Information Network (Animal-GRIN) and the Bovine Genome Database (BGD). The NAGP maintains a database information system that documents the collection and provides basic descriptors for all species. However, this database requires additional functionality and interoperability to enable the handling of larger amounts of high throughput sequencing data with metadata descriptors.NextGen sequence data has historically been a research tool to enable more refined searches for genome-phenome associations. However, cost declines make this technology appealing in some cases for more routine uses, such as commercial genetic prediction. In both cases, research and industry use, the ability to share sequence data and accompanying metadata is critical. As technologies such as NextGen sequencing become available to industry for routine use, leveraging both industry and research community data is needed to avoid unneeded duplication of efforts and cost. Moreover, data sharing among industry segments enables more accurate management decisions as animals are bought/sold, and enables genome-phenome discoveries when critical phenotypes might only be expressed in industry segments that do not currently use centralized data repositories. With these unique industry features and needs in mind, we will continue to work with UNL beef cattle populations (cow-calf, seedstock, feedlot, on farm research) and the FAANG consortium to develop APIs that enable data transfer and sharing among producers, researchers, and the Germplasm Repository, while we host discussions with stakeholders about a livestock data framework. Standards, expectations, and guidelines for uploading and hosting genotypic and phenotypic data (e.g., bioProject, bioSample) will be adapted from the frameworks for the NCBI Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) and dbGap (https://www.ncbi.nlm.nih.gov/gap/).Objective 2: Develop a clear and complete sustainability plan for updates, expansion, and user support that delineates principles and examples of specific mechanisms for capacity integration that will support effective use of the data.The sustainability of these framework advances currently rests on several ongoing efforts. We continue to leverage the Agricultural Data Coalition and AgGateway, whose members and industry partners provide financial support for the maintenance of the database in Objective 1. The APIs and tools developed are based on open source code and are supported by the OATS Center and the NAGP infrastructure (part of ARS). Demand from ADC and AgGateway members, and the NAGP user community, will drive updates, expansion, and user support beyond the award period.Further updates and appropriate integration with current and future tool development, cyberinfrastructure facilities, and programs that will support effective use of the data will rely on both `push' and `pull' mechanisms. For the purposes of this project, integration of tools and facilities will be done with solicitation and input of the Scientific Advisory Board. This said, we will pursue a more formalized approach and develop a roadmap to sustainability motivated by the process adopted by CyVerseand the AgBioData consortium.Objective 3: Develop feasible appropriate assurances of regulatory and legal compliance including adata trust structure and FAIR principles.The 2008 Farm Bill prohibited USDA from releasing producer information at an identifiable level, but it did not prohibit the collection and analysis of the information at an aggregate level (e.g., Confidential Information Protection and Statistical Efficiency Act or CIPSEA). However, many producers remain distrustful of USDA and other government agencies, citing concerns about inappropriate data use and consequent financial risk. Any national data framework must maintain the producer's trust as the primary data providers through transparency, privacy, security, and accountability. The framework that this project will contribute to developing will be "Ag Data Transparent", i.e., willabide by the Privacy and Security Principles for Farm Data as endorsed by almost 40 agriculture groups and businesses. The framework will treat data as a trade secret and prohibit the copying or distributing of data without the producer's consent. These principles are consistent with the NIST Cybersecurity and Privacy Frameworks and cover all five privacy risk management areas: Identify, Govern, Control, Communicate, and Protect. The ADC and AgGateway, along with the members of our Scientific Advisory Board, will use their knowledge of these principles to contribute to the development of feasible appropriate assurances of regulatory and legal compliance.?Objective 4: Further next steps for framework growth and expansion through engagement andsupport of diverse participation in NAPDC activities, integration of feedback, and communication and dissemination of findings.Several activities and aspects of this project are critical to framework growth and expansion. These are outlined below and include a survey to gather stakeholder input and robust communication plans between the project team and stakeholders. These complement any next steps outlined in Objectives 1-3 above. The survey will be launched at the 2023 NAPDC conference in May 2023 when attendees will be asked to complete the survey online. A second round of survey launch and data collection will happen in Fall 2023 in order to capture input from the farming community (many of whom are unavailable during the Summer season). The results of this survey will be collated and form the basis of a peer-reviewed publication. This publication will be made open access for the broader community.