Progress 09/15/23 to 09/14/24
Outputs Target Audience:Our audience includes various groups, such as undergraduates, graduates, research scientists, plant and animal breeding professionals, and data-sharing stakeholders like field-based plant and animal breeders. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?We organized two stakeholder advisory meetings to discuss our progress and proposed solutions with stakeholders, including editors from scientific journals and representatives from plant and animal genetics companies. How have the results been disseminated to communities of interest?We organized two stakeholder advisory meetings to discuss our progress and proposed solutions with stakeholders, including editors from scientific journals and representatives from plant and animal genetics companies. What do you plan to do during the next reporting period to accomplish the goals?We will continue improving the database. To this end, we will reach out to various editors and colleagues, grant them access to the database, and seek their suggestions for improvements. Additionally, we will finalize and submit the manuscript currently in preparation and make the searchable database publicly available. Furthermore, we will continue to develop and validate the encryption methods and federated learning techniques.
Impacts What was accomplished under these goals?
We created the searchable database infrastructure, including: Definition of the metadata required to identify data sets and to enable searches; Created a database infrastructure; Developed a web interface for the searchable database; Deployed a beta version of the database online. We identified 137 publicly available genomic data sets and populated the data bases with the meta data for this data sets. For each data set we also created a data loader (accessible through the searchable database) that can be used to load into an R-environment the data set after downloading it. Thus, the searchable database is functional and populated with 137 data sets. We outline a manuscript (to be submitted as a 'resource' paper) that we aim to publish to introduce this resource in the genomic selection community. In this manuscript we will present an application of the use of this data set benchmarking standard genomic prediction models for 43 selected data sets. The analyses will serve two purposes: (i) as an illustration of possible uses of the searchable data base, and (ii) provide benchmarks that can be used for future methods and software research. To facilitate the use of these benchmarks we will share a doi with all the data sets used in the benchmarks formatted and ready to be analyzed. We further developed our statistical methods for data encryption and federated learning, and validated these methods using real data.
Publications
- Type:
Other Journal Articles
Status:
Published
Year Published:
2024
Citation:
Zhao, T., Wang, F., Mott, R., Dekkers., J., Cheng, H, 2024, Using encrypted genotypes and phenotypes for collaborative genomic analyses to maintain data confidentiality, Genetics
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Donna Li, Tianjing Zhao, Jack C M Dekkers, Richard Mott, and Hao Cheng, Using Encrypted Genotypes and Phenotypes for Reproducible and Collaborative Genomic Analyses to Maintain Data Confidentiality, AGBT Ag, 2024
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Gustavo de los Campos, Hao Cheng, Jack Dekkers, Juan Steibel, Regularization And Transfer Learning in Genomic Models Through Gradient Descent with Early Stopping, AGBT Ag, 2024
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Blessing Olabosoye, Hao Cheng, Jack Dekkers, Juan Steibel, Gustavo de los Campos, Transfer learning and meta-analysis strategies for optimizing genomic prediction accuracy without data sharing, AGBT Ag, 2024
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Hao Cheng, Jack Dekkers, Juan Steibel, Gustavo de los Campos, Platforms and Methods for Sharing and Collaboration on Agricultural Genome to Phenome Using Public and Confidential Data, AGBT Ag, 2024
|