Progress 01/01/24 to 12/31/24
Outputs Target Audience:The results of this project will be of interest to geneticists (human and livestock), animal breeders, animal scientists, animal breeding and genetics instructors, and progressive livestock producers. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?During the reporting period, two PhD students were mentored under the supervision of Co-PD Li on this project. One PhD student participated and presented atthe NAACL conference 2024. Both PhD students presented a poster of the published NAACL paper in the Department of Computer Science. One PhD student graduated and joined Amazon as a Research Scientist. How have the results been disseminated to communities of interest?Co-PD Li, in collaboration with ISU NRT-D4 Graduate Traineeship program, presented in an AI workshop for high school teachers. She introduced large language models and how LLM and text mining can be used in accelerating scientific discovery. All new QTLdb data are ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index upon each data release. What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Impact statement: This project specifically addresses the USDA-AFRI Animal Breeding and Functional Annotation of Genomes program by annotating livestock genomes with genotype-by-phenotype association data. The data we curate and the tools we develop will benefit all agriculturally important species. Objective 1:Machine learning-assisted curation of genotype-to-phenotype and correlation/heritability data, automated semantic annotation, and ontology enrichment to annotate livestock and aquaculture genomes coupled with an intelligent document search system.Throughout the reporting period, we have developed two text mining models for CorrDB and QTLdb. We developed an entity-linking tool named GenDecider. It is a novel re-ranking approach for a Zero-Shot Entity Linking (ZSEL) task. It detects scenarios where the correct entity is not among the retrieved candidates, a common oversight in existing re-ranking methods. We fine-tuned a LlaMA model using Wikipedia. To assess the entity-linking system's performance, we manually created three testing datasets by annotating around 140 trait mention-entity pairs and linking to three ontologies, VT, LPT, and CMO. The retriever successfully included the correct entity among the top 64 candidates for about 80% of linked mentions and the correct entity among the top 10 candidates for about 70% of linked mentions. The re-ranker achieved 63% to 76% precision, meaning that for all linked trait mentions, over 60% of the predictions are correct. However, the model had relatively low recall rates (40%-54%), meaning that the re-ranker tended to output "None". As an assistant tool, this system provides curators with both the top 10 candidates from the retriever and the re-ranker's output. This helps curators more quickly identify both whether a mention is linkable and potential entity candidates, significantly reducing manual search efforts. During the previous reporting year, we designed an entity recognition method called CuPUL and manually annotated 102 abstracts with QTL for evaluation. This machine learning-based method utilizes a dictionary to identify entities of interest in the article. During this reporting period, we further evaluated the model on existing benchmark datasets to validate the effectiveness under different settings and evaluated existing state-of-the-art NER methods on QTL datasets. We found that CuPUL significantly outperformed state-of-the-art models on QTL dataset and obtained state-of-the-art results on benchmark datasets. We confirmed the effectiveness of CuPUL, which can significantly reduce the workload associated with manual curation. The software of the two models is released on GitHub. Objective 2:Manual curation of genotype-to-phenotype and correlation/heritability data, and ontology development.In 2024, a total of 25,435 new QTL/association data were curated into QTLdb by November 20. Currently, there are 55,688 porcine QTL/associations, 192,247 bovine QTL/associations, 18,602 chicken QTL/associations, 2,216 horse QTL/associations, 4,743 sheep QTL/associations, 2,145 goat QTL/associations, and 2,201 rainbow trout QTL/associations released to the public domain. These totals were affected by the archival of 13,309 older linkage map-based data and a shift to the exclusive use of genome/SNP-based maps. All new data have been ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index for data sharing upon each data release. Therefore, users can utilize the browsing and data mining tools at these database sites to explore animal QTL/association data. In 2024, a total of 5,038 new correlations and 477 heritability estimates were curated into CorrDB. Currently, the CorrDB contains 37,974 correlations (15,687 cattle; 2,609 chicken; 3,755 goat; 209 horse; 12,210 pig; and 3,504 sheep) and 6,423 heritabilities (2,787 cattle; 536 chicken; 487 goat; 131 horse; 1,928 pig; and 554 sheep) that have been released to the public domain. (The released numbers will be higher by year-end for both databases.) In addition to the curation of new data, 6,346 previously curated QTL/association data, 8,374 correlation data, and 1,456 heritability data were updated as part of an overhaul of the trait management system within the QTLdb/CorrDB curator tools. The new system was developed to deal with an ever-increasing list of traits created to describe the varying circumstances under which traits are assessed, such as muscle pH in different muscles or at multiple post-mortem time points, litter size in different parities, or subcutaneous fat thickness at different locations on the body. Instead of creating a new "sibling trait" associated with QTL/association data (e.g., semimembranosus pH 24 hr post-mortem), the structure now relies on the creation of "trait variants," in which the base trait (muscle pH) is "modified" by one or more additional terms related to a defined set of properties such as anatomical location, environment, time of measurement, etc. This system allows the hierarchy of traits within the databases to remain manageable, since all data for a specific trait, regardless of the conditions of a particular study, are linked back to the same base trait. Continued development of VT/LBO: As an important part of Animal QTLdb/CorrDB development, expansion and improvement of the Vertebrate Trait Ontology (VT) have been ongoing. During 2024, we released 28 new versions of the VT. In collaboration with the Rat Genome Database at the Medical College of Wisconsin, we added cross-references to nearly 700 terms from the Experimental Factor Ontology (EFO) to related terms in the VT. This facilitates ontology interoperability and the linking of related data. In addition, we have released nine versions of the Livestock Breed Ontology (LBO), which is used for annotation of QTL/associations with breed data. Each release of the updated VT/LBO data has been made available on BioPortal, Github, and the AnimalGenome.org website.
Publications
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Li, Y., K. Zhou, Q. Qiao, and Q. Li. 2024. Re-examine distantly supervised NER: A new benchmark and a simple approach. The 31st International Conference on Computational Linguistics, 2025. arXiv:2402.14948. https://arxiv.org/abs/2402.14948
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Zhou, K., Y. Li, Q. Wang, Q. Qiao. and Q. Li. 2024. GenDecider: Integrating None of the Candidates Judgments in Zero-Shot Entity Linking Re-ranking. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2, 239-45.
- Type:
Peer Reviewed Journal Articles
Status:
Submitted
Year Published:
2024
Citation:
Mullen, K.R., I. Tammen, N.A. Matentzoglu, M. Mather, C.J. Mungall, M.A. Haendel, F.W. Nicholas, S. Toro, and the Vertebrate Breed Ontology Consortium. 2024. The Vertebrate Breed Ontology: Towards effective breed data standardization. arXiv:2406.02623.
https://doi.org/10.48550/arXiv.2406.02623
|
Progress 01/01/22 to 12/31/24
Outputs Target Audience:The results of this project will be of interest to geneticists (human and livestock), animal breeders, animal scientists, animal breeding and genetics instructors, and progressive livestock producers. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?During the reporting period, two PhD students were mentored under the supervision of Co-PD Li on this project. One PhD student participated and presented in the NAACL conference 2024. Both PhD students presented a poster of the published NAACL paper in the department of Computer Science. One PhD student graduated and joined Amazon as a Research Scientist. How have the results been disseminated to communities of interest?Co-PD Li, in collaboration with ISU NRT-D4 Graduate Traineeship program, presented in an AI workshop for high school teachers. She introduced large language models and how LLM and text mining can be used in accelerating scientific discovery. All new QTLdb data are ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index upon each data release. What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Impact statement: This project specifically addresses the USDA-AFRI Animal Breeding and Functional Annotation of Genomes program by annotating livestock genomes with genotype-by-phenotype association data. The data we curate and tools we develop will benefit all agriculturally important species. Objective 1: Machine learning-assisted curation of genotype-to-phenotype and correlation/heritability data, automated semantic annotation, and ontology enrichment to annotate livestock and aquaculture genomes coupled with an intelligent document search system. Throughout the reporting period, we have developed two text mining models for CorrDB and QTLdb. We developed an entity linking tool named GenDecider. It is a novel re-ranking approach for a Zero-Shot Entity Linking (ZSEL) task. It detects scenarios where the correct entity is not among the retrieved candidates, a common oversight in existing re-ranking methods. We fine-tuned a LlaMA model using Wikipedia. To assess the entity-linking system's performance, we manually created three testing datasets by annotating around 140 trait mention-entity pairs and linking to three ontologies: VT, LPT, and CMO. The retriever successfully included the correct entity among the top 64 candidates for about 80% of linked mentions and the correct entity among the top 10 candidates for about 70% of linked mentions. The re-ranker achieved 63% to 76% precision, meaning that for all linked trait mentions, over 60% of the predictions are correct. However, the model had relatively low recall rates (40%-54%), meaning that the re-ranker tended to output "None". As an assistant tool, this system provides curators with both the top 10 candidates from the retriever and the re-ranker's output. This helps curators more quickly identify both whether a mention is linkable and potential entity candidates, significantly reducing manual search efforts. During the previous reporting year, we designed an entity recognition method called CuPUL and manually annotated 102 abstracts with QTL for evaluation. This machine learning-based method utilizes a dictionary to identify entities of interest in the article. During this reporting period, we further evaluated the model on existing benchmark datasets to validate the effectiveness under different settings and evaluated existing state-of-the-art NER methods on QTL datasets. We found that CuPUL significantly outperformed state-of-the-art models on QTL dataset and obtained state-of-the-art results on benchmark datasets. We confirmed the effectiveness of CuPUL, which can significantly reduce the workload associated with manual curation. Software of the two models is released on GitHub. Objective 2: Manual curation of genotype-to-phenotype and correlation/heritability data, and ontology development. In 2024, a total of 25,435 new QTL/association data were curated into QTLdb by November 20. Currently, there are 55,688 porcine QTL/associations, 192,247 bovine QTL/associations, 18,602 chicken QTL/associations, 2,216 horse QTL/associations, 4,743 sheep QTL/associations, 2,145 goat QTL/associations, and 2,201 rainbow trout QTL/associations released to the public domain. These totals were affected by the archival of 13,309 older linkage map-based data and a shift to the exclusive use of genome/SNP-based maps. All new data have been ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index for data sharing upon each data release. Therefore, users can utilize the browsing and data mining tools at these database sites to explore animal QTL/association data. In 2024, a total of 5,038 new correlations and 477 heritability estimates were curated into CorrDB. Currently, the CorrDB contains 37,974 correlations (15,687 cattle; 2,609 chicken; 3,755 goat; 209 horse; 12,210 pig; and 3,504 sheep) and 6,423 heritabilities (2,787 cattle; 536 chicken; 487 goat; 131 horse; 1,928 pig; and 554 sheep) that have been released to the public domain. (The released numbers will be higher by year-end for both databases.) In addition to the curation of new data, 6,346 previously curated QTL/association data, 8,374 correlation data, and 1,456 heritability data were updated as part of an overhaul of the trait management system within the QTLdb/CorrDB curator tools. The new system was developed to deal with an ever-increasing list of traits created to describe the varying circumstances under which traits are assessed, such as muscle pH in different muscles or at multiple post-mortem time points, litter size in different parities, or subcutaneous fat thickness at different locations on the body. Instead of creating a new "sibling trait" associated with QTL/association data (e.g., semimembranosus pH 24 hr post-mortem), the structure now relies on the creation of "trait variants," in which the base trait (muscle pH) is "modified" by one or more additional terms related to a defined set of properties such as anatomical location, environment, time of measurement, etc. This system allows the hierarchy of traits within the databases to remain manageable, since all data for a specific trait, regardless of the conditions of a particular study, are linked back to the same base trait. Continued development of VT/LBO: As an important part of Animal QTLdb/CorrDB development, expansion and improvement of the Vertebrate Trait Ontology (VT) have been ongoing. During 2024, we released 28 new versions of the VT. In collaboration with the Rat Genome Database at the Medical College of Wisconsin, we added cross-references to nearly 700 terms from the Experimental Factor Ontology (EFO) to related terms in the VT. This facilitates ontology interoperability and the linking of related data. In addition, we have released nine versions of the Livestock Breed Ontology (LBO), which is used for annotation of QTL/associations with breed data. Each release of the updated VT/LBO data has been made available on BioPortal, Github, and the AnimalGenome.org website.
Publications
- Type:
Conference Papers and Presentations
Status:
Accepted
Year Published:
2024
Citation:
Li, Y., K. Zhou, Q. Qiao, and Q. Li. 2024. Re-examine distantly supervised NER: A new benchmark and a simple approach. The 31st International Conference on Computational Linguistics, 2025. arXiv:2402.14948. https://arxiv.org/abs/2402.14948
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2024
Citation:
Zhou, K., Y. Li, Q. Wang, Q. Qiao. and Q. Li. 2024. GenDecider: Integrating None of the Candidates Judgments in Zero-Shot Entity Linking Re-ranking. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2, 239-45.
- Type:
Peer Reviewed Journal Articles
Status:
Published
Year Published:
2024
Citation:
Mullen, K.R., I. Tammen, N.A. Matentzoglu, M. Mather, C.J. Mungall, M.A. Haendel, F.W. Nicholas, S. Toro, and the Vertebrate Breed Ontology Consortium. 2024. The Vertebrate Breed Ontology: Towards effective breed data standardization. arXiv:2406.02623.
|
Progress 01/01/23 to 12/31/23
Outputs Target Audience:The results of this project will be of interest to geneticists (human and livestock), animal breeders, animal scientists, and progressive livestock producers. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?During the reporting period, two PhD students are mentored under the supervision of Co-PD Li on this project. One PhD student participated and presented in the AAAI conference 2023. One PhD student served on the Program Committee for AAAI-2023's main track. Both PhD students presented a poster of the published AAAI paper in the department of computer science. Both PhD students interned at Amazon as Applied Scientists working on information extraction related projects. How have the results been disseminated to communities of interest?Co-PD Li, in collaboration with ISU Extension for Iowa-4H and NRT-D4 Graduate Traineeship program, organized Data Science Workshops at Spencer, Iowa and introduced how AI and text mining can be used in accelerating scientific discovery. All new QTLdb data are ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index upon each data release. What do you plan to do during the next reporting period to accomplish the goals?Objective 1: With the recognized trait entities identified by developed method CuPUL, we will develop a machine learning-based method to link them to their normalized or official names in Vertebrate Trait Ontology (VT) and Livestock Product Trait Ontology (LPT), which aims to reduce the curation efforts on trait entity disambiguation. Objective 2: During 2024, we will continue curation of relevant data into the QTLdb and CorrDB, as well as development of the related ontologies, VT, LPT, and LBO.
Impacts What was accomplished under these goals?
Impact statement: This project specifically addresses the USDA-AFRI Animal Breeding and Functional Annotation of Genomes program by annotating livestock genomes with genotype-by-phenotype association data. The data we curate and tools we develop will benefit all agriculturally important species. Objective 1: Machine learning-assisted curation of genotype-to-phenotype and correlation/heritability data, automated semantic annotation, and ontology enrichment to annotate livestock and aquaculture genomes coupled with an intelligent document search system. Throughout the reporting period, we have developed two text mining models for CorrDB and QTLdb. We have trained a deep learning model to pre-screen articles retrieved through keyword search from PubMed. This model is particularly adept at discerning and filtering out articles that do not pertain to our research needs before they reach the curation stage. We deployed this model on a batch of 20,508 articles, where it was able to pinpoint that only 13% of these articles were relevant to our databases. Further validation through human evaluation on a selected subset of these articles revealed that the model has a high recall rate, successfully identifying over 95% of the articles that are relevant. This has led to a significant decrease in the amount of time and effort our curators need to spend on filtering out irrelevant content. We also developed an entity recognition method called CuPUL. This machine learning-based method utilizes a dictionary to identify entities of interest in the article. We gathered a dictionary containing 3884 trait names and abstracts of 1716 articles to train the machine learning models in CuPUL. The trained models are capable of marking trait entities in relevant articles, providing a foundation for semantic annotation and ontology development. Additional verification revealed that CuPUL achieved a 98.5% success rate in retrieving trait entities in 102 manually annotated articles, while the precision recognition rate was 54.4%. This method significantly reduces the workload associated with manual curation. Objective 2: Manual curation of genotype-to-phenotype and correlation/heritability data, and ontology development. In 2023 a total of 22,883 new QTL/association data were curated into QTLdb by November 20. Currently, there are 54,816 porcine QTL/associations, 195,011 bovine QTL/associations, 18,646 chicken QTL/associations, 2,649 horse QTL/associations, 4,729 sheep QTL/associations, 558 goat QTL/associations, and 2,329 rainbow trout QTL/associations released to the public domain. All new data have been ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index for data sharing upon each data release. Therefore, users can utilize the browsing and data mining tools at these database sites to explore animal QTL/association data. In 2023, a total of 5,709 new correlations and 783 heritability estimates were curated into CorrDB. Currently, the CorrDB contains 29,947 correlations (14,839 cattle; 2,053 chicken; 834 goat; 209 horse; 9,308 pig; and 2,704 sheep) and 5,458 heritabilities (2,636 cattle; 400 chicken; 72 goat; 131 horse; 1,781 pig; and 438 sheep) that have been released to the public domain. (The released numbers will be higher by year-end for both databases). In addition to the curation of new data, 19,448 previously curated QTL/association data, 6,454 correlation data, and 58 heritability data were updated as part of an overhaul of the trait management system within the QTLdb/CorrDB curator tools. The new system was developed to deal with an ever-increasing list of traits created to describe the varying circumstances under which traits are assessed, such as muscle pH in different muscles or at multiple post-mortem time points, litter size in different parities, or subcutaneous fat thickness at different locations on the body. Instead of creating a new "sibling trait" associated with QTL/association data (e.g., semimembranosus pH 24 hr post-mortem), the structure now relies on the creation of "trait variants," in which the base trait (muscle pH) is "modified" by one or more additional terms related to a defined set of properties such as anatomical location, environment, time of measurement, etc. This system allows the hierarchy of traits within the databases to remain manageable, since all data for a specific trait, regardless of the conditions of a particular study, are linked back to the same base trait. Continued development of VT/LPT: As an important part of Animal QTLdb/CorrDB development, expansion and improvement of the Vertebrate Trait Ontology (VT) have been ongoing. During 2023, we released 14 new versions of VT. In addition, we have released 10 versions of the Livestock Breed Ontology (LBO), which is used for annotation of QTL/associations with breed data. Each release of the updated VT/LBO data has been made available on BioPortal, Github, and the AnimalGenome.org website.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2023
Citation:
Hu, Z.-L., C.A. Park, and J.M. Reecy. 2023. A combinatorial approach implementing new database structures to facilitate practical data curation management of QTL, association, correlation, and heritability data on trait variants. Database (Oxford). 2023:baad024. https://doi.org/10.1093/database/baad024
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2023
Citation:
Hu, Z.-L., C.A. Park, and J.M. Reecy. 2023. An implementation of new approaches to extend livestock trait ontologies for practical curation management of QTL, association, correlation, and heritability data. Presented at Plant & Animal Genome 30 Meeting, January 812, 2023. San Diego, CA. https://animalgenome.org/QTLdb/publications/2023PAG.pdf
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2023
Citation:
Zhou, K., Q. Qiao, Y. Li, and Q. Li. 2023. Improving distantly supervised relation extraction by natural language inference. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 11, pp. 14047-14055).
https://doi.org/10.1609/aaai.v37i11.26644
- Type:
Conference Papers and Presentations
Status:
Submitted
Year Published:
2023
Citation:
Li, Y., K. Zhou, Q. Qiao, Q. Wang, and Q. Li. 2024. Improving Distantly Supervised NER via Token-Level Curriculum-Based Positive-Unlabeled Learning. NAACL 2024
|
Progress 01/01/22 to 12/31/22
Outputs Target Audience:The results of this project will be of interest to geneticists (human and livestock), animal breeders, animal scientists, and progressive livestock producers. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?During the reporting period, two Ph.D. students are mentored under the supervision of Co-PD Li on this project. One Ph.D. student participated and presented in the ACL conference 2022. How have the results been disseminated to communities of interest?The implementation of ConfMPU is publicly available on GitHub. All new QTLdb data are ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index upon each data release. What do you plan to do during the next reporting period to accomplish the goals?Objective 1: We will include the two text mining models in the current curation pipeline of CorrDB and QTLdb. We will design an automated ontology enrichment method to assist the development of Vertebrate Trait Ontology (VT) and Livestock Product Trait Ontology (LPT). Objective 2: During 2023, we will continue curation of relevant data into the QTLdb and CorrDB, as well as ontology development of the VT, LPT, and LBO.
Impacts What was accomplished under these goals?
Impact statement: This project specifically addresses the USDA-AFRI Animal Breeding and Functional Annotation of Genomes program by annotating livestock genomes with genotype-by-phenotype association data. The data we curate and the tools we develop will benefit all agriculturally important species. Objective 1: Machine learning-assisted curation of genotype-to-phenotype and correlation/heritability data, automated semantic annotation, and ontology enrichment to annotate livestock and aquaculture genomes coupled with an intelligent document search system. Throughout the reporting period, we have developed two text mining models for CorrDB and QTLdb. We have trained a deep learning model to pre-screen articles retrieved through keyword search from PubMed. This model can filter irrelevant articles before the curation step. We also developed a new entity recognition method called ConfMPU. It is a machine learning method that uses a dictionary to recognize entity names of interest in the articles. ConfMPU significantly improved state-of-the-art distantly supervised named entity recognition methods on multiple benchmark datasets from different domains and achieved comparable results with fully supervised methods. This method has been used to train a trait entity recognizing model for CorrDB and QTLdb. It built a solidfoundations for semantic annotation and ontology development. Objective 2: Manual curation of genotype-to-phenotype and correlation/heritability data, and ontology development. In 2022, a total of 22,320 new QTL/association data were curated into QTLdb. Currently, there are 36,725 porcine QTL/associations, 193,641 bovine QTL/associations, 18,313 chicken QTL/associations, 2,649 horse QTL/associations, 4,504 sheep QTL/associations, 129 goat QTL/associations, and 2,329 rainbow trout QTL/associations in the database. All new data have been ported to NCBI, Ensembl, UCSC genome browser, and Reuters Data Citation Index for data sharing upon each data release. Therefore, users can utilize the browsing and data mining tools at these database sites to explore animal QTL/association data. In 2022, a total of 2,735 new correlations and 459 heritability estimates were curated into CorrDB. Currently, the CorrDB contains 26,839 correlations (13,471 cattle; 1,835 chicken; 311 goat; 209 horse; 8,902 pig; and 2,111 sheep) and 4,778 heritabilities (2,330 cattle; 377 chicken; 5 goat; 123 horse; 1,607 pig; and 336 sheep). In addition to the curation of new data, 16,227 previously curated QTL/association data, 5,573 correlation data, and 415 heritability data were updated as part of an overhaul of the trait management system within the QTLdb/CorrDB curator tools. The new system was developed to deal with an ever-increasing list of traits created to describe the varying circumstances under which traits are assessed, such as muscle pH in different muscles or at multiple post-mortem time points, litter size in different parities, or subcutaneous fat thickness at different locations on the body. Instead of creating a new "sibling trait" associated with QTL/association data (e.g., semimembranosus pH 24 hr post-mortem), the structure now relies on the creation of "trait variants," in which the base trait (muscle pH) is "modified" by one or more additional terms related to a defined set of properties such as anatomical location, environment, time of measurement, etc. This system allows the hierarchy of traits within the databases to remain manageable, since all data for a specific trait, regardless of the conditions of a particular study, are linked back to the same base trait. Continued development of VT/LPT: As an important part of Animal QTLdb/CorrDB development, expansion and improvement of the Vertebrate Trait Ontology (VT) and Livestock Product Trait Ontology (LPT) have been ongoing. During 2022, we released nine new versions of VT and three new versions of LPT. In addition, we have released 15 versions of the Livestock Breed Ontology (LBO), which is used for annotation of QTL/associations with breed data. Each release of the updated VT/LPT/LBO data has been made available on BioPortal, Github, and the AnimalGenome.org website.
Publications
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2022
Citation:
Hu, Z.-L, C.A. Park, and J.M. Reecy. 2022. A database structural improvement for efficient trait variation curation in Animal QTLdb and CorrDB. Presented at the World Congress on Genetics Applied to Livestock Production (WCGALP). The Netherlands.
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2022
Citation:
K. Zhou, Y, Li, and Q. Li. 2022. Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
|
|