Progress 04/10/13 to 04/09/18
Outputs Progress Report Objectives (from AD-416): Objective 1: Support stewardship of maize genome sequences and forthcoming diverse maize sequences. Objective 2: Create tools to enhance access to expanded datasets that reveal gene function and datasets for genetic and breeding analyses. Objective 3: Deploy tools to increase user-specified flexible queries. Objective 4: Provide community support services, training and documentation, meeting coordination, and support for community elections and surveys. Objective 5: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources. Approach (from AD-416): For applied researchers to benefit from basic investigations, generated data must be made freely and easily accessible. MaizeGDB, the Maize Genetics and Genomics Database (http://www.maizegdb.org), is the research community�s central repository for genetics and genomics information. The overall aim of our work is to create and maintain unified public resources that facilitate access to the outcomes of maize research. We will support reference genome stewardship within the context of extensive genomic diversity by adopting, developing, and deploying tools that enable community members to annotate and document updates to the genome and by developing and deploying genome visualization tools that enable user-friendly interaction with reference genome and diversity data. We will deploy datasets and tools that reveal gene function and support genetic and breeding analyses by adopting and populating network analysis software to support all types of data that can be represented by such means including gene networks, interaction data, and pedigree information. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee. ARS scientists working on the Maize Genetics and Genomics Database (MaizeGDB) in Ames, Iowa provided tools and resources that make the maize genome sequence useful for investigative research and crop improvement. Genome sequences served include the latest version of the B73 representative genome (RefGen_v4), and the recently released genome assemblies of 10 additional maize inbred lines. Genotype data for thousands of various other lines and individuals that represent the broad diversity represented by the Zea genus (i.e., maize and its near relatives) are also available as downloads or through a recently developed tool to visualize maize diversity. MaizeGDB focuses on curating high-quality, high-impact data sets for the maize research community. This includes the curation of over 75 datasets from the maize community to be added to the MaizeGDB genome browser. The current versions of the genome browser have over 150 datasets (tracks). In addition, MaizeGDB now supports 14 genome browsers for recently released maize genomes and over 90 data sets that can be used as targets for sequence similarity searches. This allows researchers to leverage research outcomes from many different sources and data types, all within the context of the maize reference assemblies. To enable a better understanding of how the genes in a plant define the potential phenotypes that will be observed in farmers� fields, we maintain a pathway view tool suite called CornCyc. This resource helps researchers determine which genes and pathways to select for targeted crop improvement. The new redesigned graphical user interface of MaizeGDB has continually been updated to provide improved access for maize researchers. Additional tools have been developed including a tool to visual explore maize variation in over 17,000 maize lines. Work carried out by the MaizeGDB team has resulted in improved communication among maize researchers worldwide, increased ability to document the results of experiments, and increased availability of information relative to high impact research. Accomplishments 01 Tools, resources, and curation have been updated at MaizeGDB. In genetics and genomics research, model organism databases are critical by acting as both a data repository and as a resource to search, integrate, analyze, and visualize the data. For maize, this need is met by MaizeGDB, the Maize Genetics and Genomics database. ARS researches in Ames, Iowa have expanded the capabilities of MaizeGDB to meet the needs of the maize research community. MaizeGDB provides ways to visually explore genomes, search based on sequence similarity, visualize variation across germplasm, display webpages of curated data, and provide consistent nomenclature, standardized data, and quality statistics. MaizeGDB has expanded its public resources to support the sequenced genomes beyond B73, a long-established maize inbred line that was used as a founder of many public and private breeding programs. Data and tools are available for fourteen additional maize genome assemblies. MaizeGDB hosts a wide range of data including recent support of new types including genome metadata, gene expression, protein analysis, and variation data. To improve access and visualization of data several new tools have been implemented to: explore the variation in maize, compare how genes are expressed, visualize pedigrees, link genes with images, and enable improved access. MaizeGDB continues to be the community hub for maize research, coordinating activities and providing technical support. These resources provide long-term support, stability, and maintenance to maize research data and accelerate maize trait analysis, germplasm analysis, genetic studies, and breeding through better data access and utilization.
Impacts (N/A)
Publications
- Walsh, J., Schaeffer, M.L., Zhang, P., Rhee, S., Dickerson, J., Sen, T.Z. 2016. The quality of metabolic pathway resources depends on initial enzymatic function assignments and level of manual curation: A case for maize. BMC Systems Biology. 10:129.
|
Progress 10/01/16 to 09/30/17
Outputs Progress Report Objectives (from AD-416): Objective 1: Support stewardship of maize genome sequences and forthcoming diverse maize sequences. Objective 2: Create tools to enhance access to expanded datasets that reveal gene function and datasets for genetic and breeding analyses. Objective 3: Deploy tools to increase user-specified flexible queries. Objective 4: Provide community support services, training and documentation, meeting coordination, and support for community elections and surveys. Objective 5: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources. Approach (from AD-416): For applied researchers to benefit from basic investigations, generated data must be made freely and easily accessible. MaizeGDB, the Maize Genetics and Genomics Database (http://www.maizegdb.org), is the research community�s central repository for genetics and genomics information. The overall aim of our work is to create and maintain unified public resources that facilitate access to the outcomes of maize research. We will support reference genome stewardship within the context of extensive genomic diversity by adopting, developing, and deploying tools that enable community members to annotate and document updates to the genome and by developing and deploying genome visualization tools that enable user-friendly interaction with reference genome and diversity data. We will deploy datasets and tools that reveal gene function and support genetic and breeding analyses by adopting and populating network analysis software to support all types of data that can be represented by such means including gene networks, interaction data, and pedigree information. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee. ARS scientists working on the Maize Genetics and Genomics Database (MaizeGDB) in Ames, Iowa improved tools that make the maize genome sequence useful for investigative research and crop improvement. Genome sequences served include the latest version of the B73 representative genome (RefGen_v4), and the newly released genome assemblies of B104, CML247, EP1, F7, Mo17, PH207, and W22. Skim sequence of thousands of various other lines and individuals that represent the broad diversity represented by the Zea genus (i.e., maize and its near relatives) are also available as downloads or through a recently developed tool to visualize maize diversity. MaizeGDB focuses on curating high-quality, high-impact data sets for the maize research community. This includes the curation of over 50 datasets from the maize community to be added to the MaizeGDB genome browser. The current versions of the genome browser have over 150 datasets (tracks). In addition, MaizeGDB has created eight new genome browsers for recently released maize genomes and over 21 data sets that can be used as targets for sequence similarity searches. This allows researchers to leverage research outcomes from many different sources and data types, all within the context of the maize reference assemblies. To enable a better understanding of how the genes in a plant define the potential phenotypes that will be observed in farmers� fields, we maintain a pathway view tool suite called CornCyc. This resource helps researchers determine which genes and pathways to select for targeted crop improvement. The new redesigned graphical user interface of MaizeGDB has continually been updated to provide improved access for maize researchers. The hardware for the interface has been updated to allow for quicker and more reliable access to the data. Work carried out by the MaizeGDB team has resulted in improved communication among maize researchers worldwide, increased ability to document the results of experiments, and increased availability of information relative to high impact research. Accomplishments 01 MaizeGDB provides in-depth support for the maize representative reference genome. Since 2009, there have been four versions of the representative maize reference genome B73. In 2017, ARS researchers from Cold Spring Harbor, New York released the fourth version (RefGen_v4). This assembly was based on longer sequence reads and represents a major leap forward in whole-genome sequencing technology which will impact MaizeGDB�s role in genome assembly stewardship. MaizeGDB has updated all of its tools and resources to the latest version of B73. MaizeGDB incorporated the current B73 reference assembly into the Genome Reference Consortium (GRC) which has tools to visualize the quality of an assembly and to explore and fix assembly issues. Tools provide a resource that leverages the knowledge and expertise in the maize research community to improve the reference assembly, which in turn improves research in maize genetics, genomics, and breeding. This provides a path forward to creating future B73 genome assemblies driven by input from the maize research community. 02 Data for nine maize genome assemblies are available at MaizeGDB. In addition to the maize reference genome that has been publically available since 2009, several additional reference-quality genomes are now available to the research community. MaizeGDB currently has information on 11 maize genomes. ARS and Iowa State University personnel from Ames, Iowa have developed a plan to standardize information related to maize assemblies. To address the long-term stability of these genomes, MaizeGDB provides ways to visually explore the genomes (genome browsers), search based on sequence similarity (BLAST tools), webpages with curated data for the genome and genes, and consistent nomenclature, standardized data downloads, and quality statistics. MaizeGDB also handles the submissions for many of these reference quality assemblies to a federally funded data repository (GenBank). These resources provide long-term support, stability, and maintenance to maize reference assemblies that allow researchers to better utilize the data. 03 The MaizeGDB Genotype Visualization tool. One of the priorities set by the Breeders Tools and Resources survey conducted in 2015 was displaying DNA changes in a region for a given list of lines. To this end, ARS and Iowa State University personnel in Ames, Iowa, developed a tool to help maize researchers select a customized set of maize lines and a genomic region, and visualize DNA changes for that genomic region. The datasets contain approximately 1 million regions of DNA diversity for over 17,000 public maize lines. The tool has been upgraded to improve the flexibility of searches, produce more detailed results, and provide estimates on query runtimes. This tool has utility for maize geneticists trying to identify and clone genes of interest, relate genomic regions to phenotype, and understand the diversity in maize. This tool will help maize breeders identify appropriate maize lines to use in breeding projects which will improve available germplasm for researchers and farmers. 04 MaizeGDB has updated genome nomenclature rules. The addition of numerous whole genome assemblies and their associated gene models has necessitated a rethinking of the current nomenclature rules that are in effect for Zea mays and closely associated Zea species. It is important to establish new nomenclature early, as non-standardized names create confusion in both the community and the literature. ARS personnel from Ames, Iowa has expanded the current nomenclature rules and in doing so, has tried to take into account what has worked well for other model organism databases. The expanded nomenclature rules have been accepted by the Maize Nomenclature Committee and adopted by all of the most recent maize sequencing projects. These new rules take into account the many issues and complications that arise when hundreds of new whole- genome assemblies and their associated gene models become available. The standardized nomenclature will facilitate multiple genome wide comparisons that will provide insight and future improvements to the publically available maize germplasm. 05 ARS staff trained, informed, and provided collaboration opportunities to students and scientists at workshops during the Annual Maize Genetics Conference. ARS personnel hosted three workshops titled �The Maize Genomes�, �Losing the Fear of the Command Line�, and �MaizeTools & Resources�. The �Maize Genomes� workshop had eight speakers presenting on the reference-quality genomes that are available to the research community. These reference assemblies are very valuable to the maize research community, but to best utilize these resources, the groups need to work together to provide a common set of nomenclature, data sets, and tools. The �Losing the Fear of the Command Line� workshop was designed to give a hands-on experience of using command line. The workshop was led by four members of the MaizeGDB staff, along with seven volunteers from the maize research community. The �Maize Tools & Resources� workshop had nine speakers give short 10-15 minute talks on data, tools, and resources available for the maize research community. ARS staff spearheaded an effort to organize these workshops to train, inform, and foster collaboration within the maize research community. Each of these workshops took place in St. Louis, Missouri prior to the 59th Annual Maize Genetics Conference. Greater communication and transparency between these groups and within the community will lead to greater collaboration and improved quality of research for our stakeholders.
Impacts (N/A)
Publications
- Sen, T.Z., Braun, B., Schott, D., Portwood II, J.L., Schaeffer, M.L., Harper, E.C., Cannon, E.K., Andorf, C.M. 2017. Surveying the maize community for their diversity and pedigree visualization needs to prioritize tool development and curation. Database: The Journal of Biological Databases and Curation. doi: 10.1093/database/bax031.
- Walsh, J.R., Schaeffer, M.L., Zhang, P., Rhee, S.Y., Dickerson, J.A., Sen, T.Z. 2016. The quality of metabolic pathway resources depends on initial enzymatic function assignments: a case for maize. BMC Systems Biology. 10:129. doi: 10.1186/s12918-016-0369-x.
|
Progress 10/01/15 to 09/30/16
Outputs Progress Report Objectives (from AD-416): Objective 1: Support stewardship of maize genome sequences and forthcoming diverse maize sequences. Objective 2: Create tools to enhance access to expanded datasets that reveal gene function and datasets for genetic and breeding analyses. Objective 3: Deploy tools to increase user-specified flexible queries. Objective 4: Provide community support services, training and documentation, meeting coordination, and support for community elections and surveys. Objective 5: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources. Approach (from AD-416): For applied researchers to benefit from basic investigations, generated data must be made freely and easily accessible. MaizeGDB, the Maize Genetics and Genomics Database (http://www.maizegdb.org), is the research community�s central repository for genetics and genomics information. The overall aim of our work is to create and maintain unified public resources that facilitate access to the outcomes of maize research. We will support reference genome stewardship within the context of extensive genomic diversity by adopting, developing, and deploying tools that enable community members to annotate and document updates to the genome and by developing and deploying genome visualization tools that enable user-friendly interaction with reference genome and diversity data. We will deploy datasets and tools that reveal gene function and support genetic and breeding analyses by adopting and populating network analysis software to support all types of data that can be represented by such means including gene networks, interaction data, and pedigree information. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee. ARS scientists working on the Maize Genetics and Genomics Database (MaizeGDB) in Ames, Iowa, Columbia, Missouri, and Albany, California improved tools that make the maize genome sequence useful for investigative research and crop improvement. Genome sequences served include the latest version of the B73 reference genome, the upcoming release of W22, B104, and CML247, and skim sequence of thousands of various other lines and individuals that represent the broad diversity represented by the Zea genus (i.e., maize and its near relatives). Members of the MaizeGDB team improved and developed tools to visual pedigree relationships, visual maize diversity data (large SNP-based datasets), and research capabilities at MaizeGDB. In addition, staff curated over 15 datasets from the maize community to be added to the MaizeGDB genome browser. The current versions of the genome browser have over 75 datasets (tracks). This allows researchers to leverage research outcomes from many different sources and data types all within the context of the maize reference assembly. To enable a better understanding of how the genes in a plant define the potential phenotypes that will be observed in farmers� fields, we maintain two pathway view tool suites: CornCyc and MaizeCyc. These resources help researchers determine which genes and pathways to select for targeted crop improvement. The new, redesigned graphical user interface of MaizeGDB has continually been updated to provide improved access for maize researchers. The hardware for the interface has been updated to allow for quicker and more reliable access to the data. Work carried out by the MaizeGDB team has resulted in improved communication among maize researchers worldwide, increased ability to document the results of experiments, and increased availability of information relative to high impact research. Accomplishments 01 Maize integrated into the Genome Reference Consortium. ARS researchers from Cold Spring Harbor released the third version of maize reference genome. Although the reference assembly continuously improves each version, MaizeGDB incorporated the current B73 reference assembly into the Genome Reference Consortium (GRC); The GRC has tools to visualize the quality of an assembly and to explore and fix assembly issues. A set of standard operating procedures (SOPs) exist for handling publically submitted assembly issues, informing researchers about progress on resolving the issues, and for releasing resolved issues as patch assembly releases. MaizeGDB has collected 547 assembly and annotation issues to date, from the literature and directly from the maize research community. An important aspect of a patch release is that it does not change assembly coordinates but does give the community access to improvements between major releases of the assembly. These tools provide a resource that leverages the knowledge and expertise in the maize research community to improve the reference assembly, which in turn improves research in maize genetics, genomics, and breeding. 02 Genome stewardship support for maize genomes. Several additional reference-quality genomes will soon be available to the research community. Potentially hundreds to thousands of new genomes may be available in the future. There is a strong need to standardize information related to maize assemblies and a resource to provide long- term support, maintenance, and stability to the assemblies. MaizeGDB has worked closely with three of these projects (W22, CML247, and B104) to develop a plan to meet the above needs. To address the long-term stability of these genomes, MaizeGDB is handling the GenBank submissions for each of these three reference quality assemblies. The W22 genome was successfully submitted to GenBank. MaizeGDB has developed web pages for maize genomes. These web pages provide a centralized hub for maize genome assemblies. The content of the web pages include consistent nomenclature, standardized data downloads, quality statistics, genome browsers, sequence alignment tools, and general annotation pages. This resource provides long-term support, stability, and maintenance to maize reference assemblies that allow researchers to better utilize the data. 03 The MaizeGDB Genotype Visualization Tool. One of the priorities set by the Breeders Tools and Resources survey conducted in 2015 was displaying DNA changes in a region for a given list of lines. To this end, the MaizeGDB Team in Ames, Iowa, developed a tool to help maize researchers to select a customized set of maize lines and a genomic region, and visualize DNA changes for that genomic region. The datasets contain approximately 1 million regions of DNA diversity for over 17, 000 public maize lines. This tool has utility for maize geneticists trying to identify and clone genes of interest, relate genomic regions to phenotype, and understand the diversity in maize. This tool will help maize breeders identify which maize lines to cross which will improve available germplasm for researchers and farmers. 04 The MaizeGDB Pedigree Viewer. A priority set by the Breeders Tools and Resources survey was to display pedigree relationships. In response to this stakeholder feedback, ARS in Ames, Iowa, created the MaizeGDB Pedigree Viewer. The Viewer is based on a pedigree network of 5,487 maize lines that are currently available in the MaizeGDB Stock Pages. The viewer shows a network view and allows the user to apply a number of filters, select or upload their own breeding relationships, center a pedigree network on a maize line, and display a shortest path between two maize lines on the pedigree network. This tool will help maize breeders visual pedigree relationships between maize germplasm and help identify which lines to cross. This will lead to improved germplasm for researchers and farmers. 05 New Zea mays genome assemblies workshop. Several additional reference- quality genomes will soon be available to the research community. These reference assemblies will be very valuable to the maize research community, but to best utilize these resources the groups need to work together to provide a common set of nomenclature, data sets, and tools. MaizeGDB ARS staff in Ames, Iowa, spearheaded an effort to organize a workshop to inform and foster collaboration within the maize research community regarding new reference quality maize genome assemblies. This workshop took place in Jacksonville, Florida prior to the 58th Annual Maize Genetics Conference. Representation for seven different labs working with new maize genome assemblies and annotations presented at this workshop. Greater communication and transparency between these groups will lead to greater collaboration and improve the utility of future maize genome assemblies for our stakeholders.
Impacts (N/A)
Publications
- Jones, D.C., Zheng, W., Huang, S., Du, C., Zhao, X., Yennamalli, R., Sen, T.Z., Nettleton, D., Wurtele, E.S., Li, L. 2016. A clade-specific Arabidopsis gene connects primary metabolism and senescence. Frontiers in Plant Science. 7:983. doi: 10.3389/fpls.2016.00983.
- Andorf, C.M., Cannon, E., Portwood II, J.L., Gardiner, J.M., Harper, E.C., Schaeffer, M.L., Braun, B.L., Campbell, D.A., Vinnakota, A.G., Sribalusa, V.V., Huerta, M., Cho, K., Wimalanathan, K., Richter, J.D., Mauch, E.D., Rao, B.S., Birkett, S.M., Sen, T.Z., Lawrence, C.J. 2016. MaizeGDB update: New tools, data, and interface for the maize model organism database. Nucleic Acids Research. 44 (D1):D1195-201. doi:10.1093/nar/gkv1007.
- Harper, E.C., Gardiner, J., Andorf, C.M., Lawrence, C.J. 2016. MaizeGDB: The Maize Genetics and Genomics Database. In: Edwards, D. Plant Bioinformatics, Methods and Protocols. 2nd Edition. New York, NY: Humana Press. p.187-202.
|
Progress 10/01/14 to 09/30/15
Outputs Progress Report Objectives (from AD-416): Objective 1: Support stewardship of maize genome sequences and forthcoming diverse maize sequences. Objective 2: Create tools to enhance access to expanded datasets that reveal gene function and datasets for genetic and breeding analyses. Objective 3: Deploy tools to increase user-specified flexible queries. Objective 4: Provide community support services, training and documentation, meeting coordination, and support for community elections and surveys. Objective 5: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources. Approach (from AD-416): For applied researchers to benefit from basic investigations, generated data must be made freely and easily accessible. MaizeGDB, the Maize Genetics and Genomics Database (http://www.maizegdb.org), is the research community�s central repository for genetics and genomics information. The overall aim of our work is to create and maintain unified public resources that facilitate access to the outcomes of maize research. We will support reference genome stewardship within the context of extensive genomic diversity by adopting, developing, and deploying tools that enable community members to annotate and document updates to the genome and by developing and deploying genome visualization tools that enable user-friendly interaction with reference genome and diversity data. We will deploy datasets and tools that reveal gene function and support genetic and breeding analyses by adopting and populating network analysis software to support all types of data that can be represented by such means including gene networks, interaction data, and pedigree information. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee. ARS scientists working on the Maize Genetics and Genomics Database (MaizeGDB) in Ames, Iowa, Columbia, Missouri, and Albany, CA improved tools that make the maize genome sequence useful for investigative research and crop improvement. Genome sequences served include the new version of the B73 reference genome as well as skim sequence of thousands of various other lines and individuals that represent the broad diversity represented by the Zea genus (i.e., maize and its near relatives). Members of the MaizeGDB team improved phenotypic descriptions by applying standardized terms that are used to describe traits in all flowering plants. Use of these terms enables cross-species queries of traits and phenotypes shared among various plants, which enables multiple plant data repositories to be searched simultaneously. This allows researchers to leverage research outcomes from, e.g., rice investigations to hypothesize function in maize. This effort is one example of how scientists at MaizeGDB support Open Data for agriculture. Documentation of genome assembly and gene model errors was maintained in an issue collection system deployed at MaizeGDB. Issues are being reported on relevant record pages, including a record of the work being done to resolve the error. To enable a better understanding of how the genes in a plant define the potential phenotypes that will be observed in farmers� fields, we maintain two pathway view tool suites: CornCyc and MaizeCyc. These resources help researchers determine which genes and pathways to select for targeted crop improvement. The new, redesigned graphical user interface of MaizeGDB is available to provide improved access for maize researchers. Work carried out by the MaizeGDB team has resulted in improved communication among maize researchers worldwide, increased ability to document the results of experiments, and increased availability of information relative to high impact research. Accomplishments 01 MaizeGDB�s pathway view tool suite now includes new, improved metabolic pathway resources for maize. ARS researchers in Ames, IA, Columbia, MO, and Albany, CA in collaboration with researchers at Iowa State University and Carnegie Institute of Science at Stanford University have deployed an improved metabolic pathway resource: CornCyc. CornCyc is useful to decipher how enzymes interact to create plant phenotypes. The resource relies on predicted and experimentally confirmed pathway information. Metabolism annotation of CornCyc for experimentally verified maize genes were performed for enzymes in photosynthesis, carbohydrate metabolism, and response to diseases and pests. This resource provides a view of how gene products function to help researchers determine which genes and/or biochemical pathways might be good targets for crop improvement. 02 Survey to identify information for corn breeders. MaizeGDB conducted a �Diversity and Pedigree Visualization Tools for Breeders� survey on behalf of the Maize Genetics Executive Committee to guide the tool and resources development for a Corn Breeders Data Center at MaizeGDB. The anticipated tools will allow visualization of diversity, pedigrees, and network representations. The survey has been developed by ARS researchers in Ames, IA, Columbia, MO, Albany, CA, and Ithaca, NY in collaboration with academic researchers at Iowa State University and University of Wisconsin-Madison. Survey results indicated the needs of maize researchers, especially corn breeders visually integrating breeding and genomic information in order to identify markers to facilitate the development of better corn varieties. 03 MaizeGDB interface redesign. MaizeGDB team completed a multi-year effort to redesign the web interface and underlying infrastructure. The redesign expands the overall functionality of the MaizeGDB website while simultaneously creating a clean, modern interface with enhanced user interaction and better response times. The redesign involved creating a new look and feel as well as reorganizing existing data and incorporating new data, data types, and analysis tools (including, e.g., gene models, diversity data, and functional genomics datasets) into the MaizeGDB resource. A key component to the redesign has been community involvement by offering their perspectives via email, website feedback, and personal interactions. Over a 6-month period since the release of the interface the usage at MaizeGDB has increased over 60%, which demonstrates that the interface became more useful for maize researchers. 04 New maize genome assembly. MaizeGDB team has integrated the latest version of the maize reference assembly (B73 RefGen_v3) into the maize community database. The over 2.3 billion base-pair assembly has over 100,000 structural annotations. Many of these annotations have been integrated with known maize genes. MaizeGDB also created a new instance of the MaizeGDB Genome Browser for the latest assembly that currently has over 35 tracks including data related to assembly features, diversity, proteomics, expression, structural annotations, genetic maps, insertions, and repetitive elements. MaizeGDB has collaborated with the Genome Reference Consortium to facilitate reporting of assembly issues to help improve the next version of the assembly as more accuracy in genome assembly leads to better identification of genes responsible of agriculturally important traits. 05 MaizeGDB staff spearheaded an effort to organize a workshop to further collaboration in Big Data between ARS scientist in data-intensive research projects from Ames, IA, Albany, CA, Cold Spring Harbor, NY, among others in January 2015 in Plant and Animal Genome Conference. ARS agricultural database personnel face similar Big Data-related challenges, the subgroups are formed to increase collaboration between multiple research projects and impacting how we manage, analyze, and visualize Big Data. Greater communication will lead to greater collaboration among databases to overcome common challenges in data management and representation, and will improve the ease-of-use and usefulness of the agricultural databases for the stakeholders.
Impacts (N/A)
Publications
- Law, M., Childs, K.L., Campbell, M.S., Stein, J.C., Olson, A.J., Holt, C., Panchy, N., Lei, J., Jiao, D., Andorf, C.M., Lawrence, C.J., Ware, D., Shiu, S., Sun, Y., Jiang, N., Yandell, M. 2015. Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes. Plant Physiology. 167(1):25-39.
- Andorf, C.M., Kopylov, M., Dobbs, D., Koch, K., Stroupe, M., Lawrence, C.J. , Bass, H. 2014. G-quadruplex (G4) motifs in the maize (Zea mays L.) genome are enriched at specific locations in thousands of genes coupled to energy status, hypoxia, low sugar, and nutrient deprivation. Journal of Genetics and Genomics. 41(12):627-647. DOI: 10.1016/j.jgg.2014.10.004.
|
Progress 10/01/13 to 09/30/14
Outputs Progress Report Objectives (from AD-416): Objective 1: Support stewardship of maize genome sequences and forthcoming diverse maize sequences. Objective 2: Create tools to enhance access to expanded datasets that reveal gene function and datasets for genetic and breeding analyses. Objective 3: Deploy tools to increase user-specified flexible queries. Objective 4: Provide community support services, training and documentation, meeting coordination, and support for community elections and surveys. Objective 5: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources. Approach (from AD-416): For applied researchers to benefit from basic investigations, generated data must be made freely and easily accessible. MaizeGDB, the Maize Genetics and Genomics Database (http://www.maizegdb.org), is the research community�s central repository for genetics and genomics information. The overall aim of our work is to create and maintain unified public resources that facilitate access to the outcomes of maize research. We will support reference genome stewardship within the context of extensive genomic diversity by adopting, developing, and deploying tools that enable community members to annotate and document updates to the genome and by developing and deploying genome visualization tools that enable user-friendly interaction with reference genome and diversity data. We will deploy datasets and tools that reveal gene function and support genetic and breeding analyses by adopting and populating network analysis software to support all types of data that can be represented by such means including gene networks, interaction data, and pedigree information. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee. ARS scientists working on the Maize Genetics and Genomics Database (MaizeGDB) in Ames, Iowa, Columbia, Missouri, and Albany, CA improved tools that make the maize genome sequence useful for investigative research and crop improvement. Genome sequences served include the new version of the B73 reference genome as well as the Mexican Palomero Toluqueno and skim sequence of thousands of various other lines and individuals that represent the broad diversity represented by the Zea genus (i.e., maize and its near relatives). Members of the MaizeGDB team improved phenotypic descriptions by applying standardized terms that are used to describe traits in all flowering plants. Use of these terms enables cross-species queries of traits and phenotypes shared among various plants, which enables multiple plant data repositories to be searched simultaneously. This allows researchers to leverage research outcomes from, e.g., rice investigations to hypothesize function in maize. This effort is one example of how scientists at MaizeGDB support Open Data for agriculture. A high-availability infrastructure that obviates the need for disaster recovery was deployed to ensure availability of MaizeGDB for researchers� use. To enable a better understanding of how the genes in a plant define the potential phenotypes that will be observed in farmers� fields, we maintain two pathway view tool suites: CornCyc and MaizeCyc. These resources help researchers determine which genes and pathways to select for targeted crop improvement. The new, redesigned graphical user interface of MaizeGDB is available to provide improved access for maize researchers. Work carried out by the MaizeGDB team has resulted in improved communication among maize researchers worldwide, increased ability to document the results of experiments, and increased availability of information relative to high impact research. Significant Activities that Support Special Target Populations: One ARS researcher in Ames, IA served as a judge of high school projects for the Iowa State Science & Technology Fair, where 62% of the exhibitors are female. Accomplishments 01 MaizeGDB full website redesign completed and released to the public. ARS researchers in Ames, IA, Columbia, MO, and Albany, CA in collaboration with researchers at Iowa State University have improved the look and feel of the new MaizeGDB website. It is more modern, the technologies are updated to modern software and systems, the database is separated from the interface (which will enable a facile near future transition from an expensive database system to an open source and inexpensive alternative) and the design enables access to data stored off-site from within a context-appropriate site. The new release will significantly enhance efforts made by maize geneticists to improve the crop. 02 A new version of the maize genome assembly (Version 3) is now available at the MaizeGDB Genome Browser with over 25 tracks that have recently been added. The new and improved assembly is created with the help of ARS researchers located in Cold Spring Harbor, NY. This higher-quality assembly will enable maize researchers to identify genomic contexts and regulation of agronomically important genes. 03 MaizeGDB�s pathway view tool suite now includes new, improved metabolic pathway resources for maize. ARS researchers in Ames, IA, Columbia, MO, Albany, CA, and Cold Spring Harbor, NY in collaboration with researchers at Iowa State University, Oregon State University, and Stanford University have deployed two pathway view tools: CornCyc and MaizeCyc. Both are useful to decipher the metabolic pathways and gene product interactions encoded by the B73 reference genome sequence. Both rely on predicted and experimentally confirmed pathway information. These resources provide a view of how gene products function to help researchers determine which genes and/or biochemical pathways might be good targets for crop improvement. 04 Training scientist abroad. MaizeGDB staff trained 105 Chinese, Tawainese and Korean scientists in the use of tools and resources at MaizeGDB. China�s largest crop is now maize, and the use of MaizeGDB by Chinese researchers is second only to use by American researchers. These new contacts have facilitated the transfer of data between China and the United States. Access to Big Data now generated in China will help researchers in the United States to design breeding populations and to understand function of agronomical important genes.
Impacts (N/A)
Publications
|
|