Source: CLEMSON UNIVERSITY submitted to NRP
COMPLETION OF THE PEACH GENOME DATABASE: A REFERENCE GENOME FOR ROSACEAE
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0203079
Grant No.
2005-35300-15452
Cumulative Award Amt.
(N/A)
Proposal No.
2005-00756
Multistate No.
(N/A)
Project Start Date
May 1, 2005
Project End Date
Apr 30, 2009
Grant Year
2005
Program Code
[52.1]- (N/A)
Recipient Organization
CLEMSON UNIVERSITY
(N/A)
CLEMSON,SC 29634
Performing Department
BIOCHEMISTRY MICROBIOLOGY AND MOLECULAR GENETICS
Non Technical Summary
The importance of high-quality fruit, and the intrinsic difficulties of breeding in a perennial species, requires linkage mapping technology for the sustained improvement of peach and other fruiting trees. Knowledge of the genetic basis of the traits and their tagging with molecular markers permits reduction of effort and time to complete introgression of important traits to produce a new improved variety. Thus, field evaluation is limited to trees containing the genes of interest, significantly reducing the costs associated with maintaining undesirable trees to maturity. The ability to preselect seedlings, using DNA based markers, for fruit quality traits while introgressing traits, such as biotic and abiotic stress resistance, speeds the development of commercially acceptable cultivars. In the study of Rosaceae genomics, we established a consortium of laboratories in the US and worldwide. These laboratories are carrying out research on genome organization of agriculturally important fruiting plant species. This consortium has as its primary objectives: 1) The integration of genetic information of the individual species into a common genomics database that will enable fruit breeders to optimize breeding schemes based on marker assisted selections in any Rosaceae fruit crop. 2) Establishment of the peach genome as the core physically mapped genome of this database to provide the structural and functional genomics tools for research and fruit breeding programs of the many economically important Rosaceae species. Both these comprise the overall goal of this project.
Animal Health Component
(N/A)
Research Effort Categories
Basic
40%
Applied
(N/A)
Developmental
60%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
20111141080100%
Knowledge Area
201 - Plant Genome, Genetics, and Genetic Mechanisms;

Subject Of Investigation
1114 - Peach;

Field Of Science
1080 - Genetics;
Goals / Objectives
The goal of our research is to develop peach as a model genetic resource for the identification, characterization and cloning of important genes of Rosaceae species. For this purpose the utility of having as complete a genomics database (integrated genetic/physical map and mapped EST database) for a Rosaceae model species genome is clearly without question. A physical map serves as an ideal tool to cross compare maps of different species and to identify cloned genomic regions containing important gene loci thus facilitating the process of gene marking and gene discovery in related species. In addition, physical maps provide marker resources that can serve to bridge the gap from the mapping of specific characters to the implementation of marker assisted breeding schemes. For this reason, in the previous grant award in 2001, we proposed to initiate construction of a physical map database for peach and to anchor the physical map on the general Prunus molecular genetic map. At this point, we have completed an initial a framework physical map anchored on the genetic map and are in the process of merging BAC contigs to finalize an initial anchored physical map of the genome. However, to provide the most complete model genomics database for peach it is necessary to complete BAC tiling paths across each peach chromosome; to place on the map as many unique genes as possible to serve as a candidate gene database; to integrate all this genomic information into a publicly accessible database; and to develop a highly saturated mapped set of genetic markers to enable breeders to easily manipulate important genes for crop improvement of Rosaceae species. With these goals in mind, we will carry out the following specific aims: 1. Complete construction of the physical/genetic map of peach using HICF (High information content fingerprinting) of the existing physical map framework contig clones and those of larger insert libraries of the haploid cultivar Lovell. 2. Incorporate the physical map data into the Genome Database for Rosaceae (GDR) to provide a reference genome for identification and cloning of genes important to Rosaceous crop development and sustainability. 3. Complete development of a high density genetic marker set anchored on the physical map to provide the tools necessary for marker assisted selection, comparative mapping and molecular map development in the less well characterized Rosaceae species.
Project Methods
Specific aim 1: At this juncture we have over 60% of the peach genome in statistically high confidence (e-10to e-12 cutoff) contigs utilizing BAC DNAs prepared and fingerprinted on acrylamide using published procedures. In order to finalize the peach physical map, we will employ the technology of HICF technique. Utilizing this technology we can rapidly fingerprint the haploid Lovell BAC library and by refingerprinting with HICF our former contigs developed in our initial mapping efforts, we expect to merge former results and complete the peach physical map. Having the HICF fingerprinted initial high confidence contigs (developed by fingerprinting and hybridization of an excess of 4000 probes) covering a substantial portion of the genome serves as a framework to substantiate the physical map build based on the addition of an 20,000-30,000 fingerprints from a new lovell library we will construct with a different enzyme and projected average insert size of 125 kb. The fingerprints will be assembled through the use of the standard bioinformatics programs [Gene Profiler and FPC(V4.7)] and final contigs merged using a combination of EST hybridization data, BAC end sequencing, and hybridization of mapped genetic markers. Specific aim 2: The Prunus Genome Database utilizes the Oracle database management system. To date, we have designed a relational schema that models the various biological entities relevant the peach genome database. This includes data to support: (1) genetic maps and associated genetic markers of all crops that belong to Prunus; (2) peach physical maps and associated genetic marker and EST clone probes; (3) breeding crosses and their segregating populations; (4) EST sequence and analysis results (SSRs, consensus sequences, homology, unigenes); (5) publications and collaborator information and peach genomic links. All data generated in this project is housed in the Prunus genome database publically available in the Genome Database for Rosaceae (www.genome.clemson.edu/GDR) Specific aim 3: We have available the 460 sequences for all the SSR loci currently on the general Prunus map. Physical mapping of known SSR loci: We will use overgo probes from the unique flanking sequences of SSR to identifying the corresponding peach BACs carrying these SSR loci. We will employ an overgo hybridization strategy to locate the genetically mapped set of Prunus SSRs on the physical map. Overgo probes to SSR flanking sequences are designed essentially with protocols outlined for overgo construction from BAC end sequences using the online program Overgo Maker (J.D. McPherson, http://genome.wustl.edu/gsc/overgo/overgo.html) after checking for repetitive sequences by BLAST search and Repeatmasker (A.F.A. Smit and P. Green, http://ftp.genome.washington.edu/cgi-bin/ReapeatMasker). Labeling and hybridization of overgo probes on BAC filters are performed following the protocols described by J.D. McPherson at http://www.tree.caltech.edu/protocols/overgo.html. We will essentially employ similar strategies of overgo probe pooling as we have done previously to map approximately 4,000 ESTs.

Progress 05/01/05 to 04/30/09

Outputs
OUTPUTS: Integrated physical/genetic maps are of critical importance for high-throughput EST mapping, QTL fine-mapping and effective positional cloning of genes. To construct the physical map for peach we employed essentially the strategies utilized to develop the physical maps for the Arabidopsis thaliana and Drosophila melanogaster. The approach combines hybridization of the genetically mapped markers with BAC DNA fingerprinting and in our case, hybridization of EST sequences as well. Manual sequencing gel-based fingerprinting has been proven a reliable and cost-effective technique for BAC fingerprinting and under certain circumstances performs better than other traditional fingerprinting methods. An initial acrylamide gel-based physical framework for peach was established and released recently. This framework was based on random fingerprinting of 3x peach genome equivalents, covered at least 50% of the genome and included hybridization data for 673 of 3,384 ESTs of the peach unigene set (PP LE). On this framework, genetically anchored BAC contigs provided landmarks for a Prunus-Arabidopsis microsynteny study and for further development of the Prunus transcript map. Since the first release, the peach physical framework underwent further enhancements. We have integrated global hybridization data that includes additional ESTs AFLPs (amplified fragment length polymorphism), SSRs (microsatellites), gene specific genomic probes and overgo probes derived from the BAC- end sequences. The total number of markers incorporated into the physical framework was increased up to 2,636 markers. Selectively, we have fingerprinted all hybridization positive BACs omitted during random fingerprinting. As a result, we incorporated an additional 1x peach genome equivalent composed of marker-positive BACs. Finally, we took an advantage of a HICF (a high-information content fingerprinting) technique along with improved FPC v8.5.2 software to increase an average number of bands per BAC clone and improve accuracy in contig assembly. The initial HICF physical map for the peach consists of 2,138 contigs. Of these, 252 contigs are anchored to eight linkage groups of the Prunus reference map. The physical length of physical map contigs has been estimated at 303 Mb which exceeds the estimated size of the peach genome (~270 Mb). Due to the abundance of hybridization data the HICF physical map for peach is biased to the expressed genome regions and thus substantially covers the euchromatic portion of the peach genome. Currently, we are completing the hybridization of genetically mapped SSR loci to merge and place orphan contigs to specific locations of the Prunus general genetic map. All the map data is housed in the GDR (www.rosaceae.org). PARTICIPANTS: This project utilized a number of partner organizations, Dr. Bryon Sosinski at North Carolina State University, Dr. Pere Arus at the IRTA in Cabrils Spain, Dr. Dorrie Main and Dr. Sook Jung at Washington State University. Through this project, at least 5 undergraduate students, four graduate students at Clemson , 5 graduate students from other countries, five postdoctoral associates and 6 visiting scientists benefited in their professional development from working on this project. In additional, this project hosts a number of continuing collaborations with scientists around the world in almost all Western European and Asian countries. TARGET AUDIENCES: This work is targeted at providing the most current molecular genetic resources for peach. Those individuals that benefit from this information include basic Rosaceae researchers, fruit tree breeders, scientists interested in individual candidate genes controlling characters important to fruit tree growth and sustainability, molecular plant geneticists and eventually through the application of this information in breeding programs, the consumer. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
This work directly impacts the fruit industry in the US and world by providing the tools to identify and manipulate genes critical to the fruit tree industry. This work integrates the mapping information on important fruiting tree characters of interest to breeding programs directly with genes that control these characters. This allows a streamlining of the breeding process important for the introduction or enhancement of genes that control characters such as disease resistance, fruit quality, tree architecture, and growth characteristics. These maps and tools developed in this program are made publicly available through the GDR at www.rosaceae.org. Additionally, the whole genome sequence of the peach is substantially complete and under assembly. This physical/genetic map will provide critical information for final assembly of the genome sequence.

Publications

  • No publications reported this period


Progress 05/01/07 to 04/30/08

Outputs
OUTPUTS: Integrated physical/genetic maps are of critical importance for high-throughput EST mapping, QTL fine-mapping and effective positional cloning of genes. To construct the physical map for peach we employed essentially the strategies utilized to develop the physical maps for the Arabidopsis thaliana and Drosophila melanogaster. The approach combines hybridization of the genetically mapped markers with BAC DNA fingerprinting and in our case, hybridization of EST sequences as well. Manual sequencing gel-based fingerprinting has been proven a reliable and cost-effective technique for BAC fingerprinting and under certain circumstances performs better than other traditional fingerprinting methods. An initial acrylamide gel-based physical framework for peach was established and released recently. This framework was based on random fingerprinting of 3x peach genome equivalents, covered at least 50% of the genome and included hybridization data for 673 of 3,384 ESTs of the peach unigene set (PP LE). On this framework, genetically anchored BAC contigs provided landmarks for a Prunus-Arabidopsis microsynteny study and for further development of the Prunus transcript map. Since the first release, the peach physical framework underwent further enhancements. We have integrated global hybridization data that includes additional ESTs AFLPs (amplified fragment length polymorphism), SSRs (microsatellites), gene specific genomic probes and overgo probes derived from the BAC- end sequences. The total number of markers incorporated into the physical framework was increased up to 2,636 markers. Selectively, we have fingerprinted all hybridization positive BACs omitted during random fingerprinting. As a result, we incorporated an additional 1x peach genome equivalent composed of marker-positive BACs. Finally, we took an advantage of a HICF (a high-information content fingerprinting) technique along with improved FPC v8.5.2 software to increase an average number of bands per BAC clone and improve accuracy in contig assembly. The initial HICF physical map for the peach consists of 2,138 contigs. Of these, 252 contigs are anchored to eight linkage groups of the Prunus reference map. The physical length of physical map contigs has been estimated at 303 Mb which exceeds the estimated size of the peach genome (~270 Mb). Due to the abundance of hybridization data the HICF physical map for peach is biased to the expressed genome regions and thus substantially covers the euchromatic portion of the peach genome. Currently, we are completing the hybridization of genetically mapped SSR loci to merge and place orphan contigs to specific locations of the Prunus general genetic map. All the map data is housed in the GDR (www.rosaceae.org). PARTICIPANTS: This project utilized a number of partner organizations, Dr. Bryon Sosinski at North Carolina State University, Dr. Pere Arus at the IRTA in Cabrils Spain, Dr. Dorrie Main and Dr. Sook Jung at Washington State University. Through this project, at least 5 undergraduate students, four graduate students at Clemson , 5 graduate students from other countries, five postdoctoral associates and 6 visiting scientists benefited in their professional development from working on this project. In additional, this project hosts a number of continuing collaborations with scientists around the world in almost all Western European and Asian countries. TARGET AUDIENCES: This work is targeted at providing the most current molecular genetic resources for peach. Those individuals that benefit from this information include basic Rosaceae researchers, fruit tree breeders, scientists interested in individual candidate genes controlling characters important to fruit tree growth and sustainability, molecular plant geneticists and eventually through the application of this information in breeding programs, the consumer. PROJECT MODIFICATIONS: No major changes to the proposed project.

Impacts
This work directly impacts the fruit industry in the US and world by providing the tools to identify and manipulate genes critical to the fruit tree industry. This work integrates the mapping information on important fruiting tree characters of interest to breeding programs directly with genes that control these characters. This allows a streamlining of the breeding process important for the introduction or enhancement of genes that control characters such as disease resistance, fruit quality, tree architecture, and growth characteristics. These maps and tools developed in this program are made publicly available through the GDR at www.rosaceae.org. Additionally, the whole genome sequence of the peach is substantially complete and under assembly. This physical/genetic map will provide critical information for final assembly of the genome sequence.

Publications

  • Jung, S., M. Staton, T. Lee, A. Blenda, R. Svancara, A. Abbott and D. Main. 2007. GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data. 2007. Nucleic Acids Res. pg, 1-7
  • Bielenberg DG, Wang Y, Li Z, Zhebentyayeva T, Fan S, Reighard GL, Scorza R, Abbott AG (2008) Sequencing and annotation of the evergrowing locus in peach [Prunus persica (L.) Batsch] reveals a cluster of six MADS-box transcription factors as candidate genes for regulation of terminal bud formation. (2008) Tree Genetics and Genomes, 4: 495-507
  • Sicard O, Marandel G, Soriano JM, Lalli DA, Lambert P, Salava J, Badenes L, Abbott AG, Decroocq V (2008) Flanking the major Plum pox virus resistance locus in apricot with co-dominant markers (SSRs) derived from candidate resistance genes. Tree Genetics & Genomes 4:359-365
  • Zhebentyayeva TN , G.Swire-Clark, LL Georgi, L Garay, S Jung, S Forrest, AV Blenda, B Blackmon, J Mook, R Horn, W Howad, P. Arus, D Main, JP Tomkins, B. Sosinski, WV Baird, GL Reighard, AG Abbott 2008. A framework physical map for peach, a model Rosaceae species. Tree Genetics and Genomes. DOI: 10.1007/s11295-008-0147-z
  • Zhebentyayeva T. N., Reighard G., Lalli D., Gorina V.M., Krska B., and Abbott A.G. (2008) Origin of resistance to Plum Pox Virus in apricot: what new AFLP and targeted SSR data analyses tell. Tree Genetics & Genomes, 4: 403-417
  • Vladimir Shulaev,* Schuyler S. Korban, Bryon Sosinski, Albert G. Abbott, Herb S. Aldwinckle, Kevin M. Folta, Amy Iezzoni, Dorrie Main, Pere Arus, Abhaya M. Dandekar, Kim Lewers, Susan K. Brown, Thomas M. Davis, Susan E. Gardiner, Daniel Potter, and Richard E. Veilleux 2008. Multiple Models for Rosaceae Genomics. Plant Physiol. July; 147(3):985-1003.


Progress 05/01/06 to 04/30/07

Outputs
Integrated physical/genetic maps are of critical importance for high-throughput EST mapping, QTL fine-mapping and effective positional cloning of genes. To construct the physical map for peach we employed essentially the strategies utilized to develop the physical maps for the Arabidopsis thaliana and Drosophila melanogaster. The approach combines hybridization of the genetically mapped markers with BAC DNA fingerprinting and in our case, hybridization of EST sequences as well. Manual sequencing gel-based fingerprinting has been proven a reliable and cost-effective technique for BAC fingerprinting and under certain circumstances performs better than other traditional fingerprinting methods. An initial acrylamide gel-based physical framework for peach was established and released recently. This framework was based on random fingerprinting of 3x peach genome equivalents, covered at least 50% of the genome and included hybridization data for 673 of 3,384 ESTs of the peach unigene set (PP_LE). On this framework, genetically anchored BAC contigs provided landmarks for a Prunus-Arabidopsis microsynteny study and for further development of the Prunus transcript map. Since the first release, the peach physical framework underwent further enhancements. We have integrated global hybridization data that includes additional ESTs AFLPs (amplified fragment length polymorphism), SSRs (microsatellites), gene specific genomic probes and overgo probes derived from the BAC- end sequences. The total number of markers incorporated into the physical framework was increased up to 2,636 markers. Selectively, we have fingerprinted all hybridization positive BACs omitted during random fingerprinting. As a result, we incorporated an additional 1x peach genome equivalent composed of marker-positive BACs. Finally, we took an advantage of a HICF (a high-information content fingerprinting) technique along with improved FPC v8.5.2 software to increase an average number of bands per BAC clone and improve accuracy in contig assembly. The initial HICF physical map for the peach consists of 2,138 contigs. Of these, 252 contigs are anchored to eight linkage groups of the Prunus reference map. The physical length of physical map contigs has been estimated at 303 Mb which exceeds the estimated size of the peach genome (~270 Mb). Due to the abundance of hybridization data the HICF physical map for peach is biased to the expressed genome regions and thus substantially covers the euchromatic portion of the peach genome. Currently, we are completing the hybridization of genetically mapped SSR loci to merge and place orphan contigs to specific locations of the Prunus general genetic map. All the map data is housed in the GDR (www.rosaceae.org).

Impacts
This work directly impacts the fruit industry in the US and world by providing the tools to identify and manipulate genes critical to the fruit tree industry. This work integrates the mapping information on important fruiting tree characters of interest to breeding programs directly with genes that control these characters. This allows a streamlining of the breeding process important for the introduction or enhancement of genes that control characters such as disease resistance, fruit quality, tree architecture, and growth characteristics. These maps and tools developed in this program are made publicly available through the GDR at www.rosaceae.org. Additionally, the whole genome sequence of the peach is projected to finish in 2008. This physical/genetic map will provide critical information for final assembly of the genome sequence.

Publications

  • Zhebentyayeva, T. N., R. Horn, J. Mook, A. Lecouls, L. Georgi, A.G. Abbott, G. L. Reighard, G. Swire-Clark, and W.V. Baird. 2006. A physical framework for the peach genome. Acta Hort. 713: 83-88.
  • TN Zhebentyayeva1, G.Swire-Clark, LL Georgi, L Garay, S Jung, S Forrest, AV Blenda, B Blackmon, J Mook, R Horn, W Howad, D Main, JP Tomkins, WV Baird, GL Reighard, P. Arus, AG Abbott 2007, A framework physical map for peach, a model Rosaceae species, Submitted Tree Genetics and Genomes
  • T. N. Zhebentyayeva, D. Jiwan, J.H. Jun, D.A. Lalli, S. Forrest, J. Duncan, D. Main, G.L. Reighard, A. Callahan, R. Scorza and A.G. Abbott, 2007, in press Acta Hort.


Progress 05/01/05 to 04/30/06

Outputs
We are constructing a peach physical map utilizing these current BAC library resources. We are employing essentially the strategies utilized to develop the Drosophila physical map and others. The approach utilizes a combination of hybridization of mapped markers and BAC fingerprinting and in our case hybridization of EST sequences as well. With the current Prunus molecular marker map resources, we have completed hybridizing 210 low-copy mapped RFLP markers, 4,000 peach fruit ESTs, resistance gene analogs, and other cDNAs and specific AFLP markers. We completed BAC fingerprinting approximately 35,000 BACs (20,000 from the Nemared library and 15,000 from the haploid Lovell library from which approximately 15,000 have been used to construct an initial physical map. Although the average insert size of the physical mapping clones from both libraries is perceived to be somewhat small, 60 kb for one library, 85 kb for the other, in fact physical mapping with these libraries has proceeded satisfactorily. We have utilized FPC (V4.7) to construct and initial physical map of the peach genome. Initially, we construct the map at a cutoff of e -10to e-12 and tolerance 5 to obtain all high confidence contigs. These are merged by testing end clones at cutoff values ranging from e-8- e-11. As we have a significant amount of hybridization data, we can in many cases establish merges based on common hybridization of BACs in different contigs. In other cases where we have no other data but the fingerprint data, we make note of the merge points for further testing. At this juncture, the framework map developed predominantly from the smaller insert 'Nemared' library is composed of 1,367 contigs containing approximately 11,193 clones. Estimations from FPC evaluation of the data suggest that the coverage of the peach genome is 210-230 Mb. We are currently increasing our fingerprinting to include another 10,000 fingerprints from the haploid Lovell library, as well as, additional fingerprints from another library of the doubled haploid of Lovell. We are now utilizing HICF fingerprinting to develop the final map. We have 831 contigs that contain at least one hybridization marker, helping to verify the contig assembly. As we have included marker hybridization data from the general Prunus genetic map, the developing physical map is anchored to the genetic map. From initial analysis of the integrated genetic/physical map, we already have evidence for duplication of some regions of the peach genome. The developing physical map is housed in the Prunus genome website within the Genome Database for Rosaceae (GDR) www.genome.clemson.edu/gdr/.

Impacts
This work will directly impact the fruit industry in the US and world by providing the tools to identify and manipulate genes critical to the fruit tree industry. This work integrates the mapping information on important fruiting tree characters of interest to breeding programs directly with genes that control these characters. This allows a streamlining of the breeding process important for the introduction or enhancement of genes that control characters such as disease resistance, fruit quality, tree architecture, and growth characteristics. These maps and tools developed in this program are made publically available through the GDR at www.rosaceae.org

Publications

  • Sook Jung, Albert Abbott, Christopher Jesudurai, Jeff Tomkins and Dorrie Main. Fequency, Type, Localization and Annotation of SSRs in Rosaceae ESTs, Functional and Integrated Genomics, 2005 Jul;5(3):136-43.
  • D. A. Lalli, V. Decroocq, A. Blenda, L. Garay, V. Schurdi-Levraud , O. Le Gall, V. Damsteegt, G. L. Reighard, and A. G. Abbott. 2005. Identification and mapping of resistance gene analogs (rgas) in prunus: a resistance map for prunus. Theor. and Appl. Genet. Theor. Appl. Genet. 111:1504-1513
  • S. Jung, D. Main, M.Staton, I.Cho, T.Zhebentyayeva, P. Arus and A. Abbott, 2006. Synteny conservation between the Prunus genome and both the present and ancestral Arabidopsis genomes. BMC Genomics, in press