Progress 04/15/05 to 10/14/08
Outputs OUTPUTS: The two major goals of the project are: 1) develop a 300-gene transcript map of common bean using the community-wide BAT93 x Jalo EEP558 mapping population; and 2) develop an on-line tool from which sequence data can be used to design primers that will be used to amplify target genes from species severely lacking genomic sequence resources. We have already completed the first goal. All of the data has been collected, CAPS and dCAPS markers were developed, the parents of the major common bean mapping populations were scored for these polymorphic loci, and all of the data is currently loaded into the LIS database at NCGR. We are going beyond this goal, and developing a low-density SNP map for the sequence data that we have collected. In addition, we have collaborated with the Jackson lab (Purdue) project to develop and integrate a physical map of common bean with the genetic map. We are also working with the soybean research community to trace the ancestry of the soybean genome using the common bean transcript map as a reference. All of these last activities were beyond the initial goals of the project. For goal two, we have implemented or developed all of the tools necessary to extract the necessary sequence data, define appropriate gene families, discovery appropriate primer site targets, and report primer sequences. We are currently implementing these in our WWW interface. PARTICIPANTS: Nothing significant to report during this reporting period. TARGET AUDIENCES: Nothing significant to report during this reporting period. PROJECT MODIFICATIONS: Not relevant to this project.
Impacts The development of the database will enable researchers working on species without significant sequence data to apply the modern candidate gene genomic approach to their research. The gene-based map of common bean will provide a framework for 1) gene cloning of important agronomic target genes in common bean and 2) the application of comparative legume genome analysis to the improvement of the crop.
Publications
- Schlueter JA, Goichochea JL, Gill V, Lin J-Y, Yu U, Collura K, Vallejos, Thome J, Blair M, McClean P, Wing R, Jackson SA. 2008. BAC-end sequence and a draft physical map of the common bean (Phaseolus vulgaris L.) genome. Tropical Plant Biology 1:40-48.
- Mamidi, S., Lee, R.K., Terpstra, J., Schlueter, J.A., Dixon, P., Shoemaker, R.C., Lavin, M., and McClean, P.E.. 2008. Investigating Gene duplication events in legumes using EST sequence data. Plant and Animal Genome XVI Abstracts. http://www.intl-pag.org/16/abstracts/PAG16_P05f_378.html.
- McConnell, M.D., Lee, R.K., Choi, I.Y., Song, Q., Song, Q.J., Cregan, P., McClean, P.E. 2008. Macrosyntenic relations between common bean (Phaseolus vulgaris L.), Medicago, Arabidopsis, and Poplar. http://www.intl-pag.org/16/abstracts/PAG16_P05f_420.html.
- Gepts P, Aragao F, de Barros E, Blair, MW, Brondani R, Broughton WJ, Galasso I, Hernandez G, Kami J, Lariguet P, McClean P, Melotto M, Miklas P, Pauls P, Pedrosa-Harand A, Porch T, Sanchez F, Sparvoli F and Yu K. 2008. Genomics of Phaseolus beans, a major source of dietary protein and micronutrients in the Tropics. In: Moore PH, and Ming R (Eds), Genomics of Tropical Crop Plants. Springer, Berlin, 113-143.
|
Progress 04/15/06 to 04/14/07
Outputs Database development: The latest version of the database can be viewed at: http://134.129.125.203/. The current version of the database has full CDS sequences for Viridiplantae species in GenBank (as of November 2006) except Arabidopsis, rice, and Medicago. For those three species, the database is populated with the gene models generated from the on-going sequencing projects. We have implemented a phylogenetic approach to the database. The user selects the specific hierarchy to which the want data. For example, a legume researcher can expand Angiosperms/Eudicots/Core Eudicots/Rosids and then select Legumes. Or the individual can select any level above to provide a broader survey of sequences. After the individual enters their query, the results show individual records in the database that meet the query criteria. Next the user selects one of records and other sequences that are similar at a user selected e-value are returned. We are currently working on implementing
several other features. First, the user will be able to view a multiple alignment of the sequences (using the MultAlin algorithm). Next, they will be able to create primers based on the Primer 3 algorithm. They will also be able to download the nucleotide and amino acid sequences for the sequences in the cluster. Once all of the features are in place, we intend to completely update the database by including all of the Viridiplantae CDS sequences along with all of the full gene models from all sequenced plant genomes. When complete the database should include over 125,000 records for orthology and paralogy searches as well as primer design. Gene-by-gene sequencing in common bean: 300 genes were mapped using markers from the core BJ map (Freyre et al. 1998) as a guide for linkage group assignment. The completed map is 1586.7 cM in length, containing 205 markers mapped at a LOD score of 3.0 or better, 159 of which are the new gene-based markers. 303 total markers were mapped at LOD 2.0
or better, 214 of which are gene-based. In addition, 139 markers could be assigned to bins between markers on the LOD 2.0 map. All totaled, 285 gene-based markers could be assigned to a location. It also contains the location of other previously mapped markers. Linkage group (LG) 2 is the longest linkage group, at 207.5 cM, and LG 5 is the smallest, at 77.2 cM. The number of markers on each linkage group range from 27 (LG 4 and LG 10) to 56 (LG 2). The average number of markers on each linkage group is 41, and the average number of gene-based markers on each linkage group is 26. We have deposited the map and all associated data into Legume Information Service database (see http://www.comparative-legumes.org/cgi-bin/cmap/map_set_info?species_ acc=Pv&map_type_acc=-1 and select the maps associated with McClean (NDSU) 2007). We have begun studying the synteny with other species and found relationships that extend over tens of megabase distances in Medicago and Poplar and tens of
centimorgans in common bean. As with other plant species, we observed synteny over a megabase distance in Arabidopsis and a few centimorgans in common bean.
Impacts The development of the database will enable researchers working on species without significant sequence data to apply the modern candidate gene genomic approach to their research. The gene-based map of common bean will provide a framework for 1) gene cloning of important agronomic target genes in common bean and 2) the application of comparative legume genome analysis to the improvement of the crop.
Publications
- Rossi M, Mamidi S, Bellucci E, McConnell MD, Lee RD, Papa R, McClean PE. (2007) The effect of selection on loci within close proximity of domestication loci in common bean (Phaseolus vulgaris L.) Phaseomic V Abstracts. p. 9.
- McConnell MD, Mamidi S, Rossi M, Lee RK, and McClean PE. A gene-based linkage map of common bean (Phaseolus vulgaris L.). 2007. Plant and Animal Genome XV Abstracts. (http://www.intl-pag.org/15/abstracts/PAG15_P05f_414.html)
- Mamidi S, Rossi M, McConnell MD, Lee RK, Papa R, McClean PE, and Bellucci E. 2007. Investigation of the domesticatin process in common bean (Phaseolus vulgaris L.) using multilocus data. Plant and Animal Genome XV Abstracts. (http://www.intl-pag.org/15/abstracts/PAG15_P05f_416.html)
- Buchfink DJ, Denton A, McClean P. 2007. Database and tools for primer design. 2007. Plant and Animal Genome XV Abstracts. (http://www.intl-pag.org/15/abstracts/PAG15_P08a_843.html)
|
Progress 04/15/05 to 04/14/06
Outputs Database development: The goal of this aspect of the project is to develop a database from which users interested in gene-by-sequence can make a query and be offered a suite of primers from which the target gene could be amplified from their species of interest. To that end, we created a database structure to store all relevant data. We have downloaded all of the publicly available gene models for Arabidopsis, rice, Medicago, and maize from databases involved in curating this data. In addition, we downloaded all full gene models available from GenBank for species other than these model species. Collectively, we have over 100,00 sequences. These sequences were analyzed in an all-against-all manner using blastp and clustered using complete linkage clustering. Alignments of clusters were performed for all clusters at specific similarity levels using the MultAlin algorithtm. An algorithm was developed to search the alignment for the best regions for primer development. A
perl script was tested to pass specific sequences to Primer3 for primer development. More specifically, we implemented a BioSQL schema in the PostGreSQL database. For clustering, 50% of each gene involved in the cluster constraint was require. Finally, to evaluate the clustering, we used a histogram-based evaluation measure and determined the complete linkage clustering provided better alignments than single linkage clustering. Gene-by-gene sequencing of common bean: The second goal of the project is to perform gene-by-gene sequencing with common bean. We collected all of the known tentative consensus (TC) sequences of common bean and compared them with all of the gene models from Arabidopsis. Genes to be sequenced were selected based on similarity in a BLAST search. TC sequences were used as a query for an all-against-all blastp analysis against individual databases containing Arabidopsis thaliana genes with mutant phenotypes, genes under selection during domestication in maize, A.
thaliana genes involved in biochemical pathways, and all A. thaliana genes. A gene was selected for sequencing if had at least 100 nucleotides in the 3 prime UTR and an E-value less than e-30 with the top hit. Primers were designed with Primer3 with a target TC fragment size of 450-500 nucleotides, primer size of 18-28 nucleotides, and Tm of all primers about 58oC. The 3 prime primer was targeted to a location 150 nt downstream of the putative stop codon. Fragments were amplified from BAT93 (Bat) and Jalo EEP558 (Jalo) genomic DNA and directly sequenced. Of the more than1000 genes analyzed to date, DNA sequence data for the two genotypes were obtained for 322. Of these, 222 genes were polymorphic. A total of 1003 polymorphisms were detected, and of these 85.5% were SNPs. On average, one SNP was detected every 151 nt, and one indel was observed every 897 nt. 44.1% of the polymorphisms were located in introns, 38.7% in exons, and 17.0% in the 3 prime UTR. SNPs were evenly distributed
between introns and exons, whereas indels were largely found with introns. The sequence polymorphism data was used to developed CAPS markers, and to date, we have mapped 52 genes on the Bat x Jalo linkage map.
Impacts The development of the database will enable researchers working on species without significant sequence data to apply the modern candidate gene genomic approach to their research. The gene-based map of common bean will provide a framework for 1) gene cloning of important agronomic target genes in common bean and 2) the application of comparative legume genome analysis to the improvement of the crop.
Publications
- McConnell, M., Mamidi, S., Lee, R. and McClean, P.E. 2006. DNA sequence polymorphisms among common bean genes. Annu. Rept. Bean Improv. Coop. 49: in press.
- Kar, A., Dorr, D., Denton, A. and McClean, P. 2006. Evaluating clusterings for primer design. Plant and Animal Genome XIV Abstracts. p. 322.
- Dorr, D., and Denton, A. 2006. Clustering sequences by length alignment. Plant and Animal Genome XIV Abstracts. p. 327. McClean, P.E., Lee, R.D., McConnell, M.D., Mamidi, S. and White, A. 2006. Sequence and marker-based diversity in common bean. Plant and Animal Genome XIV Abstracts. p. 40.
- McConnell, M., Mamidi, S., Lee, R., and McClean, P. 2006. Mapping putative functional genes in Phaseolus vulgaris. Plant and Animal Genome XIV Abstracts. p. 212.
|
|