Source: AGRICULTURAL RESEARCH SERVICE submitted to
DEFINING THE GENETIC DIVERSITY AND STRUCTURE OF THE SOYBEAN GENOME AND APPLICATIONS TO GENE DISCOVERY IN SOYBEAN AND WHEAT GERMPLASM
Sponsoring Institution
Agricultural Research Service/USDA
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0413479
Grant No.
(N/A)
Project No.
1245-21000-263-00D
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
May 14, 2008
Project End Date
May 13, 2013
Grant Year
(N/A)
Project Director
CREGAN P B
Recipient Organization
AGRICULTURAL RESEARCH SERVICE
RM 331, BLDG 003, BARC-W
BELTSVILLE,MD 20705-2351
Performing Department
(N/A)
Non Technical Summary
(N/A)
Animal Health Component
(N/A)
Research Effort Categories
Basic
60%
Applied
10%
Developmental
30%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2011540104018%
2031820104064%
2041549104018%
Goals / Objectives
The three objectives of the research are firstly, to define linkage disequilibrium and recombination rates across the soybean genome to facilitate efficient discovery of quantitative trait loci (QTL) through Association Analysis and efficient introgression of exotic germplasm, secondly, to define genome regions in cultivated soybean that are associated with domestication for the discovery of genetic variation lost through the domestication bottleneck that can be used to improve soybean and thirdly, to discover QTL and genes controlling biotic and abiotic stress resistance and quality traits in soybean and wheat, and develop DNA markers that define haplotype variation across these and previously identified regions.
Project Methods
Single nucleotide polymorphism (SNP) DNA markers will be discovered using high throughput genome sequence analysis in combination with the newly developed whole genome soybean sequence from the Department of Energy, Joint Genome Institute. A set of 50,000 SNPs, selected from across the genome, will be identified and genetically mapped in cultivated soybean as well as in a newly created cultivated x wild soybean population. The same SNPs will be used to characterize 16,795 soybean landraces as well as a set of 96 elite soybean cultivars and 1,116 wild soybean genotypes. This will allow an assessment of linkage disequilibrium and population structure across the genomes of the landraces, elite cultivars and wild soybeans. Association Analysis will be assessed as a new approach to detect genes/QTL underlying the important trait of seed protein concentration. The high resolution genetic maps in both cultivated x cultivated and cultivated x wild soybean populations combined with QTL analysis of traits related to soybean domestication will facilitate the identification of regions in cultivated soybean which, in comparison to wild soybean, have little or no genetic variation as a result of ¿selective sweeps¿ that occurred during soybean domestication. A universal set of 1536 soybean SNPs with high rates of polymorphism and even distribution across the genome will be developed and used to discover QTL underlying a number of disease resistance and quality traits in soybean. In addition, DNA marker development in hexaploid wheat will be continued and these markers and other SSR markers previously developed in our laboratory will be used in QTL analysis for a number of important traits in hexaploid wheat.

Progress 05/14/08 to 05/13/13

Outputs
Progress Report Objectives (from AD-416): The three objectives of the research are firstly, to define linkage disequilibrium and recombination rates across the soybean genome to facilitate efficient discovery of quantitative trait loci (QTL) through Association Analysis and efficient introgression of exotic germplasm, secondly, to define genome regions in cultivated soybean that are associated with domestication for the discovery of genetic variation lost through the domestication bottleneck that can be used to improve soybean and thirdly, to discover QTL and genes controlling biotic and abiotic stress resistance and quality traits in soybean and wheat, and develop DNA markers that define haplotype variation across these and previously identified regions. Approach (from AD-416): Single nucleotide polymorphism (SNP) DNA markers will be discovered using high throughput genome sequence analysis in combination with the newly developed whole genome soybean sequence from the Department of Energy, Joint Genome Institute. A set of 50,000 SNPs, selected from across the genome, will be identified and genetically mapped in cultivated soybean as well as in a newly created cultivated x wild soybean population. The same SNPs will be used to characterize 16,795 soybean landraces as well as a set of 96 elite soybean cultivars and 1,116 wild soybean genotypes. This will allow an assessment of linkage disequilibrium and population structure across the genomes of the landraces, elite cultivars and wild soybeans. Association Analysis will be assessed as a new approach to detect genes/QTL underlying the important trait of seed protein concentration. The high resolution genetic maps in both cultivated x cultivated and cultivated x wild soybean populations combined with QTL analysis of traits related to soybean domestication will facilitate the identification of regions in cultivated soybean which, in comparison to wild soybean, have little or no genetic variation as a result of �selective sweeps� that occurred during soybean domestication. A universal set of 1536 soybean SNPs with high rates of polymorphism and even distribution across the genome will be developed and used to discover QTL underlying a number of disease resistance and quality traits in soybean. In addition, DNA marker development in hexaploid wheat will be continued and these markers and other SSR markers previously developed in our laboratory will be used in QTL analysis for a number of important traits in hexaploid wheat. Progress was made in completing the analysis of the USDA Soybean Germplasm Collection with 50,000 single nucleotide polymorphism (SNP) DNA markers. The USDA Soybean Germplasm Collection has approximately 18,480 cultivated and 1,165 wild soybean accessions that represent a wide diversity of genetic types. Over the past four years the DNA of each accession was isolated and analyzed with 52,041 SNP DNA markers. The analysis of the resulting data indicated that 42,509 SNP DNA markers produced high quality data. The entire dataset consisting of 19,652 cultivated and wild soybean accessions analyzed with the 42,509 SNP DNA markers was submitted to SoyBase, the USDA, ARS, Soybean Genome Database. In the near future, SoyBase will make all of the genetic marker data available to soybean researchers around the world at http://SoyBase.org. These data will provide information for a range of analyses of soybean genetic and genome variability and for the discovery of genes impacting important traits including resistance to biotic and abiotic pests, seed composition, growth habit and seed yield. Accomplishments over the life of the project included the discovery of SNP DNA markers in soybean, common bean and wheat and the development of �genechips� for high throughput DNA marker analysis of soybean and common bean. In soybean these markers were used to create a genetic map with more than 5,800 DNA markers that was used to anchor the soybean whole genome DNA sequence to the 20 soybean chromosomes and resulted in the publication of the soybean whole genome sequence in 2010. A carefully selected set of 1536 SNP DNA markers in soybean was used to create the Universal Soy Linkage Panel 1.0 that was widely used to analyze soybean populations from breeders and geneticists around the U.S. These analyses resulted in the discovery of quantitative trait loci (QTL) i.e., genes, controlling numerous traits including resistance to biotic and abiotic stresses, seed composition, as well as traits related to plant growth and development and seed yield. One notable application of SNP marker technology in soybean was the use of Genome Wide Association Analysis (GWAS) for the detection of QTL/genes controlling seed protein and oil content. The GWAS was very successful in detecting 17 regions across the 20 soybean chromosomes containing QTL/genes controlling the level of seed protein. In the case of common bean, 3 genechips with more than 5,000 SNP markers each were used to analyze a set of more than 500 common bean accessions. A portion of these SNPs was also used to analyze a common bean genetic mapping population of 277 progeny to create a genetic map with 7,019 SNP DNA markers. In collaboration with the Dept. of Energy, Joint Genome Institute and North Dakota State Univ. the resulting genetic map has been used to anchor the common bean DNA sequence to the 11 common bean chromosomes and will soon result in the publication of the common bean whole genome sequence. Accomplishments 01 Development of a �Core Collection� of cultivated soybean. Plant germplasm collections often contain many thousands of accessions that represent the total diversity of a plant species. Such is the case of the USDA Soybean Germplasm Collection that contains approximately 18, 480 cultivated soybean accessions. Such large numbers of accessions would be difficult to evaluate in most research programs and thus the concept of Core Collections was developed. A Core Collection or Core Set is a selected subset of 10 � 20% of the entire collection that represents a large proportion of the diversity of the entire collection. Based upon the analysis of 18,489 cultivated soybean accessions from the USDA Soybean Germplasm Collection with 42,509 single nucleotide polymorphism (SNP) DNA markers, ARS researchers at Beltsville, MD eliminated 4305 accessions that were 99.9% genetically similar to another accession. Based upon a genetic analysis of the SNP DNA marker data of the 14,184 remaining accessions, a Core Collection of 1418 accessions that represented 79% of the genetic diversity of the entire collection was identified. The cultivated soybean Core Collection will allow soybean breeders and geneticists to analyze and search for useful genetic variation in a relatively small and manageable set of accessions that still is anticipated to contain most of the genetic variability of the entire USDA Soybean Germplasm Collection. 02 Design of a new �beadchip� for the genetic analysis and genetic improvement of Soybean. DNA markers are defined positions that are interspersed within and among the genes along the chromosomes of higher organisms. DNA markers can be used to create genetic maps in which the order and the distance between the positions of the DNA markers along each chromosome are defined. In plant genetic research such maps are used in a variety of ways to define the positions of genes on the chromosomes and to identify breeding lines that carry the form of a gene or genes that offer resistance to disease, abiotic stress or improved product quality. ARS researchers at Beltsville, MD selected an optimal set of 6,000 single nucleotide polymorphism (SNP) DNA markers from those they had previously developed and whose position on the soybean genetic map they had previously defined. These markers are evenly distributed across the 20 pairs of soybean chromosomes. The BARCSoySNP6K Illumina beadchip with 6,000 SNP DNA markers provides an extensive set of well positioned DNA markers that can be used by soybean geneticists and breeders to discover genes of interest and to rapidly incorporate such genes into new soybean varieties with enhanced stress resistance and nutritional quality.

Impacts
(N/A)

Publications

  • Song, Q., Hyten, D., Jia, G., Quigley, C.V., Fickus, E.W., Nelson, R.L., Cregan, P.B. 2013. Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS Genetics. 8(1):e54985.
  • Bales-Arcelo, C., Zhang, A., Liu, M., Mensah, C., Gu, C., Song, Q., Hyten, D., Cregan, P.B., Wang, D. 2013. Mapping soybean aphid resistance genes in PI 567598B. Theoretical and Applied Genetics. 126:2081-2091.


Progress 10/01/11 to 09/30/12

Outputs
Progress Report Objectives (from AD-416): The three objectives of the research are firstly, to define linkage disequilibrium and recombination rates across the soybean genome to facilitate efficient discovery of quantitative trait loci (QTL) through Association Analysis and efficient introgression of exotic germplasm, secondly, to define genome regions in cultivated soybean that are associated with domestication for the discovery of genetic variation lost through the domestication bottleneck that can be used to improve soybean and thirdly, to discover QTL and genes controlling biotic and abiotic stress resistance and quality traits in soybean and wheat, and develop DNA markers that define haplotype variation across these and previously identified regions. Approach (from AD-416): Single nucleotide polymorphism (SNP) DNA markers will be discovered using high throughput genome sequence analysis in combination with the newly developed whole genome soybean sequence from the Department of Energy, Joint Genome Institute. A set of 50,000 SNPs, selected from across the genome, will be identified and genetically mapped in cultivated soybean as well as in a newly created cultivated x wild soybean population. The same SNPs will be used to characterize 16,795 soybean landraces as well as a set of 96 elite soybean cultivars and 1,116 wild soybean genotypes. This will allow an assessment of linkage disequilibrium and population structure across the genomes of the landraces, elite cultivars and wild soybeans. Association Analysis will be assessed as a new approach to detect genes/QTL underlying the important trait of seed protein concentration. The high resolution genetic maps in both cultivated x cultivated and cultivated x wild soybean populations combined with QTL analysis of traits related to soybean domestication will facilitate the identification of regions in cultivated soybean which, in comparison to wild soybean, have little or no genetic variation as a result of �selective sweeps� that occurred during soybean domestication. A universal set of 1536 soybean SNPs with high rates of polymorphism and even distribution across the genome will be developed and used to discover QTL underlying a number of disease resistance and quality traits in soybean. In addition, DNA marker development in hexaploid wheat will be continued and these markers and other SSR markers previously developed in our laboratory will be used in QTL analysis for a number of important traits in hexaploid wheat. Progress was made in assessing the application of Association Analysis for the determination of the position of genes that control seed protein content in 302 accessions from the USDA Soybean Germplasm Collection. Seed protein content in the accessions, determined in field trails, ranged from 35.5 to 50.5%. The 302 accessions were genetically analyzed with more than 32,000 SNP DNA markers. The Association Analysis using the seed protein and the DNA marker data detected 18 positions on the 20 pairs of soybean chromosomes that contained a gene or genes impacting seed protein concentration. The results based upon the Association Analysis appear to be very reliable because 11 of the 18 positions have been previously reported as containing genes that impact seed protein concentration. Progress was made in the analysis of the USDA Soybean Germplasm Collection with the 50,000 single nucleotide polymorphism (SNP) DNA markers that are spread across the 20 pairs of soybean chromosomes. The USDA Soybean Germplasm Collection has over 18,680 cultivated, and 1,115 wild, soybean accessions that represent a wide diversity of genetic types. Over the past three years the DNA of each accession was isolated and analyzed with more than 51,000 SNP DNA markers. The initial analysis of the resulting data indicated that 42,509 SNP DNA markers produced high quality data. It was also determined that the data from more than 2,870 accessions were not completely reliable and these accessions have now been reanalyzed. Based upon the accessions with high quality data, the average genetic distance between any pair of cultivated soybean accessions was 0.29 as determined by the DNA marker analysis. This indicates that, on average, a different form or �allele� of the DNA marker is present at 29% of the DNA marker positions in a comparison of any two soybean germplasm accessions. The complete DNA marker dataset consisting of more than 19,700 accessions with data for 42,509 DNA markers is being further analyzed in preparation for submission to SoyBase, the USDA, ARS, Soybean Genome Database. Accomplishments 01 Discovery of regions of the soybean genome likely to contain genes important in the domestication of soybean. The cultivated soybean that i grown on more than 70 million acres in the U.S. was domesticated about 5 000 years ago in China from the wild, viney, small black seeded wild soybean. In the process of the domestication of soybean and other crops large proportion of the genetic variation present in the wild progenitor species is lost due to what is referred as a �genetic bottleneck�. This loss of genetic variability is desirable in that it may eliminate many o the undesirable traits present in the wild species, but it also may eliminate genetic variation that could be useful in modern crop improvement. Thus, it is important to define the regions of the soybean chromosomes and the specific genes in those regions that were responsibl for the genetic improvement associated domestication. Using 42,000 DNA markers analyzed on 96 wild soybeans and 96 cultivated soybeans, ARS researchers at Beltsville, MD defined 18 regions across the 20 pairs of soybean chromosomes where there was significantly reduced genetic variation in the cultivated versus the wild soybean. These regions are putatively associated with soybean domestication. Little or no genetic variation is available in cultivated soybean in these regions. Thus, th provide targets to mine for genes that are currently not available in cultivated soybean germplasm that can be used in the genetic improvement of cultivated soybean. 02 Creation of a high density genetic map of the common bean. DNA markers are defined positions that are interspersed within and among the genes along the chromosomes of higher organisms. DNA markers can be used to create genetic maps in which the order and the distance between the positions of the DNA markers along each chromosome are defined. In plan genetic research such maps are used in a variety of ways to define the positions of genes on the chromosomes and to identify breeding lines tha carry the form of a gene or genes that offer resistance to disease, abiotic stress or improved product quality. ARS researchers at Beltsvil MD developed and mapped more than 7,000 new single nucleotide polymorphism (SNP) DNA markers and created a genetic map of the 11 pairs of the common bean chromosomes. This map provides an extensive set of well positioned DNA markers that can be used by common bean geneticists and breeders to discover genes of interest and to rapidly incorporate su genes into new common bean varieties with enhanced stress resistance and nutritional quality. In addition, the DNA sequence associated with each of the SNP DNA markers is being used to �assemble� the DNA sequence of t whole common bean genome which is being completed by researchers at the Department of Energy, Joint Genome Institute.

Impacts
(N/A)

Publications

  • Jiang, G., Wang, X., Green, M., Scott, R.A., Hyten, D., Cregan, P.B. 2012. QTL analysis of saturated fatty acids in a population of recombinant inbred lines of soybean. Molecular Breeding. 30:1163-1179.
  • De Souza, T., De Barros, E.G., Bellato, C.M., Fickus, E.W., Cregan, P.B., Pastor Corrales, M.A. 2011. Single nucleotide polymorphism (SNP) discovery in common bean. Molecular Breeding. 30:419-428.
  • Mamidi, S., Chikara, S., Goos, R.J., Hyten, D.L., Moghaddam, S.M., Cregan, P.B., Mcclean, P.E. 2012. Genome-wide association analysis identifies candidate genes associated with iron deficiency chlorosis in soybean. The Plant Genome. 11:154-164.
  • Du, J., Tian, Z., Sui, Y., Zhao, M., Song, Q., Cannon, S.B., Cregan, P.B., Ma, J. 2012. Pericentromeric effects shape the patterns of divergence, retention, and expression of duplicated genes in the Paleopolyploid Soybean. The Plant Cell. 24(1):21-32.


Progress 10/01/10 to 09/30/11

Outputs
Progress Report Objectives (from AD-416) The three objectives of the research are firstly, to define linkage disequilibrium and recombination rates across the soybean genome to facilitate efficient discovery of quantitative trait loci (QTL) through Association Analysis and efficient introgression of exotic germplasm, secondly, to define genome regions in cultivated soybean that are associated with domestication for the discovery of genetic variation lost through the domestication bottleneck that can be used to improve soybean and thirdly, to discover QTL and genes controlling biotic and abiotic stress resistance and quality traits in soybean and wheat, and develop DNA markers that define haplotype variation across these and previously identified regions. Approach (from AD-416) Single nucleotide polymorphism (SNP) DNA markers will be discovered using high throughput genome sequence analysis in combination with the newly developed whole genome soybean sequence from the Department of Energy, Joint Genome Institute. A set of 50,000 SNPs, selected from across the genome, will be identified and genetically mapped in cultivated soybean as well as in a newly created cultivated x wild soybean population. The same SNPs will be used to characterize 16,795 soybean landraces as well as a set of 96 elite soybean cultivars and 1,116 wild soybean genotypes. This will allow an assessment of linkage disequilibrium and population structure across the genomes of the landraces, elite cultivars and wild soybeans. Association Analysis will be assessed as a new approach to detect genes/QTL underlying the important trait of seed protein concentration. The high resolution genetic maps in both cultivated x cultivated and cultivated x wild soybean populations combined with QTL analysis of traits related to soybean domestication will facilitate the identification of regions in cultivated soybean which, in comparison to wild soybean, have little or no genetic variation as a result of �selective sweeps� that occurred during soybean domestication. A universal set of 1536 soybean SNPs with high rates of polymorphism and even distribution across the genome will be developed and used to discover QTL underlying a number of disease resistance and quality traits in soybean. In addition, DNA marker development in hexaploid wheat will be continued and these markers and other SSR markers previously developed in our laboratory will be used in QTL analysis for a number of important traits in hexaploid wheat. Progress was made in the genetic analysis of single nucleotide polymorphism (SNP) DNA markers in two extremely large soybean populations that consist of the progeny derived from hybridizing two soybeans. These SNP genetic markers are landmarks along the 20 soybean chromosomes and are interspersed with more than 45,000 genes in the soybean genome. The markers were mapped using the Genechip that we had previously developed. Both of the genetic populations had the soybean Williams 82 as one parent. Williams 82 was the soybean whose DNA sequence (nearly 1 billion DNA units) was completed in 2010. The first population consisted of 1098 soybean lines developed from crossing Williams 82 with a wild soybean PI 479752, and the second consisted of 961 lines from the cross of Williams 82 with the Essex soybean which was commonly grown and used as a parent in U.S. soybean breeding. A total of 23,814 of the SNP DNA markers were genetically analyzed in the Williams 82 x PI 479752 population and 14,933 in the Williams 82 x Essex population. The result is a very dense map of the soybean genome. These genetic maps will assist in improving the assembly of the Williams 82 soybean genome sequence. That is, the pieces or �scaffolds� of DNA sequence will be more accurately positioned onto the 20 soybean chromosomes. Progress was made in the analysis of the USDA Soybean Germplasm Collection with 50,000 single nucleotide polymorphism (SNP) DNA markers. The USDA Soybean Germplasm Collection has over 19,000 cultivated and wild soybean accessions that have been collected over the past 90 years and which represent a wide diversity of genetic types. Over the past two years, DNA was isolated from each of these soybean accessions and was analyzed with the 50,000 SNP DNA markers. The 50,000 SNP DNA markers were selected to be positioned across each of the 20 soybean chromosomes. At this time the process of analyzing the resulting SNP genetic marker data is proceeding and requires the determination of the form of the SNP or �allele� that is present at each of the 50,000 SNP positions in each of the more than 19,000 soybean accessions. The initial determination of the alleles present in all accessions is nearly complete and it has been determined that DNA of about 2,000 accessions will have to be re-isolated and reanalyzed with the 50,000 SNP beadchip. Accomplishments 01 Single Nucleotide Polymorphism (SNP) DNA marker discovery in common bean and the development of high throughput SNP DNA marker analysis. DNA markers are defined positions that are interspersed among the genes alon the chromosomes of higher organisms. Because of their proximity to gene DNA markers are used in plant and animal breeding to select individuals that carry the forms of particular genes that condition improved disease or stress resistance, improved quality traits or greater productivity. the case of common bean only a relatively small set of genetic markers i available to plant breeders and geneticists. To expedite genetic marker discovery, ARS reseachers at Beltsville, MD, used a combination of two �next generation� DNA sequencers, the Roche 454-FLX and the Illumina Genome Analyzer II, to sequence specially constructed DNA libraries of t diverse common bean accessions. Careful analysis and comparison of the DNA sequence data derived from the two common bean accessions resulted i the discovery of 3,487 SNP DNA markers. A high throughput Illumina Inc. GoldenGate SNP assay was developed which contained 1050 of the 3,487 predicted SNP markers. A total of 827 of the 1050 SNP markers produced high quality data as demonstrated via the analysis of 48 cultivated and wild common bean accessions. Genetic mapping defined the chromosome positions of 649 of the SNP markers. This initial SNP marker discovery common bean will impact breeding and cultivar development by providing a set of markers that can be used in high throughput genetic analyses of common bean breeding populations to discover genes controlling traits of interest and for the selection of breeding lines that carry the forms of genes required for enhanced disease or stress resistance, improved quali traits or greater productivity.

Impacts
(N/A)

Publications

  • Hyten, D.L., Song, Q., Fickus, E.W., Choi, I., Quigley, C.V., Hwang, E., Pastor Corrales, M.A., Cregan, P.B. 2010. High-throughput SNP discovery and assay development in Common Bean. Biomed Central (BMC) Genomics. 11:475.
  • Kendrick, M.D., Harris, D.K., Ha, B., Hyten, D.L., Cregan, P.B., Frederick, R.D., Boema, H.R., Pedley, K.F. 2011. Identification of a second Asian soybean rust resistance gene in Hyuuga soybean. Phytopathology. 101:535- 543.
  • Haun, W.J., Hyten, D.L., Xu, W.W., Gerhardt, D.J., Albert, T.J., Richmond, T., Jeddeloh, J.A., Springer, N.M., Vance, C.P., Stupar, R. 2011. The composition and origins of intravarietal genomic heterogeneity in soybean. Plant Physiology. 155:645-655.
  • Kim, M., Hyten, D.L., Niblack, T.L., Diers, B.W. 2011. Stacking resistance alleles from wild and domestic soybean sources improves soybean cyst nematode resistance. Crop Science. 51:934-943.


Progress 10/01/09 to 09/30/10

Outputs
Progress Report Objectives (from AD-416) The three objectives of the research are firstly, to define linkage disequilibrium and recombination rates across the soybean genome to facilitate efficient discovery of quantitative trait loci (QTL) through Association Analysis and efficient introgression of exotic germplasm, secondly, to define genome regions in cultivated soybean that are associated with domestication for the discovery of genetic variation lost through the domestication bottleneck that can be used to improve soybean and thirdly, to discover QTL and genes controlling biotic and abiotic stress resistance and quality traits in soybean and wheat, and develop DNA markers that define haplotype variation across these and previously identified regions. Approach (from AD-416) Single nucleotide polymorphism (SNP) DNA markers will be discovered using high throughput genome sequence analysis in combination with the newly developed whole genome soybean sequence from the Department of Energy, Joint Genome Institute. A set of 50,000 SNPs, selected from across the genome, will be identified and genetically mapped in cultivated soybean as well as in a newly created cultivated x wild soybean population. The same SNPs will be used to characterize 16,795 soybean landraces as well as a set of 96 elite soybean cultivars and 1,116 wild soybean genotypes. This will allow an assessment of linkage disequilibrium and population structure across the genomes of the landraces, elite cultivars and wild soybeans. Association Analysis will be assessed as a new approach to detect genes/QTL underlying the important trait of seed protein concentration. The high resolution genetic maps in both cultivated x cultivated and cultivated x wild soybean populations combined with QTL analysis of traits related to soybean domestication will facilitate the identification of regions in cultivated soybean which, in comparison to wild soybean, have little or no genetic variation as a result of �selective sweeps� that occurred during soybean domestication. A universal set of 1536 soybean SNPs with high rates of polymorphism and even distribution across the genome will be developed and used to discover QTL underlying a number of disease resistance and quality traits in soybean. In addition, DNA marker development in hexaploid wheat will be continued and these markers and other SSR markers previously developed in our laboratory will be used in QTL analysis for a number of important traits in hexaploid wheat. Progress was made in development of the Soybean Illumina Infinium beadchip which is capable of analyzing 50,000 single nucleotide polymorphism (SNP) DNA markers in the DNA of the soybean genome. The 50, 000 SNPs were selected from among approximately 200,000 SNP markers developed using �next generation� Illumina Genome Analyzer DNA sequencing of a set of six cultivated and one wild soybean. The sequences were aligned to each other and to the newly completed whole genome sequence of the soybean for SNP discovery and to determine the position of each putative SNP on the 20 soybean chromosomes. The 50,000 SNPs were then selected from the total of 200,000 SNPs based upon their position on the 20 soybean chromosomes to assure an even distribution of SNP markers across each of the 20 soybean chromosomes. The Soybean Illumina Infinium Beadchip was used to analyze more than 10,000 accessions from the USDA Soybean Germplasm Collection. The analyzed accessions included accessions with high seed protein concentration. Progress was made in discovery of genes controlling the level of protein in the soybean seed. In collaborative research, 48 soybean germplasm accessions with high seed protein (48% or more) were mated to a high- yielding cultivar with normal seed protein (42% or less) of the same maturity to generate the 48 populations. A total of 240 F2 plants per mating were grown in the field and a leaf sample was collected from each for DNA isolation. The F3 seeds from the 240 plants from each mating were analyzed for seed protein concentration. Using a techniques called �selective genotyping� DNA from the 22 F2 plants that produced the seed with the highest seed protein concentration and the 22 plants that produced the seed with the lowest seed protein concentration from each mating were analyzed along with the two parental lines with the 1536 SNP DNA markers in the Universal Soybean Linkage Panel 1.0. To date significant genetic effects controlling seed protein that have not been previously reported have been detected on five different soybean chromosomes. Progress was made in the discovery of SNP DNA markers in wheat using next generation Roche 454 and Illumina Genome Analyzer DNA sequencing. Genomic DNA libraries were constructed by restriction enzyme digestion and size selection of DNA fragments in the 400-600 basepair size range of two hard red winter wheats, two hard red spring wheats two soft red winter wheats and one durum wheat. These libraries were further digested and sub-libraries with DNA fragments in the 180-240 basepair size range were isolated. The 400-600 basepair fragments of the library of the cultivar Chinese Spring were sequenced with the Roche 454 DNA sequencer and each of the seven sub-libraries were sequenced with the Illumina Genome Analyzer. The resulting DNA sequences are being aligned and analyzed to identify SNP DNA markers in preparation for the design of a 6, 000 SNP Illumina Infinium beadchip. Accomplishments 01 Discovery, evaluation, and release of 33,065 Simple Sequence Repeat (SSR DNA markers in soybean. DNA markers serve as genetic landmarks and are interspersed among and within the genes throughout the genome of higher organisms including the soybean. If a marker is located near a gene of interest, the marker can be used to select for the desired form of the gene. For example, a soybean breeder can use a DNA marker to identify plants that carry the form of the gene that gives resistance to a diseas rather than the form that leads to susceptibility. With the recent release of the whole DNA sequence of the soybean genome it became possib to identify thousands of DNA markers called Simple Sequence Repeat or SS markers across the 20 soybean chromosomes. ARS scientists at the Soybea Genomics and Improvement Laboratory in Beltsville, MD with collaborating ARS scientists at Ames, IA screened the DNA sequence of the 20 soybean chromosomes and the identified more than 33,000 SSR markers with a high probability of functioning well for use in DNA marker assisted soybean breeding and for the discovery of the positions of genes on the soybean chromosomes. A database called BARCSOYSSR_1.0 was created which contain the information required for the use of each of the more than 33,000 SSR DNA markers as well as the specific position of each marker on one of th 20 soybean chromosomes. This information is available on SoyBase (http://soybase.org), the USDA, ARS, Soybean Genome Database. The information in this database will be useful to soybean breeders and soybean geneticists to select useful DNA markers at any position on any one of the 20 soybean chromosomes to facilitate gene cloning or DNA mark assisted soybean breeding.

Impacts
(N/A)

Publications

  • Schmutz, J., Cannon, S.B., Schlueter, J., Ma, J., Hyten, D.L., Song, Q., Mitros, T., Nelson, W., May, G.D., Gill, N., Peto, M.F., Shu, S., Goodstein, D., Thelen, J.J., Cheng, J., Sakurai, T., Umezawa, T., Shinozaki, K., Du, J., Bhattacharyya, M., Sandhu, D., Grant, D.M., Joshi, T., Libault, M., Zhang, X., Hguyen, H., Valliyodan, B., Xu, D., Futrell- Griggs, M., Abernathy, B., Hellsten, U., Berry, K., Grimwood, J., Yu, Y., Wing, R.A., Cregan, P.B., Stacey, G., Specht, J., Rokhsar, D., Shoemaker, R.C., Jackson, S. 2010. Genome Sequence of the Paleopolyploid Soybean (Glycine max (L.) Merr.). Nature. 463:178-183.
  • Hyten, D.L., Choi, I., Song, Q., Specht, J.E., Carter Jr, T.E., Shoemaker, R.C., Hwang, E., Matukumalli, L.K., Cregan, P.B. 2001. A high density integrated genetic linkage map of soybean and the development of a 1,536 Universal Soy Linkage Panel for QTL mapping. Crop Science. 50:960-968.
  • Hyten, D.L., Cannon, S.B., Song, Q., Weeks, N.T., Fickus, E.W., Shoemaker, R.C., Specht, J.E., May, G.D., Cregan, P.B. 2010. High-Throughput SNP Discovery through Deep Resequencing of a Reduced Representation Library to Anchor and Orient Scaffolds in the Soybean Whole Genome Sequence. Biomed Central (BMC) Genomics. 11:38.
  • Kim, K.S., Bellendir, S., Hudson, K. A., Hill, C.B., Hartman, G.L., Hyten, D., Hudson, M.E., Diers, B.W. 2010. Fine mapping the soybean aphid resistance gene Rag1 in soybean. Theoretical and Applied Genetics. 20(5) :1063-1071.


Progress 10/01/08 to 09/30/09

Outputs
Progress Report Objectives (from AD-416) The three objectives of the research are firstly, to define linkage disequilibrium and recombination rates across the soybean genome to facilitate efficient discovery of quantitative trait loci (QTL) through Association Analysis and efficient introgression of exotic germplasm, secondly, to define genome regions in cultivated soybean that are associated with domestication for the discovery of genetic variation lost through the domestication bottleneck that can be used to improve soybean and thirdly, to discover QTL and genes controlling biotic and abiotic stress resistance and quality traits in soybean and wheat, and develop DNA markers that define haplotype variation across these and previously identified regions. Approach (from AD-416) Single nucleotide polymorphism (SNP) DNA markers will be discovered using high throughput genome sequence analysis in combination with the newly developed whole genome soybean sequence from the Department of Energy, Joint Genome Institute. A set of 50,000 SNPs, selected from across the genome, will be identified and genetically mapped in cultivated soybean as well as in a newly created cultivated x wild soybean population. The same SNPs will be used to characterize 16,795 soybean landraces as well as a set of 96 elite soybean cultivars and 1,116 wild soybean genotypes. This will allow an assessment of linkage disequilibrium and population structure across the genomes of the landraces, elite cultivars and wild soybeans. Association Analysis will be assessed as a new approach to detect genes/QTL underlying the important trait of seed protein concentration. The high resolution genetic maps in both cultivated x cultivated and cultivated x wild soybean populations combined with QTL analysis of traits related to soybean domestication will facilitate the identification of regions in cultivated soybean which, in comparison to wild soybean, have little or no genetic variation as a result of �selective sweeps� that occurred during soybean domestication. A universal set of 1536 soybean SNPs with high rates of polymorphism and even distribution across the genome will be developed and used to discover QTL underlying a number of disease resistance and quality traits in soybean. In addition, DNA marker development in hexaploid wheat will be continued and these markers and other SSR markers previously developed in our laboratory will be used in QTL analysis for a number of important traits in hexaploid wheat. Significant Activities that Support Special Target Populations Progress was made in discovery of SNPs for the genetic analysis of soybean germplasm with 37,000 SNP markers. Reduced representation DNA libraries were created from genomic DNA of the soybean cultivars Evans, Essex, Archer, Minsoy, Peking, Noir 1 as well as of the wild soybean PI 468916. In addition, similar libraries were created with a mix of DNA of the genotypes Minsoy, Noir 1, Archer, Evans, and Peking. The genomic DNA libraries consisted of restriction digested fragments that were �size selected� to only include DNA fragments in the 100-150 basepair size range. Following numerous runs on the Solexa/Illumina Genome Analyzer more than 7 billion bases of DNA sequence was obtained from these various libraries. These were aligned with the whole genome sequence of soybean to facilitate the discovery of more than 60,000 putative SNPs. Progress was made in the anchoring of the whole soybean genome sequence produced by the Department of Energy, Joint Genome Institute to the 20 chromosomes that make-up the soybean genome. The Consensus Soybean Linkage Map 4.0 contains more than 4,800 DNA sequence-based markers including 3792 SNP-containing sequence tagged sites and 1009 simple sequence repeat markers. An additional 1,240 SNP markers were developed specifically to anchor segments of the genome that were not anchored by markers in Consensus Soybean Linkage Map 4.0. These markers were genetically mapped in a high resolution mapping population of 470 recombinant inbred lines developed from a cross of Williams 82 with the wild soybean PI 468916. As result of these efforts a total of 97% of the whole soybean genome sequence was anchored to the 20 soybean chromosomes. Progress was made in the discovery of wheat SNP markers using next generation Solexa/Illumina DNA sequencing. Genomic DNA libraries were constructed by restriction enzyme digestion and size selection of DNA fragments in the 100-150 basepair size range of the parents of the ITMI (International Triticeae Mapping Initiative) genetic mapping parents Opata 85 and W7984. These DNA libraries were sequenced on the Solexa/Illumina Genome Analyzer. Homologous fragments were aligned and compared for the identification of defining DNA sequence differences or SNPs. To date a total of 1742 putative SNPs have been identified. Technology Transfer Number of Other Technology Transfer: 2

Impacts
(N/A)

Publications

  • Gaitan-Solis, E., Choi, I., Quigley, C.V., Cregan, P.B., Tohme, J. 2008. Single nucleotide polymorphisms in common bean: their discovery and genotyping using a multiplex detection system. The Plant Genome. 1:125-134.
  • Hyten, D.L., Smith, J.R., Frederick, R.D., Tucker, M.L., Song, Q., Cregan, P.B. 2009. Bulk Segregate Analysis using the GoldenGate Assay to Locate the Rpp3 Locus that Confers Resistance to Phakopsora pachyrhizi (Soybean Rust) in Soybean. Crop Science. 49:265-271.
  • Wu, X., Blake, S., Sleper, D.A., Shannon, J.G., Cregan, P.B., Nguyen, H.T. 2008. QTL and Additive and Epistatic Effects for SCN Resistance in PI 437654. Theoretical and Applied Genetics. 118:1093-1105.
  • Malkus, A., Song, Q., Cregan, P.B., Arseniuk, E., Ueng, P.P. 2009. Genetic linkage map of Phaeosphaeria nodorum, the causal agent of stagonospora nodorum blotch disease of wheat. European Journal of Plant Pathology. 124:681-690.
  • Chakroaborty, N., Curley, J., Frederick, R.D., Hyten, D.L., Nelson, R.L. Hartman, G.L., Diers, B.W. 2009. Mapping and Confirmation of a New Allele at Rpp1 from Soybean PI 504538A Conferring RB Lesion Type Resistance to Soybean Rust. Crop Science. 49:783-790.