Source: AGRICULTURAL RESEARCH SERVICE submitted to NRP
GENOME ASSEMBLY: GENETIC ANCHORING OF CONTIGS
Sponsoring Institution
Agricultural Research Service/USDA
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0412214
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Jun 1, 2007
Project End Date
Apr 30, 2011
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
AGRICULTURAL RESEARCH SERVICE
RM 331, BLDG 003, BARC-W
BELTSVILLE,MD 20705-2351
Performing Department
(N/A)
Non Technical Summary
(N/A)
Animal Health Component
10%
Research Effort Categories
Basic
60%
Applied
10%
Developmental
30%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2011540104018%
2041820104064%
2031549104018%
Goals / Objectives
The first objective of this research is to create a genetic mapping population via a cross of the soybean cultivar Williams 82 with the wild soybean (G. soja) parent PI 468916. The second objective is to create and sequence a plasmid library of PI 468916 DNA and compare the sequences of the plasmid clones with the draft whole genome sequence of Williams 82 that is being created by the Department of Energy, Joint Genome Institute (DOE, JGI). This will allow the discovery of single nucleotide sequence differences or SNPs between Williams 82 and the PI 468916. The third objective is the mapping of the SNP DNA markers in the Williams 82 x PI 468916 mapping population. The resulting mapping data will anchor sequence contigs of Williams 82 to the soybean genetic map and assist in the assembly of the Williams 82 whole genome sequence.
Project Methods
A plasmid library of approximately 30,000 clones with an insert size of 2000 basepairs will be created using genomic DNA of PI 468916. Each clone will be sequenced from both ends using Sanger sequencing on the ABI3730. The resulting sequence data will be compared to the DOE, JGI draft sequence of Williams 82. This comparison will firstly eliminate sequences that are repetitive. Those sequences that align with sequence in Williams 82 contigs that are already anchored to the genetic map will be discarded. Sequence variants will be identified in the remaining PI 468916-Williams 82 alignments using Phred sequence quality scores. Only high quality variants (Phred score > 25) with a high probability of being real variants (SNPs), rather than sequencing errors, will be considered. In addition, SNPs that occur in fragments in which there are 100 basepairs of high quality sequence data on either side of the SNP will be selected for the development of SNP detection assays using the Illumina GoldenGate assay. SNPs will be genetically mapping on the Williams 82 x PI 468916 mapping population using the Illumina BeadStation 500.

Progress 06/01/07 to 04/30/11

Outputs
Progress Report Objectives (from AD-416) The first objective of this research is to create a genetic mapping population via a cross of the soybean cultivar Williams 82 with the wild soybean (G. soja) parent PI 468916. The second objective is to create and sequence a plasmid library of PI 468916 DNA and compare the sequences of the plasmid clones with the draft whole genome sequence of Williams 82 that is being created by the Department of Energy, Joint Genome Institute (DOE, JGI). This will allow the discovery of single nucleotide sequence differences or SNPs between Williams 82 and the PI 468916. The third objective is the mapping of the SNP DNA markers in the Williams 82 x PI 468916 mapping population. The resulting mapping data will anchor sequence contigs of Williams 82 to the soybean genetic map and assist in the assembly of the Williams 82 whole genome sequence. Approach (from AD-416) A plasmid library of approximately 30,000 clones with an insert size of 2000 basepairs will be created using genomic DNA of PI 468916. Each clone will be sequenced from both ends using Sanger sequencing on the ABI3730. The resulting sequence data will be compared to the DOE, JGI draft sequence of Williams 82. This comparison will firstly eliminate sequences that are repetitive. Those sequences that align with sequence in Williams 82 contigs that are already anchored to the genetic map will be discarded. Sequence variants will be identified in the remaining PI 468916-Williams 82 alignments using Phred sequence quality scores. Only high quality variants (Phred score > 25) with a high probability of being real variants (SNPs), rather than sequencing errors, will be considered. In addition, SNPs that occur in fragments in which there are 100 basepairs of high quality sequence data on either side of the SNP will be selected for the development of SNP detection assays using the Illumina GoldenGate assay. SNPs will be genetically mapping on the Williams 82 x PI 468916 mapping population using the Illumina BeadStation 500. This project is in collaboration with researchers at USDA-ARS Ames, IA. Funds are being provided by the United Soybean Board, Project #7268 entitled �Genome Assembly: Genetic Anchoring of Contigs� which is a project being conducted in collaboration with researchers at the USDA, ARS, Ames, IA. A total of more than 23,000 single nucleotide polymorphism (SNP) DNA markers was used to analyze more than 1000 progeny derived from a cross of the cultivated soybean Williams 82 with the wild soybean PI 479752. These DNA markers are distributed along the 20 soybean chromosomes and their analysis on the large number of progeny allow us to determine the order of the markers along the chromosomes and the approximate distance between the markers along the chromosomes. The large number of DNA markers analyzed in this very large population will allow us to provide an updated and more accurate assembly of the Williams 82 whole soybean genome sequence that was completed in 2010.

Impacts
(N/A)

Publications


    Progress 10/01/08 to 09/30/09

    Outputs
    Progress Report Objectives (from AD-416) The first objective of this research is to create a genetic mapping population via a cross of the soybean cultivar Williams 82 with the wild soybean (G. soja) parent PI 468916. The second objective is to create and sequence a plasmid library of PI 468916 DNA and compare the sequences of the plasmid clones with the draft whole genome sequence of Williams 82 that is being created by the Department of Energy, Joint Genome Institute (DOE, JGI). This will allow the discovery of single nucleotide sequence differences or SNPs between Williams 82 and the PI 468916. The third objective is the mapping of the SNP DNA markers in the Williams 82 x PI 468916 mapping population. The resulting mapping data will anchor sequence contigs of Williams 82 to the soybean genetic map and assist in the assembly of the Williams 82 whole genome sequence. Approach (from AD-416) A plasmid library of approximately 30,000 clones with an insert size of 2000 basepairs will be created using genomic DNA of PI 468916. Each clone will be sequenced from both ends using Sanger sequencing on the ABI3730. The resulting sequence data will be compared to the DOE, JGI draft sequence of Williams 82. This comparison will firstly eliminate sequences that are repetitive. Those sequences that align with sequence in Williams 82 contigs that are already anchored to the genetic map will be discarded. Sequence variants will be identified in the remaining PI 468916-Williams 82 alignments using Phred sequence quality scores. Only high quality variants (Phred score > 25) with a high probability of being real variants (SNPs), rather than sequencing errors, will be considered. In addition, SNPs that occur in fragments in which there are 100 basepairs of high quality sequence data on either side of the SNP will be selected for the development of SNP detection assays using the Illumina GoldenGate assay. SNPs will be genetically mapping on the Williams 82 x PI 468916 mapping population using the Illumina BeadStation 500. Significant Activities that Support Special Target Populations Funds are being provided by the United Soybean Board, Project #7268 entitled �Genome Assembly: Genetic Anchoring of Contigs� which is a project being conducted in collaboration with researchers at the USDA, ARS, Ames, IA. A total of 106 SSR markers were genotyped in each of 470 F5-derived RILs from the cross of Williams 82 x PI 468916 Cultivated x Wild Soybean Mapping Population. The 106 markers included SSR markers that have been previously mapped on the Soybean Consensus Map. The genotyping of these markers allowed us to tie together the linkage groups (LGs) on the new map with the corresponding LGs on the Soybean Consensus Map. To date, 71 SSRs from the current Soybean Consensus Map were genotyped in the population. In addition, 35 new SSR markers were developed and mapped. These SSR primers were selected from sequence �scaffolds� or segments on the new Department of Energy, Joint Genome Institute (DoE-JGI) whole genome sequence of Williams 82. These were from sequence scaffolds that were either not anchored to the Soybean Consensus Map or scaffolds that were unoriented in the sequence assembly.

    Impacts
    (N/A)

    Publications


      Progress 10/01/07 to 09/30/08

      Outputs
      Progress Report Objectives (from AD-416) The first objective of this research is to create a genetic mapping population via a cross of the soybean cultivar Williams 82 with the wild soybean (G. soja) parent PI 468916. The second objective is to create and sequence a plasmid library of PI 468916 DNA and compare the sequences of the plasmid clones with the draft whole genome sequence of Williams 82 that is being created by the Department of Energy, Joint Genome Institute (DOE, JGI). This will allow the discovery of single nucleotide sequence differences or SNPs between Williams 82 and the PI 468916. The third objective is the mapping of the SNP DNA markers in the Williams 82 x PI 468916 mapping population. The resulting mapping data will anchor sequence contigs of Williams 82 to the soybean genetic map and assist in the assembly of the Williams 82 whole genome sequence. Approach (from AD-416) A plasmid library of approximately 30,000 clones with an insert size of 2000 basepairs will be created using genomic DNA of PI 468916. Each clone will be sequenced from both ends using Sanger sequencing on the ABI3730. The resulting sequence data will be compared to the DOE, JGI draft sequence of Williams 82. This comparison will firstly eliminate sequences that are repetitive. Those sequences that align with sequence in Williams 82 contigs that are already anchored to the genetic map will be discarded. Sequence variants will be identified in the remaining PI 468916-Williams 82 alignments using Phred sequence quality scores. Only high quality variants (Phred score > 25) with a high probability of being real variants (SNPs), rather than sequencing errors, will be considered. In addition, SNPs that occur in fragments in which there are 100 basepairs of high quality sequence data on either side of the SNP will be selected for the development of SNP detection assays using the Illumina GoldenGate assay. SNPs will be genetically mapping on the Williams 82 x PI 468916 mapping population using the Illumina BeadStation 500. Significant Activities that Support Special Target Populations Funds are being provided by the United Soybean Board, Project #7268 entitled �Genome Assembly: Genetic Anchoring of Contigs� which is a project being conducted in collaboration with researchers at the USDA, ARS, Ames, IA. Work was undertaken to discover single nucleotide polymorphism (SNP) DNA markers that distinguish soybean cultivar Williams 82 from the wild soybean (G. soja) PI 468916. A cross was made between these two soybean genotypes and fertile hybrids were produced. A total of more than 400 recombinant inbred lines were developed from this cross and these are being characterized with the newly discovered SNP markers as well as simple sequence repeat (SSR) markers that were previously developed and genetically mapped in our laboratory. The genetic mapping information is allowing us to anchor sequence contigs from the Williams 82 whole genome sequence to the genetic map. Research progress is monitored via frequent e-mail and telephone communication and exchange of genetic mapping data as well as face-to-face meetings twice a year.

      Impacts
      (N/A)

      Publications