Source: PURDUE UNIVERSITY submitted to
DATABASE OF REPETITITVE AND TRANSPOSABLE ELEMENTS IN SOYBEAN: GENOME ANNOTATION
Sponsoring Institution
Agricultural Research Service/USDA
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0412307
Grant No.
(N/A)
Project No.
3625-21000-052-04S
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Sep 6, 2007
Project End Date
Dec 31, 2010
Grant Year
(N/A)
Project Director
SHOEMAKER R C
Recipient Organization
PURDUE UNIVERSITY
(N/A)
WEST LAFAYETTE,IN 47907
Performing Department
AGRONOMY
Non Technical Summary
(N/A)
Animal Health Component
(N/A)
Research Effort Categories
Basic
20%
Applied
80%
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2012499108010%
2011820108090%
Goals / Objectives
Additional research objectives include 1) develop repeat junction markers for mapping and identification of homoeologous centromeres in soybean, 2) anchor repetitive DNA to the soybean genetic map and whole-genome draft sequence, and 3) identify genomic rearrangements and gene or gene fragments caused by transposable elements.
Project Methods
Putative repeat markers will be recognized by analysis of the distribution and organization of transposable elements in the draft sequence of the soybean genome; Polymorphisms will be tested by PCR amplification and transposon-display of parental lines of a segregation population, and the marker exhibiting polymorphisms between parental lines will be mapped using the segregation population and integrated to the draft sequence of the soybean genome; Genomic rearrangements and gene or gene fragments caused by transposable elements will be identified by a combined approach to transposon and gene annotation.

Progress 09/06/07 to 12/31/10

Outputs
Progress Report Objectives (from AD-416) Additional research objectives include 1) develop repeat junction markers for mapping and identification of homoeologous centromeres in soybean, 2) anchor repetitive DNA to the soybean genetic map and whole-genome draft sequence, and 3) identify genomic rearrangements and gene or gene fragments caused by transposable elements. Approach (from AD-416) Putative repeat markers will be recognized by analysis of the distribution and organization of transposable elements in the draft sequence of the soybean genome; Polymorphisms will be tested by PCR amplification and transposon-display of parental lines of a segregation population, and the marker exhibiting polymorphisms between parental lines will be mapped using the segregation population and integrated to the draft sequence of the soybean genome; Genomic rearrangements and gene or gene fragments caused by transposable elements will be identified by a combined approach to transposon and gene annotation. During this reporting period the project identified a nearly complete set of transposable elements (32,552 retrotransposons and 6,029 DNA transposons) in the sequenced soybean reference genome, and constructed a comprehensive soybean transposable element database. These data laid the foundation for accurate assessment of the composition of the genome (~46, 000 soybean genes) and are freely accessible through the database website � SoyTEdb.org, which facilitates users to better utilize soybean genomic resources. The elements deposited in this database, together with numerous truncated elements or remnants make up ~50% of the soybean genome and are particularly enriched in recombination-suppressed regions surrounding the chromosome centromeres. In addition, the project developed a computational approach to predicting centromere retrotransposon families in any sequenced plant genomes, identified a subset of transposable elements harboring gene fragments, and characterized molecular mechanisms for maintenance of autonomous and non- autonomous retrotransposon partnership. Progress on this project is monitored through periodic conference calls, written reports, and annual face-to-face meetings.

Impacts
(N/A)

Publications


    Progress 10/01/09 to 09/30/10

    Outputs
    Progress Report Objectives (from AD-416) Additional research objectives include 1) develop repeat junction markers for mapping and identification of homoeologous centromeres in soybean, 2) anchor repetitive DNA to the soybean genetic map and whole-genome draft sequence, and 3) identify genomic rearrangements and gene or gene fragments caused by transposable elements. Approach (from AD-416) Putative repeat markers will be recognized by analysis of the distribution and organization of transposable elements in the draft sequence of the soybean genome; Polymorphisms will be tested by PCR amplification and transposon-display of parental lines of a segregation population, and the marker exhibiting polymorphisms between parental lines will be mapped using the segregation population and integrated to the draft sequence of the soybean genome; Genomic rearrangements and gene or gene fragments caused by transposable elements will be identified by a combined approach to transposon and gene annotation. We investigated 510 retrotransposon families, so-called �jumping genes� because of their ability to move from place to place in chromosomes. In soybean, these families are comprised of 32,370 elements. Approximately 87% of these elements were located in a region of chromosomes where genetic recombination is suppressed. Further study of the evolutionary relationships among the families uncovered elements that had diverged before the divergence of dicots (flowering plants with two seed leaves e. g., Arabidopsis, soybean, alfalfa) and monocot species (flowering plants with one seed leaf e.g., rice, corn, orchids). Analysis of the physical association of the families with repeated DNA sequences common to chromosomal structures called centromeres, indicated that the function of the repeats predate the evolutionary split of dicots and monocots. Our data also suggest the single ancient origin of retrovirus-like elements in flowering plants. The details of DNA sequence, evolutionary relationships, and chromosomal positions were all placed into a publicly available database. Progress on this project is monitored through periodic conference calls, written reports, and annual face-to-face meetings.

    Impacts
    (N/A)

    Publications


      Progress 10/01/08 to 09/30/09

      Outputs
      Progress Report Objectives (from AD-416) Additional research objectives include 1) develop repeat junction markers for mapping and identification of homoeologous centromeres in soybean, 2) anchor repetitive DNA to the soybean genetic map and whole-genome draft sequence, and 3) identify genomic rearrangements and gene or gene fragments caused by transposable elements. Approach (from AD-416) Putative repeat markers will be recognized by analysis of the distribution and organization of transposable elements in the draft sequence of the soybean genome; Polymorphisms will be tested by PCR amplification and transposon-display of parental lines of a segregation population, and the marker exhibiting polymorphisms between parental lines will be mapped using the segregation population and integrated to the draft sequence of the soybean genome; Genomic rearrangements and gene or gene fragments caused by transposable elements will be identified by a combined approach to transposon and gene annotation. Significant Activities that Support Special Target Populations Using a semi-automated program developed in the cooperator�s laboratory, the cooperator identified 32,552 retrotransposons (Class I) and 6,029 DNA transposons (Class II) with clear insertion sites. These elements, together with numerous truncated elements/fragments, make up approximately 61% of the soybean genome. Similar to other crop species, soybean transposable elements are particularly enriched in centromere and pericentromeric regions. SoyTE, �the soybean transposable element database� has been set up and will be freely accessible for users to search, browse, visualize and download the transposable element (TE) sequences (http://www.soytedb.org). SoyTE was also integrated with soybean physical map and genetic maps, facilitating users to better utilize soybean genetic resources. The 14,106 intact elements and 18,264 solo Long Terminal Repeats (LTRs) are classified into 510 distinct families. Among these, 353 families were assigned as Gypsy-like, and the rest 157 families were categorized as Copia-like. The ratio of solo LTRs (S/I) to intact elements in recombination suppressed regions is significantly lower than in chromosome arms. Using the rice centromeric retrotransposon and the Maize centromeric retrotransposon for comparisons, two centromere-specific/enriched families in soybean were identified and characterized. This finding supported the hypothesis that centromeric retrotransposon elements predate the divergence of dicots and monocots species. The age distribution showed that most LTR-retrotransposons (approximately 95%) in soybean were amplified in the past three million years (mys). Close examination on the largest family Soybean Nonautonomous and Autonomous Retrotransposon Element (SNARE) in soybean revealed several new observations regarding autonomous-nonautonomous retrotransposon interaction. Firstly, SNARE contains two autonomous subfamilies and a nonautonomous subfamily. Inter-element recombination mediated by reverse transcription during retrotransposition was frequently observed. Secondly, the bifurcation of non-autonomous elements arisen recently from a single lineage into two distinct subgroups corresponds to two autonomous partners that appear to have been brought together in the present soybean genome by a polyploidization event about 15 million years ago. Thirdly, the �proliferation� of an unrelated �piggy-backing� retroelement was mediated by rapid amplification of a single non-antonomous element, supporting that nested retrotransposons have the potential to amplify and transpose in the plant genome. Progress on this project has been documented through written reports, conference calls, and face-to-face meetings. Three manuscripts are under review or in preparation.

      Impacts
      (N/A)

      Publications


        Progress 10/01/07 to 09/30/08

        Outputs
        Progress Report Objectives (from AD-416) Additional research objectives include 1) develop repeat junction markers for mapping and identification of homoeologous centromeres in soybean, 2) anchor repetitive DNA to the soybean genetic map and whole-genome draft sequence, and 3) identify genomic rearrangements and gene or gene fragments caused by transposable elements. Approach (from AD-416) To fully annotation the soybean genome it is important to accurately identify LTR-retrotransposons, DNA transposons, Helitrons, and centromeric repetitive DNAs. Because manual inspection is very time- consuming, the first year will be focused on the identification of repeats. Because of the rapid divergence of transposable elements and other classes of repetitive DNA in plants, inter-species comparison or homology-based searches against available repeat databases will have limited power for identification of new repeats in the soybean genome. We thus will take a semi-automated approach to identify LTR- retrotransposons, DNA transposons, Helitrons, centromere satellite repeats etc. by employing multiple sequence analysis tools currently available, such as CROSS_MATCH, LTR_STRUC, RECON, RepeatMasker and Tandem Repeats Finder, and by manual annotation and inspection based on structural analysis of some classes of repeats. Perl-based scripts/programs will be developed to facilitate data mining and processing. The characterized/classified/categorized repeats sequence database (fasta files) will be made accessible and searchable via SoyBase and The Soybean Breeders ToolBox, and will provide information regarding repeat nature, repeat percentage, repeat structure, and repeat evolution in the soybean genome. Significant Activities that Support Special Target Populations Using a semi-automated program developed in our laboratory, the Cooperator identified 6456 intact LTR-retrotransposons and 9238 solo-LTRs, which are classified into 486 distinct families. These elements, together with numerous truncated elements or remnants, make up ~30% of the soybean genome. Based on the chromosomal distribution of LTR- retrotransposons and their association with centromere satellite repeats, the soybean centromere-enriched retrotransposons have been identified. The data suggest that most intact LTR-retrotransposons were amplified in the soybean genome within the past three million years, and removal of retrotransposon DNA by unequal recombination and illegitimate recombination is a rapid process counteracting genome expansion caused by retrotransposon proliferation. The evolutionary dynamics of LTR- retrotransposons will be investigated. This dataset has been submitted to the Soybean Genome Annotation Team to help the annotation of the soybean genes. Currently, the incumbent has been developing programs for identification of Helitrons and DNA transposons, such as mutator, CACTA, and hAT elements. So far 172 mutator elements, 27 CACTA elements and 10 Helitrons have been mined, and more elements will be identified and manually inspected. This project is part of NP 301 Plant Genetic Resources, Genomics, and Genetic Improvement, and fits within Action Plant Component 2, Crop Informatics, Genomics and Genetic Analyses, Problem Statement 2A: Genome Database Stewardship and Informatics Tool Development, and Problem Statement 2B: Structural Comparison and Analysis of Crop Genomes. Progress on this project has been documented through written reports and conference calls.

        Impacts
        (N/A)

        Publications