Source: UNIVERSITY OF CALIFORNIA, DAVIS submitted to
THE POSITIVE EFFECTS OF INTRONS ON PLANT GENE EXPRESSION
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0207144
Grant No.
2006-35301-17072
Project No.
CA-D*-MCB-7538-CG
Proposal No.
2006-01141
Multistate No.
(N/A)
Program Code
52.2
Project Start Date
Aug 1, 2006
Project End Date
Jul 21, 2009
Grant Year
2006
Project Director
Rose, A. B.
Recipient Organization
UNIVERSITY OF CALIFORNIA, DAVIS
410 MRAK HALL
DAVIS,CA 95616-8671
Performing Department
MOLECULAR & CELLULAR BIOLOGY
Non Technical Summary
The information contained in most plant genes is interrupted by non-coding sequences known as introns. Introns are deleted after the gene has been transcribed into mRNA but before the mRNA is translated into protein. Introns are not simply packaging material that must be removed before the information in a gene can be read. In many examples, the amount of protein derived from a gene is greatly increased by the presence of an intron. The goal of this project is to determine the mechanistic basis for the positive effect of introns on the expression of plant genes. The parts of an intron involved in regulating the degree of stimulation will be identified by analyzing a series of hybrids that were made with parts of two introns that boost expression to very different degrees. Also, a computer program designed to predict the enhancing ability of introns in Arabidopsis will be tested, refined, and adapted for use in crop species. This research will increase our understanding of a very important but poorly understood fundamental aspect of plant gene expression. A deeper understanding of the factors needed for abundant gene expression could have great practical benefit because there are many scientific, agricultural, and commercial applications in which a high level of protein synthesis is desirable.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
20124201040100%
Goals / Objectives
The long-term aim of this research is to investigate the mechanisms by which introns increase the expression of many plant genes. The goal of this proposal is to identify the intron sequences that control the magnitude of enhancement and are therefore likely to be directly involved in the mechanism. This goal will be achieved in two ways. The first is to define the regions of a well-characterized intron required for it to elevate expression, and the second is to use a computational approach to identify sequences whose occurrence in various introns correlates with their ability to stimulate expression. For ease of study and to build on previous work, these experiments will begin with an analysis of Arabidopsis introns in Arabidopsis. The studies then will be extended to introns in rice and maize. Preliminary results made with two structurally similar introns with very different effects on expression suggest that the sequences responsible for enhancement are redundant, dispersed, and more prevalent or active in some introns than others. To eliminate the variation in expression between lines with different numbers of transgenes, the first specific aim is to identify, from existing lines containing a reporter gene with various deletion-containing and hybrid introns, those with a single copy of the transgene. A careful quantitative evaluation of reporter gene expression in each will be performed, with the aim of determining the distribution of enhancing sequences in an intron. In the second specific aim, a complementary approach will be used to identify potential enhancing sequences by computational means. An algorithm (named the IMEter) has been devised that compares an individual intron to all the introns in genes that are predicted to be highly expressed, based on their codon entropy. When applied to Arabidopsis introns whose affect on expression has been characterized, the IMEter generates a score that correlates with that introns ability to boost expression. The ability of the IMEter to predict the enhancing capability of an intron will be tested, and the original algorithm will be refined to optimize the match between IMEter score and experimental data. In addition, the data generated using the IMEter will be searched for short sequences whose occurrence in an intron contributes to a high IMEter score: these sequences are candidates for being involved in the enhancement mechanism. Finally, the IMEter will be reprogrammed using the rice genome sequence, and the rice IMEter will be evaluated using rice and maize introns known to enhance expression in those species.
Project Methods
The first aim explores the sequence dependence for the very different effects on expression mediated by structurally similar introns from the UBQ10 and COR15a genes of Arabidopsis. A few nucleotides in each intron were changed to generate restriction sites, which were then used to create deletions in the strongly enhancing UBQ10 intron and hybrids containing regions from each intron. Transgenic lines were generated that contain a reporter gene into which these introns were inserted. Preliminary evidence from these lines suggests that enhancing sequences are redundant and dispersed throughout the UBQ10 intron. Accurate quantitation will be performed using lines containing a single copy of the transgene, and these will be identified by genomic DNA blot. Reporter expression relative to an intronless control will be measured by RNA blot and enzyme assay, and intron splicing efficiency will be monitored by RT-PCR using primers that flank the intron. The outcome will reveal which portions of the UBQ10 intron are necessary and sufficient for maximal stimulation. To test the IMEter, four pairs of introns were selected from the genome. The introns in each pair are carefully matched in numerous ways except that one generates a much higher IMEter score than the other. All eight introns will be inserted into the same reporter, single-copy transgenic lines containing each will be identified, and reporter gene expression and splicing efficiency will be assessed as described above. The results from these eight introns and the six previously analyzed will be used to evaluate the correlation between an introns IMEter score and its ability to boost expression. Refinements will be made to the program such as adjusting for intron length, considering only the first introns from genes, and describing intron composition using tetramers and hexamers rather than pentamers as in the original IMEter. Modifications will be retained that improve the correlation between IMEter score and enhancing ability of all the well-characterized introns. A rice IMEter will be generated by determining the codon entropy of all the genes in the rice genome, calculating the frequency at which all possible pentamers are found in all the introns in the 20% of genes with the lowest entropy, and asking if a test intron fits that pentamer profile better than the profile of introns from genes with high codon entropy. This algorithm will be evaluated using rice introns known to stimulate expression. Even though a complete maize IMEter will not be possible until the maize genome is sequenced, the rice IMEter will be tested on maize introns previously shown to boost expression.

Progress 08/01/06 to 07/21/09

Outputs
OUTPUTS: The main activity during the funding period was to conduct and analyze all of the experiments proposed. This included the training of numerous students and postdoctoral scientists. In the Rose lab, a total of six undergraduate students were involved in performing the research, as were three graduate students doing laboratory rotations. In addition, two technicians were hired for a combined total of sixteen months. The Korf lab trained two postdocs and two graduate students in the computational analysis of introns. The services provided during the period of funding included consulting for a locally owned small biotechnology company that is interested in using introns to increase gene expression in insect cells. A tangible product to arise from this research is the IMEter, which is software that reliably predicts the enhancing ability of introns in rice and Arabidopsis. An online version of the IMEter is available for public use at http://korflab.ucdavis.edu/cgi-bin/web-imeter.pl and a link to the IMEter was established in the TAIR database as a Gene Expression Resources portal (http://www.arabidopsis.org/portals/expression/index.jsp). In addition to the publications listed below, the results of this research were disseminated through the presentation of posters at four scientific meetings (the Cold Spring Harbor Eukaryotic mRNA Processing meetings in August 2007 and 2009, the Biology of Post-transcriptional Gene Regulation Gordon Conference in July 2008, and the Plant Gene Expression Center 20 Year Symposium in Berkeley, June 2007). Also, Dr. Korf gave a talk on this subject at the Biology and Mathematics in the Bay Area meeting in November 2008. PARTICIPANTS: Alan B. Rose, PI, carried out the expression experiments, constructed plasmids and did plant transformations, and supervised the students in his lab. This included six undergraduate students (Aruna Dutta, Tamika Rozema, Janice Kwong, Angie Hood, Ilona Gurkova, and Jeff Lu) who mostly performed plasmid construction and helped with the growth and maintenance of the plants. Three graduate students (Amanda Schrager, Daniel Chu, and Phillip Conklin) performed laboratory rotations in the Rose lab, during which they constructed plasmids, did preliminary expression studies, or isolated introns from rice and Arabidopsis. Two people worked in the Rose lab as Senior Research Associate II (Dinah Arumainayagam from Aug. 2007 to May 2008 and Shelley Martin from Aug. 2006 to Dec. 2006), doing plasmid construction and helping with expression experiments. The Co-PI, Ian F. Korf, performed most of the initial software development and supervised the people in his lab. This included two graduate students (Shahram Emami and Tali Elfersi) and two postdocs (Genis Parra and Keith Bradnam), all of whom helped to test and refine the IMEter algorithm. TARGET AUDIENCES: The target audiences for this work include scientists involved in basic research as well as those involved in biotechnology applications in which a high level of gene expression is desired. The work provided insights into the general mechanisms of eukaryotic gene expression and is expected to be of interest as a fundamental biological process. The success of many biotechnology ventures often rests on the ability to produce a large amount of product from a specific gene, and scientists attempting this would benefit from the increased yield that introns can provide. Intron-mediated enhancement has been observed in many diverse organisms in addition to plants, making this research relevant to a broad audience. PROJECT MODIFICATIONS: None. The project was completed as proposed.

Impacts
The revised objectives for this funding period were to identify the intron sequences responsible for the very different enhancement mediated by the UBQ10 and COR15a introns, and to test, refine, and adapt for crop plants the IMEter algorithm. Both of these aims were fully accomplished, and the results generated new insights into intron structure and the ways introns could be stimulating gene expression. One change in knowledge to result from this work is a new appreciation that introns early in genes are compositionally distinct form those that occur further from the start of transcription, and that this difference in composition is related to the effect of promoter-proximal introns on gene expression. Another is that the intron sequences responsible for boosting gene expression are distributed throughout enhancing introns. This is significant because it eliminates all possible mechanisms that involve the binding of a factor to a specific or localized sequence, or that depend on a particular RNA secondary structure. Prior to this work, the only way to identify introns that boost gene expression for scientific purposes or for plant biotechnology was trial and error. The development of the IMEter algorithm greatly facilitates the identification of enhancing introns, a change of action that should be of significant benefit for all practical applications in which a high level of gene expression is desired. In the future, the ability to increase expression of a specific gene through the use of stimulating introns has potential to change conditions by improving the nutritional quality of foods and by providing a means to generate new value-added crops.

Publications

  • Rose, A.B. (2007) Book Review: Plant Gene Expression. Science STKE, pe26.
  • Korf, I.F. and A.B. Rose (2009) Applying word-based algorithms: The IMEter. In Plant Systems Biology (D. Belostotsky ed.) Methods in Molecular Biology (Clifton, NJ) 553:287-301.
  • Rose, A.B., T. Elfersi, G. Parra, and I. Korf (2008) Promoter-proximal introns in Arabidopsis thaliana are enriched in dispersed signals that elevate gene expression. Plant Cell 20:543-551.
  • Rose, A.B. (2008). Intron-mediated regulation of gene expression. In Nuclear pre-mRNA Processing in Plants (A.S.N. Reddy and M. Golovkin, eds.) Current Topics in Microbiology and Immunology 326:277-290.


Progress 08/01/07 to 07/31/08

Outputs
OUTPUTS: The goal of my research is to understand how introns enhance gene expression in plants. The aims of the current funding are to localize the intron sequences responsible for boosting expression using deletions and hybrids between two introns that affect expression to different degrees, and to test, refine and expand an algorithm for predicting the enhancing ability of introns. Significant progress has been made on both of these aims this year. Measurements of the mRNA accumulation and enzyme activity derived from reporter genes containing hybrid or deletion-containing introns confirm that the sequences responsible for stimulating expression are distributed throughout the enhancing UBQ10 intron. That is, the removal of any part of this strongly enhancing intron has a negligible effect on its ability to boost expression, and all regions of this intron have some enhancing capacity when inserted into a non-stimulating intron. The dispersed nature of the enhancing sequences indicates that introns must influence gene expression by a unique mechanism. The algorithm used in the second aim is a word-based method for comparing an individual intron to promoter-proximal introns as a group. We had previously noted a strong correlation between the ability of an intron to stimulate expression and the score it generates with this algorithm. Six new introns were tested and all increased expression in proportion to their score, demonstrating that the algorithm is capable of predicting the enhancing ability of introns. The quantitative data from these six introns and the six previously tested in the same gene were used to optimize the variable parameters of the algorithm. The algorithm was adapted to use rice intron sequences, and this version gives a high score to virtually all introns known to increase expression in rice and most that boost expression in maize and Arabidopsis. Motifs were identified that are over-represented in the highest scoring introns, and also by using the algorithm to scan intron sequences in a small sliding window. Both methods suggest the sequence TCGATT is important in generating high scores and therefore may be involved in the mechanism of enhancement. The algorithm was also incorporated into a computational approach to design synthetic introns with higher scores than any natural intron. These could be useful if they stimulate more strongly than existing introns, and it provides a way to generate an infinite supply of new enhancing introns in a variety of species. The algorithm is now publicly available from a web site (http://korflab.ucdavis.edu/cgi-bin/web-imeter.pl), and a link to this web site has been established in the TAIR database of Arabidopsis genome information as a Gene Expression Resources portal. (http://www.arabidopsis.org/portals/expression/index.jsp). PARTICIPANTS: Alan Rose performed and supervised all of the research involving plants and plasmid construction. Ian Korf performed and supervised all of the computational research. During the period covered by this report, this research provided training for two graduate students doing laboratory rotations and three undergraduates doing research for course credit. Research collaborations were recently initiated with two other groups (Karen McDonald, in Chemical Engineering at UC Davis) and Tracy Johnson (UC San Diego). TARGET AUDIENCES: Not relevant to this project. PROJECT MODIFICATIONS: Not relevant to this project.

Impacts
The benefits of this research will be both practical and scientific. The success of many biotechnology ventures depends on the production of large amounts of a specific protein. Most of the candidate genes involved in such applications will be from other species and will lack introns naturally (e.g. viral or bacterial genes), or their introns will have been removed to avoid problems of splicing foreign introns. The expression of many of these genes will be increased by the addition of an intron from the host species. Thus, a collection of easily inserted introns, and knowledge of how best to use them, will help boost the production of valuable proteins for scientific, agricultural, or commercial purposes. The algorithm to predict the enhancing ability of introns will be particularly useful because it will allow new and stronger stimulatory introns to be found or made. Identifying the intron sequences involved in enhancement will allow us to engineer a desired level of stimulation by changing the sequence of an intron, and provides the basis for future biochemical isolation of other factors involved in the mechanism of enhancement. The algorithm to predict enhancement can be adapted readily for use in many organisms including crop plants. In addition to these practical considerations, the study of introns will continue to increase our understanding of the fundamental events of gene expression in plants and other organisms. An indication of the significance of this work is that the article reporting the results was featured in the In Brief section of the Plant Cell (Farquharson, K.L. 2008. The IMEter Predicts an Intron's Ability to Boost Gene Expression. Plant Cell 20:498), and has been evaluated by Faculty of 1000 Biology (http://www.f1000biology.com/article/id/1104324/evaluation).

Publications

  • Rose, A.B. 2007. Book Review: Plant Gene Expression. Science STKE, pe26.
  • Rose, A.B., Elfersi, T., Parra, G., and Korf, I. 2008. Promoter-Proximal Introns in Arabidopsis thaliana are Enriched in Dispersed Signals that Elevate Gene Expression. Plant Cell 20: 543-551.


Progress 08/01/06 to 08/01/07

Outputs
The goal of my research is to understand how introns enhance gene expression in plants. The aims of the current funding are to analyze hybrid introns formed using parts of two different introns, only one of which significantly boosted gene expression, and to test, refine and expand an algorithm for predicting the enhancing ability of introns. Since funding for this project began in August 2006, significant progress has been made on both of these aims. For the first aim, homozygous single-copy transgenic lines containing a reporter gene with one of twelve different hybrid introns or controls were identified. For each, enzyme activity, mRNA accumulation, and intron splicing efficiency have been determined, with the following findings. First, the minor changes made to introduce restriction sites to facilitate the construction of hybrid introns have no discernible effect on the enhancing effect of the two parental introns. Second, that the enhancing sequences are roughly evenly distributed throughout the first three quarters of the enhancing intron, and that each enhancing region has an additive effect on expression. Third, any hybrid intron containing enhancing sequences at the beginning of the intron seem to have an additional two-fold effect on translational efficiency. That is, the enhancement seen at the level of enzyme activity is twice that seen at the level of mRNA accumulation, as if more enzyme is being produced per unit of mRNA. Perhaps the splicing of those introns promotes either export of the mRNA from the nucleus to the cytoplasm or association of the mRNA with ribosomes. Fourth, that all of the hybrid introns are completely spliced, indicating that any differences between them are not due to splicing efficiency. To test the predictive abilities of the algorithm (the IMEter) in the second aim, transgenic lines with a single copy of a reporter gene containing introns predicted to differ in their effect on expression have been identified. Of the eight introns, two clearly prevent expression of the reporter. The annotation for one of these was recently modified in the Arabidopsis database to indicate that the intron is actually 38 nt shorter than originally predicted, which would cause a reading frame shift. Expression studies on the remaining 6 introns will begin in the new year. An analysis of introns in rice, Drosophila, and C. elegans using the IMEter has revealed than first introns are structurally different than later introns in Arabidopsis, rice, and Drosophila, but not in C. elegans. A program has been devised to very rapidly evaluate the effect of changing various parameters of the IMEter and to incorporate new information as it is generated, with the goal of optimizing the correlation between IMEter scores and expression data. The IMEter is currently being used on the hybrid introns in hopes of pinpointing sequences that are potentially responsible for the enhancement observed.

Impacts
The unparalleled biosynthetic capabilities of plants can be harnessed to synthesize many valuable substances such as medicines, edible vaccines, or unusual oils. In most cases, success depends on making large amounts of a specific protein, and ways to increase gene expression are enormously beneficial. Most of the candidate genes for such practical applications will be from non-plant sources and will lack introns naturally (e.g. viral or bacterial genes), or their introns will have been removed because plants are poor at splicing foreign introns. The expression of many of these genes in plants will be increased by the addition of a plant intron. Thus, a collection of easily inserted introns, and knowledge of how best to use them, will help boost the production of valuable proteins for scientific, agricultural, or commercial purposes. Identifying the intron sequences involved in enhancement will allow us to engineer a desired level of stimulation by changing the sequence of an intron. The algorithm to predict the enhancing ability of introns will be particularly useful because it will allow new and stronger stimulatory introns to be found. The knowledge gained, and the algorithm to predict enhancement, can be adapted readily for use in many organisms including crop plants. In addition to these practical considerations, the study of introns will continue to increase our understanding of the fundamental events of gene expression in plants.

Publications

  • No publications reported this period