Source: MICHIGAN TECHNOLOGICAL UNIV submitted to
SYSTEMATIC IDENTIFICATION AND CHARACTERIZATION OF OVERLAPPING SENSE/ANTISENSE GENE LOCI IN POPULUS GENOME
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0228277
Grant No.
2012-67014-19445
Project No.
MICW-2011-04390
Proposal No.
2011-04390
Multistate No.
(N/A)
Program Code
A1101
Project Start Date
Apr 15, 2012
Project End Date
Apr 14, 2016
Grant Year
2012
Project Director
Yuan, Y.
Recipient Organization
MICHIGAN TECHNOLOGICAL UNIV
1400 Townsend Drive
HOUGHTON,MI 49931
Performing Department
School of Forest Resources and Enviromental Science
Non Technical Summary
The genomic structure and molecular features of overlapping sense-antisense gene loci, especially those involved in plant adaption to climate stress have not been globally described. This project will establish a comprehensive and genome-wide catalog of sense-antisense loci associated with drought and heat stress in Populus, and provide both novel dataset and unprecedented insight into the role of sense-antisense loci in regulating stress response, which can be exploited to develop stress-tolerant bioenergy cultivars to improve biomass production under changing climate.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
20606991040100%
Goals / Objectives
The goal of this research is to investigate the role of sense-antisense loci in regulating stress response of plants. We will systematically determine and catalog the genome-wide occurrence of overlapping sense-antisense loci under drought and heat, and characterize the genomic context and gene organization of stress-responsive sense-antisense pairs in Populus genome. We will also test the biological significance of candidate sense-antisense loci by functional studies of a few evolutionary conserved putative sense-antisense loci across Arabidopsis and Populus through RNA interference (RNAi). By the end of these studies we will generate both novel dataset of stress-associated sense-antisense loci in Populus and several RNAi transgenic lines in both Populus and Arabidopsis ready for future functional and genetic analysis.
Project Methods
We will use Populus, a model system for woody biomass crops to understand the co-regulation mechanism of sense-antisense pairs in plant response to climate stress. We will employ an experimental approach to enrich overlapping sense and antisense transcripts from leaf and root tissue under drought and heat. These enriched strand-specific cDNAs will be sequenced through Illumina sequencing platform. Sequence data will be mapped to Populus reference genome and analyzed through various bioinformatics tools such as Tophat and Cufflinks. This targeted transcriptome sequencing will allow us to cost-effectively and efficiently identify genome-wide stress-responsive sense-antisense pairs including low abundant ones. We will also initiate RNAi knockout analysis to reveal the functional significance of a subset of evolutionary conserved sense-antisense loci in Arabidopsis and Populus. This analysis will eventually disclose the biological function of sense-antisense loci in key functional process of plants.

Progress 04/15/12 to 04/14/16

Outputs
Target Audience:We have successfully developed a novel cost-effective RNA sequencing method to identify and profile sense and antisense transcript pairs genome-wide for Populus plants under drought. Our method demonstated that over 50% of Populus genome involving at least 67% annotated Populus gene loci undergo both sense and antisense transcription. During this funded research we have trained four undergraduate students with majors from chemistry, chem informatics and forestry, and one visiting scientist and one graduate student. The training includes biofinfomatics data mining, computer script developing and drought treatment of plants in greenhouse. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?We have trained four undergraduate students , three from Chmistry department and one from Forestry. We also have one visiting scientist from China and one graduate tudent from University of Georgia are involved in this project. Students and visiting scientist are trained for bioinformatics data analysis, computer scripts development and drougth treatment of Populus plants in greenhouse. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? We have successfully developed a novel method to enrich sense and antisense (S/AS) transcript pairs from Populus tissues. We further employed this method to genome-wide profile S/AS transcript changes of Populus genome under drought. We demonstrate that over 50% of Populus genome involving at least 67% annotated Populus gene loci undergoes both sense and antisense transcription. We have identified that 7353 S/AS pairs are significantly differentially expressed (p value < 0.05) between control (no water stress) and drought plants in apex tissue, and 2739 S/AS pairs are differentially expressed in mature leaves. We are also able to identify several S/AS pairs that are reversely regulated under drought in apex tissue but not in mature leaves. In conclusion over 50% of Pupulus genome carry out transcription through both DNA strands, and vast majority of these drought-regulated S/AS pairs are co-regulated and very few of them are reversely regulated. We employed strand-specific RNA-seq to validate our method. This approach allows us to evaluate our enriched method at genome-wide level and generate valuable dataset to reveal the complete transcriptional landscape of Populus genome under drought. Strand-specific RNA-seq also serves a good comparison in discovery rate / per sequencing cost with our enriched method. Strand-specific RNA-seq confirmed that over 90% of S/AS pairs (14035) identified in strand-specific RNA-seq are present in our enriched libraries. Our enrich method outperforms strand-specific RNA-seq in discovery of S/AS genes regarding sequence cost and efficiency. We found that at around 1/3 of the cost of traditional strand-specific RNA-seq, we identified about twice of S/AS transcript pairs from our enriched libraries than from strand specific RNA-seq method. Moreover genome-wide transcript profiling analysis of S/AS pairs of enriched S/AS sequencing derived from apex tissue under early drought stress identifies 7353 co-regulated S/AS (p<0.05), while standard strand-specific RNA-seq, only identify 339 co-regulated and 106 reversely regulated S/AS pairs (p <0.05). These results demonstrate that enriched S/AS transcript sequencing develop in this project is not only a powerful tool for discovery of S/AS pairs at genome-wide level, but also an efficient approach to identify drought-regulated S/AS genes at a much lower cost. With this enriched method and the strand-specific RNA-seq analysis, we have discovered that apex and mature leaves respond to drought quite differently at transcriptome level. Transcriptional change under drought is more dynamics in apex than in mature leaves with both sense and antisense transcription dramatically adapted to drought signal. We investigated the transcriptome landscape of Populus genome under long term drought stress with 16 strand-specific mRNA libraries from 4 tissue types (apex, young leaves, mature leaves and root) of 2 control and 2 drought-treated plants. Overall transcriptome response in root showed more significant changes with both sense and antisense transcription under drought in contrast to leaf transcriptome. The leaf transcriptome on the other hand is not very responsive to drought. In particular apex transcrptome is least responsive to drought with only 1 de_SAS transcript pairs identified from expression analysis. We used GOseq gene enrichment analysis to functionally categorize drought-induced sense and antisense genes. Our enriched S/AS transcript profiling reveal that both sense and antisense transcripts of transcriptional factors are important regulators of drought response in young tissue of plants. In conclusion, we have functionally categorize drought-regulated S and AS genes into different functional group based on Goseq enrichment analysis. We are able to identify a set of genes involved in nucleosome assembly which are differently regulated in different drought treatment and different tissues. It is known that nucleosome are basic units of chromatin, and chromatin regulation is essential to regulate gene and genome activities. But it is not clear how or whether genes involved in nucleosome assembly are regulated by stress such as drought. In this research we demonstrated that a substantial number of genes associated with nucleosome assembly contains antisense transcription, and such AS genes are also differently co-regulated by drought with their sense genes. Such knowledge will help us to understand how plant responds to drought at genome level. These data also provide us a opportunity to carry out further experiment to investigate the functional role of antisense transcription in drought response of Populus. By investigating the distribution patteren of sense and antisense reads acrosss annoted gene region, we identify three major types of S/AS pairs: divergent (5utr overlapping), convergent (3utr overlapping) and embedded (one transcript is totally embedded into another). Contrary to previous reports, we found that majority S/AS pairs belongs to embedded type. When we separate the S/AS pairs based on their regulation pattern: co-regulation and reversely regulation), it is more convincing that embedded and divergent are overwhelming predominant for co-regulation S/AS pairs and divergent and embedded for reversely regulated S/AS pairs. We also inspected the distribution of sense and antisense reads across different regions of gene including intron and exon regions. We discovered that antisense reads are enriched in intron region of Populus genes and majority of Populus genes contain intron retention implies that there may be an extensive relationship of antisense transcription and alternative splicing, and this discovery is worthy with further investigation. We have identified a subset of drought-regulated genes involved in nucleosome assembly for functional analysis. Since majority of S/AS pairs that we studies are embedded type. We have now decided to mainly employ CRISPR to target mutate these genes. For several S/AS pairs with distinct 5and 3-UTR sequences we are going to design construct for RNAi also. Since plant transformation for Populus is a very tedious procedure we are planning to continue this work with support from School of Forest Environmental Science at Michigan Tech University even after the project ended.

Publications


    Progress 04/15/14 to 04/14/15

    Outputs
    Target Audience: Overlaping sense-antisense transcripts are encoded from both stands of same genomic loci. The gene structure, molecular features and biological function of such loci, especially those involved in plant adaption to climate stress such as drought have not been globally described and documentated. This project will establish a comprehensive and genome-wide catalog of sense-antisense loci associated with drought and heat stress in Populus, and provide both novel dataset and unprecedented insight into the role of sense-antisense loci in regulating stress response, which can be exploited to develop stress-tolerant bioenergy cultivars to improve biomass production under adverse climate. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Dr. Su Chen, a visiting schloar from China, has been involved in this project. With the help from PI, Dr. Chen was able to write custom script for data mapping and expression analysis. Ms XI Gu, a Ph.D. Candidate at University of Georgia has been involved in this project. With the guidance from PI, she wrote custom trascript to divide annotated Populus genes into different groups to assist data analysis. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals? We are going to validate the expression analysis data with qPCR analysis. Based on the qpcr data we are going to design gene-specific constructs to genetically modified the individual S/AS loci(around 10) in Populus. Meanwhile we are going to publish the data we have analysed and submit the RNA-seq data to NCBI for pubic access.

    Impacts
    What was accomplished under these goals? The major goal of this project is to identify drought responsive sense/antisense (S/AS) gene loci in Populus genome. We have now developed a technology to enrich sense/antisense gene pairs from drought stressed tissues of Populus. This technology allows us to work with small amount of input RNA (as low as 100 ng with our method vs. 150 ug from method published previously). Since the input RNA amount is no longer a limiting factor, we are able to use this technology to enrich sense/antisense pairs from any stressed tissues including challenging tissues such as apex and root tips. More importantly we adapted this method into Illumina TruSeq kit for high throughput Illumina sequencing. Thus this advanced technology can be applied to any species for maximized discovery of sense/antisense gene pairs in any tissue or cell types. With this established technology we have generated 36 sense-antisense enriched libraries from leaves including apex, young leaves and mature leaves), and 9 libraries from root tissues, of drought and control plants. 12 libraries have been sequenced through HiSeq sequencer 2000 at University of Michigan, 2 libraries through Illumina MiSeq sequencer at Michigan State University, and 15 libraries have recently been sequenced by a MiSeq sequencer located in our lab by PI. We have obtained over 34G data from enriched libraries and over 35G data from non-enriched strand-specific RNA libraries. With such amount data we successfully constructed a pipeline for gene expression analysis of stand-specific RNA data and our enriched sense-antisense data. Sequenced reads are cleaned with Trimmomatic and mapped to Populus reference genome through Tophat. The mapping rate are very similar for enriched and non-enriched strand-specific RNA sequencing with over 75% mapping rate for Hiseq reads and over 90% mapping rate for our Miseq reads. We identified 13,534 overlapping S/AS loci from over 20G data derived of non-enriched strand-specific RNA library of drought-stressed young leaves. In contrast, we detected 26,661 overlapping S/AS loci with 2G data from our enriched libraries. Therefore at the cost of 1/12 of non-enriched libraries, we are able to detect twice S/AS loci by our enriched method. Furthermore, through expression analysis of enriched SAS libraries with 2 drought samples and 2 control samples, we identified 9,331 drought responsive S/AS loci, with 4,488 upregulated and 4,843 down regulated by drought in apex tissue. With same approach, we also sequenced and analyzed S/AS loci from mature leaves. We identified 3746 drought-responsive S/AS loci, with 1843 up-regulated and 1903 down-regulated by drought. We desined a unique sampling technique to identify potential negative regulated S/AS pairs: S/AS transcripts are enriched by equally mixed hybridization of control and drought tissues. When combined with expression anlysis, we identified 9 potential negative regulated S/AS pairs from apex tisue and 6 potential negatively regulated S/AS pairs from mature leaves. Since we are aiming to develop a technology with high efficiency and low cost, we take advantage of sequencing our enriched libraries with our own Miseq sequencer. We have multiplexed 6 enriched root libraries (2 drought-stressed, 2 controls and 2 mixed-sampled) into one lane on Miseq sequencer. For non-enriched libraries we pooled 4 libraries (2 drought-stressed and 2 controls) for one run. On average, we identified 5,196 S/AS loci from each non-enriched library, and we identified 26,346 from each of enriched library. The discovery rate is 5 times higher with our enriched libraries with just one Miseq run. We carried out expression analysis with non-enriched strand-specific libraries and enriched libraries. For non-enriched libraries we identified 75 drought-regulated S/AS pairs, and for enriched libraries we identified 3516 drought-responsive S/AS loci, with 1,558 upregulated by drought and 1,958 down-regulated by drought. It is interesting to point out that 47 out of 75 drought-responsive S/AS loci identified in non-enriched libraries are also identified as drought-responsive SAS loci in enriched libraries. And for the rest of 28 which are identified as drought-responsive in non-enriched libraries, 23 are not significant different between drought and control for enriched libraries and 5 are not identified in enriched libraries. For the drought-responsive S/AS loci identified from expression analysis, we performed Gene Othology (GO) enrichment analysis to identify the enriched GO terms. We found that the enriched GO terms in apex and mature leaves are quite distinct. Ribosome RNA and proteins, genes involved in translation, lipid metabolic process, microtubule motor activity are significantly enriched in drought-stressed apex tissue, while genes control oxidation-reduction process, transferase activity, heme binding are significantly enriched in drought-stressed mature leaves, which also share similar pattern with drought treated root transcriptome. To validate these drought-responsive S/AS loci we are going to carry out qPCR and transgenic analysis, which will ultimately demonstrate the functionality of such loci in drought response of Populus.

    Publications


      Progress 04/15/13 to 04/14/14

      Outputs
      Target Audience: One undergraduate student Dylan Malone from School fo Forest Resources and Enviromental Science of Michigan Technological University has been involved in growing Populus plants in greenhouse for this project. Dylan was civil engineering major before transfered to Forestry. He didn't have any experience to work with plants, but he was eager to learn the drought effects on plant growth and had passion for the project. After two weeks' traning he was capable of taking care of plants in greenhouse, and always worked on the project over weekends. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals? We have submitted two libraries derived from short-term drought treated young leaf tissue for Illumina MiSeq sequencing. We expect to get the sequence data in April. We plan to submit more libraries derived from different drought treatments including long-term drought treatment and different tissues such as mature leaves, apex and root. After we obtain the high throughput sequence data, we are going to map them to Populus reference genome and quantify the expression level of sense/antisense pairs among drought treated samples and controls. We expect to identify sense/antisense gene pairs that are related to drought stress. Detailed analysis of those drought responsive sense/antisense gene pairs will be carried out including gene structure characterization, evolutionary conservation between Arabidopsis and Populus, and RNAi knockdown analysis of several targeted sense/antisense gene pairs.

      Impacts
      What was accomplished under these goals? The major goal of this project is to identify drought responsive sense/antisense (S/AS) gene loci in Populus genome. We have now developed a technology to enrich sense/antisense gene pairs from drought stressed tissues of Populus. This technology allows us to work with small amount of input RNA (as low as 100 ng with our method vs. 150 ug from method published previously). Since the input RNA amount is no longer a limiting factor, we are able to use this technology to enrich sense/antisense pairs from any stressed tissues including challenging tissues such as apex and root tips. More importantly we adapted this method into Illumina TruSeq kit for high throughput Illumina sequencing. Thus this advanced technology can be applied to any species for maximized discovery of sense/antisense gene pairs in any tissue or cell types. With this established technology we have generated 4 libraries from young leaf tissues and 4 libraries from root tissues. Two libraries have been submitted for Illumina MiSeq sequcing. All these libraries have also been cloned into Zero Blunt TOPO PCRvector. For routine library validation, 20 colonies are picked up from each library for colony PCRs to estimate the size range of the libraries. The size range of the libraries is around 200 bp to 900 bp, and the average size is about 500 bp. 10 colonies from each library are sequenced by traditional Sanger sequencing. So far we have sequenced 70 clones form the libraries that we have made, and 55 produced good reads with EST and mRNA support. We mapped all the reads to Populus genome though blastn. The S/AS gene pairs mapped through our small scale cloning and sequencing approach contain at least one protein-coding gene per gene pair. Although majority of the mapped region are characterized as hypothetical proteins or putative proteins, we have identified several known genes with antisense transcription, such as auxin response factor 1 family protein, heat shock protein 70, splicing factor PWI domain-containing protein and chaperone protein. These proteins have been linked with various stress conditions such as water stress and heat. The antisense transcription of such stress -responsive genes identified in our project could play critical role in mediating plant response to stress. We have picked up 9 clones out of 55 for strand-specific RTPCR analysis to confirm the antisense origin of these genes. Since conventional strand-specific RTPCR analysis has very high false positive pcr artifacts due to primer-independent first-strand cDNA synthesis, we developed a technique to improve the accuracy and strand-specificity of strand-specific RTPCR analysis by taking advantage of dideoxynucleotide-tri-phosphate (ddNTPs). ddNTPs are chain-terminating inhibitors for DNA polymerase used for Sanger sequencing. We pretreated total RNA free of genomic DNA contamination with ddNTPs in the presence of reverse transcriptase such as SuperScript III in our study. During this treatment, ddNTPs block all the possible self-priming sites, thus allow gene specific primers more efficiently initiate the fist-strand cDNA synthesis in the next round of RT reaction. With this technique we are able to confirm 8 genes that contain antisense transcription at their protein-encoding loci. Three sense-antisense gene pairs of these 8 confirmed have alternative splicing. To be noted forthe sense-antisense gene pair that we have not confirmed with our stressed samples, we plan to test it on control samples (non-stressed). Since this clone is derived from a library of mixed RNA source (stressed sample mixed with control), there is a possibility that the antisense transcript of this gene may be only present in unstressed samples. Therefore we are going to carry out strand-specific RTPCRs with control RNAs.

      Publications


        Progress 04/15/12 to 04/14/13

        Outputs
        Target Audience: We have trained two undergraduate students during this report period on our research. Undergraduate student Philip Olivares is a Chemistry student at MTUwhose major is biochemistry. He has been actively involved in growing Populus cuttings and drought stress treatments. He got extensive training on tree growth, the effect of drought on plant development and tissue sampling techniques. He voluntered to work on weekends to carry out the drought treatment. Another undergradute student Dylan Malone is a Forestry student. Heis now working onanother batch of Populus cuttings. He has showing strong interest on this project. Hewil be alsoget trained on sequencing and molecular cloning and other advance plant genomic technolgies. Changes/Problems: The heat stress treatment couldn't be successfully completed due to the breakdown of our growht chambers. We will try to repeat this treatment with the repaired machine. And if the high temperature will still be a problem for our old growth chmber, we will have to modify this approach. One alternative will be to grow the Populus under summer months. What opportunities for training and professional development has the project provided? Two undergraduate students have been involved in growing Populus cuttings in greenhouse and drought treatments. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals? We plan to have the sequencing data mapped to Populus genome and identify the potential stress-responsive sense-antisense loci. We will also repeat drought treatment and confirm the identified putative drought responsible overlapping sense-antisene loci on both Populus and Arabidopisis genome.

        Impacts
        What was accomplished under these goals? Green stem cuttings of Nisuqally-1 were requested from Dr. Steven Strauss of Oregon State University and Dr. Gerald Tuskan of Oak Ridge National Laboratory. Serially vegetative propagation has been made from those cuttings in a misting chamber of our greenhouse. After one month grown in misting chamber, the rooted cuttings were planted in 3 gal. pots filled with Sunshine soil 1#, and were watered to field capacity for 60 days in a greenhouse with an average day temperature around 25°C and night temperature 18°C. The drought treatments were carried out on the 61 days. After assessing initial trial of drought treatment, we selected two watering regimes: 100% and 40% of field capacity for control plants and drought stressed plants respectively. For control plants, the pots were weighed every day and watered to 100% of field capacity. For water stressed plants, water was withheld until the soil water level reached 40% of field capacity. For short-term water deficient treatment, stressed plants were kept at 40% of field capacity for 24 hours. We also carried out long-term water deficient treatment by allowing the stressed plants to grow for 15 days at 40% of field capacity. The drought treatments have been repeated for at least 4 times. Plant growth and leaf damage has been recorded during drought treatment. We noticed that compared with control plants, stressed plants showed fewer new leaves formed during 15 day long-term water deficient treatment, and stem growth has been also dramatically reduced with average 5.2 cm for stressed plants and 10 cm for control plants. Samplings were done between 11 am to 3 pm. We collected apex, young leaves, mature leaves and roots from both control plants and stressed plants and flash frozen in liquid N2. We attempted to apply heat stress (42°C) on water stressed plants. The growth chambers that our school owns, however, are too old and couldn’t keep consistent heat. Therefore the heat stress experiment was not completed. We are now in the process of RNA extraction and sample preparation for RNA sequencing. We expect to get the sequence data by the end of August. PI Yuan is the only personnel working on this project. She was on maternity leave from October 25, 2013 to January 25. As a result, the project is slightly behind our research plan. Now, Yuan is fully back to work on the project. We expect to catch up the plan very soon.

        Publications