Source: UNIVERSITY OF CALIFORNIA, DAVIS submitted to
GENOME WIDE IDENTIFICATION AND ANNOTATION OF FUNCTIONAL REGULATORY REGIONS IN LIVESTOCK SPECIES
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
1005339
Grant No.
2015-67015-22940
Project No.
CA-D-ASC-2283-CG
Proposal No.
2014-05920
Multistate No.
(N/A)
Program Code
A1201
Project Start Date
Jan 1, 2015
Project End Date
Dec 31, 2019
Grant Year
2015
Project Director
Zhou, H.
Recipient Organization
UNIVERSITY OF CALIFORNIA, DAVIS
410 MRAK HALL
DAVIS,CA 95616-8671
Performing Department
Animal Science
Non Technical Summary
Genomics is playing an increasingly significant role in improving animal production, health, and well-being. The utility of a genome depends critically on how well it is annotated. The goal of this project is to improve the annotation of regulatory regions for three of the most important farm animal genomes: chicken, cow, and pig. These genomes have been assembled, but there is very limited information on the functional roles of regulatory regions of their genes. Recent work by the human and mouse ENCODE projects provide a blueprint for identifying the functional roles of regulatory elements in the human and mouse genomes, that can be implemented for similar efforts on animal genomes. Our specific goals are to 1) identify regulatory elements of chicken, cow and pig genomes, 2) determine functional roles of regulatory regions, and 3) freely distribute all data using the UCSC and Ensembl genome browsers. The proposed research addresses the program area priority: Application of genome-wide methods for identification of gene regulatory regions. This study will set a cornerstone for initiating animal ENCODE project by providing a valuable resource for exploring the funtional landscapes of the chicken, bovine, and swine genomes, and provide a valuable tool for a deeper and more meaningful understanding of complex biological systems.
Animal Health Component
0%
Research Effort Categories
Basic
80%
Applied
(N/A)
Developmental
20%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3043210108034%
3043310108033%
3043510108033%
Goals / Objectives
The overall goal of this project is to generate a comprehensive resource of functional regulatory elements for the chicken, cattle, and pig genomes. Our specific goals are:Annotate chromatin states corresponding to DNase I hypersensitivity, four histone modifications (H3K4me3, H3K27me3, H3K27ac, and H3K4me1), and one insulator element (insulator-binding protein CCCTC-binding factor).Identify and annotate promoters, enhancers, and silencers by integrating information from RNA-seq, DNase I hypersensitivity, and ChIP-seq for four histone modifications and one insulator element.Freely distribute all raw and annotated data via UCSC Genome Browser and Ensembl.
Project Methods
DNase-seq, RNA-seq, and ChIP-seq from 8 tissues will be generated. For DNase-seq, nuclei from different tissues will be isolated and libraries will be prepared and sequenced by Illumina HiSeq 2000. For RNA-seq, total RNA will be isolated and libraries will be prepared and sequenced by Illumina HiSeq 2000. For ChIP-seq, chromatin will be isolated from each tissue and solated chromatin will be fragmented by sonication to an average size of 200-300 bp using a Bioruptor apparatus. Sonicated chromatin will be then used for immunoprecipitation. The antibodies will be used for each histonemodification marks. Chromatin precipitation will be made using protein A/G coated magnetic beads. Individual antibodies will be first bound to the beads. Then, beads with bound antibodies and targets will be purified using a magnet technology. DNA will be precipitated from each of the antibodies and from the input control sample and used for preparation of sequencing libraries indexed with TruSeq adaptors. Libraries will be sequenced using HiSeq2000 with up to 8 samples per lane in a 100 bp single read run. For RNA-seq, total RNA will be isolated and will be used for libraries preparation and sequenced using HiSeq2000.The promoter, enhancer and insulator elements based on mapped features will be identified and the hidden Markov models will be used to determine functional roles of regulatory regions. Annotated genomes will be distributed via UCSC browser and Ensembl browser.

Progress 01/01/15 to 12/31/19

Outputs
Target Audience:Livestock scientific community, livestock industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholar have training in in animal genetics, bioinformatics and genomic analysis. How have the results been disseminated to communities of interest?Oral and poster presentations were given in scientific meetings, and institutes in the US, Italy, Kenya,Spain, France, UK. Participated in workshop with diverse participants including scientists from institutes, funding agencies. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? We have completed data generation for all ChIP-seq marks (H3K4me3, H3K27ac, H3K27me3, H3K4me1, and CTCF) for eight tissues and two male biological replicates of chicken, pig, and cattle. Across all species, an average of 24,986,402 aligned and filtered reads were obtained per H3K4me3 library, 27,428,564 per H3K4me1 library, 26,874,867 per H3K27ac library, 30,394,847 per CTCF library, and 48,702,912 per H3K27me3 library. Many quality metrics were used to ensure high data quality, including the ENCODE data standards. We found the Jensen-Shannon distance (JSD) metric, which is not part of the ENCODE standards, to be the most informative in determining high data quality. All libraries were required to exceed a JSD of 0.1 for narrow marks (all except H3K27me3) and a JSD of 0.05 for H3K27me3. These cut-offs were determined by manual inspection of a wide variety of data. An average JSD of 0.37, 0.17, 0.25, 0.15, and 0.09 were obtained for the H3K4me3, H3K4me1, H3K27ac, CTCF, and H3K27me3 marks respectively. The fraction of reads in peaks (FRiP) was another key data quality metric and is widely used in ChIP-seq studies. Average FRiPs of 0.38, 0.22, 0.26, 0.14, and 0.23 were achieved for the H3K4me3, H3K4me1, H3K27ac, CTCF, and H3K27me3 marks respectively. ChromHMM was used to build a 14-state model integrating all the ChIP-seq data to predict genome-wide chromatin states per-tissue in each species. The chromatin states were used to annotate an average of 16,653 active enhancers (2,718 tissue-specific) in each chicken tissue, 39,811 (6,729 tissue-specific) in each pig tissue, and 31,339 (7,883 tissue-specific) in each cattle tissue. An average of 13,413, 13,962, and 15,345 active promoters were identified in chicken, pig, and cattle respectively, with 409 being tissue-specific on average across all tissues and species. Insulators, marked by CTCF, were identified with an average of 29,192 per tissue (9,395 tissue-specific) across all species. Open chromatin data (DNase-seq in chicken, ATAC-seq in pig and cattle) and RRBS-seq data were also generated for all samples used to generate the ChIP-seq data. Comparative analysis across species has begun, including additional data from horse (Equine FAANG) as well as human and mouse (ENCODE) in addition to the chicken, pig, and cattle data generated for this project in order to investigate the evolutionary conservation of these regulatory elements.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Kern, C. P. Y. Wang, P. Saelao, K. Chanthavixay , I. Korf, M. Delany, H. Cheng, J. Ross, Zhou, H. 2019. Allele-specific chromatin accessibility and histone modifications in an F1 cross of MDV resistant and susceptible chicken lines. 37th Conference for the International Society of Animal Genetics, Lleida, Spain
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Halstead M., Kern, C. P. Y. Wang, P. Saelao, S. M. Waters, X. Xu. I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. K. Tuggle, C. W. Ernst. Zhou, H. J. Ross. 2019. Identification of orthologous tissue-specific enhancer-gene pairs across chicken, pig and cattle. 37th Conference for the International Society of Animal Genetics, Lleida, Spain
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Kern, C. P. S. M. Waters, X. Xu. I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. K. Tuggle, C. W. Ernst. J. Ross, Zhou, H.Zhou, H. Comparison of epigenomes from eight bovine tissues identify regulatory elements with tissue-specific activity. 2019. Keystone Symposia Conference: Epigenetics and Human Disease, Banff, Alberta, Canada.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2019 Citation: Kern, C. P. Y. Wang, P. Saelao, S. M. Waters, X. Xu. I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. K. Tuggle, C. W. Ernst. J. Ross, Zhou, H. 2019. Correlating Gene Expression with the Histone Modifications H3K4me3 and H3K27ac in High and Low CpG Content Promoters of Chickens, Cattle, and Pigs. Plant & Animal Genome XXVII, San Diego, CA.


Progress 01/01/18 to 12/31/18

Outputs
Target Audience:Livestock scientific community, livestock industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholar have training in in animal genetics, bioinformatics and genomic analysis. How have the results been disseminated to communities of interest?Oral and poster presentations were given in scientific meetings, and institutes in the US, France, UK. Participated in workshop with diverse participants including scientists from institutes, funding agencies. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? We have performed RNA-seq, DNase-seq/ATAC-seq, and five ChIP-seq assays on eight tissues in three farm animal species. Currently, we have completed data collection for eight tissues in cattle, seven tissues in pig, and six tissues in chicken. Three ChromHMM models, one for each species, were trained to predict genome-wide chromatin states specific to each tissue. To facilitate comparison of models between species, all models were created with 14 states. Among the three species, 11 states were consistent between models while the remaining three states in each model were either unique to that species or appear in the model for only two species. A more robust comparison between tissues and species could provide novel insights of evolutionary divergence of regulatory elements when all data collection is completed.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Y. Wang, X. Li, Saelao, P., K. Chanthavixay, T.R. Kelly, J.M. Dekkers, R. Gallardo, S.J. Lamont. Zhou, H. 2018. Transcriptome Analysis of Host Response to NDV Infection under Heat Stress in Two Genetically Distinct Chicken Inbred Lines. Plant & Animal Genome XXVI, San Diego, CA
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: M. Halstead, Kern, C. P. Saelao, Y. Wang, M. Delany, H. Cheng, Zhou, H. P. J. Ross. 2018. Modification of ATAC-Seq Permits Profiling of Open Chromatin in Cryopreserved Chicken Lung. Plant & Animal Genome XXVI, San Diego, CA
  • Type: Journal Articles Status: Published Year Published: 2018 Citation: Kern, C. P. P. Saelao, Y. Wang, M. Halstead, J. L. Chitwood, I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. W. Ernst. J. Ross, Zhou, H. 2018. Genome-wide identification of tissue-specific long non-coding RNA in three farm animal species. BMC Genomics doi.org/10.1186/s12864-018-5037-7.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Christopher K. Tuggle, Huaijun Zhou, Catherine Ernst, Crystal L. Loving, James Koltes, James M. Reecy, Pablo Ross, Dan Nonneman, Tim P.L. Smith, Juan Steibel, Wen Huang. 2018. Functional Annotation of the Porcine Genome. Livestock Genomics, Hinxton, UK.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: S. M. Waters. P. J. Ross, Zhou, H. 2018. Tissue Specific ChIP-Seq Analysis of Four Histone Modifications and an Insulator Element in Bovine Adult Male Tissue Samples. Plant & Animal Genome XXVI, San Diego, CA
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Kingsley, N. B. C. J. Finno, G. Berguet, J.-J. Wesselink, M. Tammoh, C. Creppe, S. Schvartzman, C. Sabatel, C. Kern, E. N. Burns, H. Zhou, J. L. Petersen, R. R. Bellone. 2018. Optimization of Equine ChIP-Seq for the Functional Annotation of Animal Genomes. Plant & Animal Genome XXVI, San Diego, CA
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Kern, C. P. J. Ross, P. Saelao, Y. Wang, M. Halstead, J. L. I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. K. Tuggle, C. W. Ernst. Zhou, H. 2018. Genome-Wide Identification and Analysis of CTCF Binding Sites in Chickens and Pigs. Plant & Animal Genome XXVI, San Diego, CA
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: P. J. Ross, Kern, C. Y. Wang, P. Saelao, K. Chanthavixay, I. Korf, C.K. Tuggle, C. Ernst, Zhou, H. 2018. Identification of Regulatory Regions in the Swine Genome. Plant & Animal Genome XXVI, San Diego, CA
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Zhou, H. P. J. Ross, Kern, C. P. Saelao, Y. Wang, M. Halstead, K. Chanthavixay, I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. K. Tuggle, C. W. Ernst. 2018. Identification of regulatory elements in livestock species. Plant & Animal Genome XXVI, San Diego, CA


Progress 01/01/17 to 12/31/17

Outputs
Target Audience:Livestock scientific community, livestock industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholarhave training in genomics and bioinformatics analysis. How have the results been disseminated to communities of interest?Oral and poster presentations were given in scientific meetings, and institutes in the US, and Ireland. Participated in workshop with diverse participants including scientists from institutes, funding agencies. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Development of ChIP-seq and ATAC-seq assays to identify and annotate regulatory elements in livestock genomes. Development of bioinformatic pipeline to functionally annotate livestock genomes.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2017 Citation: Kern, C. P. Saelao, Y. Wang, M. Halstead, J. L. Chitwood, T. Kim, P. J. Ross, I. Korf, M. Delany, H. Cheng, Zhou, H. 2017. The Open Chromatin Landscape of the Chicken Genome. Plant & Animal Genome XXV, San Diego, CA.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2017 Citation: X. Zhou, Y. Zhang, J. N. Brutman, J.J. Michal, H. Jiang, Y. Wang, P. Ross, M. Delany, H. Zhou, H. Cheng, J. F. Davis, Z. Jiang. 2017, Non-Coding RNAs: Striking Features in Use of Alternative Polyadenylation Sites. Plant & Animal Genome XXV, San Diego, CA.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2017 Citation: N. Trakooljul, Zhou, H. P. J. Ross, I. Korf, M. Delany, H. Cheng, S. Ponsuksili, K. Wimmers. 2017. Comparative DNA Methylome of the Chicken and Pig: An Evolutionary Bridge Between Avian and Mammalian. Plant & Animal Genome XXV, San Diego, CA.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2017 Citation: N. Trakooljul, H. Zhou, P. Ross, I. Korf, M. E. Delany, H. H. Cheng, C. Ernst, C. Kern, F. Hadlich, S. Ponsuksili, and K. Wimmers. 2017. Update on DNA methylation datasets of FAANG reference samples for the chicken and pig. 36th Conference for the International Society of Animal Genetics, Dublin, Ireland.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2017 Citation: Zhou, H. P. J. Ross, Kern, C. P. Saelao, Y. Wang, M. Halstead, K. Chanthavixay, I. Korf, M. Delany, H. Cheng, J. F. Medrano, A. Van Eenennaam, C. K. Tuggle, C. W. Ernst. 2017. Identification of regulatory elements in livestock species. 36th Conference for the International Society of Animal Genetics, Dublin, Ireland.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2017 Citation: Kern, C. P. Saelao, Y. Wang, K. Chanthavixay, I. Korf, C. K. Tuggle, C. W. Ernst P. J. Ross, Zhou, H.. 2017. Genome-wide analysis of H3K4me3 and H3K27me3 in three tissues in pigs. 36th Conference for the International Society of Animal Genetics, Dublin, Ireland.


Progress 01/01/16 to 12/31/16

Outputs
Target Audience:Animal genome scientific community, livestock industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholar have training in genomics assays and bioinformatics analysis. How have the results been disseminated to communities of interest?Oral and poster presentations were given in scientific meetings, and institutes in the US , New Zealand and China. Participated in workshop with diverse participants including scientists from institutes, funding agencies, NGO. What do you plan to do during the next reporting period to accomplish the goals?Continue to work other three ChIP-seq assays in all three species. For fat and muscle, additional optimization is needed for ChIP-seq. ATAC-seq assay will be further optimized. Integrative analysis of DNase-seq, ChIP-seq and RNA-seq will be conducted.

Impacts
What was accomplished under these goals? In order to identify regulatory elements in the genomes by integrating RNA-seq, DNase-seq and ChIP-seq data from each tissue. We have generated and analyzed data from these three important species, including analysis of forty-eight RNA-seq libraries (sixteen per species) collected from two biological replicates across eight tissues: adipose, cerebellum, cortex, hypothalamus, liver, lung, muscle and spleen. For chicken, an analysis of 15 DNase-seq data from all issues with two replicates, except hypothalamus (one replicate) show that identified tissue-specific DNase hypersensitivity (DHS) sites are associated with genes that relate to unique biological functions of the organs or tissues. Using the two replicates to construct a set of DHS sites that are present in each replicate, we found 29190 sites in cerebellum, 43672 in cortex, 52337 in liver, 64149 in lung, 27433 in muscle, and 63605 in spleen. In addition, 24 ChIP-seq data from all tissues (two replicates) except adipose and muscle in chickens and 12 ChIP-seq from liver, spleen and lung in pigs (H3K4me3 and H3K27me3 histone modification marks) were generated. Integrative analysis of DHS sites, ChIP-seq, and RNA-seq allows the identification of genome-wide active and inactive promoter regions in chickens, which enables an in-depth comparison of the regulatory landscapes of multiple tissues within these species. ATAC-seq in chicken tissues was optimized and some promising results have been achieved.

Publications

  • Type: Journal Articles Status: Published Year Published: 2016 Citation: Tuggle, C., E. Giuffra, S. N. White, L. Clarke, H. Zhou, P. J. Ross, H. Acloque, J. M. Reecy, A. Archibald, R. R. Bellone, M. Boichard, A. Chamberlain, H. Cheng, R. P.M.A. Crooijmans, M. E. Delany, C. J. Finno, M. A. M. Groenen, B. Hayes, J. K. Lunney, J. L. Petersen, G. S. Plastow, C. J. Schmidt, J. Song, M. Watson. 2016. GO-FAANG meeting: a Gathering On Functional Annotation of ANimal Genomes Animal Genetics DOI: 10.1111/age.12466.
  • Type: Other Status: Published Year Published: 2016 Citation: Zhou, H. P. J. Ross, Kern, C. P. Saelao, Y. Wang, J. L. Chitwood, I. Korf, M. Delany, H. Cheng,. 2016. Genome-wide Functional Annotation of Regulatory Elements in chickens. Pp:48-52. The Proceedings of XXV World's Poultry Congress, Beijing, China.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2016 Citation: Kern, C. P. Saelao, Y. Wang, M. Halstead, J. L. Chitwood, T. Kim, P. J. Ross, I. Korf, M. Delany, H. Cheng, Zhou, H. 2016. Identification of tissue-specific promoters in chickens. 35th Conference for the International Society of Animal Genetics. Salt Lake City, UT.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2016 Citation: M. Halstead, Kern, C. P. Saelao, Y. Wang, Zhou, H. P. J. Ross. 2016. Profiling of open chromatin in chicken tissues using ATAC-seq. 35th Conference for the International Society of Animal Genetics. Salt Lake City, UT.


Progress 01/01/15 to 12/31/15

Outputs
Target Audience:Animal genome scientists, livestock industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?Research findings have been presented in both oral and poster formats. What do you plan to do during the next reporting period to accomplish the goals?Continue to work on other three ChIP-seq assays for chicken, and will work on DNase-seq and all ChIP-seq for pig and cattle. Integrative analysis of 3 assays will be conducted.

Impacts
What was accomplished under these goals? The identification of regulatory elements is a key step in understanding how an organism's genotype determines the phenotype. The technologies and assays developed in human and mouse ENCODE projects provide a solid foundation to functionally annotate chicken, pig and cattle genomes. These three livestock species, which are core to the world's food production, lack robust functional annotations that could be leveraged to improve the production efficiency of these industries. We will present the current progress in generating and analyzing data from these three important species, including analysis of forty-eight RNA-seq libraries (sixteen per species) collected from two biological replicates across eight tissues: adipose, cerebellum, cortex, hypothalamus, liver, lung, muscle and spleen. Transcripts detected in these tissues show good coverage of the Ensembl gene sets for the three species, and an initial analysis has identified putative long non-coding RNAs, both tissue-specific and expressed across all tissues. For chicken, an analysis of 12 DNase-seq libraries from liver, lung, cerebellum and spleen (two replicates) and hypothalamus, cortex, muscle and adipose (one replicate) show that identified tissue-specific DNase hypersensitivity (DHS) sites are associated with genes that relate to unique biological functions of the organs or tissues. Twelve dataset from H3K4me3 and H3K27me3 histone modification in chickens were analyzed. Integrative analysis of DHS sites, ChIP-seq (H3K4me3 and H3K27me3 histone modification marks), and RNA-seq allows the identification of genome-wide active and inactive promoter regions in chickens, which enables an in-depth comparison of the regulatory landscapes of multiple tissues within these species.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2015 Citation: H. Zhou, P. Ross, C. Kern, P. Saelao, Y. Wang, M. Halstead, J. Chitwood, T. Kim, I. Korf, M. E. Delany, H. H. Cheng, J. F. Medrano, A. L. Van Eenennaam, C. K. Tuggle, C. Ernst. 2015. Genome wide identification and annotation of functional regulatory regions in livestock species. GO-FAANG Workshop, Washington D.C. October