Source: UNIVERSITY OF CALIFORNIA, DAVIS submitted to
FUNCTIONAL ANNOTATION AND VALIDATION OF REGULATORY ELEMENTS IN THE CHICKEN GENOME
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1021692
Grant No.
2020-67015-31175
Cumulative Award Amt.
$500,000.00
Proposal No.
2019-05590
Multistate No.
(N/A)
Project Start Date
May 1, 2020
Project End Date
Apr 30, 2025
Grant Year
2020
Program Code
[A1201]- Animal Health and Production and Animal Products: Animal Breeding, Genetics, and Genomics
Project Director
Zhou, H.
Recipient Organization
UNIVERSITY OF CALIFORNIA, DAVIS
410 MRAK HALL
DAVIS,CA 95616-8671
Performing Department
Animal Science
Non Technical Summary
The U.S. is one of the world's largest producers and exporters of poultry meat. The chicken genome assembly is at the stage where significant improvements in annotation would enable investigators to better utilize genomic information to improve production traits. This project is to efficiently generate a resource of epigenomic data for annotation of functional elements in the chicken genome and enhance our knowledge of regulation of genes associated with economically important traits such as immune function, feed efficiency, egg production, quality, disease resistance. The expected results will provide enhanced biological understanding of key genome elements that are potentially associated with important agronomic traits that will translate into significant improvements for U.S. poultry producers and industry.
Animal Health Component
5%
Research Effort Categories
Basic
70%
Applied
5%
Developmental
25%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3043210108045%
3043220108015%
3043210104030%
3043220104010%
Goals / Objectives
The advent of the FAANG Consortium signified the critical need for annotating the chicken genome. Specifically, regulatory elements have not been functionally characterized, and more importantly, target genes of distal regulatory elements remains largely unknown, which is essential to understand spatio-temporal transcriptional regulation. Building on our current efforts, we propose to extend annotation to other biologically important tissues, and address the critical gaps of validating functional enhancers and assigning them to their target genes. Our specific aims are: (1) Identify and annotate regulatory elements in chicken gut and reproductive tissues by integrating information from RNA-seq, ATAC-seq, and ChIP-seq; (2) Use lentiMPRA in a model cell line to functionally validate a set of enhancers; (3) Employ promoter capture Hi-C to identify target genes of enhancers.
Project Methods
We plan to employ a variety of high-throughput genomic and epigenomic assays including RNA-seq, ATAC-seq, and ChIP-seq for four histone modifications and one insulator element (CTCF), lentiMPRA assay and, promoter capture Hi-C to identify targeted genes of enhancers and validate functions of the tissue-specific enhancers.

Progress 05/01/23 to 04/30/24

Outputs
Target Audience:poultry scientific community, livestock and poultry industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholar have training in in animal genetics, bioinformatics and genomic analysis. How have the results been disseminated to communities of interest?The research findings have been presented in a variety of national and international conferences with audiences from both academia and livestock industry. What do you plan to do during the next reporting period to accomplish the goals?Moving forward, we plan to scale up our MPRA to include 15,000 sequences, categorized into two sets: Set 1: FAANG predicted enhancers which are (a) active in multiple tissues or (b) active in immune-related tissues, or (c) tissue-specific. The lengths of predicted enhancers range from 200 bp to 8,000 bp, and we will select one or multiple 269-bp regions from each putative enhancer as the candidate sequences for testing. These selected 269-bp regions meet either of the following criteria: (a) they are near the center of the H3K4me1 and/or H3K27ac peaks, or (b) they contain motifs. In total, we will have 8,000 candidate sequences. Set 2: Candidate variants that may result in ASE in response to Marek's disease virus (MDV) infection in chickens. We will assess 3,000 candidate variants in strong enhancers whose putative target genes have ASE effect in response to chicken MDV retrieved from previous studies17. We devised 6,000 candidate sequences, each having two alleles with variants in haplotype (if it contains multiple SNPs). The construction of the MPRA validation system will enable genome-wide discovery and functional characterization of enhancers and regulatory variants in the chicken genome. The resulting data will significantly expand the catalog of characterized chicken enhancers, providing a crucial knowledge base for systematic exploration of their roles in chickens. This knowledge will also aid in identifying causal variants associated with economically important traits in chickens.

Impacts
What was accomplished under these goals? Over the past year, we have made significant progress in the development of the massively parallel reporter assay (MPRA) in the chicken genome, focusing on a subset of candidate sequences targeting variants in chicken gene expression and egg quality traits. We selected 340 variants that exhibited statistical association with both gene expression (eQTLs) and chicken egg quality traits (GWAS). From this, we devised 680 candidate sequences, each spanning 269 bp in the chicken genome, with a central variant and two alleles. Additionally, we included 190 positive control sequences, 100 negative control sequences, and 29 fragment sequences from 3 previously validated enhancers (P4HB, MIIP, GATA3). Oligos were synthesized by Twist Bioscience as 299 nt sequences containing 269 nt of genomic context and 15 nt of adapter sequence on either end. Unique 20 bp barcodes were added by PCR along with an additional constant sequence for subsequent incorporation into a backbone vector by Gibson assembly. We have successfully constructed two plasmid libraries: (1) pMPRA:?orf, which contains the synthesized oligos and unique barcodes; and (2) the transfection plasmid library pMPRA:eGFP/minP, which incorporates a minimal promoter and enhanced green fluorescent protein (eGFP) fragment into the pMPRA:?orf. Each library was expanded by electroporation into E.coli with transformation efficiency greater than 5 x 10^10 cfu/μg pUC19, ensuring the diversity and complexity of those libraries. Sequencing (Illumina MiSeq) was performed for pMPRA:?orf to acquire barcode/oligo pairings. From the sequencing results, we identified 10.6 million unique barcodes, with each oligo being coupled to an average of 10,608 barcodes. We tested various conditions for transfection of DF1 cells and the highest transfection efficiency (~40%) was detected in DF-1 cells transfected with Lipofectamine 3000 reagent, lipid dose 7.5 ul/well. Library pMPRA:eGFP/minP was independently transfected into DF1 cells under this optimized condition three times. The minimum promoter alone was proved insufficient for generating observable GFP expression in DF-1 cells. Following the insertion of the MPRA library, however, we observed green fluorescence, indicating great enhancement of GFP expression facilitated by the introduced library. Total RNA was harvested 48 hours post-transfection followed by DNA digestion, capturing of the GFP transcripts, and cDNA synthesis. Sequencing (NovaSeq) was then performed for pMPRA:eGFP/minP to acquire a baseline representation of each oligo within the library and for the cDNA library to determine the abundance of each barcode, which represents the enhancer activity of the corresponding candidate sequence. Our assay achieved a capture rate of 99.9% (998 out of 999) for tested oligos with a minimum depth of 20 reads. Each oligo was captured by an average of 2838 unique barcodes per replicate during barcode sequencing, with an average total read count of 61265 per replicate. Additionally, we observed high reproducibility in our assay, with an average correlation of 0.99 between experimental replicates. We identified that 26% of tested variants (88 out of 340) had an effect on the expression of the reporter for at least one of the two alleles, and we refer to these as 'active' sequences. Among these, 82% (72 out of 88) enhanced expression of the reporter. We then identified variants that exhibited differential expression between the reference and alternate allele ("allelic skew"). Of the active sequences, only 7 showed allelic skew, with all showing modest expression differences between alleles (fold change ranged from 1.05 to 1.57).

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2023 Citation: H. Zhou. 2023. Functional role of regulatory elements informs complex trait variation and domestication selection in farm animals. Midwest Section ASAS 2023
  • Type: Conference Papers and Presentations Status: Published Year Published: 2023 Citation: H. Zhou. Dissection of evolution of cis-regulatory elements and its application on genetic control of complex traits in farm animals. The Central Dogma of Phenomics symposium, Albuquerque, ASAS 2023 USA
  • Type: Journal Articles Status: Published Year Published: 2023 Citation: Pan Z, Y. Wang, M. Wang, Y. Wang, X. Zhu, S. Gu, C. Zhong, L. An, M. Shan , J. Damas, M. M. Halstead, D. Guan, N. Trakooljul, K. Wimmers, Y. Bi, S. Wu, M. E. Delany, X. Bai, H.H. Cheng, C. Sun, N. Yang, X. Hu, H. A Lewin, L. Fang, H. Zhou. 2023. An atlas of regulatory elements in chicken: a resource for chicken genetics and genomics. Science Advances 9,eade1204(2023).DOI:10.1126/sciadv.ade1204.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: L. An, D. Guan, Y. Wang, D. Lubritz, P. Settar, K. Rowland, A. Wolc, H. Cheng, H. Zhou. 2024. Identification and functional validation of variants in chicken gene expression and egg quality traits. Plant & Animal Genome XXXI San Diego


Progress 05/01/22 to 04/30/23

Outputs
Target Audience:poultry scientific community, livestock and poultry industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholar have training in in animal genetics, bioinformatics and genomic analysis. How have the results been disseminated to communities of interest?The research findings have been presented in a variety of national and international conferences with audiences from both academia and livestock industry. What do you plan to do during the next reporting period to accomplish the goals?We will focus on performing MPRAs in chicken DF-1 cells. We will optimize the conditions of the critical steps for MPRAs (barcode addition, electroporation, plasmid maxi prep, transfection, maxi scale RNA extraction, GFP capture, etc.) and test them on a smaller scale (~1000 oligos) first, then we will apply it on a larger scale (~14,000 oligos). For each synthesized oligo, we will add multiple random barcodes by PCR and will then clone them into vectors. We will then perform illumina sequencing to determine the association of candidate enhancer and barcode. We will generate MPRA plasmids pool by assembling GFP amplicon into the vectors containing a candidate enhancer and barcode. We will then transfect the MPRA plasmids pool into DF-1 cells and then harvest RNA and DNA from the transfected cells, amplify barcodes, and perform illumina sequencing for barcodes. We will then determine the enhancer activity of each candidate sequence by calculating the RNA/DNA ratio of its barcodes.

Impacts
What was accomplished under these goals? A comprehensive characterization of regulatory elements in the chicken genome across tissues 41 will have substantial impacts on both fundamental and applied research. Here we systematically 42 identified and characterized regulatory elements in the chicken genome by integrating 377 43 genome-wide sequencing datasets from 23 adult tissues. In total, we annotated 1.57 million 44 regulatory elements, representing 15 distinct chromatin states, and predicted about 1.2 million 45 enhancer-gene pairs and 7,662 super-enhancers. This functional annotation of the chicken genome 46 should have wide utility on identifying regulatory elements accounting for gene regulation 47 underlying domestication, selection, and complex trait regulation, which we explored. In short, 48 this comprehensive atlas of regulatory elements provides the scientific community with a valuable 49 resource for chicken genetics and genomics. For the enhancers we have predicted, it is essential to functionally validate these predicted enhancers. We selected a total of 3501 predicted enhancers whose target genes exhibit relatively high expression in DF-1 cells. This includes 1180 enhancers active among at least 5 tissue categories, 793 enhancers specifically active in spleen (149), bone marrow (159), bursa (457) or thymus (28), and 1528 enhancers active in immune-related tissues but not tissue-specific. We chose a 269-bp region from each candidate enhancer and add 15 nt adaptors at both ends. These resulting 299 nt oligos will be synthesized along with 100 positive control sequences and 100 negative control sequences. Each oligo will be coupled to multiple barcodes through PCR to facilitate reproducible and quantitative measurements of regulatory activity. The vectors containing a minimal promoter, synthesized sequence, enhanced green fluorescent protein (eGFP), and associated barcode will be transfected into DF-1 cells. We will harvest RNA and DNA from the transfected cells, amplify barcodes, and perform sequencing for barcodes (Illumina NextSeq). We will then determine the enhancer activity of each candidate sequence by calculating the RNA/DNA ratio of its barcodes.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: D. Guan, M. L. Fang, and H. Zhou. 2022. The Chicken Genotype-Tissue Expression (ChickenGTEx): a comprehensive atlas of genetic regulatory variants in chicken transcriptome. FAANG Workshop.


Progress 05/01/21 to 04/30/22

Outputs
Target Audience:poultry scientific community, livestock and poultry industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Graduate students, undergraduate students, and postdoctoral scholar have training in in animal genetics, bioinformatics and genomic analysis. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Functional annotation of the chicken genome, especially the identification of regulatory elements, will play a significant role in identifying causative genetic variants associated with economically important traits. Using a bioinformatics pipeline that incorporates and analyzes (Chromatin Immunoprecipitation Sequencing (ChIP-seq), Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq), and RNA-seq datasets, we extend our annotation efforts to seven gut-associated tissues from two male chickens used in the FAANG pilot project. ChromHMM was used to predict a total of 15 chromatin states and 1,197,186 regulatory elements (excluding Quiescent) spanning 15 tissues (include 8 core tissues done previously) with 398,050 new regulatory elements were identified. These regulatory elements included 29,823 strong promoters, 104,560 active strong enhancers, and 73,582 repressors. This result provides a great resource for the farm animal genome community to further explore causative variants of complex traits. Active enhancer loci can be identified on the basis of certain epigenetic signatures, chromatin accessibility, and transcription factor and cofactor binding. Thousands of enhancers have been identified through the functional annotation of regulatory elements in the chicken genome by the Functional Annotation of Animal Genomes (FAANG) consortium. While an important first step, the FAANG annotated enhancers may have low concordance with their bona fide functional activities, thus, experimental validation of enhancer activity is required. In this study, we aim to achieve high-throughput functional validation of enhancers in the chicken genome. We first validated two splenic-specific enhancers (Proly 4-Hydroxylase Subunit Beta and Migration and Invasion Inhibitory Protein) using the standard luciferase reporter assay in the DF-1 chicken cell line. We are currently developing massively parallel reporter assays (MPRAs) in DF1 cells to overcome the throughput limit of reporter gene assays. Our MPRA library involves the synthesis of ~2,000 putative enhancers, each 250-nt in length, and each coupled to 15 unique 15-nt molecular barcodes separately. Our candidate enhancers include two sets: (1) predicted strong enhancers associated with high expression genes in DF-1 cells, and (2) variants of predicted strong enhancers with allele specific expression genes related to genetic resistance to Marek's disease. A positive control and a negative control sets are also included. The development of this high throughput enhancer validation system will enable the genome-wide discovery and functional characterization of enhancers in the chicken genome, which could provide a growing knowledge base for the systematic exploration of their role in chicken biology and disease susceptibility.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: An L, Y. Wang, Z. Pan, D. Guan, H.H. Cheng, H. Zhou. 2022. Functionally Validate Enhancers in the Chicken Genome using MPRAs. FAANG Workshop.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Z. Pan, Y. Wang, An L, D. Guan, H.H. Cheng, H. Zhou. 2022 A multi-tissue atlas of regulatory elements in the chicken genome. Annual poultry science meeting, San Antonio, TX


Progress 05/01/20 to 04/30/21

Outputs
Target Audience:poultry scientific community, livestock and poultry industry Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?Graduate students, undergraduate students, and postdoctoral scholar have training in in animal genetics, bioinformatics and genomic analysis. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? For the chicken regulatory elements annotation, we have generated 135 epigenomic data from 15 tissues including thymus, trachea, bone marrow, kidney, heart, follicle, shell gland, testis, gizzard, jejunum, colon, ileum, duodenum, cecum, pventriculus (2 biological replicates per tissue except follicle) from F1 cross of line 6 by line 7 on H3K4me3, H3K27ac, H3K4me1, K3k27me3 and ATAC-seq. RNA-seq libraries for all above tissues have been done and are currently sequencing the libraries. In addition, in order to validate spleen-specific enhancers identified from FAANG pilot project, a Renilla luciferase assay was developed to validate 2 enhancers using DF1 cell line.

Publications