Source: UNIVERSITY OF MISSOURI submitted to
IDENTIFICATION OF EXPRESSION QTL ASSOCIATED WITH FEED EFFICIENCY IN BEEF CATTLE
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
EXTENDED
Funding Source
Reporting Frequency
Annual
Accession No.
1021808
Grant No.
2020-67015-30829
Project No.
MO00067855
Proposal No.
2019-05998
Multistate No.
(N/A)
Program Code
A1231
Project Start Date
Jun 1, 2020
Project End Date
May 31, 2024
Grant Year
2020
Project Director
Schnabel, R. D.
Recipient Organization
UNIVERSITY OF MISSOURI
(N/A)
COLUMBIA,MO 65211
Performing Department
Animal Science
Non Technical Summary
The U.S. has the fourth largest cattle population in the world by head count yet is the largest beef producer. There is a significant amount of genetic variation between animals in their ability to convert inputs (feed) to final product (growth) e.g. some animals grow well while eating very little while others grow poorly while eating a lot. The cost of feed represents up to 75% of the direct cost associated with beef production. In order for the U.S. to continue to lead in beef production and increase profitability of beef production, efficiencies in production must be optimized. Our overall objective is to identify DNA variants influencing gene expression linked to variation of feed efficiency of cattle and include these variants on commercially available assays to enable producers to more accurately select the most efficient animals. To achieve this, we will profile the gene expression of three relevant tissues in a large number of animals selected for extremes of feed efficiency. We expect that all the products of this research will enhance our understanding of the genetic/genomic mechanisms underlying the complex trait of feed efficiency and serve as a model for future work in related traits. Indeed, a better understanding of the biological mechanisms responsible for quantitative variation will help create the "map" that can be used by many scientific disciplines from nutritionists, physiologists, geneticist to genome engineers, enabling the scientific community to help accomplish the grand challenge of advancing our nation's ability to achieve global food security and fight hunger.
Animal Health Component
0%
Research Effort Categories
Basic
90%
Applied
10%
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3033310108050%
3043310108050%
Goals / Objectives
Our overall objective is to identify putative causal variants influencing gene expression and transcript utilization linked to phenotypic variation of feed efficiency of cattle.The objectives of this proposal are to:Transcriptome profile 107 hypothalamus, 107 small intestine and 150 liver tissues from animals selected to have extreme residual feed intake phenotypes.Generate variant calls from the transcriptome data to integrate with SNP chip genotype data to better represent observed variation in the transcriptome profiled samples.Perform alternative splicing analysis and generate intron/exon counts for all genes in the genome to allow eQTL analyses.Impute SNP chip genotypes on 11,000 animals with measurements of feed efficiency to 850k and whole genome sequence to enable GWAS for feed efficiency related traits.Perform eQTL analysis on the 107/150 transcriptomes using imputed sequence genotypes to identify variants associated with gene expression, splicing and exon usage.Integrate eQTL and GWAS results to identify variants most likely to be biologically causal and predictive of feed efficiency related phenotypes for inclusion on future generations of commercially available genotyping assays.
Project Methods
Data generation: Objective 1RNA-seq: We define a transcriptome as a single tissue for a single animal. We will collect a total of 253 transcriptomes to augment the 111 transcriptomes already generated and described in the preliminary data. We have identified three tissues for which we will generate stranded RNA-seq libraries, and we shall equimolar pool 85 libraries to be sequenced on the Illumina NovaSeq S4 flowcell using paired-end 100 bp (2x100 bp) reagents. On average, this will yield 30M fragments (60M reads) per library. This work will generate 107 transcriptomes each for hypothalamus and small intestine and 150 transcriptomes for liver. Using the large number of available transcriptomes, we shall investigate the production of alternate isoforms produced within and across tissues.We will perform an eQTL analysis and a transcript splice variant GWAS for each tissue. Expression profiles, and intron/exon counts for each sample or gene respectively will be used with the imputed WGS SNP variation to identify variants that are associated with gene expression, alternate transcript abundance or inton/exon usage. Initial expression analysis will focus on gene level expression data for genes annotated by Ensembl and use the new ARS-UCD1.2 reference genome. Additional analyses will be performed at the isoform and individual intron/exon level.Imputation: We will impute all 11,000 animal's chip genotypes to the level of genome sequence using our established imputation pipeline and the 1000 Bulls data or our own internal reference data.GWAS While imputed genotypes for tens of millions of variants across 11,000 samples by its very nature contains enormous information content, it also presents significant computations issues. First, because this dataset represents animals from multiple purebred and crossbred populations, one must account for the breed composition and background population structure to avoid false positive associations. Our breed composition pipeline CRUMBLER will provide estimates of breed composition for all animals that can be used to either a priori partition the animals into homogeneous groups for separate analyses or be used as covariates in the analysis to account for breed composition. Second, due to the immense size of the data and the need to account for population structure and relatedness to control false positives, the signal for true positives tend to also be reduced. While we have described several software packages by name, we will not restrict our analyses to those described and will use tools that are best suited to and capable of analyzing the data generated.Integration. The primary analysis phase produces the "catalog" of data needed for integration. The integration phase of analysis will utilize all analyses and will integrate the produced data with existing data from the PDs. Both GWAS and eQTL analyses can produce potentially thousands of significant results for each type of analysis. Here, we refer to a significant result as a single variant that surpasses a genome-wide threshold based on the appropriate analysis-specific method for controlling for multiple testing. We seek to answer the basic question, which GWAS variants are also eQTL variants? Importantly, we do not expect all GWAS variants to be eQTL variants and vice versa. Due to the power of integrating GWAS and eQTL results, there are many new software tools currently being developed to integrate these data. Because this is such a new field and methods are being rapidly developed, there has not been a consensus reached by the scientific community as to the "best" tool(s) for the job.As new tools are developed we will assess their applicability to the data from this project and utilize whichever tools enable the best biological insight.

Progress 06/01/22 to 05/31/23

Outputs
Target Audience: Nothing Reported Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Two graduate students have received training in bioinformatics methods for processing genomic data. How have the results been disseminated to communities of interest?10/19/2022 PD Schnabel presented a talk titled "Basic animal genetics and managing variants." to the American International Charolais Association Breed Improvement Committee meeting at the Kansas City Airport Marriot. 1/06/2023 PD Schnabel presented a talk titled "The basic biology and technology behind the genomic EPD" at the 55th Annual Missouri Cattle Industry Convention and Trade Show. What do you plan to do during the next reporting period to accomplish the goals?All analyses will be concluded in the final year of the no-cost extension.

Impacts
What was accomplished under these goals? 2) Variant calling from the generated transcriptomes was completed. 3) The full analysis pipeline for bulk RNA-seq was finalized and tested. This includes WASP filtering during sequence processing, calculation of the various molecular phenotypes (gene expression, exon expression, exon inclusion ration and intron exclusion ratio) and final analysis with GCTA-MLMA including the GRM and PEER factors as covariates. All of this has been automated for speed and reproducibility. 4) A final WGS phasing reference panel was developed that includes 6,137 genomes at approximately 211 million sites. The variants were phased using SHAPEIT5 after extensive optimization of parameters. RNA-seq and/or SNP-chip based genotypes from assayed samples will be imputed to this reference to use for eQTL analysis. 5) Nothing to report. 6) Nothing to report.

Publications


    Progress 06/01/21 to 05/31/22

    Outputs
    Target Audience: Nothing Reported Changes/Problems:The main issue has revolved around the Covid pandemic and the ability to recruit a Ph.D. student to work on the project. What opportunities for training and professional development has the project provided?Two graduate students have received training in bioinformatics methods for processing genomic data. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals?With the addition of a Ph.D. student devoted to the project we expect to complete the objectives in the final year.

    Impacts
    What was accomplished under these goals? 1) Transcriptome data generation was completed in 2021. We leveraged NRSP8 cattle coordinator funding to produce single-nuclei RNA-seq (snRNA-seq) data for liver from three high efficiency and three low efficiency samples. 2) All transcriptome data from the Short Read Archive (SRA) was downloaded to augment the eQTL and variant calling (N>8000). 3) A Ph.D. student devoted to the project started in August 2021. He has evaluated using WASP filtering within the STAR alignment software and is building the pipeline to perform transcriptome counts and alternative splicing. 4) Progress was made in developing the pipeline for imputation to genome sequence. Variant calls have been made using over 5000 genomes to produce the haplotype reference for WGS imputation. The SNP chip imputation reference panel was improved by additional quality control to exclude misplaced markers and correct some marker locations. 5) Nothing to report 6) Nothing to report

    Publications

    • Type: Journal Articles Status: Accepted Year Published: 2022 Citation: Qanbari S, Schnabel RD and Wittenburg D. Evidence of Rare Misassemblies in the Bovine Reference Genome Revealed by Population Genetic Metrics. Animal Genetics. 2022;0 0:1 8. https://doi.org/10.1111/age.13205


    Progress 06/01/20 to 05/31/21

    Outputs
    Target Audience: Nothing Reported Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Most conferences were either moved to online or cancelled due to COVID19. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals?Objectives 1-4 are scheduled to be completed in year 2 of the project.

    Impacts
    What was accomplished under these goals? All of this work was significantly delayed due to the COVID19 pandemic initially due to campus closure/restrictions and later due to reagent acuisition delays. 1) Transcriptome profile 107 hypothalamus, 107 small intestine and 150 liver tissues from animals selected to have extreme residual feed intake phenotypes. We identified all of the samples needed for this project and they were retrieved from our freezer system. Testing of RNA extraction procedures was done in 2020. Production RNA extractions were initiated in January 2021 with the first 100 samples completed data generation on 4/14/2021. The remaining 250 RNA extractions were completed in May 2021 and submitted for data generation. 4) Impute SNP chip genotypes on 11,000 animals with measurements of feed efficiency to 850k and whole genome sequence to enable GWAS for feed efficiency related traits. Significant progress was made on building an imputation pipeline to efficiently impute SNP-chip genotypes. We began exploring different approaches to increase imputation accuracy both at SNP-chip and sequence level imputation.

    Publications

    • Type: Journal Articles Status: Published Year Published: 2020 Citation: Bickhart DM, McClure JC, Schnabel RD, Rosen BD, Medrano JF, Smith TPL. Advances in sequencing technology herald a new frontier in cattle genomics and genome-enabled selection. 2020. Journal of Dairy Science 103:6, 5278-5290. https://doi.org/10.3168/jds.2019-17693
    • Type: Journal Articles Status: Published Year Published: 2020 Citation: Triant DA, Le Tourneau JJ, Unni DR, Diesh CM, Shamimuzzaman M, Walsh AT, Gardiner J, Goldkamp A, Li Y, Nguyen H, Roberts C, Zhao Z, Alexander LJ, Decker JE, Schnabel RD, Schroeder SG, Sonstegard TS, Taylor JF, Rivera RM, Hagen DE, Elsik CG. Using Online Tools at the Bovine Genome Database to Manually Annotate Genes in the New Reference Genome. Anim Genet. 2020;51(5):675-682. https://doi.org/10.1111/age.12962
    • Type: Journal Articles Status: Published Year Published: 2020 Citation: Silva DBS, Fonseca LFS, Pinheiro DG, Magalh�es AFB, Muniz MMM, Ferro JA, Baldi F, Chardulo LAL, Schnabel RD, Taylor JF, Albuquerque LG. Spliced genes in muscle from Nelore Cattle and their association with carcass and meat quality. 2020. Sci Rep 10, 14701.
    • Type: Journal Articles Status: Published Year Published: 2021 Citation: Wang X, Ju Z, Jiang Q, Zhong J, Liu C, Wang J, Hoff JL, Schnabel RD, Zhao H, Gao Y, Liu W, Wang L, Gao Y, Yang C, Hou M, Huang N, Regitano LCA, Porto-Neto LR, Decker JE, Taylor JF, Huang J. Introgression, admixture, and selection facilitate genetic adaptation to high-altitude environments in cattle. 2021. Genomics. https://doi.org/10.1016/j.ygeno.2021.03.023