Progress 05/01/23 to 04/30/24
Outputs Target Audience:The target audience of year 1 of this grant is specifically geared towards the Cannabis genomics community, including both academic research scientists and industry, as we work towards genomic data generation. Working closely with our vendor New West Genetics, a fiber hemp breeding and seed company, we have successfully engaged several scientists within New West Genetics, as well as within HudsonAlpha. A key goal of our proposal is to build an internship program that spans non-profit HudsonAlpha and New West Genetics, specifically targeting non-traditional and minority students to enter the Cannabis genomics and breeding industry. We were not able to recruit a summer intern due to the May start date of the proposal, but have successfully recruited TJ Singh, an undergraduate from Mississippi State University to begin in summer 2024. We will compensate for our lack of year 1 interns by increasing the number of summer interns in years 2-3. Hannah Mueller, a technician in the Harkess Lab, was specifically responsible for much of the wetlab work for this proposal; she is a first-generation college student. Changes/Problems:We have not had any major changes to this proposal. However, given the lowering costs of genome sequencing technologies (e.g. PacBio sequencing on the Revio, and Illumina sequencing on the NovaSeq X), we have been able to expand the scope of Objective 1 to include 10 more genomes through a collaboration with the Smart laboratory without any change in budget. What opportunities for training and professional development has the project provided?PI Alex Harkess has presented this ongoing work at four invited university seminars in year 1: Iowa State University, Harvard University and the Arnold Arboretum, University of Georgia, University of Alabama. Dr. Sarah Carey has presented some of this work at the Cannabis Genomics session at Plant and Animal Genome 2024. Hannah Mueller, a technician on this project, has presented some of this work at American Society of Plant Biology (ASPB) 2023 in Savannah Georgia. Hannah has also gained immense expertise in high molecular weight DNA isolation, RNA isolation, and PacBio and Omni-C library preparation through this project. In April 2024, Hannah left our laboratory to pursue her dream of Physical Therapy school, and she openly credits this laboratory experience with helping her find her dream path. We are very proud of the opportunities this grant provided to her. Dr. Sarah Carey, although funded by her own USDA NIFA postdoctoral fellowship, has taken ownership of much of the genome assembly for this project as it relates to her funded proposal. For instance, she has independently led the assembly of the Objective 1 five genomes, as they are also very useful for her NIFA postdoctoral fellowship on common hop sex chromosomes. Sarah has become a leader in the field of sex chromosome genomics and assembly. Dr. Julie Robinson was funded by this proposal as a postdoc, but recently she received a 2024 USDA NIFA postdoctoral fellowship on developing inducible sex determination systems in soybean. Again, we are very proud of the opportunities this grant has given the trainees in my laboratory. Our training program written as "Broader Impacts" for this proposal recruited our first intern trainee this year, but will occur in 2024. How have the results been disseminated to communities of interest?The project has been disseminated in several ways in year 1. Primarily, this has been through invited seminars and conference presentations. PI Alex Harkess has presented this ongoing work at four invited university seminars: Iowa State University, Harvard University and the Arnold Arboretum, University of Georgia, University of Alabama. Dr. Sarah Carey has presented some of this work at the Cannabis Genomics session at Plant and Animal Genome 2024, an invited seminar at Oregon State University, as well as at two internal HudsonAlpha research seminars. Hannah Mueller, a technician on this project, has presented some of this work at American Society of Plant Biology (ASPB) 2023 in Savannah Georgia. We are currently building several manuscripts with the pangenome data from Objectives 1 and 3, that likely will be submitted in Year 2. In year 1, we do not have any publications. Upon grant award, we did publish a press release broadly (https://www.eurekalert.org/news-releases/991632), as did New West Genetics (https://newwestgenetics.com/2023/06/07/hudsonalpha-and-new-west-genetics-collaborate-on-usda-nifa-grant/). What do you plan to do during the next reporting period to accomplish the goals?In year 2 of this proposal, we are on track to complete several major goals of the manuscript and are currently ahead of schedule. First, we will aim to complete all genome assemblies for the 5 genotypes sequenced. Given that the raw data is already in hand, we anticipate no major delays in this goal. The major goals for year 2 are to polish the assemblies, and when we have fully sequenced all of the RNA-Seq from Objective 2, we will have all the material needed for gene annotation. We have expanded this aim somewhat to include more genomes, through a collaboration with Dr. Larry Smart's laboratory, where we are assisting in the data generation and assembly of an additional 10 genomes. In year 2, we will focus heavily on building the pangenome graph for Cannabis. Second, as mentioned above, we are currently sequencing all of the RNA-seq necessary to complete Objective 2. We will work towards releasing this data in a gene expression atlas on a physical representation of a plan using the R and ggplot modules "gganatogram". This will be released as a publicly available module in year 2 on our lab website. All differential expression analyses will be conducted in year 2, and we will overlap male-specific gene expression with X- and Y-linked genes to narrow in on putative sex determination genes. Third, for Objective 3 we have generated all of the raw low-pass Illumina genotyping data necessary from 960 individuals to map dioecy and monoecy genes. We are currently using our pangenome graph from Aim 1 to map this low-pass data and expect to have narrowed these candidate sex determination genes within the year 2 reporting period. One challenge we suffered from was our inability to recruit a summer intern given the May start date of the proposal, so our proposed internship program was delayed until year 2. We have recruited one international student from MS State to begin a 4 month internship in year 2.
Impacts What was accomplished under these goals?
In the first year of this project, we have met or exceeded our goals for data generation and analysis. Objective 1: We have generated all raw data for 5 chromosome-scale PacBio genomes (3 XY males, 1 XX monoecious individual, and 1 XX female. This includes 60X coverage PacBio HiFi data, and 60X coverage Dovetail Omni-C sequencing. These data are currently being assembled into chromosome-scale assemblies. Additionally, due to the falling prices of genome sequencing (in particular Illumina PE150 costs), we collaborated with Dr. Larry Smart's laboratory at USDA/Cornell to assist in Omni-C data generation to scaffold 10 additional Cannabis genomes. Combined, this dataset is poised to produce the largest Cannabis pan-genome to date. We are working closely with Smart lab postdoc George Stack, who is leading these genome assemblies. Objective 2: For the gene expression atlas to uncover sex determination genes on the X and Y, our vendor collaborators at New West Genetics have isolated 96 tissue types in replicate for RNA for the gene expression atlas for objective 2. We have successfully isolated all RNA from these tissues and they are currently in TruSeq stranded RNA-seq library preparation. These libraries are currently slated for sequencing in year 2 of this proposal (summer 2024). Objective 3: Our vendor collaborators at New West Genetics have also provided excellent plant material for several populations of mutants related to sex variation. In year 1, we partnered with Josh Clevenger's laboratory at HudsonAlpha to generate 1X coverage low-pass Illumina genotyping of 960 Cannabis individuals from an F2 population segregating for sex phenotypes - specifically, monoecy, maleness, and femaleness. These data have all been generated, at an average of 1X coverage per haplotype. Using the pangenome graph from Objective 1, we are currently mapping sex variation to the X and Y sex chromosomes using the Clevenger Lab's Khufu genotyping pipeline. Preliminarily, we have hits on the X chromosome that match to some known sex determination genes in the Rosales. We believe that this is the most highly dense Cannabis genotyping dataset ever produced, and we are currently performing additional phenotyping on the New West Genetics germplasm that we sequenced in order to map additional traits (oil andflowering time in particular). Further, we are proving the ability for mapping low-pass genotyping data to pangenome graphs, which currently is not routine.
Publications
|