Performing Department
(N/A)
Non Technical Summary
The leaves of Mitragyna speciosa (kratom), produce more than 50 monoterpene indole alkaloids (MIAs) and spirooxindole alkaloids associated with varied pharmaceutical uses. For example, the MIA mitragynine has shown promise as a potential treatment for pain, opioid use disorder, and opioid withdrawal without any demonstrated addiction potential; and the spirooxindole mitraphylline shows anti-tumor activity. The existing knowledge gap in the biosynthesis of bioactive M. speciosa alkaloids prevents systematic advances in engineered production of these high value compounds, creating a bottleneck for pharmaceutical research and production. This proposal combines expertise of an international interdisciplinary team to elucidate the biosynthesis of multiple bioactive alkaloids in Mitragyna species.A multi-faceted approach incorporating genomics, transcriptomics, and metabolomics will be jointly applied to generate genomic resources for Mitragyna species, including genome and transcriptome assemblies and annotation to enable pathway discovery of diverse bioactive alkaloids; and elucidate the biosynthesis of selected pharmaceutical MIAs and spirooxindole alkaloids. The metabolic engineering and further biosynthetic understanding of Mitragyna alkaloids enabled by this project are expected to have a significant positive impact towards the mitigation of the opioid crisis in the U.S and around the world. In addition, genomic resources generated in this project will be an invaluable resource for the scientific community in MIA and spirooxindole alkaloids research.
Animal Health Component
10%
Research Effort Categories
Basic
90%
Applied
10%
Developmental
(N/A)
Goals / Objectives
M. speciosa leaves produce more than 50 MIAs and spirooxindole alkaloids that have psychoactive effects and are associated with varied pharmaceutical uses. Most of the studied kratom alkaloids have shown prominent activity at central nervous system targets, with mitragynine receiving the most attention. Other kratom alkaloids are also being investigated for their medicinal properties. The existing knowledge gap in the biosynthesis of M. speciosa alkaloids prevents systematic advances in engineered production of these high value compounds. A multi-faceted approach will be applied to accomplish the following three specific objectives in this project: Obj 1: Generate genomic resources for Mitragyna species leading to pathway discovery of diverse pharmaceutical alkaloids. Obj 2: Complete the mitragynine and speciogynine biosynthetic pathway in M. speciosa. Obj 3: Elucidate the biosynthetic pathway of anticancer spirooxindole mitraphylline and related compounds in Mitragyna species.
Project Methods
We will assemble the genomes using Oxford nanopore technology (ONT) based sequences we generated for selected M. speciosa cv1, 3, M. parvifolia and M. hirsuta chemotypes, generate Illumina short reads for error correction, generate transcriptome data from varying tissue types, annotate the genomes, identify structural variants across the four genomes, and perform synteny analyses across the Rubiaceae family and MIA producing species with available genomes. To generate Illumina short reads for error correction, genomic DNA will be isolated from young leaves of Ms cv1, 3, M. hirsuta and M. parvifolia using a modified CTAB protocol. Library preparation will be performed using the Illumina Truseq DNA Nano kit. Sequencing (paired end, 150 bp) will be performed on Illumina Novaseq6000 to achieve a minimum sequencing depth of 100X. The ONT long reads are being assembled using FLYE Assembler (V2.9.2) for other genotypes . The resulting assemblies will be polished with Illumina short reads using ntEdit. The Genome assembly completeness will be assessed using the benchmarking universal single-copy orthologs (BUSCO) software with the Plantae BUSCO "Embryophyta_odb9" database. For transcriptome analysis, RNA will be extracted from diverse tissue types of Ms cv1, 3, M. hirsuta and M. parvifolia. mRNA-seq libraries will be constructed and sequenced with the Illumina HiSeq4000 and analyzed. Equimolar amounts of cDNA from all tissues within each accession will be pooled for library preparation using the PacBio SMRTbell Express Template Prep Kit 2.0. Libraries will be sequenced using the Sequel System to generate up to 500,000 full-length, non-concatemer reads per single molecule real-time (SMRT) cell on a total of 4 SMRT cells. Iso-seq generates long read RNA sequences, abolishes the need for assembly, and enables the unambiguous identification of transcript isoforms. Iso-seq reads will be processed for each genotype using IsoSeq3 to generate representative circular consensus sequences. Genome-guided hybrid-read transcript assemblies will be generated for each genotype using both the Iso-seq and Illumina transcriptomes by StringTie2 for a highly resolved final transcriptome. The resulting transcriptome will be used to generate a predicted proteome using the MAKER-2 pipeline. The annotation and predicted proteome will be used to identify syntenic regions across Mitragyna (Rifat and four genomes in this proposal) and other MIA producing species using the MCScanX toolkit. Spearman correlation coefficients will be calculated for gene pairs (gene-gene correlation), using the R package GWENA (v1.10.0), a single pipeline for co-expression network analysis, functional enrichment of expressed modules, and phenotypic association. Phenotypic data for association will be a matrix of alkaloid quantities from the same tissue from which RNA was extracted. Key genes in MIA biosynthesis will be used for targeted orthology analysis. Further, potential syntelogs, genes derived from the same lineage based on synteny, identified will be analyzed for abundance in each cultivar using the expression matrix. Co-expression analyses by hierarchical clustering, and self-organizing maps will be made to identify novel candidate genes. Phylogenetic analyses of putative genes will be carried out by homology searches and using maximum likelihood method.We will identify and functionally validate the potential candidates by in vitro enzyme assays and transient expression in N. benthamiana to complete the pathway for mitragynine and speciogynine. Substrates will either be acquired from commercial sources or extracted/semi-synthesized and purified. In vitro assays will be used as an efficient first pass and high throughput screening method. The coding sequences of candidate genes will be cloned into the pOPINF vector for expression in E. coli or pESC-leu2d vector (specifically CYP450s) into S. cerevisiae, followed by protein purification. Each candidate protein will be incubated with each substrate library for a total of 6 replicates per timepoint per candidate. Metabolite profiling of each resulting reaction will be performed and interpreted using UPLC-MS/MS. These results will be corroborated by performing N. benthamiana infiltration assays. We will generate constructs for each candidate gene in a similar manner using binary 3?1 vector via InFusion cloning and transform into Agrobacterium tumefaciens strain GV3101 and infiltrated into leaves of ten, 3-4-week-oldN. benthamiana. 3 days after infiltration, the precursors tryptamine and secologanin will be infiltrated into the same leaves. Leaves will be snap frozen in liquid nitrogen 2-days post precursor infiltration and analyzed by UPLC-MS/MS. We will combine cross species phylogenetic analysis using Orthofinder for candidate gene identification for spiroxindole alkaloids, followed by biochemical characterization inmicrobial systems including E. coli and yeast and N. benthamiana as previously performed. Transcriptome and genome sequence data of different M. speciosa chemotypes and species generated in obj 1 will be correlated to the metabolite analyses for target identification. Co-expression analyses (as described in Obj 1, section 5.1.3) will be used to analyze sequencing data to determine which reductases, oxidases and other genes display expression profiles consistent with a role in the biosynthesis of these alkaloids. High-throughput approaches in transcriptomics analysis such as unsupervised classification (e.g., hierarchical clustering), machine learning techniques (e.g., super- and self-organizing maps), OrthoFinder and comparative co-expression network construction and visualization (CoExpNetViz) will be employed to efficiently identify gene candidates. Genomic regions adjacent to bait genes will also be analyzed to determine whether any parts of these alkaloid biosynthetic pathways are physically clustered on the genome. Gene candidates will be cloned and expressed using microbial hosts including E. coli, S. cerevisiae or Pichia pastoris, and in plant hosts such as N. benthamiana using the established discovery platform. After heterologous expression, the enzyme candidates will be functionally characterized by in vitro biochemical assay. Pathway genes can be expressed in yeast either on plasmids or integration into the yeast genome and the starting substrate will then be fed to a whole-cell yeast culture. Activities of proteins of interest will be analyzed using LC coupled with triple-quadruple mass spectrometry (LC/TQD-MS). Substrates and reference compounds for enzyme assays will be obtained from commercial sources, isolated from plant materials and/or semi-synthesized from precursors.