Performing Department
(N/A)
Non Technical Summary
Fruit photosynthesis contributes to yield and fruit nutrient quality as demonstrated in a wide range of fleshy fruits, ears of cereals, oil seeds, legumes etc. Tomato is one of the world's most consumed vegetables with $1.78 billion economic value in the US in 2022 (USDA-NASS). Tomato serves as a great model to study fleshy fruit development, including chloroplast development and chlorophyll accumulation in fruits. Although leaves are the primary photosynthetic organs, up to 20% of the carbon in tomato fruit is derived from fruit photosynthesis itself, impacting yield, nutrient content, flavor, and overall fruit quality. This has been validated in the uniform ripening (u) mutation in tomato, where the transcription factor (TF) GOLDEN 2-LIKE (SlGLK2) is disabled. SlGLK2 determines chloroplast development, and chlorophyll accumulation in a developing fruit thereby contributing to improved fruit quality. The selection of 'u' in modern tomato varieties inadvertently compromised ripe fruit quality for desirable production traits. PD's research identified another TF, TKN4 in tomato, a Class I KNOTTED-like homeobox (KNOX) gene that also influences chloroplast development and chlorophyll accumulation in fruit through regulation of SlGLK2 expression. The identification of KNOX regulatory module highlights an entirely novel role for KNOX TFs in fruit development, which needs to be deciphered. By utilizing a tomato diversity panel, we will identify genome-wide loci underlying fruit chlorophyll accumulation bygenome-wide association studies (GWAS). We will perform targeted transcriptome analyses of knox mutants and integrate with GWAS results to construct gene regulatory networks governing fruit chlorophyll accumulation. We will identify genome-wide binding sites for KNOX and other key TFs involved in fruit chloroplast development. Molecular basis of chloroplast development and chlorophyll accumulation in fruits identified through this project can potentially translate into crop varieties with improved yield and fruit quality through breeding/ gene-editing. Progress in this area has the long-run potential to favorably engineer photosynthesis contributing to yield and quality of food crops.
Animal Health Component
15%
Research Effort Categories
Basic
85%
Applied
15%
Developmental
0%
Goals / Objectives
Fruit development is a highly regulated process involving a series of physiological and biochemical processes. One such critical process is fruit photosynthesis. Prior to ripening, the majority of fruits are green photosynthetic organs. Despite fruits being regarded as photosynthate sinks, fruit photosynthesis contributes to yield and fruit quality as demonstrated in a wide variety of fleshy fruits, pods of legumes/oil seeds, and ears of cereals.Chloroplasts are key organelles in the energy biology of plants that serve as sites of photosynthesis. In addition, chloroplasts also harbor a multitude of interacting metabolic pathways including, biosynthesis of amino acids, fatty acids, vitamins, nucleotides, plant hormones, and a range of specialized metabolites. In a developing fruit, chloroplasts transform into chromoplasts for carotenoid formation. There is a well-defined positive correlation between chloroplast number and chlorophyll content of the developing fruit and the nutritive value of the ripe fruit. Furthermore, gradients in chlorophyll content and chloroplast number are observed in fleshy fruits, with decreasing gradients from the stem end to the base of the fruit as shown in several Solanaceae fruits.Tomato is one of the world's most consumed vegetables with $1.78 billion economic value in the US in 2022 (USDA-NASS). Tomato serves as a great model to study fleshy fruit development, including chloroplast development in fruits. In tomato, fruit photosynthesis contributes up to 20% of the total fruit carbohydrate. Interestingly, some tomato varieties exhibit fruit chlorophyll gradients, and the others do not. Research from our lab and others have identified multiple genes including transcription factors (TFs) that function in chloroplast development and chlorophyll gradient formation in tomato fruit.Overall goal of this research is to dissect the molecular basis of fruit-specific chloroplast development and chlorophyll gradient formation by leveraging the tremendous resources available for tomato, including a diverse collection of sequenced accessions, extensive phenotypic variation in fruit chlorophyll content, and a range of chlorophyll gradient phenotypes. The specific objectives of this proposal are Obj 1: Genome-wide association studies (GWAS) to identify loci underlying fruit chlorophyll content and chlorophyll gradient formation by utilizing a tomato diversity panel, Obj 2: Identify novel structural and regulatory sequences and define a gene regulatory network (GRN) for fruit chloroplast development. This will be achieved by gene expression profiling of fruit from differential chlorophyll accumulating tomato accessions and mutants and integrating with GWAS and further validating the GRN by identifying genome-wide TF binding sites for important TFs. Key deliverables of this project are identification of novel structural and regulatory sequences governing fruit-specific chloroplast development and their functional relationships in a regulatory circuit. This research will facilitate strategies to engineer photosynthesis and tailor chloroplast-derived metabolites to increase productivity and nutrient value of fleshy fruits.
Project Methods
We will perform GWAS utilizing a diverse germplasm collection of 561 accessions with enormous diversity in fruit chlorophyll phenotypes. This set includes, 398 modern, heirloom, and wild accessions of tomato (Solanum lycopersicum) and its closest relative, S. pimpinellifoium and 163 accessions from the Varitome Project representing geographical, morphological, and genetically diverse tomatoes. We will perform GWAS to identify genomic loci for the fruit chlorophyll phenotypes by using Efficient Mixed-Model association expedited (EMMAX). Kinship matrix derived from the markers will be used as the variance-covariance matrix for random effect, and the first ten principal components will be included as fixed effects to control for population structure. The genome-wide significance will be analyzed using a Bonferroni correction for multiple testing (uniform threshold, P =1/n). n, the effective number of independent SNPs will be calculated using Genetic type 1 Error Calculator (GEC). The Haploview software will be used to calculate linkage disequilibrium (LD) with the following parameters: -maxdistance 2000 -minMAF 0.05 -hwcutoff 0. The raw genomic data has been re-aligned and SNPs called for the entire 561-member diversity panel using the genome analysis tool kit (GATK). We possess high-quality SNPs, Indels and structural variants (SVs) and an established computational framework to achieve Obj 1.For gene expression analyses, total RNA will be isolated from the shoulder, middle, and base, sections of 21-DPA fruit pericarp from knox mutants, Cu and ug, slglk2 mutant u and wild type (WT) Ailsa Craig (AC). In addition, two genotypes each from the upper and lower extreme tails of the distribution selected based on GWAS phenotyping for fruit chlorophyll to capture gene expression data from large-effect phenotypic variants. In total 72 Illumina libraries (8 genotypes x 3 fruit sections x 3 biological replicates) will be sequenced using Novaseq6000 to generate 80M, 150bp paired end reads per sample. All sequencing data will be stored and processed on UF's HiPerGator 3.0 computing cluster. Automatic data processing scripts will be used to trim (Trimmomatic) the mRNA-seq reads, and align to the tomato Heinz 1706 reference genome with gene models iTAG4.1 using HISAT. Raw counts for each tomato gene will be normalized to fragments per kilobase million mapped reads (FPKM). Pairwise Pearson's correlation coefficient values between biological replicates will be calculated with log2-transformed FPKM values using the cor function in the R program. In addition, principal component analysis (PCA) will be performed among fruit sections using the prcomp function in R to ensure that biological replicates are tightly clustered for high reproducibility of our data. Differentially expressed genes (DEGs; log fold change of ±2 and a p-value < 0.05 adjusted with a false discovery rate (FDR) < 0.001) between samples including shoulder, middle and base portions of the fruit, mutants vs WT, and combinations will be determined using DESeq2 package in R. No deviation from established protocols is necessary to achieve this objective.DNA affinity purification sequencing (DAP-seq), a high throughput sequencing-based in vitro TF-DNA interaction assay identifies binding motifs and target genes of TFs. We will used DAP-seq to identify genome-wide TF binding sites (TFBS) for KNOX (TKN4, TKN2), select TFs based on GWAS and gene expression analyses. Briefly, TF cDNAs will be amplified by Phusion High-Fidelity PCR kit, cloned into the pENTR-D/TOPO vector, then transferred into the pIX-Halo vector which contains an N-terminal HaloTag affinity sequence82. The pIX-Halo-TF recombinant proteins will be produced in vitro using TnT SP6 High-Yield Wheat Germ Protein Expression system (Promega) and affinity purified. Total genomic DNA will be extracted from seedlings of WT AC using the commonly used CTAB method. Illumina TruSeq genomic DNA libraries will be constructed and the recombinant proteins will be incubated with the genomic DNA library as described previously. Bound DNA molecules will be recovered, PCR amplified, and sequenced on the Illumina platform generating 100 nt reads; the HALO vector will be used as a negative control. DAP-seq reads will be aligned to the latest tomato genome SL4.0 and iTAG4.1 annotation using BWA-MEM and peaks called using GEM peak caller to identify the binding sites.Constructing GRNs will serve as a powerful means to interpret GWAS candidate loci. We will construct fruit-specific GRN from RNA-seq data from 4.2.1. by building co-expression network modules using averaged FPKM values and the weighted gene co-expression network analysis (WGCNA) package (v1.51) in R. The Cytoscape platform v.3.10.0 will be used to visualize the networks and determine the hub genes involved in chloroplast development and connections between TF-target genes. To identify novel genes, we will extract a list of significant variants from the GWAS QTLs. Next, we will calculate the LD blocks in the region from the extracted genomic position of each variant using Haploview and all the genes present in this interval will be extracted based on the latest tomato genome SL4.0 and annotation iTAG4.1