Source: MISSISSIPPI STATE UNIV submitted to NRP
BIG DATA: BIOCOMPUTING, BIOINFORMATICS, AND BIOLOGICAL DISCOVERY
Sponsoring Institution
Agricultural Research Service/USDA
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
0431533
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Sep 15, 2016
Project End Date
Sep 14, 2021
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
MISSISSIPPI STATE UNIV
(N/A)
MISSISSIPPI STATE,MS 39762
Performing Department
(N/A)
Non Technical Summary
(N/A)
Animal Health Component
50%
Research Effort Categories
Basic
50%
Applied
50%
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2011710104060%
3041820108010%
2013910104030%
Goals / Objectives
The Institute for Genomics, Biocomputing & Biotechnology (IGBB) at Mississippi State University (MS State) and the ARS¿s Genomics & Bioinformatics Research Unit (GBRU) will continue their collaborative efforts to improve the biocomputing infrastructures of MS State and the ARS while advancing big data biological research. Goals of the current project include (but are not limited to) [a] acquisition of computer hardware/software that will facilitate computational biology research by the IGBB and the ARS; [b] providing assistance to the ARS while it works to establish its own supercomputing facility; [c] generating novel computer scripts and adapting existing scripts for big data analyses; [d] conducting big data genomics and proteomics research on organisms of agricultural importance in collaboration with the GBRU and other ARS units; [e] using differential gene expression analysis to explore health and fertility issues in livestock; [f] using biomolecular data/tools produced in the study of model organisms to advance research on agricultural species; [g] establishing and testing both laboratory and computational pipelines/protocols for producing, analyzing, and protecting sensitive genomic data and accompanying metadata; and [h] participating in training ARS and other scientists in big data management and analyses.
Project Methods
Pacific Bioscience (PacBio, long-read), Oxford Nanopore (long-read), and Illumina (short-read) sequencing technologies will be used to generate DNA and transcriptome sequence data. Proteomics data will be generated using the IGBB¿s LTQ Orbitrap Velos mass spectrometer. The IGBB will utilize its supercomputing capacity and expertise to conduct genome assembly, SNP calling, comparative genomics research, and proteome analyses. RT-qPCR will be used to validate RNASeq-based differential gene expression results. Computational tools developed for this project will be tested on existing and new data sets, with laboratory support data being generated as needed. Workshops will be developed by IGBB faculty/staff to train ARS and other scientists in advanced data management and analysis techniques.