Progress 01/01/08 to 12/31/11
Outputs OUTPUTS: The major outputs of this project include software tools for text mining the scientific literature for information about genes, incorporation of these tools into the AgBase curation and annotation pipeline, development of appropriate data models for incorporating livestock data into the Reactome pathway database, an analysis tool, Pathfigure, for evaluating transcriptome data in the context of the Reactome database, the establishment of a web based transcriptome database housing RNAseq expression data. Arizona: Work during this past year has focused on enhancing the capabilities of Birdbase and improving integration with avian genomics resources including GEISHA, AgBase, AGNC, and most recently the multi genome browser CoGe. The Birdbase database was redesigned to provide a more robust platform for integrating the anticipated flood of genomic data for other bird species. This significant effort is now completed. Data for turkey and zebra finch are being incorporated into BirdBase. Data from 50 bird genomes recently sequenced at by the Beijing Genomics Institute have been uploaded to CoGe. Integration of the various resources on the Birdbase user interface is ongoing. Work at NYU was focused on using the Reactome data model and web tools to annotate the functions of key Gallus proteins by representing them as participants in sequences of reactions. Protein annotations are linked to the canonical representations of these proteins in UniProt or Ensembl; small molecules are linked to their reference forms in the ChEBI database. Pathfigure: Given differential expression data from an experiment, Pathfigure predicts what Gallus Reactome pathways are differentially active. Based on this pathway prediction, Pathfigure automatically generates graphs that show the relationships of these differentially expressed proteins on four levels, including the pathway level, expressed gene level, reaction level, and all-participants level. PD-Explorer: PD-Explorer (Plan Domain-Explorer) integrates general domain-independent plan adaptation strategies and domain specific formalized background knowledge (e.g. ontologies) to propose and evaluate hypothetical plans, based on an incomplete planning domain model. BioPlanner is an instance of the PD-Exporer tool for biological domains, that views a biological pathway (particularly a signal transduction pathway) as a plan, and information from sources such as Gallus Reactome as a currently incomplete planning domain model. . eGRAB for disambiguating gene names in text and retrieving all the scientific papers mentioning a particular gene. eGIFT for mining gene-related information from text. PARTICIPANTS: Carl J. Schmidt Principal Investigator University of Delaware K Vijay-Shanker Co-PI University of Delaware Fiona McCarthy co-PI Mississippi State University Peter D'Eustachio co-PI New York University Parker Antin co-PI University of Arizona Keith Decker co-PI University of Delaware Li Jin Doctoral Student University of Delaware Catalina Oana Tudor Doctoral Student University of Delaware Anjana Saxena Post-doctoral Fellow New York University Veronica Shamovsky Post-doctoral Fellow New York University Lisa Matthews Post-doctoral Fellow New York University TARGET AUDIENCES: Life scientists PROJECT MODIFICATIONS: Not relevant to this project.
Impacts We have provided multiple web interfaces for serving scientists interested in genomics and text mining including: Bridbase: http://birdbase.arizona.edu/birdbase/ Pathfigure: http://birdbase.udel.edu/pathfigure/ eGIFT: http://biotm.cis.udel.edu/eGIFT/ Gallus Reactome: http://gallus.reactome.org/ Birdbase Transcriptome: http://birdbase.udel.edu/birdbase_atlas/ We are currently using the platform we developed with support from this award to host a comparative genomics site devoted to 50 different bird species that have been sequenced by BGI. Integration of Software Tools: eGIFT has been integrated with AgBase, developed by collaborators at Mississippi State eGRAB has been integrated with various other tools at University of Delaware, such as Rlims-P (for mining phosphorylation), and eFIP (for mining the impact of phosphorylation on the subsequent interactions involving the phospho-protein). This award directly supported completion of two Ph.D. Degrees at the University of Delaware
Publications
- Catalina O Tudor, Carl J Schmidt, K Vijay-Shanker. eGIFT: Mining Gene Information from the Literature, BMC Bioinformatics, 2010, 11:418
- Catalina O Tudor, Carl J Schmidt, K Vijay-Shanker. Mining for Gene-Related Key Terms. Third International Symposium on Semantic Mining in Biomedicine SMBM, Turku Finland, 2008,157-160
- Catalina O Tudor, K Vijay-Shanker, Carl J Schmidt. Mining the Biomedical Literature for Genic Information, BioNLP Workshop in conjunction with ACL, 2008, 288-290.
- Li Jin, Keith S. Decker, Carl J. Schmidt: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species. IAAI 2009
- L. Jin and K. Decker. Ontology Oriented Exploration of an HTN Planning Domain through Hypotheses and Diagnostic Execution. Proceedings of the Workshop on Knowledge Engineering for Planning and Scheduling, at ICAPS 2010.
- L. Jin. Stability Oriented Task-Structure Based Multi-Agent Re-Planning [extended abstract]. 2009.
- Li Jin, Decker, K.S.,Stachnik, A.J., Schmidt, C.J. Prediction of biological pathways with integrated information. Bioinformatics and Biomedicine Workshop, 2009. BIBMW 2009.
- Dalloul et al. 2010 Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol 8(9): e1000475. doi:10.1371/journal.pbio.1000475
- Burt DW, Carre W, Fell M, Law AS, Antin PB, Maglott DR, Weber JA, Schmidt CJ, Burgess SC, McCarthy FM. 2009 The Chicken Gene Nomenclature Committee report. BMC Genomics [10 Suppl 2:S5]
- Li Jin, PhD: 2011 Exploring Incomplete Planning Domain Knowledge through Hypothesis Generation and Diagnostic Execution.
- Catalina O Tudor. 2011 Using Text Mining Techniques to Gather Gene-Specific Information from the Biomedical Literature, PhD Thesis, 2011
|