Progress 08/01/02 to 07/31/05
Outputs The objectives of this grant are (1) create web-based approaches for analyzing and viewing a FPC database, (2) parallelize the FPC assembly code, and (3) provide an FPC tutorial. For objective 1, we developed the following suite of tools referred to as the WebAGCoL package: (a) WebFPC is an interactive Java display of the contigs, where there is complete support for zooming, scrolling, and coloring markers based on various attributes. (b) WebChrom is an interactive display of the contigs shown along the chromosome. There is also a search tool that allows the user to define a set of marker attributes and the resulting set of distribution of markers are shown along the chromosome. (c) WebFP allows the user to compare the fingerprint of one or more clones with the rest of the database. (d) WebBSS allows the user to compare a sequence with the BAC end sequences and sequenced clones associated with the FPC project, hence, the input is located on the map if it matches a
sequence. The results from the last three tools all link to WebFPC. Along with this software package is an installation script and demo files that make it easy for a novice user to install the package. The software is downloadable from www.agcol.arizona.edu/software/webagcol and has been published in Pampanwar et al. 2004. Objective 2 has been completed, where the three most compute intensive routines have all been parallelized for a multiprocessor machine. The code uses standard Unix and C parallel constructs so that no special software needs to be implemented in order to use it, and it is executed simply be providing a command line argument. It runs on any Unix based machine that has multiple processors and provides a 3.5x speedup on a four processor machine. Part of the parallel implementation resulted in a master's thesis in the Department of Electrical and Computer Engineering at the University of Arizona (Gupta 2004). The parallel code is part of the FPC distributable, which is
downloadable from www.agcol.arizona.edu/software/fpc and has been published in a paper on the High Information Content Fingerprinting (Nelson et al. 2005), a large exploratory project that would have been difficult if we did not have the large speedup provided by this implementation. Objective 3 has been completed and the tutorial along with the demo files can be downloaded from the FPC web site and has been published (Engler and Soderlund 2002). We are frequently complimented on the tutorial and used it for a 2004 HICF/FPC workshop organized by Jan Dvorak at Davis University. Also under this grant, we have developed an FPC module for BioPerl, which is downloadable from www.agcol.arizona.edu/software/bioperl and is part of the BioPerl package (www.bioperl.org). Funded in part by this grant, we have developed a Java implementation of the GMOD genome browser (www.gmod.org) and provide the configuration and perl script necessary to create a genome browser for FPC; this is downloadable
from www.agcol.arizona.edu/software/java_gbrowse).
Impacts Physical maps are built with FPC and the maps are generally used as a community resource for clone based sequencing and locating regions of interest. Though the FPC software package is freely available and FPC databases are generally freely available, it takes too much time to get FPC working on a local machine just to look at a region of interest or ask a few questions of the map. Hence, the WebAGCoL web interface allows the user community to easily query and view the FPC map. Additionally, the availability of the WebAGCoL package means that each lab does not have to spend valuable resources on making their own web display of the data. It has been downloaded by 141 laboratories, where each site is potentially viewed by hundreds of people in the community. The parallel implementation for FPC allows it to run on a multiprocessor Unix based machines. As dual processors can now be purchased for under $6000 and quad processors can be purchased for under $1300, this allows
considerable speedup on machines that are affordable by a general biology laboratory. For laboratories that will use FPC extensively, the FPC tutorial is extremely helpful in learning how to use the software, which saves the users much time and results in better overall maps for the community. In summary, the impact of this grant is that it saves scientists time when using FPC.
Publications
- Engler, F. and C. Soderlund (2002). Software for Physical Maps. In Ian Dunham (ed) Genomic Mapping and Sequencing, Horizon Press, Genome Technology series. Norfolk, UK, pp. 201-236.
- Gupta, G. (2004) Shared Memory Implementation for Building Physical Maps of Genomes. Master's Thesis. University of Arizona.
- Pampanwar, V., F. Engler, J. Hatfield, S. Blundy, G. Gupta, and C. Soderlund (2005) FPC web tools for rice, maize and distribution. Plant Physiology 138: 116-126.
- Nelson, W., A. Bharti, E. Butler, F. Wei, G. Fuks, H. Kim, R. Wing, J. Messing, and C. Soderlund (2005). Whole-Genome Validation of High-Information-Content Fingerprinting. Plant Physiology 139:27-38.
|