Source: UNIV OF CONNECTICUT submitted to NRP
BIOINFORMATICS TOOLS FOR VIRAL QUASISPECIES RECONSTRUCTION FROM NEXT-GENERATION SEQUENCING DATA AND VACCINE OPTIMIZATION
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0224815
Grant No.
2011-67016-30331
Cumulative Award Amt.
$419,388.00
Proposal No.
2010-04519
Multistate No.
(N/A)
Project Start Date
Apr 1, 2011
Project End Date
Mar 31, 2014
Grant Year
2011
Program Code
[A1201]- Animal Health and Production and Animal Products: Animal Breeding, Genetics, and Genomics
Recipient Organization
UNIV OF CONNECTICUT
438 WHITNEY RD EXTENSION UNIT 1133
STORRS,CT 06269
Performing Department
Computer Science and Engineering
Non Technical Summary
Viral infections cause a significant burden on animal health, reducing yields and increasing production costs due to expensive control programs. Vaccination is a vital part of such control programs; however, its effectiveness is reduced by the quick evolution of virus variants, called virus quasispecies, in animal hosts. Despite experimental evidence that viral quasispecies play a major role in disease progression and emergence of drug or vaccine resistant variants, practical implications of quasispecies evolution remain poorly understood due to the difficulty to characterize sequence variants and their frequencies in infected animals. The overarching goal of this project is to develop computational methods enabling comprehensive characterization of genomic diversity of viral quasispecies infecting animal populations based on next-generation sequencing data. Recent advances in sequencing technologies have made it possible to generate millions of short sequence fragments from complex viral samples. However, identification of viral quasispecies from such data requires the development of novel computer algorithms that can accurately piece together the short fragments generated by next-generation sequencing machines. The specific aims of the project are to (1) develop and validate bioinformatics tools for accurate reconstruction of viral quasispecies sequences and their frequencies from next-generation sequencing data; (2) measure quasispecies persistence and evolution in commercial layer flocks following administration of modified live Infectious Bronchitis Virus (IBV) vaccine using; and (3) develop predictive models and algorithms for optimizing strategies of administration of modified live IBV vaccine to commercial layer flocks. Expected outcomes of the project include the development of a comprehensive algorithmic toolkit for quasispecies sequence reconstruction and frequency estimation from next-generation sequencing data, and user-friendly web-based bioinformatics tools made available free of charge to the research community. We will also conduct four longitudinal sequencing studies of pooled tracheal swab samples collected from layer flocks that are administered attenuated live IBV vaccine. Sequencing will be performed using the 454 and SOLiD platforms to take advantage of their complementary strengths in terms of read length and coverage depth. Sequencing data will be made publicly available to enable further analysis and methods development by other groups. Anticipated benefits from the successful completion of the project include improved diagnostic and monitoring of viral outbreaks in animal populations, reduced costs of vaccination, improved animal health, and improved yield of animal production.
Animal Health Component
10%
Research Effort Categories
Basic
90%
Applied
10%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3043210110175%
3113210110125%
Knowledge Area
311 - Animal Diseases; 304 - Animal Genome;

Subject Of Investigation
3210 - Egg-type chicken, live animal;

Field Of Science
1101 - Virology;
Goals / Objectives
The main goal of this project is to develop novel bioinformatics tools enabling the use of next-generation sequencing data to study quasispecies evolution in animal populations. The specific aims of the project are to (1) develop and validate bioinformatics tools for accurate reconstruction of viral quasispecies sequences and their frequencies from next-generation sequencing data; (2) measure quasispecies persistence and evolution in commercial layer flocks following administration of modified live Infectious Bronchitis Virus (IBV) vaccine using; and (3) develop predictive models and algorithms for optimizing strategies of administration of modified live IBV vaccine to commercial layer flocks. Direct outcomes of this project will include: (a) novel efficient computational methods for viral quasispecies reconstruction capable of handling and integrating sequencing data from major high-throughput sequencing technologies (b) freely available user-friendly web-based bioinformatics tools for quasispecies sequences reconstruction and frequency estimation, (c) time series sequencing data obtained from 454 and SOLiD platforms registering evolutionary and epidemiological changes of the IBV virus in commercial layer flocks following administration of modified live vaccine, (d) analysis of IBV quasispecies variants obtained from time series sequencing data, and (e) assessment of the effectiveness of IBV vaccines and predictive models for optimization of vaccination strategies. Specific activities for first year of the project include collection of field samples for 2 studies, sample sequencing with 454 FLX Genome Analyzer, algorithm and software development for the quasispecies sequences assembly and frequency estimation problems, accuracy assesment of sequence assembly and frequency estimation tools using real and simulated benchmarks, validation experiments using RT-PCR and qRT-PCR, and preparation of manuscripts and conference presentations. Deliverables include 454 sequencing data for 2 studies, public release of the quasispecies sequence assembly and frequency estimation tool, and a conference paper on quasispecies sequences assembly and frequency estimation methods. Specific activities for the second year of the project include collection of field samples for 2 additional studies, sample sequencing with 454 FLX Genome Analyzer and ABI SOLiD, further validation experiments using RT-PCR and qRT-PCR, analysis of sequencing data from the 4 studies, development of models of vaccine quasispecies evolution, algorithms and software development for vaccination strategy optimization, preparation of manuscripts and conference presentations, and organizing a workshop on computational methods for quasipecies sequence assembly and analysis. Deliverables include 454 sequencing data for 2 studies and SOLiD sequencing data for 4 studies, public release for vaccination strategy optimization tool, and submission of journal paper on IBV vaccine quasispecies evolution and vaccination strategy optimization.
Project Methods
We will perform 4 separate studies, two per year. In each study we will collect 100 pooled tracheal swab samples collected at 9 different time points from layer flocks that are administered attenuated live IBV vaccine. We will include in the time point array the vaccine itself as well as pooled samples collected prior to vaccination. The latter time point will identify any prior infection previously undetected in the chickens that may confound further data analysis. Each sequencing library will be prepared following RT-PCR amplification using the Superscript kit (Invitrogen). For 454 sequencing, we will amplify the S1 hypervariable region of IBV using 25 different degenerate primer pairs designed using our previously developed PrimerHunter tool using as target sequences all known IBV sequences of the relevant serotype. Amplicons will overlap and will be staggered approximately every 50 base pairs. Amplicons for each time point will be pooled and sequenced on 1/8 of a gasket, providing us with over 100,000 reads per sample. This will ultimately give us approximately 25,000x coverage per pooled sample and a sensitivity to detect quasispecies represented at 1 in 10,000 sequences. For ABI SOLiD sequencing, we will RT-PCR the entire 1.6kb S1 region of IBV using only 1 set of primers and the Superscript kit (Invitrogen). Following this, the amplicons will be fragmented and sequenced to 5-10 million read depth using the shotgun sequencing protocol and MID tags for each time point. Quasispecies sequences assembly will be performed using a multi-step algorithmic flow consisting of first aligning the reads onto the IBV reference sequence, then constructing the read overlap graph, and finally selecting a set of likely quasispecies sequences corresponding to maximum bandwidth paths in the read graph under appropriately defined edge costs. Frequency estimation will be performed using an expectation-maximization algorithm that can utilize multi-platform sequencing data and takes into account base quality scores and read pairing information, when available. The developed software will be first validated on simulated data. We will also assess the robustness of inferred sequences and frequencies using a bootstrapping approach. Experimental validation will be performed by RT-PCR and quantative RT-PCR. A subset of 10-20 sequence variants spanning the entire range of predicted frequencies will be selected for experimental validation from each pool (i.e. each time point). We will design variant specific PCR primers, and, following PCR, quasispecies-specific PCR products will be subcloned and subjected to Sanger sequencing to validate sequence variants. Quantitative real time RT-PCR (qRT-PCR) will be performed using SYBR green assays. Using primers designed for each quasispecies and spanning only 150-250nt of the S1 region (using variant-specific PCR primers designed using PrimerHunter), we will validate the proportion of the target variant in the total RNA quasispecies pool.

Progress 04/01/11 to 03/31/14

Outputs
Target Audience: Themain target audience for our efforts has been the virologists and molecular epidemiologiststhat are increasingly using next generation sequencing data to study viral evolution within host populations. The bioinformatics tools we have developed are assisting themin conducting cost-effectivesequencing experiments,reconstructing viral quasispecies from both shotgun and amplicon sequencing data, and quantifying quasispecies abundance. The datasets we have generated will specifically benefit researchers studying the Infectious Bronchitis Virus in avian populations. They will also be usefulto bioinformaticsresearchers tobenchmark new tools. The broader bioinformatics communityis alreadybenefiting from the algorithmic techniques we have developed with support from this grant, as they find applications well beyond quasispecies reconstruction (including but not limited to genome scaffolding, reference-guided transcriptome reconstruction, transcriptome quantification, testing for differential gene expression). To maximize research impact, the PIs have given tutorials and presentations at both poultry science and bioinformatics conferences. They have also co-founded the workshop on Computational Advances in Molecular Epidemiology (CAME) to accelerate progress in the fieldby bringing together field practitionerswith bioinformaticians, statisticians and computer scientists interested inanalysis ofviral molecular data inan epidemiological context. The first edition of the workshop was held in Nov. 2011 in conjuction with the IEEE International Conference on Bioinformatics and Biomedicine; two other successful editions have been held in conjunction with the IEEE International Conference on Computational Advances in Bio and Medical Sciences in June 2013 and June 2014. Changes/Problems: As originally outlined in the proposal, we have collectedsamples from commercial layer flocks at several time points following administration of modified live Infectious Bronchitis Virus (IBV) vaccine. However, most of the samples have tested negative for the virus, indicating that the vaccine used by our collaboratorsatKofkoff Egg Farms does not persist in thevaccinated flocks.Since we have been able tosequence field samples collectedshortly after vaccination, it has beenimpossible to conduct the type of longitudinal analyses envisioned in the proposal. However, toenablethe validation of the developed quasispecies reconstruction tools on real IBV sequencing data weinoculated embryonating eggs withseveral archived IBVsamples, purified the virus from the allantoic fluid, and then performed both shotgun 454 sequencing and Sanger sequencing of~100 clones. We further constructed pools of clones mixed in predetermined proportions, and performed 454 shotgun sequencing ofthe clone pools. What opportunities for training and professional development has the project provided? Twopost-doctoral scholars (Hongjun Wang and Sahar Al Seesi),nine Ph.D. students (Adrian Caciula, Nicholas Mancuso, Bassam Tork, Serghei Mangul, Anas Al-Okaily, Mai Hamdalla, Amir Bayegan, Lu Li, and Qianlian Su), one M.S. student (Marius Nicolae), and one undergraduate student (Andrew Bligh)have participated in research activities related to this project. Their participation has given them the chance to develop a deep understanding of viral evolution, acquire research skills in molecular biology, next generation sequencing, mathematical modeling, statistical analysis, andalgorithm design and implementation,as well asto enhance their capability of working effectively across disciplinary lines. Besides training graduate students, the PIs have also given tutorials at international conferences such as the 5th IEEE International Conference on Bioinformatics and Biomedicine and the 1st SelectBiosciences Next-Gen Sequencing Congress. Mandoiusupervised undergraduate independent study projects on IBV quasispecies analysis, and is currently supervising a year-long industry sponsored Senior Design Project in which a team of five students is deploying an innovative RNA-Seq analysis pipeline on the Amazon EC2 cloud. At GSU Zelikovsky has mentored3 seniors with project topics in network flow methods and applications to bioinformatics. The PIs have also been active in curriculum development. At UConn, Mandoiu has designed and taught several new undergraduate and graduate courses in Bioinformatics, Computational Genomics, and Computational Molecular Biology. At GSU, Zelikovsky has developed advanced undergraduate and graduate level courses on Bioinformatics algorithms and Systems Biology offered annually at GSU.We have incorporated topics related to quasispecies reconstruction in all these courses. How have the results been disseminated to communities of interest? The project has resulted in the publication of over 25 journal and refereed conference publications and 4 book chapters.The PIs have given numerous invitedpresentations, including at the 101st Annual Meeting of the Poultry Science Associationand the11th International Conference on Molecular Epidemiology and Evolutionary Genetics of Infectious Diseases. The PIs haveco-founded the workshop on Computational Advances in Molecular Epidemiology (CAME) to accelerate progress in this fieldby bringing together field practitionerswith bioinformaticians, statisticians and computer scientists interested inanalysis ofviral molecular data inan epidemiological context. So far the CAMEworkshop had three very successful editions: in November 2011 in conjuction with the IEEE International Conference on Bioinformatics and Biomedicine,respectively in June 2013 and June 2014, both in conjunction with the IEEE International Conference on Computational Advances in Bio and Medical Sciences. By attracting a multidisciplinary audience, theseworkshopshavealready facilitatedotherwise unlikely exchanges of ideas between academic and industry researchers working in different fields and have enabled several new collaborations (most notably with CDC). To further broaden the impact of the CAMEworkshops, wehave edited special issuesdevoted to the presented worksby the journals In Silico Biology, BMC Bioinformatics, andBMC Genomics. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Viral infections cause a significant burden on animal health, reducing yields and increasing production costs due to expensive control programs. By generating millions of short reads per run, next-generation sequencing technologies have recently made it possible to comprehensively characterize viral variants infecting an animal. However, due to the sheer size of next-generation sequencing datasets, computational analysis is increasingly becoming a bottleneck. The main outcome of this project is a suite of efficient software tools that overcome this bottleneck for the study of viral evolution in animal populations. Besides drastically reducing analysis time, our tools enable large scale experiments in which thousands of samples can be tested at a fraction of the cost of standard methods. These tools have been made available free of charge to the research community. Our research results have been broadly disseminated via journal publications and presentations at international conferences, including the Workshop on Computational Advances in Molecular Epidemiology organized by the PIs. Specific accomplishments include: * We developed a new method, called Viral Spectrum Assembler (ViSpA), for quasispecies spectrum reconstruction. The ViSpA assembler takes into account sequencing errors at multiple steps, including mapping-based read preprocessing, path selection based on maximum bandwidth, and candidate sequence assembly using probability-weighted consensus techniques. Sequencing errors are also taken into account in ViSpA’s EM-based estimation of quasispecies sequence frequencies. WecomparedViSpA with state-of-the-art tools on both simulated and real 454 pyrosequencing shotgun reads from HCV and HIV quasispecies. Experimental results show that ViSpA outperforms ShoRAH on simulated error-free reads. While ShoRAH has a significant advantage over ViSpA on reads simulated with sequencing errors due to its advanced error correction algorithm, ViSpA is better at assembling the simulated reads after they have been corrected by ShoRAH. ViSpA also outperforms ShoRAH on real 454 reads. * We developedmulticommodity flow (MCF) based methods for viral quasispecies reconstruction and frequency estimationfrom overlapping ampliconand shotgun next-generation sequencing reads.Multicommodityflowmethods were evaluated on simulated data in comparison to ViSpA andpreviously developed state-of-the-art algorithms for estimating quasispecies spectra from the NGS amplicon and shotgun reads, respectively, and found toreliably assemble viral quasispecies andaccuratelyestimate theirfrequencies, especially from large datasets. *We conducted acomprehensive study comparing complete quasispecies reconstruction flows from 454 shotgun reads, considering alternative software tools for each flow step (read error correction, read alignment, quasispecies reconstruction, and frequency estimation) and assessing the effect of paramter settings for each tool.Parameter tuning done usingdatasets consisting of454 reads generated from pools of Infectious Bronchitis Virus clones with known sequence and mixingproportions was shown to significantly improvereconstruction accuracy in terms of recall and precision. * A novel method for reconstructing haplotypes from single amplicon NGS reads (kGEM) has been designed, implemented and validated. It is based on applying expectation-maximization method for finding most likely fractional haplotypes (i.e., haplotypes in which each position can be each nucleotide or a gap with certain probabilities). kGEM has bee shown to significantly outperform the state-of-the-art QuasiRECOMB software. * Wedeveloped anovel scaffolding algorithm based ona maximum likelihood model capturing read mapping uncertainty and/or non-uniformity of contig coverage. The algorithm achieves high scalability by combining integer linear programming in conjunction with a non-serial dynamic programming paradigm. Experimental results show that our algorithmcompares favorably to previous methods OPERA and MIP in both scalability and accuracy for scaffolding single genomes of up to human size, and significantly outperforms them on scaffolding low-complexity metagenomic samples. * A novel method for differential gene expression analysis based on bootstrapping, called IsoDE,has been developed. The bootstrapping experiment exploits the idea that by using subsampling on RNAseq reads, we can create virtual replicates.We compared IsoDE against four existing methods (Fisher's exact test, GFOLD, edgeR and Cuffdiff) on RNA-Seq datasets generated using three different sequencing technologies (Illumina, Ion Torrent and 454), both with and without replicates. Experiments onRNA-Seq datasets without replicates show that IsoDE has consistently high accuracyrelative tothe qPCR ground truth, frequently higher than that of the compared methods, particularly for low coverage data and at lower fold change thresholds. In experiments on RNA-Seq datasets with up to 7 replicates, IsoDE has also achieved high accuracy. Furthermore, unlike the other compared methods, IsoDE accuracy varies smoothly with the number of replicates, and is relatively uniform across the entire range of gene expression levels. * We have developed a novel expectation-maximization algorithm, called IsoEM, for inference of alternative splicing isoform frequencies from high-throughput transcriptome sequencing (RNA-Seq) data and have conducted empirical comparisons with existing algorithms on both simulated and real RNA-Seq datasets. Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM significantly outperforms existing methods of isoform and gene expression level estimation from RNA-Seq data. * We have developed and implementeda newmaximum likelihoodmethod called MaLTA for simultaneous transcriptome assembly and quantification from Ion Torrent RNA-Seq reads. A new version ofIsoEMalgorithm suitable for Ion Torrent RNA-Seq readswas also developedto accurately estimate transcript expression levels.Experimental results on both synthetic and real datasets show that Ion Torrent RNA-Seq data can be successfully used for transcriptome analyses. Experimental results suggest increased transcriptome assembly and quantification accuracy of MaLTA-IsoEM solution compared to existing state-of-the-art approaches.

Publications

  • Type: Other Status: Published Year Published: 2011 Citation: I.I. Mandoiu and S. Al Seesi, Bioinformatics Pipelines for RNA-Seq Data Analysis, Tutorial at 5th IEEE International Conference on Bioinformatics and Biomedicine, Atlanta GA, Nov. 12-15, 2011.
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2011 Citation: I.I. Mandoiu, Reconstruction of infectious bronchitis virus quasispecies from 454 pyrosequencing reads, Invited talk, 1st Workshop on Computational Advances in Molecular Epidemiology, Nov. 13, 2011.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2011 Citation: S. Mangul and I. Astrovskaya and B. Tork and I.I. Mandoiu and A. Zelikovsky, Viral Quasispecies Reconstruction Based on Unassembled Frequency Estimation, 7th International Symposium on Bioinformatics Research and Applications, Changsha, China, May 27-29, 2011.
  • Type: Other Status: Published Year Published: 2011 Citation: I.I. Mandoiu and A. Zelikovsky, Computational Advances for Next Generation Sequencing, Training course in conjunction with 1st SelectBiosciences Next-Gen Sequencing Congress, Boston, MA , April 28, 2011.
  • Type: Book Chapters Status: Awaiting Publication Year Published: 2015 Citation: Bassam Tork, Ekaterina Nenastyeva, Alexander Artyomenko, Nicholas Mancuso, Mazhar I.Khan, Rachel O'Neill, Ion Mandoiu, and Alex Zelikovsky, Reconstruction of Infectious Bronchitis Virus Quasispecies from NGS Data
  • Type: Books Status: Awaiting Publication Year Published: 2015 Citation: I.I. Mandoiu and A.Z. Zelikovsky (Eds.), Computational Methods for Next Generation Sequencing Data Analysis, John Wiley & Sons, 2015
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: B. Tork and A. Zelikovsky and I.I. Mandoiu and R.J. O'Neill and M. Khan and E. Nenastyeva and A. Artyomenko and N. Mancuso, Reconstruction of Infectious Bronchitis Virus Quasispecies from NGS Data, 9th International Symposium on Bioinformatics Research and Applications, May 20-22, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: A. Artyomenko and N. Mancuso and P. Skums and I.I. Mandoiu and A. Zelikovsky, kGEM: An Expectation Maximization Error Correction Algorithm for Next Generation Sequencing of Amplicon-based Data, 9th International Symposium on Bioinformatics Research and Applications, May 20-22, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: E. Hemphill and J. Lindsay and C. Lee and C.E. Nelson and I.I. Mandoiu, Biomarker and Classifier Selection in Diverse Genetic Datasets, 9th International Symposium on Bioinformatics Research and Applications, May 20-22, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: I.I. Mandoiu, Reconstruction of Haplotype Spectra from NGS Data, IPDPS Workshop on Future Computing Platforms to Accelerate Next-Gen Sequencing Applications, May 19, 2013.
  • Type: Other Status: Published Year Published: 2013 Citation: I.I. Mandoiu, Bioinformatics Tools for Viral Quasispecies Reconstruction from Next-Generation Sequencing Data and Vaccine Optimization , USDA NIFA Agriculture and Food Research Initiative - Animal Breeding, Genetics and Genomics Project Director Meeting, Jan. 11, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: I.I. Mandoiu, R.J. O'Neill, M. Khan, A. Zelikovsky, B. Tork, and N. Mancuso, Bioinformatics methods for reconstruction of infectious bronchitis virus quasispecies from next generation sequencing data, 11th International Conference on Molecular Epidemiology and Evolutionary Genetics of Infectious Diseases, New Orleans, LA, Oct. 30  Nov. 2, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: N. Mancuso and B. Tork and P. Skums and I.I. Mandoiu and A. Zelikovsky, Multi-Commodity Flow Methods for Quasispecies Spectrum Reconstruction Given Amplicon Reads, 8th International Symposium on Bioinformatics Research and Applications, May 21-23, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: J. Lindsay and H. Salooti and A. Zelikovsky and I.I. Mandoiu, Scaffolding Large Genomes using Integer Linear Programming, 8th International Symposium on Bioinformatics Research and Applications, May 21-23, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: S. Al Seesi and I.I. Mandoiu, Inference of allele specific expression levels from RNA-Seq data, Poster at 8th International Symposium on Bioinformatics Research and Applications, May 21-23, 2012.
  • Type: Books Status: Published Year Published: 2011 Citation: B. Chen, ..., I.I. Mandoiu, et al. (Eds.), 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops Proceedings , IEEE , 2011.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: S. Mangul and A. Caciula and N. Mancuso and I.I. Mandoiu and A. Zelikovsky, An Integer Programming Approach to Novel Transcript Reconstruction from Paired?End RNA?Seq Reads, Poster at 16th Annual International Conference on Research in Computational Molecular Biology, April 21-24, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: S. Al Seesi and I.I. Mandoiu, Inference of allele specific expression levels from RNA-Seq data, Proc. 2nd IEEE International Conference on Computational Advances in Bio and Medical Sciences, Feb. 23-25, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: S. Mangul and A. Caciula and I.I. Mandoiu and A. Zelikovsky, Novel Transcript Reconstruction from Paired-End RNA-Seq Reads Using Fragment Length Distribution, Proc. 2nd IEEE International Conference on Computational Advances in Bio and Medical Sciences, Feb. 23-25, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: N. Mancuso and B. Tork and P. Skums and L. Ganova-Raeva and I.I. Mandoiu and A. Zelikovsky, A Maximum Likelihood Method For Quasispecies Spectrum Assembly, Proc. 2nd IEEE International Conference on Computational Advances in Bio and Medical Sciences, Feb. 23-25, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: R.J. O'Neill and I.I. Mandoiu and M. Khan and C. Obergfell and H. Wang and A. Bligh and B. Tork and N. Mancuso and A. Zelikovsky, Bioinformatics Methods For Reconstruction Of Infectious Bronchitis Virus Quasispecies From Next Generation Sequencing Data, Proc. 2nd IEEE International Conference on Computational Advances in Bio and Medical Sciences, Feb. 23-25, 2012.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2012 Citation: J. Lindsay and I.I. Mandoiu and H. Salooti and A. Zelikovsky, Accurate Scaffolding of Large Genomes using Integer Programming and Non-Serial Dynamic Programming, Poster at 2nd IEEE International Conference on Computational Advances in Bio and Medical Sciences, Feb. 23-25, 2012.
  • Type: Other Status: Published Year Published: 2012 Citation: I.I. Mandoiu, Inferring Viral Quasispecies Spectra from NGS Reads, Invited talk, Brown University Center for Computational Molecular Biology, Feb. 1, 2012.
  • Type: Other Status: Published Year Published: 2012 Citation: I.I. Mandoiu and M. Khan and R.J. ONeill and A. Zelikovsky and C. Obergfell and H. Wang and A. Bligh and B. Tork and N. Mancuso, Bioinformatics Tools for Viral Quasispecies Reconstruction from Next-Generation Sequencing Data and Vaccine Optimization , Poster at the USDA NIFA Animal Genetics, Genomics and Breeding Program Annual Investigator Meeting, Jan. 13, 2012.
  • Type: Journal Articles Status: Published Year Published: 2011 Citation: I. Astrovskaya, B. Tork, S. Mangul, K. Westbrooks, I.I. Mandoiu, P. Balfe and A. Zelikovsky, "Inferring Viral Quasispecies Spectra from 454 Pyrosequencing Reads," BMC Bioinformatics 12(Suppl 6):S1, 2011
  • Type: Journal Articles Status: Published Year Published: 2012 Citation: N. Mancuso,B. Tork, P. Skums, L. Ganova-Raeva, I.I. Mandoiu and A. Zelikovsky, Reconstructing viral quasispecies from NGS amplicon reads, In Silico Biology 11, pp. 237-249, 2012
  • Type: Journal Articles Status: Published Year Published: 2012 Citation: S. Mangul, A. Caciula, O. Glebova, I.I. Mandoiu and A. Zelikovsky, "Improved transcriptome quantification and reconstruction from RNA-Seq reads using partial annotations," In Silico Biology 11, pp. 251-261, 2012
  • Type: Journal Articles Status: Published Year Published: 2011 Citation: M.B. Renfree, A.T. Papenfuss, J.E. Deakin, J. Lindsay, ..., I.I. Mandoiu, et al., "Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development," Genome Biology 12:R81, 2011
  • Type: Journal Articles Status: Published Year Published: 2011 Citation: M. Nicolae, S. Mangul, I.I. Mandoiu and A. Zelikovsky, "Estimation of alternative splicing isoform frequencies from RNA-Seq data," Algorithms for Molecular Biology 6:9, 2011
  • Type: Journal Articles Status: Published Year Published: 2013 Citation: S. Mangul, A. Caciula, S. Al Seesi, D. Brinza, A. Rouf Banday, R. Kanadia and I.I. Mandoiu and A. Zelikovsky, "Flexible Approach for Novel Transcript Reconstruction from RNA-Seq Data using Maximum Likelihood Integer Programming," Proc. 5th International Conference on Bioinformatics and Computational Biology, pp. 25-34, 2013
  • Type: Journal Articles Status: Published Year Published: 2012 Citation: M. Hamdalla, D. Grant, I.I. Mandoiu, D. Hill, S. Rajasekaran and R. Ammar, "The Use of Graph Matching Algorithms to Identify Biochemical Substructures in Synthetic Chemical Compounds: Application to Metabolomics," Proc. 2nd IEEE International Conference on Computational Advances in Bio and Medical Sciences, pp. 1-6, 2012
  • Type: Journal Articles Status: Published Year Published: 2012 Citation: J. Lindsay, H. Salooti, A. Zelikovsky and I.I. Mandoiu, "Scalable Genome Scaffolding using Integer Linear Programming," Proc. ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 377-383, 2012
  • Type: Journal Articles Status: Published Year Published: 2012 Citation: S. Mangul, A. Caciula, S. Al Seesi, D. Brinza, A. Rouf Banday, R. Kanadia, I.I. Mandoiu and A. Zelikovsky, "An Integer Programming Approach to Novel Transcript Reconstruction from Paired-End RNA-Seq Reads, Proc. ACM Conference on Bioinformatics, Computational Biology and Biomedicine," pp. 369-376, 2012
  • Type: Journal Articles Status: Published Year Published: 2011 Citation: S. Mangul, A. Caciula, I.I. Mandoiu and A. Zelikovsky, "RNA-Seq based discovery and reconstruction of unannotated transcripts in partially annotated genomes," Proc. 1st Workshop on Computational Advances in Molecular Epidemiology, pp. 118-123, 2011
  • Type: Journal Articles Status: Published Year Published: 2011 Citation: N. Mancuso, B. Tork, I.I. Mandoiu, P. Skums and A. Zelikovsky, "Viral Quasispecies Reconstruction from Amplicon 454 Pyrosequencing Reads," Proc. 1st Workshop on Computational Advances in Molecular Epidemiology, pp. 94-101, 2011
  • Type: Journal Articles Status: Published Year Published: 2011 Citation: S. Mangul, I. Astrovskaya, M. Nicolae, B. Tork, I.I. Mandoiu and A. Zelikovsky, "Maximum Likelihood Estimation of Incomplete Genomic Spectrum from HTS Data," Proc. 11th Workshop on Algorithms in Bioinformatics, pp. 213-224, 2011
  • Type: Theses/Dissertations Status: Accepted Year Published: 2011 Citation: M. Nicolae, "Accurate Estimation of Isoform and Gene Expression Levels from Next Generation Sequencing Data," University of Connecticut, Storrs, CT, September, 2011
  • Type: Book Chapters Status: Published Year Published: 2014 Citation: S. Al Seesi, S. Mangul, A. Caciula, A. Zelikovsky and I.I. Mandoiu, "Transcriptome reconstruction and quantification from RNA sequencing data," In Maria Poptsova, Genome Analysis: Current Procedures and Applications, Caister Academic Press, pp. 39-60, 2014
  • Type: Book Chapters Status: Published Year Published: 2014 Citation: I. Astrovskaya, N. Mancuso, B. Tork, S. Mangul, A. Artyomenko, P. Skums, L. Ganova-Raeva, I.I. Mandoiu and A. Zelikovsky, "Inferring Viral Quasispecies Spectra from Shotgun and Amplicon Next-Generation Sequencing Reads," In Maria Poptsova, Genome Analysis: Current Procedures and Applications, Caister Academic Press, pp. 231-262, 2014
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: A. Artyomenko and N. Mancuso and I.I. Mandoiu and P. Skums and A. Zelikovsky, kGEM: An EM-based Algorithm for Correction of Amplicon Reads from Viral Quasispecies, Proc. 3rd IEEE International Conference on Computational Advances in Bio and Medical Sciences, June 12-14, 2013.
  • Type: Journal Articles Status: Accepted Year Published: 2014 Citation: X. Ding, J. Wang, A. Zelikovsky, X. Guo, M. Xie, and Y. Pan, "Searching high-order SNP combinations for complex diseases based on energy distribution difference," IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), (to appear) (2014)
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: S. Mangul, N. C. Wu, N. Mancuso, A. Zelikovsky, R. Sun, E. Eskin, "Accurate viral population assembly from ultra-deep sequencing data." Bioinformatics 30(12): 329-337 (2014)
  • Type: Book Chapters Status: Published Year Published: 2013 Citation: Q. Cheng, P. Berman, R. Harrison, and A. Zelikovsky, Efficient Alignments of Metabolic Networks with Bounded Treewidth, in Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics, in Y. Pan, J. Wang, and M. Li, ed., Wiley Book Series on Bioinformatics, 2013, pp. 413-430.
  • Type: Theses/Dissertations Status: Accepted Year Published: 2013 Citation: B. Tork, Viral Quasispecies Reconstruction Using NGS Reads, Ph.D. Dissertation, Spring 2013.
  • Type: Theses/Dissertations Status: Accepted Year Published: 2014 Citation: N. Mancuso, Algorithms for Viral Population Analysis, Ph.D. Dissertation, Summer 2014.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: I.I. Mandoiu, Scalable Algorithms for Next-Generation Sequencing Data Analysis, JAX-UCONN/BECAT/UCHC Workshop on Bioinformatics and Computational Biology, Sept. 4, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: S. Al Seesi and Y.T. Tiagueu and A. Zelikovsky and I.I. Mandoiu, Accurate differential gene expression analysis for RNA-Seq data without replicates, 9th International Symposium on Bioinformatics Research and Applications, May 20-22, 2013.
  • Type: Journal Articles Status: Published Year Published: 2013 Citation: P. Skums, O. Glebova, A. Zelikovsky, Z. Dimitrova, D.S.C. Rendon, L. Ganova-Raeva,and Y. Khudyakov, Alignment of DNA Mass-Spectral Profiles Using Network Flows, Proc. of International Symposium on Bioinformatics Research & Applications (ISBRA), 2013, Lecture Notes in Bioinformatics 7875, pp. 149-160.
  • Type: Conference Papers and Presentations Status: Other Year Published: 2012 Citation: A. Zelikovsky, "Design and interpretation of high throughput genomic analysis." Presentation at Poultry Science Association 101st Annual Meeting, Athens, GA, 2012.
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: P. Skums, A. Artyomenko, O. Glebova, S. Ramachandran, I.I. Mandoiu, D.S. Campo, Z. Dimitrova, A. Zelikovsky and Y. Khudyakov, "Computational Framework for Next-Generation Sequencing of Heterogeneous Viral Populations using Combinatorial Pooling," Bioinformatics (published online before print), 2014.
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: F. Duan, J. Duitama, S. Al Seesi, C. Ayres, S. Corcelli, A. Pawashe, T. Blanchard, D. McMahon, J. Sidney, A. Sette, B. Baker, I.I. Mandoiu and P.K. Srivastava, "Genomic and bioinformatic profiling of mutational neo-epitopes reveals new rules to predict anti-cancer immunogenicity," Journal of Experimental Medicine 211, pp. 2231-2248, 2014
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: E. Hemphill, J. Lindsay, C. Lee, I.I. Mandoiu and C.E. Nelson, "Feature selection and classifier performance on diverse biological datasets," BMC Bioinformatics 15(Suppl 13):S4, 2014
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: S. Al Seesi, Y.T. Tiagueu,A. Zelikovsky and I.I. Mandoiu, "Bootstrap-based differential gene expression analysis for RNA-Seq data without replicates," BMC Genomics 15(Suppl 8):S2 , 2014
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: J. Lindsay, H. Salooti, I.I. Mandoiu and A. Zelikovsky, "ILP-based maximum likelihood genome scaffolding," BMC Bioinformatics 15(Suppl 9):S9, 2014
  • Type: Journal Articles Status: Published Year Published: 2014 Citation: S. Mangul, S. Al Seesi, A. Caciula, D. Brinza, I.I. Mandoiu and A. Zelikovsky, "Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data," BMC Genomics 15(Suppl 5):S7, 2014
  • Type: Journal Articles Status: Published Year Published: 2013 Citation: Y. Huang, M. Khan, I.I. Mandoiu, "Neuraminidase subtyping of Avian influenza viruses with PrimerHunter-designed primers and quadruplicate primer pools," PLOS ONE Volume 8, Issue 11, e81842, 2013
  • Type: Journal Articles Status: Published Year Published: 2013 Citation: L. Menikarachchi, D. Hill, M. Hamdalla, I.I. Mandoiu and D. Grant, "In silico enzymatic synthesis of a 400,000 compound biochemical database for non-targeted metabolomics," Journal of Chemical Information and Modeling 53, pp. 2483-2492, 2013
  • Type: Journal Articles Status: Published Year Published: 2013 Citation: P. Skums, N. Mancuso, A. Artyomenko, B. Tork, I.I. Mandoiu, Y. Khudyakov and A. Zelikovsky, Reconstruction of Viral Population Structure from Next-Generation Sequencing Data Using Multicommodity Flows, BMC Bioinformatics 14(Suppl 9):S2, 2013
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: N. Mancuso and A. Artyomenko and P. Skums and I.I. Mandoiu and A. Zelikovsky, Estimation of Viral Population Structure from Amplicon-Based Reads, Proc. 3rd IEEE International Conference on Computational Advances in Bio and Medical Sciences, June 12-14, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: A. Caciula and A. Zelikovsky and S. Mangul and J. Lindsay and I.I. Mandoiu, Monte-Carlo Regression Algorithm for Isoform Frequency Estimation from RNA-Seq Data, Proc. 3rd IEEE International Conference on Computational Advances in Bio and Medical Sciences, June 12-14, 2013.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2013 Citation: P. Skums and O. Glebova and A. Zelikovsky and I.I. Mandoiu and Y. Khudyakov, Optimizing pooling strategies for the massive next-generation sequencing of viral samples, Proc. 3rd IEEE International Conference on Computational Advances in Bio and Medical Sciences, June 12-14, 2013.
  • Type: Journal Articles Status: Published Year Published: 2013 Citation: M. Hamdalla, I.I. Mandoiu, D. Hill, S. Rajasekaran and D. Grant, "BioSM: A metabolomics tool for identifying endogenous mammalian biochemical structures in chemical structure space," Journal of Chemical Information and Modeling 53, pp. 601-612, 2013