Progress 10/01/13 to 09/30/18
Outputs Target Audience:The target audiences are 1) researchers/scientists who investigate microbial communities with applications in biology, health and medical sciences, environmental sciences and forensic science. 2) statisticians/computational scientists who develop computational methods or algorithms in metagenomic studies. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?This research will have a significant long-term impact due to the increased number of microbiome projects. This project has integrated research and education and explicitly addresses cross-disciplinary training at multiple levels. It has provided training and professional development for 10+graduate students in cutting-edge research. Participating in this research they learned the new sequence technology and associated computational challenges. Under Dr. An's supervision, her lab has offered two summer workshops on metagenomics to the graduate and undergraduate students in Statistics, Mathematics, Biosystems Engineering, Pharmacy, Public Health, and Medicine. The topics included "Introduction to metagenomics", "Introduction to the current IHMP (integrative Human Microbiome Projects) and their status", "Statistical and computational challenges in microbial research", "Advanced quantitative methods in metagenomic data analysis", and hands-on experience in metagenomic data analysis in the High Performance Computing systems at University of Arizona, from sequence data alignment to quality control, from upstream analysis to downstream analysis, e.g., differential abundance analysis, cluster analysis, network and visualization. Through the workshops, they hoped to bring this new research area to the next generation scientists and educators. How have the results been disseminated to communities of interest?The research findings/outcomes have been presented at various conferences/symposium and seminars, including ENAR (Eastern North American Region) meeting, WNAR (Western North American Region) meeting of International of Biometric Society, ICSA (International Chinese Statistical Association) symposium, JSM (Joint Statistical Meetings),Annual Scientific Meeting of American Academy of Forensic Sciences, International Human Microbiome Consortium, and NSF workshop. We have published about 10 peer-reviewed journal papers directly or indirectly resulting from this project. All the software/packages we have developed for the project can be downloaded from http://cals.arizona.edu/~anling/software/software.htm. Researchers can use this software for free in their research. What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Biological threats are associated with the deliberate or accidental release of a pathogen or biotoxin. Sometimes they may cause people illness, death, fear, societal disruption, and economic damage. One of the difficulties in detecting this threat in an efficient manner is that our environment is already rich in microorganisms, that are harmless or actively beneficial, and about whom we know very little. Metagenomics is the study of genetic material recovered directly from natural (e.g., soil and water) or host-associated (e.g., human gut) environmental samples that contain micro-organisms organized into communities or microbiomes. With the development of new sequencing technology, metagenomic approach enables us to detect and study biological threats. In addition, new technology has also led us to advances in agriculture, medicine, energy production, and bioremediation. For example, in agricultural studies understanding the roles of beneficial and harmful microorganisms living in, on and around domesticated plants and animals can help us to detect diseases in crops and livestock, and then develop improved farming practices. High-throughput next-generation sequencing technologies provide a powerful way in metagenomic studies. However, due to the massive short DNA sequences produced by the new sequencing technologies, there is an urgent need to develop efficient statistical methods to rapidly and accurately detect the species and gene functions present in a metagenomic sample/community and to construct the relationship between the signature species/functions and the status of the microbial communities that they reside in. The project PI, Dr. An, working with her students, has developed several statistical methods on 1) identifying all possible functional roles present in metagenomic samples/communities and 2) detecting functional "biomarker" patterns that are linked to the presence of a biological factor (e.g, a disease or biothreat agent) and 3) predicting virulence/disease level for a new individual. They not only proposed novel methods and algorithms for analyzing metagenomic data in general but also developed software that implemented the proposed methods. All of the software is made publicly available for the research community. This will help scientists to ask, answer, and evaluate complex and multi-disciplinary biological questions. For example, by applying our methods scientists can detect the harmful microbial species, even without any prior micrological knowledge. They can also predict a disease/virulence level for a new microbial sample. This, in turn, will help people in treating disease or biothreat controlling.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
Zhu, L., An, L., Ran, D., Lizzarraga, R., Bondy, C., Zhou, X., Harper, R., Liao, S., Chen, Y. (2018). The Club Cell Marker SCGB1A Downstream of FOXA1 is Reduced in Asthma.
American Journal of Respiratory Cell and Molecular Biology. doi: 10.1165/rcmb.2018-0199OC
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
Zhang, S., Wang, D., Zhang, H., Skaggs, M., Lloyd, A., Ran, D., An, L, & Yadegari, R. (2018). FERTILIZATION-INDEPENDENT SEED-Polycomb Repressive Complex 2
plays a dual role in regulating type I MADS-box genes in early endosperm development. Plant Physiology, 177(1), 285299.
- Type:
Book Chapters
Status:
Awaiting Publication
Year Published:
2018
Citation:
Jiang H, An L, Ban Y. Introduction to metagenomics. No Boundary Thinking in Bioinformatics, edited by Huang and Moore.
- Type:
Journal Articles
Status:
Submitted
Year Published:
2018
Citation:
Klug, K. E., Jennings, C. M., Lytal, N., An, L., & Yoon, J.-Y. (2018). Mie Scattering and Microparticle Based Characterization of Heavy Metal Ions and
Classi??cation by Statistical Inference Methods. Royal Society Open Science.
- Type:
Book Chapters
Status:
Published
Year Published:
2018
Citation:
Du, R., An, L., & Fang, Z. (2018). Performance evaluation of normalization approaches for metagenomic compositional data on differential abundance
analysis. New Frontiers of Biostatistics and Bioinformatics. Edited by Yichuan Zhao and DingGeng Chen. Spinger.
|
Progress 10/01/16 to 09/30/17
Outputs Target Audience:Researchers/scientists in health and medical sciences, environmental sciences, and forensic science. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?It provided a training and professional development for 3 graduate students in cutting-edge research. How have the results been disseminated to communities of interest?Our findings/outcomes have been presented at various conferences/symposium and seminars, including ENAR (Eastern North American Region of International of Biometric Society) meeting, ICSA (International Chinese Statistical Association) symposium, JSM (Joint Statistical Meetings). All the software/packages we have developed for the project can be downloaded from http://cals.arizona.edu/~anling/software/software.htm. Researchers can use this software for free in their research. What do you plan to do during the next reporting period to accomplish the goals?Attend several statistical conferences and one workshhop to present our research and publish peer-reviewed journal papers.
Impacts What was accomplished under these goals?
We have developed one method in trace evidence analysis and one method in deathtime estimation,based on microbial sequence data. Both can be applied in Forensic sciences.
Publications
- Type:
Journal Articles
Status:
Under Review
Year Published:
2016
Citation:
Shanshan Zhang, Dongfang Wang, Huajian Zhang, Megan Skaggs, Alan Lloyd, Di Ran, Lingling An, Karen Schumaker, Gary Drews, and Ramin Yadegari. FIS-PRC2 plays a dual role in regulation of type I MADS-box genes in early endosperm of Arabidopsis" 2017, Plant Physiology
- Type:
Theses/Dissertations
Status:
Accepted
Year Published:
2017
Citation:
Wenchi Lu. A Novel Approach on Differential Abundance Analysis for Matched Metagenomic Samples.
- Type:
Journal Articles
Status:
Under Review
Year Published:
2017
Citation:
Fei Jia, Murat Kacira* , Lingling An,d, Caitlin C. Brown, Kimberly L. Ogden, and Judith K. Brown. Autonomous Detection of an Abiotic and Biotic Disturbance in a Microalgae Culture System Using a Multi-Wavelength Optical Density Sensor. (2017) Biosystems Engineering
- Type:
Journal Articles
Status:
Submitted
Year Published:
2017
Citation:
Kyle Carter, Meng Lu, Hongmei Jiang, Lingling An. Suspect Reduction for Culture Independent Microbial Source Tracking in Trace Evidence Analysis Using Community. Journal of Forensic Sciences
- Type:
Conference Papers and Presentations
Status:
Other
Year Published:
2017
Citation:
Dan Luo, Linglnig An. Differential Abundance Analysis on Longitudianal Metagenomic Count Data. Joint Statistical Meetings.
- Type:
Journal Articles
Status:
Submitted
Year Published:
2017
Citation:
Meng Lu, Kyle Carter, Lingling An. Accurate Prediction of Postmortem Interval with Integrating Microbial Community Dynamics. Forensic Science International.
|
Progress 10/01/15 to 09/30/16
Outputs Target Audience:Researchers/scientists in health and medical sciences, environmental sciences, and forensic science. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?It provided a training and professional development for 3 graduate students in cutting-edge research. How have the results been disseminated to communities of interest?We presented our findings/outcomes at various conferences/symposium and seminars, including ENAR (Eastern North American Regionof International of Biometric Society) meeting, ICSA (International Chinese Statistical Association) symposium, JSM (Joint Statistical Meetings) All the software/packages we have developed for the project so far have been posted online, which can be downloaded from http://cals.arizona.edu/~anling/software/software.htm. Researchers can use this software for free in their research. What do you plan to do during the next reporting period to accomplish the goals?Attend some statistical conferences to present our research and publish a few more peer-reviewed journal papers.
Impacts What was accomplished under these goals?
We have developed a new method "metaDprof" which is implemented in R software.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2016
Citation:
Zhang, Y., Kacira, M., & An, L. (2016) A CFD study on improving air flow uniformity in indoor plant factory system Biosystems Engineering, 147, 193205.
- Type:
Journal Articles
Status:
Published
Year Published:
2016
Citation:
Pena, E., Wu, W., Piegrosch, W., West, R., & An, L (2016) Model Selection and Estimation with Quantal-Response Data in Benchmark Risk Assessment Risk Analysis. 37(4):716-732
- Type:
Journal Articles
Status:
Accepted
Year Published:
2016
Citation:
Luo D, Ziebell S, An L* (2016) An informative approach on differential abundance analysis for time-course metagenomic sequencing count data. Bioinformatics.
- Type:
Theses/Dissertations
Status:
Published
Year Published:
2015
Citation:
Sara Ziebell. A Powerful CorrelationMethod forMicrobial Co-occurrence Networks
- Type:
Theses/Dissertations
Status:
Published
Year Published:
2016
Citation:
Dailu Chen. A metagenomic analysis of the microbiome in the colorectal cancer microenvironment
|
Progress 10/01/14 to 09/30/15
Outputs Target Audience:researchers/scientists in health and medical sciences, environmental sciences Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?It provided a training and professional development for a postdoc and 3 graduate students. How have the results been disseminated to communities of interest?We presented our findings/outcomes at various conferences/symposium and seminars, including NSF workshop, Statistical or informatics conferences, and seminars at various universities. All the software/packages we have developed for the project so far have been posted online, which can be downloaded from http://cals.arizona.edu/~anling/software/software.htm Researchers can use these software for free in their research. What do you plan to do during the next reporting period to accomplish the goals?a few more manuscripts are in preparation to publish the research findings.
Impacts What was accomplished under these goals?
We have developed a new approach to accurately estimate relative abundance of closely related species. We have developed a two-step statistical approach in accurately identifying the potential functions & quantifying their abundance in a metagenomic sample. We have developed a few R software, TAEA, metaFunction, ENNB and RAIDA.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2014
Citation:
Sohn M, An L*, Pookhao N, Li Q. Accurate Estimation of Genome Relative Abundance for Closely Related Species in a Metagenomic Sample. BMC Bioinformatics 2014, 15:242, PMID: 25027647 (*: corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2014
Citation:
An L*, Pookhao N, Jiang H, Xu J. Statistical approach of functional profiling for a microbial community. PLoS ONE 2014, 9(9): e106588, PMID: 25198674 (*: corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2015
Citation:
Pookhao N, Sohn M, Li Q, Jenkins I, Du R, Jiang H, An L*. A two-stage statistical procedure for feature selection and comparison in functional analysis of metagenomes. Bioinformatics, 2015, 31:158-165. PMID:
25256572 (*: corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2015
Citation:
Du R, Mercante D, An L*, Fang Z*. A statistical approach to correcting cross-annotations in a metagenomic functional profile generated by short reads. 2015, Journal of Biometrics & Biostatistics, 5:208 (*: co-corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2015
Citation:
Sohn M, Du R, An L*. (2015) A robust approach for identifying differentially abundant features in metagenomic samples. Bioinformatics. PMID: 25792553 (*: corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2015
Citation:
Ban Y, An L, Jiang H (2015) Investigating microbial co-occurrence patterns based on metagenomic compositional data. Bioinformatics 31(20): 3322-3329
- Type:
Journal Articles
Status:
Published
Year Published:
2015
Citation:
Drewry J, Choi C, An L, Gharagozloo P. (2015) A computational fluid dynamics model of algal growth: development and validation. Transactions of the American Society of Agricultural and Biological Engineers, 58:2, 203-213
- Type:
Journal Articles
Status:
Published
Year Published:
2015
Citation:
Yigiter A, Chen J, An L, Danacioglu N. (2015) An on-line CNV detection method for short sequencing reads. Journal of Applied Statistics,42:7, 15561571
- Type:
Theses/Dissertations
Status:
Published
Year Published:
2015
Citation:
M.B. Sohn. Novel Computational And Statistical Approaches In Metagenomic Studies. PhD Dissertation in Statistics, University of Arizona
- Type:
Theses/Dissertations
Status:
Published
Year Published:
2015
Citation:
Ahmad Hakeem Abdul Wahab. Statistical Discovery of Biomarkers in Metagenomics. MS thesis in Statistics, University of Arizona
- Type:
Theses/Dissertations
Status:
Published
Year Published:
2014
Citation:
Naruekamol Pookhao. Statistical Methods For Functional Metagenomic Analysis Based On Next-Generation Sequencing Data. PhD Dissertation in ABE, University of Arizona
|
Progress 10/01/13 to 09/30/14
Outputs Target Audience: The target audiences are the biologists/medical scientists who investigate microbial communities and statisticians/computational scientists who develop computational methods or algorithms in metagenomic studies. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided? It has provided training opportunities for graduate students in Biosystems Engineering, Statistics, and Computer Science. Also it has brought internship opportunities to several undergraduate students in Biosystems Engineering. Through this project a postdoc has been mentored in his early career. How have the results been disseminated to communities of interest? The PI has delivered several presentations on the results of this project at various conferences/meetings/seminars to general audience, including biologists, medical doctors, statisticians, mathematicians, and computer scientists. The results have also been presented to a general audience at a NSF workshop, including federal funding agents. PI's lab has also provided a series of workshops on metagenomics to a group of undergraduate students to motivate their interest in related research areas. What do you plan to do during the next reporting period to accomplish the goals? develop more statistical methods and computational algorithms (software) in this topic, including: -- normalize the microbial samples that are collected at different scales -- robustly and sensitively detect the biomarkers (species or genes) that are related to the changed biological factor(s), e.g., from healthy condition to diseased condition -- construct (sub)network among the detected biomarkers
Impacts What was accomplished under these goals?
We have developed statistical approaches to more accurately profiling the functional/taxonomic composition of microbial communities. The accurate annotation results play a critical role in the downstream analysis. Specially, the paper produced by Sohn et al in my lab was highlighted as top 5 (out of about 500 papers) downloaded papers in the top journal -- BMC Bioinformatics. This paper is designated for accurately detecting very closely related species, which is a big traditional challenge in metagenomic/microbial studies under low sequence coverage.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2013
Citation:
Piegorsch W*, An L*, Wickens A, West W, Pe�a E, Wu W. (2013) Information-theoretic model-averaged benchmark dose analysis in environmental risk assessment. Environmetrics 24:143-157 (*: co-first author)
- Type:
Journal Articles
Status:
Published
Year Published:
2014
Citation:
An L*, Pookhao N, Jiang H, Xu J. (2014) Statistical approach of functional profiling for a microbial community. PLoS ONE 9(9): e106588 (*: corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2014
Citation:
Sohn M, An L*, Pookhao N, Li Q. (2014) Accurate Estimation of Genome Relative Abundance for Closely Related Species in a Metagenomic Sample. BMC Bioinformatics 15:242 (*: corresponding author)
- Type:
Journal Articles
Status:
Published
Year Published:
2014
Citation:
Jiang H, An L, Baladandayuthapani V, Auer P. (2014) Classification, predictive modeling, and statistical analysis of cancer data. Cancer informatics 01/2014; 13(Suppl 2):1-3. DOI: 10.4137/CIN.S19328
- Type:
Journal Articles
Status:
Accepted
Year Published:
2014
Citation:
Pookhao N, Sohn M, Li Q, Jenkins I, Du R, Jiang H, An L*. (2014) A two-stage statistical procedure for feature selection and comparison in functional analysis of metagenomes. Bioinformatics. (*: corresponding author)
|
|