Since the first metagenomics study was published 17 years ago, the ability to make ecological inferences has been limited by sequencing throughput...
» More
Since the first metagenomics study was published 17 years ago, the ability to make ecological inferences has been limited by sequencing throughput and the ability to analyze the data. Yet, if the potential of metagenomics to help us decipher the structure and function of microbial ecosystems is to be realized, it is essential that our bioinformatics tools motivate our sequencing efforts. This is analogous to the bioinformatics advances that were made to accelerating the sequencing of the human genome, except that metagenomic projects face perhaps even greater obstacles. Unfortunately, the bioinformatics capabilities for analyzing metagenomic sequencing projects lag our sequencing efforts. Our research group has begun to develop a suite of statistical tools to describe and compare microbial communities at the genomic level using DNA sequence reads obtained via the traditional Sanger sequencing method. These tools have revealed the dominance of protein families with no known functions and the amount of functional overlap between disparate communities. To keep up with our sequencing capacity it is essential that we continue to develop tools that will enable us to analyze genomic and transcript sequences generated from pyrosequencing technologies, proteomics, and microarrays. The end goal is to provide a suite of bioinformatics tools so that we are limited by our sequencing throughput and not our ability to analyze the data.