Greengenes “Turns” A Million!
Source: Todd DeSantis, Dan Hawkes
The year 2011 has been a pivotal one in gene sequencing. This past March, Greengenes, an international database of bacterial biomarker 16S rRNA genes developed and sustained by an ESD team led by Todd DeSantis, accepted its 1,000,000th high-quality DNA sequence, making it one of the largest curated collections of such sequences in the world. One million is an arbitrary milestone, but the fact that it has been reached emphasizes that “microbiomics,” the elucidation of complex microbial communities, is still highly dependent on 16S rRNA gene sequences. Greengenes aids researchers in classifying the 16S genes they discover in their clinical or environmental samples, as demonstrated by over 100 citations per year, many in leading journals of the field. The diversity or “non-uniformity” of the 16S genes found in samples worldwide has influenced microbiologists’ view of the vast diversity of life at the microscopic scale.
The Berkeley Lab PhyloChip project drove the initial development of Greengenes. Gary Andersen, the PI of both projects, explains: “Since the aim of the PhyloChip assay was to track populations of the broadest bacterial and archaeal range possible, we had to assemble a high-quality, massive data set.” The outcome has been a publicly accessible database and a commercial PhyloChip technology available to LBNL’s licensee, Second Genome, Inc. Since its inception, the gene database has grown from 50 K genes to 1 M, with searching the database becoming a slower process. As a solution, LBNL has teamed up with Niels Larsen at the Danish Genome Institute to leverage the power of Simrank, a search acceleration tool.
Greengenes has been applied to a number of different scientific investigations:
(1) Biofuels: Since microbial community digestion in bioreactors is the most widespread bioenergy technology worldwide today, a study was performed to identify the microbes responsible for an efficient waste-to-energy process. Werner's PNAS study “opens the door to engineer” microbial communities to create energy (in the form of methane) from industrial wastewater. The study used Greengenes to compare DNA sequences found in digesters against known, high-quality reference genes.
(2) Medicine: Necrotizing enterocolitis in pre-term infants is associated with inappropriate bacteria colonizing the infant gut. A Danish team is testing for changes in the microbial communities after administering a blend of probiotics to affected pigs. Greengenes allowed them to determine which bacteria in the probiotic blend dominated the gut after treatment. Bifidobacteria animalis was the winner. It turned out the strains chosen and doses used can affect the outcome and in some cases, overuse of probiotics may have a detrimental affect.
(3) Diversity: This year, a novel microorganism was discovered in San Francisco Bay sediments—an ammonia-oxidizing archaea thought to impact global nitrogen and carbon cycles. The research team could see it under the microscope, but were unable to grow it in the lab. So they used “microscopic tweezers” to separate it from debris and other cells. After extracting the DNA from the single organism, it was sequenced and compared to known strains in the Greengenes database to confirm its dissimilarity to all other known organisms and to determine its phylogenetic placement.
Greengenes curation is greatly facilitated by Phil Hugenholtz of the University of Queensland in Brisbane, Australia, as well as by LBNL’s Morgan Price (Physical BioSciences Division).
In the Press