Mutant Phenotypes for Thousands of Bacterial Genes of Unknown Function

by Morgan N. Price, Kelly M. Wetmore, R. Jordan Waters, Mark Callaghan, Jayashree Ray, Hualan Liu, Jennifer V. Kuehl, Ryan A. Melnyk, Jacob S. Lamson, Yumi Suh, Hans K. Carlson, Zuelma Esquivel, Harini Sadeeshkumar, Romy Chakraborty, Grant M. Zane, Benjamin E. Rubin, Judy D. Wall, Axel Visel, James Bristow, Matthew J. Blow, Adam P. Arkin, and Adam M. Deutschbauer

Abstract

One third of all protein-coding genes from bacterial genomes cannot be annotated with a function. To investigate these genes’ functions, here we collected genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions each. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. 2,316 of these poorly-annotated genes had associations that are of high confidence because they are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins and we proposed specific functions for poorly-annotated enzymes and transporters and for uncharacterized protein families. Our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.

See the article (paywalled), the final author version (free), or view the data in the Fitness Browser.

Data Downloads

You can download the data for each organism here: or as a tarball for all genomes here (large! 84 GB)

You can get information about the organisms and their genomes here:

Alternatively, you can download all of the data in the Fitness Browser (as of June 2017) from doi: 10.6084/m9.figshare.5134840

Also note that for some organisms, the Fitness Browser contains additional experiments beyond those described here. For these organisms, the cofitness values will not match.

Other Downloads

An earlier version (with just 25 bacteria and a different title) was posted on bioRxiv

Page by Morgan N. Price, Arkin group, June 2017