Genome-wide fitness data for Shewanella oneidensis MR-1

A paper about this data set: Evidence-Based Annotation of Gene Function in Shewanella oneidensis MR-1 Using Genome-Wide Fitness Profiling across 121 Conditions

Background

Shewanella oneidensis strain MR-1 (formerly known as S. putrefaciens) is a model organism for studying metal reduction, as MR-1 can utilize a wide range of metal ions and solid metals as electron acceptors and also grows aerobically. MR-1 is in the same division of bacteria as E. coli (the Gammaproteobacteria), but they are not closely related. Of the ~4,500 proteins in MR-1, only about a third have orthologs in E. coli. The MR-1 genome sequence was published in 2002 and the annotation has been curated since. A few hundred papers have been published on MR-1, and hundreds of gene expression experiments are publicly available.

The Adam Arkin Lab at UC Berkeley has created a large number of S. oneidensis MR-1 transposon insertions with known location and with a known tag or barcode. These insertions are pooled together into two pools, and the pools are grown under a given (stress) condition for ~6-8 generations. Typically, the stress experiments are performed in LB media with the stressor in well-shaken (aerobic) flasks, and a concentration of the stressor that reduces the growth rate about 2-fold is used.

The abundance of each tagged strain is measured with microarray at the beginning and at the end of the experiment. The fitness of the strain is the log2 ratio of these abundances. (This is not the same scale as fitness in population genetics.) The data is normalized so that the median strain has a fitness of 0. The fitness value of a gene is computed as the average of the values for the insertions in that gene. In this experiment it is assumed that the insertions of a given gene deactivate that gene.

The reliability of these per-gene fitness values is estimated by looking at consistency across different insertions in the same gene and at consistency across the two pools. In a typical experiment, some strains are very sick (fitness < -2 imply little or no growth), some strains are moderately but significantly sick (fitness ~ -1), most strains have fitness near 0 (are neutral), and a handful of strains are advantaged (fitness ~ 1).

Tab-delimited files for download:

Viewing the data in MeV:

  1. Download MR1_fitness.mev.
  2. Run MeV, use the File / Load Data command, use Browse to select MR1_fitness.mev, select two-color array, uncheck load annotation, and select the upper/left-most expression value (under Fe(III)).
  3. Check that the color scheme limits are reasonable (i.e., -3 to 0 to 3) using Display / Set Color Scale Limits.
  4. Use Display / Gene Row Labels to set "comb" as the gene row label (this will show the VIMSS/MicrobesOnline id, the SO number, the gene's name if any, and the gene's description).
  5. You might also want to use Display / Set Element Size and Analysis / Clustering / HCL.

R source code and image:

Other resources:

Page by Morgan Price in the Arkin group
Fitness data collected by Adam M. Deutschbauer and others in the Arkin group