Fitness Data for psRCH2

562 condition samples (303 successful), Fri Feb 19 13:19:10 2016, statistics version 1.0.3





Gene Fitness

Gene fitness is a log2 ratio. It is normalized so that genes with no phenotype should have values near zero. Ideally, genes that are very sick (incapable of growth in the condition) should have values around -6 if the experiment ran for 6 generations. In practice, values below -2 or -3 indicate that mutants in the gene are very sick, and values around -1 indicate a mildly deleterious phenotype for mutants in that gene. On the other hand, if a gene's activity is deleterious, then the fitness values will be positive.

Gene fitness is calculated from strain fitness:

The method for averaging gene fitness across strains may change in the future.

Normalization for Chromosomal Bias

Depending on the growth phase of the sample, the copy number of the chromosome may be higher near the origin than near the terminus. If the treated and Time0 samples were growing at different rates, then the there will be variable recovery of barcodes near the origin which does not relate to the fitness of the genes. This is plotted for each experiment in the chromosome bias plots (note that the y axis is the unnormalized fitness). To remove this effect, we subtract the running median of fitness (the line in those plots). Then, a constant is added so that the mode (the peak of the distribution of gene fitness values) is zero.

Also, in different preparations of genomic DNA, the efficiency of recovering plasmids can vary. So, for each plasmid (if this organism has any), the median fitness of the genes on the plasmid is set to zero. Plasmids with very few genes cannot be normalized and so their genes are excluded.

If the genome sequence is not complete and is in many fragments, then it is not easy to tell if there is an effect of proximity to the terminus. Assuming that each scaffold is small, there will not be a significant variation in copy number across the scaffold. Since each scaffold is normalized separately, this should correct for the varying copy numbers of the scaffolds (but this has not been verified). Also, since it is difficult to a distinguish a small scaffold from a plasmid, genes from small scaffolds are excluded.

The quality scores in the table of experiments

As of April 9, 2014, the requirements for a successful experiment are: (The rules are implemented in FEBA_Exp_Status() in FEBA.R)

The R image

The R image includes:

The pool of mutants

The pool file is used to assign barcodes ot strains while assembling all.poolcount. It includes a separate row for each strain: (Mutant libraries made with transpososomes instead of a suicide plasmid do not have a delivery vector and will not have any "pastEnd" reads.)

Written by Morgan Price, Arkin group, Lawrence Berkeley Lab, January 2014