Downloads for "Magic pools: parallel assessment of transposon delivery vectors in bacteria"
by Hualan Liu, Morgan N. Price, Robert Jordan Waters, Jayashree Ray, Hans K. Carlson, Jacob S. Lamson, Romy Chakraborty, Adam P. Arkin, and Adam M. Deutschbauer
Abstract
Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful
approach for discovering the functions of bacterial genes. However, the development of a
suitable TnSeq strategy for a given bacterium can be costly and time-consuming. To meet this
challenge, we describe a parts-based strategy for constructing libraries of hundreds of
transposon delivery vectors, which we term "magic pools." Within a magic pool, each
transposon vector has a different combination of promoters and antibiotic resistance markers as
well as a random DNA barcode sequence, which allows the tracking of each vector during
mutagenesis experiments. To identify an efficient vector for a given bacterium, we mutagenize it
with a magic pool and sequence the resulting insertions; we then use the best vector to
generate a large mutant library. We used the magic pool strategy to construct transposon
mutant libraries in five genera of bacteria, including three genera of the phylum Bacteroidetes.
Kanamycin based magic pools
The mapping from barcode to vector is described in kan_magic_barcodes.tab
Characterization of diverse erythromycin-based magic pools with PacBio and Illumina
This was done separately for magic pool pHLL254 (erythromycin + Tn5) and magic pool pHLL255 (erythromycin + mariner).
The approach was:
- Run "Reads for Insert" on SMRT portal to get a fastq file with the circular consensus reads.
- Use bwa mem to map this to the parts. (Do not commingle parts across magic pools.)
- Use parseAln.pl to parse the SAM file
- Use cutFastq.pl to cut out the barcodes upstream of seq2
- Use topCodesToFna.pl to make a database of barcodes covering the top 98% (and 2 flanking nt on each side), based on the Illumina data
- Use blastn to match to the expected barcodes
- Use identifyParts.pl to make the table
Download scripts and key tables (1 MB)
- The *.parts files include sequences of all the parts that went into these magic pools
- The *.codes files are the result of analyzing the barseq reads for each magic pool with MultiCodes.pl from the FEBA code base (these are the input to topCodesToFna.pl)
- The *.codedb files are the databases of reliable barcodes for formatting by blastn (the output of topCodesToFna.pl)
- The *.bcparts files show how an individual PacBio reads links a barcode to one or more parts (the output of identifyParts.pl)
- The *.tab files show which barcodes are confidently linked to which parts (generated with a few lines of R from the bcparts files)
- This tarball also includes kan_magic_barcodes.tab, which describes the simpler kanamycin-based magic pools
The script to analyze TnSeq data from a magic pool and describe how well the different vectors work (MapMagicPool.pl) is available from the FEBA code base. The processing of the raw TnSeq reads was done using MapTnSeq.pl from that code base.
Genomes
The genome sequences, the protein sequences, and the gene information for each bacterium we studied is included in this tarball (59 MB). For bacteria that we sequenced, it also includes a genbank file.
Preliminary pools of mutants from the magic pools
For each preliminary library that was built with a "magic pool" and mapped by TnSeq, we report the confidently mapped insertions. For each vector in the magic pool that had sufficient insertions, we also report the quality metrics from MapMagicPool.pl.
Final pools of barcoded mutants
As of June 2017, we have not tried to build a final pool for Sphingopyxis sp. GW247-27LB ("Sphingo4").
Fitness data
For the fitness assays performed with the final mutant libraries, the FEBA code base was used to infer fitness values and to build a mini-web site with the results and quality metrics.
Supplementary tables
The supplementary tables for the preprint are available here (Excel format).
Page by Morgan Price, Arkin lab, May 2017