Downloads for "Magic pools: parallel assessment of transposon delivery vectors in bacteria"

by Hualan Liu, Morgan N. Price, Robert Jordan Waters, Jayashree Ray, Hans K. Carlson, Jacob S. Lamson, Romy Chakraborty, Adam P. Arkin, and Adam M. Deutschbauer

Abstract

Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach for discovering the functions of bacterial genes. However, the development of a suitable TnSeq strategy for a given bacterium can be costly and time-consuming. To meet this challenge, we describe a parts-based strategy for constructing libraries of hundreds of transposon delivery vectors, which we term "magic pools." Within a magic pool, each transposon vector has a different combination of promoters and antibiotic resistance markers as well as a random DNA barcode sequence, which allows the tracking of each vector during mutagenesis experiments. To identify an efficient vector for a given bacterium, we mutagenize it with a magic pool and sequence the resulting insertions; we then use the best vector to generate a large mutant library. We used the magic pool strategy to construct transposon mutant libraries in five genera of bacteria, including three genera of the phylum Bacteroidetes.

Kanamycin based magic pools

The mapping from barcode to vector is described in kan_magic_barcodes.tab

Characterization of diverse erythromycin-based magic pools with PacBio and Illumina

This was done separately for magic pool pHLL254 (erythromycin + Tn5) and magic pool pHLL255 (erythromycin + mariner).

The approach was:

Run "Reads for Insert" on SMRT portal to get a fastq file with the circular consensus reads.
- See consensus reads (146 MB)
Use bwa mem to map this to the parts. (Do not commingle parts across magic pools.)
Use parseAln.pl to parse the SAM file
Use cutFastq.pl to cut out the barcodes upstream of seq2
Use topCodesToFna.pl to make a database of barcodes covering the top 98% (and 2 flanking nt on each side), based on the Illumina data
Use blastn to match to the expected barcodes
Use identifyParts.pl to make the table

Download scripts and key tables (1 MB)

The *.parts files include sequences of all the parts that went into these magic pools
The *.codes files are the result of analyzing the barseq reads for each magic pool with MultiCodes.pl from the FEBA code base (these are the input to topCodesToFna.pl)
The *.codedb files are the databases of reliable barcodes for formatting by blastn (the output of topCodesToFna.pl)
The *.bcparts files show how an individual PacBio reads links a barcode to one or more parts (the output of identifyParts.pl)
The *.tab files show which barcodes are confidently linked to which parts (generated with a few lines of R from the bcparts files)
This tarball also includes kan_magic_barcodes.tab, which describes the simpler kanamycin-based magic pools

The script to analyze TnSeq data from a magic pool and describe how well the different vectors work (MapMagicPool.pl) is available from the FEBA code base. The processing of the raw TnSeq reads was done using MapTnSeq.pl from that code base.

Genomes

The genome sequences, the protein sequences, and the gene information for each bacterium we studied is included in this tarball (59 MB). For bacteria that we sequenced, it also includes a genbank file.

Preliminary pools of mutants from the magic pools

For each preliminary library that was built with a "magic pool" and mapped by TnSeq, we report the confidently mapped insertions. For each vector in the magic pool that had sufficient insertions, we also report the quality metrics from MapMagicPool.pl.

Final pools of barcoded mutants

Brev2: pool and statistics
Cola: pool and statistics
Ponti: pool and statistics
Sphingo3: pool and statistics
Pedo557: pool and statistics

As of June 2017, we have not tried to build a final pool for Sphingopyxis sp. GW247-27LB ("Sphingo4").

Fitness data

For the fitness assays performed with the final mutant libraries, the FEBA code base was used to infer fitness values and to build a mini-web site with the results and quality metrics.

Brevundimonas sp. GW460-12-10-14-LB2 ("Brev2")
Echinicola vietnamensis KMM 6221, DSM 17526 ("Cola")
Pontibacter actiniarum KMM 6156, DSM 19842 ("Ponti")
Sphingobium sp. GW456-12-10-14-TSB1 ("Sphingo3")
Pedobacter sp. GW460-11-11-14-LB5 ("Pedo557")

Supplementary tables

The supplementary tables for the preprint are available here (Excel format).

Page by Morgan Price, Arkin lab, May 2017