1 Bowtie: an Ultrafast, Lightweight Short Read Aligner
3 Bowtie Getting Started Guide
4 ============================
6 Download and extract the appropriate Bowtie binary release from
7 http://bowtie-bio.sf.net into a fresh directory. Change to that
13 The Bowtie source and binary packages come with a pre-built index of
14 the E. coli genome, and a set of 1,000 35-bp reads simulated from that
15 genome. To use Bowtie to align those reads, issue the following
16 command. If you get an error message "command not found", try adding
17 a "./" before the "bowtie".
19 bowtie e_coli reads/e_coli_1000.fq
21 The first argument to bowtie is the basename of the index for the
22 genome to be searched. The second argument is the name of a FASTQ file
25 Depending on your computer, the run might take a few seconds up to
26 about a minute. You will see bowtie print many lines of output. Each
27 line is an alignment for a read. The name of the aligned read appears
28 in the leftmost column. The final line should say "Reported 698
29 alignments to 1 output stream(s)" or something similar.
31 Next, issue this command:
33 bowtie -t e_coli reads/e_coli_1000.fq e_coli.map
35 This run calculates the same alignments as the previous run, but the
36 alignments are written to e_coli.map (the final argument) rather than
37 to the screen. Also, the -t option instructs Bowtie to print timing
38 statistics. The output should look something like this:
40 Time loading forward index: 00:00:00
41 Time loading mirror index: 00:00:00
42 Seeded quality full-index search: 00:00:00
43 # reads processed: 1000
44 # reads with at least one reported alignment: 699 (69.90%)
45 # reads that failed to align: 301 (30.10%)
46 Reported 699 alignments to 1 output stream(s)
47 Time searching: 00:00:00
48 Overall time: 00:00:00
50 Installing a pre-built index
51 ----------------------------
53 Download the pre-built S. cerevisiae genome package from the Bowtie
56 ftp://ftp.cbcb.umd.edu/pub/data/bowtie_indexes/s_cerevisiae.ebwt.zip
58 All pre-built indexes are packaged as .zip archives, and the S.
59 cerevisiae archive is named s_cerevisiae.ebwt.zip. When it has
60 finished downloading, extract the archive into the Bowtie 'indexes'
61 subdirectory using your preferred unzip tool. The index is now
64 To test that the index is properly installed, issue this command from
65 the Bowtie install directory:
67 bowtie -c s_cerevisiae ATTGTAGTTCGAGTAAGTAATGTGGGTTTG
69 This command searches the S. cerevisiae index with a single read. The
70 -c argument instructs Bowtie to obtain read sequences directly from
71 the command line rather than from a file. If the index is installed
72 properly, this command should print a single alignment and then exit.
74 If you would rather install pre-built indexes somewhere other than the
75 'indexes' subdirectory of the Bowtie install directory, simply set the
76 BOWTIE_INDEXES environment variable to point to your preferred
77 directory and extract indexes there instead.
82 The pre-built E. coli index included with Bowtie is built from the
83 sequence for strain 536, known to cause urinary tract infections. We
84 will create a new index from the sequence of E. coli strain O157:H7, a
85 strain known to cause food poisoning. Download the sequence file from:
87 ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_O157H7/NC_002127.fna
89 When the sequence file is finished downloading, move it to the Bowtie
90 install directory and issue this command:
92 bowtie-build NC_002127.fna e_coli_O157_H7
94 The command should finish quickly, and print several lines of status
95 messages. When the command has completed, note that the current
96 directory contains four new files named e_coli_O157_H7.1.ebwt,
97 e_coli_O157_H7.2.ebwt, e_coli_O157_H7.rev.1.ebwt, and
98 e_coli_O157_H7.rev.2.ebwt. These files constitute the index. Move
99 these files to the indexes subdirectory to install it.
101 To test that the index is properly installed, issue this command:
103 bowtie -c e_coli_O157_H7 GCGTGAGCTATGAGAAAGCGCCACGCTTCC
105 If the index is installed properly, this command should print a single
106 alignment and then exit.
108 Finding variations with SAMtools
109 --------------------------------
111 SAMtools (http://samtools.sf.net) is a suite of tools for storing,
112 manipulating, and analyzing alignments such as those output by Bowtie.
113 SAMtools understands alignments in either of two complementary
114 formats: the human-readable SAM format, or the binary BAM format.
115 Because Bowtie can output SAM (using the -S/--sam option), and SAM can
116 can be converted to BAM using SAMtools, Bowtie users can make full use
117 of the analyses implemented in SAMtools, or in any other tools
118 supporting SAM or BAM.
120 We will use SAMtools to find SNPs in a set of simulated reads included
121 with Bowtie. The reads cover the first 10,000 bases of the pre-built
122 E. coli genome and contain 10 SNPs throughout. First, we run 'bowtie'
123 to align the reads, being sure to specify the -S option. We also
124 specify an output file that we will use as input for the next step
125 (though pipes can be used to accomplish the same thing without the
128 bowtie -S e_coli reads/e_coli_10000snp.fq ec_snp.sam
130 Next, we convert the SAM file to BAM in preparation for sorting. We
131 assume that SAMtools is installed and that the samtools binary is
132 accessible in the PATH.
134 samtools view -bS -o ec_snp.bam ec_snp.sam
136 Next, we sort the BAM file, in preparation for SNP calling:
138 samtools sort ec_snp.bam ec_snp.sorted
140 We now have a sorted BAM file called ec_snp.sorted.bam. Sorted BAM is
141 a useful format because the alignments are both compressed, which is
142 convenient for long-term storage, and sorted, which is conveneint for
143 variant discovery. Finally, we call variants from the Sorted BAM:
145 samtools pileup -cv -f genomes/NC_008253.fna ec_snp.sorted.bam
147 For this sample data, the 'samtools pileup' command should print
148 records for 10 distinct SNPs, the first being at position 541 in the
151 See the SAMtools web site for details on how to use these and other
152 tools in the SAMtools suite: http://samtools.sf.net/.