--- /dev/null
+Bowtie authors
+Primary contact <blangmea@jhsph.edu>
+
+Ben Langmead and Cole Trapnell wrote Bowtie. The SeqAn-1.1 library is
+used in Bowtie and some of its sources are included in Bowtie source
+releases; its authors are Andreas Doring, David Weese, Tobias Rausch,
+and Knut Reinert. A DLL from the pthreads for Win32 library is
+distributed with the Win32 version of Bowtie. The pthreads for Win32
+library and the GnuWin32 package have many contributors (see their
+respective web sites).
+
+Websites:
+
+ Bowtie: http://bowtie-bio.sf.net
+ SeqAn: http://www.seqan.de
+ pthreads for Win32: http://sourceware.org/pthreads-win32
+ GnuWin32: http://gnuwin32.sf.net
+
+December 2009
--- /dev/null
+The Artistic License
+
+Preamble
+
+The intent of this document is to state the conditions under which a
+Package may be copied, such that the Copyright Holder maintains some
+semblance of artistic control over the development of the package,
+while giving the users of the package the right to use and distribute
+the Package in a more-or-less customary fashion, plus the right to
+make reasonable modifications.
+
+Definitions:
+ * "Package" refers to the collection of files distributed by the
+ Copyright Holder, and derivatives of that collection of files
+ created through textual modification.
+ * "Standard Version" refers to such a Package if it has not been
+ modified, or has been modified in accordance with the wishes of
+ the Copyright Holder.
+ * "Copyright Holder" is whoever is named in the copyright or
+ copyrights for the package.
+ * "You" is you, if you're thinking about copying or distributing
+ this Package.
+ * "Reasonable copying fee" is whatever you can justify on the
+ basis of media cost, duplication charges, time of people
+ involved, and so on. (You will not be required to justify it to
+ the Copyright Holder, but only to the computing community at
+ large as a market that must bear the fee.)
+ * "Freely Available" means that no fee is charged for the item
+ itself, though there may be fees involved in handling the
+ item. It also means that recipients of the item may redistribute
+ it under the same conditions they received it.
+
+1. You may make and give away verbatim copies of the source form of
+ the Standard Version of this Package without restriction, provided
+ that you duplicate all of the original copyright notices and
+ associated disclaimers.
+
+2. You may apply bug fixes, portability fixes and other modifications
+ derived from the Public Domain or from the Copyright Holder. A
+ Package modified in such a way shall still be considered the
+ Standard Version.
+
+3. You may otherwise modify your copy of this Package in any way,
+ provided that you insert a prominent notice in each changed file
+ stating how and when you changed that file, and provided that you
+ do at least ONE of the following:
+
+ a) place your modifications in the Public Domain or otherwise make
+ them Freely Available, such as by posting said modifications to
+ Usenet or an equivalent medium, or placing the modifications on a
+ major archive site such as ftp.uu.net, or by allowing the
+ Copyright Holder to include your modifications in the Standard
+ Version of the Package.
+
+ b) use the modified Package only within your corporation or
+ organization.
+
+ c) rename any non-standard executables so the names do not
+ conflict with standard executables, which must also be provided,
+ and provide a separate manual page for each non-standard
+ executable that clearly documents how it differs from the Standard
+ Version.
+
+ d) make other distribution arrangements with the Copyright Holder.
+
+4. You may distribute the programs of this Package in object code or
+ executable form, provided that you do at least ONE of the
+ following:
+
+ a) distribute a Standard Version of the executables and library
+ files, together with instructions (in the manual page or
+ equivalent) on where to get the Standard Version.
+
+ b) accompany the distribution with the machine-readable source of
+ the Package with your modifications.
+
+ c) accompany any non-standard executables with their corresponding
+ Standard Version executables, giving the non-standard executables
+ non-standard names, and clearly documenting the differences in
+ manual pages (or equivalent), together with instructions on where
+ to get the Standard Version.
+
+ d) make other distribution arrangements with the Copyright Holder.
+
+5. You may charge a reasonable copying fee for any distribution of
+ this Package. You may charge any fee you choose for support of this
+ Package. You may not charge a fee for this Package itself. However,
+ you may distribute this Package in aggregate with other (possibly
+ commercial) programs as part of a larger (possibly commercial)
+ software distribution provided that you do not advertise this
+ Package as a product of your own.
+
+6. The scripts and library files supplied as input to or produced as
+ output from the programs of this Package do not automatically fall
+ under the copyright of this Package, but belong to whomever
+ generated them, and may be sold commercially, and may be aggregated
+ with this Package.
+
+7. C or perl subroutines supplied by you and linked into this Package
+ shall not be considered part of this Package.
+
+8. The name of the Copyright Holder may not be used to endorse or
+ promote products derived from this software without specific prior
+ written permission.
+
+9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
+ WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES
+ OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
+
+The End
+This license is approved by the Open Source Initiative
+(www.opensource.org) for certifying software as OSI Certified Open
+Source.
+
--- /dev/null
+
+What is Bowtie?
+===============
+
+[Bowtie] is an ultrafast, memory-efficient short read aligner geared
+toward quickly aligning large sets of short DNA sequences (reads) to
+large genomes. It aligns 35-base-pair reads to the human genome at a
+rate of 25 million reads per hour on a typical workstation. Bowtie
+indexes the genome with a [Burrows-Wheeler] index to keep its memory
+footprint small: for the human genome, the index is typically about
+2.2 GB (for unpaired alignment) or 2.9 GB (for paired-end or colorspace
+alignment). Multiple processors can be used simultaneously to achieve
+greater alignment speed. Bowtie can also output alignments in the
+standard [SAM] format, allowing Bowtie to interoperate with other tools
+supporting SAM, including the [SAMtools] consensus, SNP, and indel
+callers. Bowtie runs on the command line under Windows, Mac OS X,
+Linux, and Solaris.
+
+[Bowtie] also forms the basis for other tools, including [TopHat]: a
+fast splice junction mapper for RNA-seq reads, [Cufflinks]: a tool for
+transcriptome assembly and isoform quantitiation from RNA-seq reads,
+[Crossbow]: a cloud-computing software tool for large-scale
+resequencing data,and [Myrna]: a cloud computing tool for calculating
+differential gene expression in large RNA-seq datasets.
+
+If you use [Bowtie] for your published research, please cite the
+[Bowtie paper].
+
+[Bowtie]: http://bowtie-bio.sf.net
+[Burrows-Wheeler]: http://en.wikipedia.org/wiki/Burrows-Wheeler_transform
+[SAM]: http://samtools.sourceforge.net/SAM1.pdf
+[SAMtools]: http://samtools.sourceforge.net/
+[TopHat]: http://tophat.cbcb.umd.edu/
+[Cufflinks]: http://cufflinks.cbcb.umd.edu/
+[Crossbow]: http://bowtie-bio.sf.net/crossbow
+[Myrna]: http://bowtie-bio.sf.net/myrna
+[Bowtie paper]: http://genomebiology.com/2009/10/3/R25
+
+What isn't Bowtie?
+==================
+
+Bowtie is not a general-purpose alignment tool like [MUMmer], [BLAST]
+or [Vmatch]. Bowtie works best when aligning short reads to large
+genomes, though it supports arbitrarily small reference sequences (e.g.
+amplicons) and reads as long as 1024 bases. Bowtie is designed to be
+extremely fast for sets of short reads where (a) many of the reads have
+at least one good, valid alignment, (b) many of the reads are
+relatively high-quality, and (c) the number of alignments reported per
+read is small (close to 1).
+
+Bowtie does not yet report gapped alignments; this is future work.
+
+[MUMmer]: http://mummer.sourceforge.net/
+[BLAST]: http://blast.ncbi.nlm.nih.gov/Blast.cgi
+[Vmatch]: http://www.vmatch.de/
+
+Obtaining Bowtie
+================
+
+You may download either Bowtie sources or binaries for your platform
+from the [Download] section of the Sourceforge project site. Binaries
+are currently available for Intel architectures (`i386` and `x86_64`)
+running Linux, Windows, and Mac OS X.
+
+Building from source
+--------------------
+
+Building Bowtie from source requires a GNU-like environment that
+includes GCC, GNU Make and other basics. It should be possible to
+build Bowtie on a vanilla Linux or Mac installation. Bowtie can also
+be built on Windows using [Cygwin] or [MinGW]. We recommend
+[TDM's MinGW Build]. If using [MinGW], you must also have [MSYS]
+installed.
+
+To build Bowtie, extract the sources, change to the extracted
+directory, and run GNU `make` (usually with the command `make`, but
+sometimes with `gmake`) with no arguments. If building with [MinGW],
+run `make` from the [MSYS] command line.
+
+To support the `-p` (multithreading) option, Bowtie needs the
+`pthreads` library. To compile Bowtie without `pthreads` (which
+disables `-p`), use `make BOWTIE_PTHREADS=0`.
+
+[Cygwin]: http://www.cygwin.com/
+[MinGW]: http://www.mingw.org/
+[TDM's MinGW Build]: http://www.tdragon.net/recentgcc/
+[MSYS]: http://www.mingw.org/wiki/msys
+[Download]: https://sourceforge.net/projects/bowtie-bio/files/bowtie/
+
+The `bowtie` aligner
+====================
+
+`bowtie` takes an index and a set of reads as input and outputs a list
+of alignments. Alignments are selected according to a combination of
+the `-v`/`-n`/`-e`/`-l` options (plus the `-I`/`-X`/`--fr`/`--rf`/
+`--ff` options for paired-end alignment), which define which alignments
+are legal, and the `-k`/`-a`/`-m`/`-M`/`--best`/`--strata` options
+which define which and how many legal alignments should be reported.
+
+By default, Bowtie enforces an alignment policy similar to [Maq]'s
+default quality-aware policy (`-n` 2 `-l` 28 `-e` 70). See [the -n
+alignment mode] section of the manual for details about this mode. But
+Bowtie can also enforce a simpler end-to-end k-difference policy (e.g.
+with `-v` 2). See [the -v alignment mode] section of the manual for
+details about that mode. [The -n alignment mode] and [the -v alignment
+mode] are mutually exclusive.
+
+Bowtie works best when aligning short reads to large genomes (e.g.
+human or mouse), though it supports arbitrarily small reference
+sequences and reads as long as 1024 bases. Bowtie is designed to be
+very fast for sets of short reads where a) many reads have at least one
+good, valid alignment, b) many reads are relatively high-quality, c)
+the number of alignments reported per read is small (close to 1).
+These criteria are generally satisfied in the context of modern
+short-read analyses such as RNA-seq, ChIP-seq, other types of -seq, and
+mammalian resequencing. You may observe longer running times in other
+research contexts.
+
+If `bowtie` is too slow for your application, try some of the
+performance-tuning hints described in the [Performance Tuning] section
+below.
+
+Alignments involving one or more ambiguous reference characters (`N`,
+`-`, `R`, `Y`, etc.) are considered invalid by Bowtie. This is true
+only for ambiguous characters in the reference; alignments involving
+ambiguous characters in the read are legal, subject to the alignment
+policy. Ambiguous characters in the read mismatch all other
+characters. Alignments that "fall off" the reference sequence are not
+considered valid.
+
+The process by which `bowtie` chooses an alignment to report is
+randomized in order to avoid "mapping bias" - the phenomenon whereby
+an aligner systematically fails to report a particular class of good
+alignments, causing spurious "holes" in the comparative assembly.
+Whenever `bowtie` reports a subset of the valid alignments that exist,
+it makes an effort to sample them randomly. This randomness flows
+from a simple seeded pseudo-random number generator and is
+deterministic in the sense that Bowtie will always produce the same
+results for the same read when run with the same initial "seed" value
+(see `--seed` option).
+
+In the default mode, `bowtie` can exhibit strand bias. Strand bias
+occurs when input reference and reads are such that (a) some reads
+align equally well to sites on the forward and reverse strands of the
+reference, and (b) the number of such sites on one strand is different
+from the number on the other strand. When this happens for a given
+read, `bowtie` effectively chooses one strand or the other with 50%
+probability, then reports a randomly-selected alignment for that read
+from among the sites on the selected strand. This tends to overassign
+alignments to the sites on the strand with fewer sites and underassign
+to sites on the strand with more sites. The effect is mitigated,
+though it may not be eliminated, when reads are longer or when
+paired-end reads are used. Running Bowtie in `--best` mode
+eliminates strand bias by forcing Bowtie to select one strand or the
+other with a probability that is proportional to the number of best
+sites on the strand.
+
+Gapped alignments are not currently supported, but support is planned
+for a future release.
+
+[Maq]: http://maq.sf.net
+
+The `-n` alignment mode
+-----------------------
+
+When the `-n` option is specified (which is the default), `bowtie`
+determines which alignments are valid according to the following
+policy, which is similar to [Maq]'s default policy.
+
+ 1. Alignments may have no more than `N` mismatches (where `N` is a
+ number 0-3, set with `-n`) in the first `L` bases (where `L` is a
+ number 5 or greater, set with `-l`) on the high-quality (left) end
+ of the read. The first `L` bases are called the "seed".
+
+ 2. The sum of the [Phred quality] values at *all* mismatched positions
+ (not just in the seed) may not exceed `E` (set with `-e`). Where
+ qualities are unavailable (e.g. if the reads are from a FASTA
+ file), the [Phred quality] defaults to 40.
+
+The `-n` option is mutually exclusive with the `-v` option.
+
+If there are many possible alignments satisfying these criteria, Bowtie
+gives preference to alignments with fewer mismatches and where the sum
+from criterion 2 is smaller. When the `--best` option is specified,
+Bowtie guarantees the reported alignment(s) are "best" in terms of
+these criteria (criterion 1 has priority), and that the alignments are
+reported in best-to-worst order. Bowtie is somewhat slower when
+`--best` is specified.
+
+Note that [Maq] internally rounds base qualities to the nearest 10 and
+rounds qualities greater than 30 to 30. To maintain compatibility,
+Bowtie does the same. Rounding can be suppressed with the
+`--nomaqround` option.
+
+Bowtie is not fully sensitive in `-n` 2 and `-n` 3 modes by default.
+In these modes Bowtie imposes a "backtracking limit" to limit effort
+spent trying to find valid alignments for low-quality reads unlikely to
+have any. This may cause bowtie to miss some legal 2- and 3-mismatch
+alignments. The limit is set to a reasonable default (125 without
+`--best`, 800 with `--best`), but the user may decrease or increase the
+limit using the `--maxbts` and/or `-y` options. `-y` mode is
+relatively slow but guarantees full sensitivity.
+
+[Maq]: http://maq.sf.net
+[Phred quality]: http://en.wikipedia.org/wiki/FASTQ_format#Variations
+
+The `-v` alignment mode
+-----------------------
+
+In `-v` mode, alignments may have no more than `V` mismatches, where
+`V` may be a number from 0 through 3 set using the `-v` option.
+Quality values are ignored. The `-v` option is mutually exclusive with
+the `-n` option.
+
+If there are many legal alignments, Bowtie gives preference to
+alignments with fewer mismatches. When the `--best` option is
+specified, Bowtie guarantees the reported alignment(s) are "best" in
+terms of the number of mismatches, and that the alignments are reported
+in best-to-worst order. Bowtie is somewhat slower when `--best` is
+specified.
+
+Strata
+------
+
+In [the -n alignment mode], an alignment's "stratum" is defined as the
+number of mismatches in the "seed" region, i.e. the leftmost `L` bases,
+where `L` is set with the `-l` option. In [the -v alignment mode], an
+alignment's stratum is defined as the total number of mismatches in the
+entire alignment. Some of Bowtie's options (e.g. `--strata` and `-m`
+use the notion of "stratum" to limit or expand the scope of reportable
+alignments.
+
+Reporting Modes
+---------------
+
+With the `-k`, `-a`, `-m`, `-M`, `--best` and `--strata` options, the
+user can flexibily select which alignments are reported. Below we
+demonstrate a few ways in which these options can be combined. All
+examples are using the `e_coli` index packaged with Bowtie. The
+`--suppress` option is used to keep the output concise and some
+output is elided for clarity.
+
+ Example 1: `-a`
+
+ $ ./bowtie -a -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+
+Specifying `-a` instructs bowtie to report *all* valid alignments,
+subject to the alignment policy: `-v` 2. In this case, bowtie finds
+5 inexact hits in the E. coli genome; 1 hit (the 2nd one listed)
+has 1 mismatch, and the other 4 hits have 2 mismatches. Four are on
+the reverse reference strand and one is on the forward strand. Note
+that they are not listed in best-to-worst order.
+
+ Example 2: `-k 3`
+
+ $ ./bowtie -k 3 -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+
+Specifying `-k` 3 instructs bowtie to report up to 3 valid
+alignments. In this case, a total of 5 valid alignments exist (see
+[Example 1]); `bowtie` reports 3 out of those 5. `-k` can be set to
+any integer greater than 0.
+
+ Example 3: `-k 6`
+
+ $ ./bowtie -k 6 -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+
+Specifying `-k` 6 instructs bowtie to report up to 6 valid
+alignments. In this case, a total of 5 valid alignments exist, so
+`bowtie` reports all 5.
+
+ Example 4: default (`-k 1`)
+
+ $ ./bowtie -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+
+Leaving the reporting options at their defaults causes `bowtie` to
+report the first valid alignment it encounters. Because `--best` was
+not specified, we are not guaranteed that bowtie will report the best
+alignment, and in this case it does not (the 1-mismatch alignment from
+the previous example would have been better). The default reporting
+mode is equivalent to `-k` 1.
+
+ Example 5: `-a --best`
+
+ $ ./bowtie -a --best -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+
+Specifying `-a` `--best` results in the same alignments being printed
+as if just `-a` had been specified, but they are guaranteed to be
+reported in best-to-worst order.
+
+ Example 6: `-a --best --strata`
+
+ $ ./bowtie -a --best --strata -v 2 --suppress 1,5,6,7 e_coli -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+
+Specifying `--strata` in addition to `-a` and `--best` causes
+`bowtie` to report only those alignments in the best alignment
+"stratum". The alignments in the best stratum are those having the
+least number of mismatches (or mismatches just in the "seed" portion of
+the alignment in the case of `-n` mode). Note that if `--strata`
+is specified, `--best` must also be specified.
+
+ Example 7: `-a -m 3`
+
+ $ ./bowtie -a -m 3 -v 2 e_coli -c ATGCATCATGCGCCAT
+ No alignments
+
+Specifying `-m` 3 instructs bowtie to refrain from reporting any
+alignments for reads having more than 3 reportable alignments. The
+`-m` option is useful when the user would like to guarantee that
+reported alignments are "unique", for some definition of unique.
+
+Example 1 showed that the read has 5 reportable alignments when `-a`
+and `-v` 2 are specified, so the `-m` 3 limit causes bowtie to
+output no alignments.
+
+ Example 8: `-a -m 5`
+
+ $ ./bowtie -a -m 5 -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+
+Specifying `-m` 5 instructs bowtie to refrain from reporting any
+alignments for reads having more than 5 reportable alignments. Since
+the read has exactly 5 reportable alignments, the `-m` 5 limit allows
+`bowtie` to print them as usual.
+
+ Example 9: `-a -m 3 --best --strata`
+
+ $ ./bowtie -a -m 3 --best --strata -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+
+Specifying `-m` 3 instructs bowtie to refrain from reporting any
+alignments for reads having more than 3 reportable alignments. As we
+saw in Example 6, the read has only 1 reportable alignment when `-a`,
+`--best` and `--strata` are specified, so the `-m` 3 limit allows
+`bowtie` to print that alignment as usual.
+
+Intuitively, the `-m` option, when combined with the `--best` and
+`--strata` options, guarantees a principled, though weaker form of
+"uniqueness." A stronger form of uniqueness is enforced when `-m` is
+specified but `--best` and `--strata` are not.
+
+Paired-end Alignment
+--------------------
+
+`bowtie` can align paired-end reads when properly paired read files are
+specified using the `-1` and `-2` options (for pairs of raw, FASTA, or
+FASTQ read files), or using the `--12` option (for Tab-delimited read
+files). A valid paired-end alignment satisfies these criteria:
+
+1. Both mates have a valid alignment according to the alignment policy
+ defined by the `-v`/`-n`/`-e`/`-l` options.
+2. The relative orientation and position of the mates satisfy the
+ constraints defined by the `-I`/`-X`/`--fr`/`--rf`/`--ff`
+ options.
+
+Policies governing which paired-end alignments are reported for a
+given read are specified using the `-k`, `-a` and `-m` options as
+usual. The `--strata` and `--best` options do not apply in
+paired-end mode.
+
+A paired-end alignment is reported as a pair of mate alignments, both
+on a separate line, where the alignment for each mate is formatted the
+same as an unpaired (singleton) alignment. The alignment for the mate
+that occurs closest to the beginning of the reference sequence (the
+"upstream" mate) is always printed before the alignment for the
+downstream mate. Reads files containing paired-end reads will
+sometimes name the reads according to whether they are the #1 or #2
+mates by appending a `/1` or `/2` suffix to the read name. If no such
+suffix is present in Bowtie's input, the suffix will be added when
+Bowtie prints read names in alignments (except in `-S` "SAM" mode,
+where mate information is encoded in the `FLAGS` field instead).
+
+Finding a valid paired-end alignment where both mates align to
+repetitive regions of the reference can be very time-consuming. By
+default, Bowtie avoids much of this cost by imposing a limit on the
+number of "tries" it makes to match an alignment for one mate with a
+nearby alignment for the other. The default limit is 100. This causes
+`bowtie` to miss some valid paired-end alignments where both mates lie
+in repetitive regions, but the user may use the `--pairtries` or
+`-y` options to increase Bowtie's sensitivity as desired.
+
+Paired-end alignments where one mate's alignment is entirely contained
+within the other's are considered invalid.
+
+When colospace alignment is enabled via `-C`, the default setting for
+paired-end orientation is `--ff`. This is because most SOLiD datasets
+have that orientation. When colorspace alignment is not enabled
+(default), the default setting for orientation is `--fr`, since most
+Illumina datasets have this orientation. The default can be overriden
+in either case.
+
+Because Bowtie uses an in-memory representation of the original
+reference string when finding paired-end alignments, its memory
+footprint is larger when aligning paired-end reads. For example, the
+human index has a memory footprint of about 2.2 GB in single-end mode
+and 2.9 GB in paired-end mode. Note that paired-end and unpaired
+alignment incur the same memory footprint in colorspace (e.g. human
+incurs about 2.9 GB)
+
+Colorspace Alignment
+--------------------
+
+As of version 0.12.0, `bowtie` can align colorspace reads against a
+colorspace index when `-C` is specified. Colorspace is the
+characteristic output format of Applied Biosystems' SOLiD system. In a
+colorspace read, each character is a color rather than a nucleotide,
+where a color encodes a class of dinucleotides. E.g. the color blue
+encodes any of the dinucleotides: AA, CC, GG, TT. Colorspace has the
+advantage of (often) being able to distinguish sequencing errors from
+SNPs once the read has been aligned. See ABI's [Principles of Di-Base
+Sequencing] document for details.
+
+ Colorspace reads
+
+All input formats (FASTA `-f`, FASTQ `-q`, raw `-r`, tab-delimited
+`--12`, command-line `-c`) are compatible with colorspace (`-C`).
+When `-C` is specified, read sequences are treated as colors. Colors
+may be encoded either as numbers (`0`=blue, `1`=green, `2`=orange,
+`3`=red) or as characters `A/C/G/T` (`A`=blue, `C`=green, `G`=orange,
+`T`=red).
+
+Some reads include a primer base as the first character; e.g.:
+
+ >1_53_33_F3
+ T2213120002010301233221223311331
+ >1_53_70_F3
+ T2302111203131231130300111123220
+ ...
+
+Here, `T` is the primer base. `bowtie` detects and handles primer
+bases properly (i.e., the primer base and the adjacent color are both
+trimmed away prior to alignment) as long as the rest of the read is
+encoded as numbers.
+
+`bowtie` also handles input in the form of parallel `.csfasta` and
+`_QV.qual` files. Use `-f` to specify the `.csfasta` files and `-Q`
+(for unpaired reads) or `--Q1`/`--Q2` (for paired-end reads) to
+specify the corresponding `_QV.qual` files. It is not necessary to
+first convert to FASTQ, though `bowtie` also handles FASTQ-formatted
+colorspace reads (with `-q`, the default).
+
+ Building a colorspace index
+
+A colorspace index is built in the same way as a normal index except
+that `-C` must be specified when running `bowtie-build`. If the user
+attempts to use `bowtie` without `-C` to align against an index that
+was built with `-C` (or vice versa), `bowtie` prints an error message
+and quits.
+
+ Decoding colorspace alignments
+
+Once a colorspace read is aligned, Bowtie decodes the alignment into
+nucleotides and reports the decoded nucleotide sequence. A principled
+decoding scheme is necessary because many different possible decodings
+are usually possible. Finding the true decoding with 100% certainty
+requires knowing all variants (e.g. SNPs) in the subject's genome
+beforehand, which is usually not possible. Instead, `bowtie` employs
+the approximate decoding scheme described in the [BWA paper]. This
+scheme attempts to distinguish variants from sequencing errors
+according to their relative likelihood under a model that considers the
+quality values of the colors and the (configurable) global likelihood
+of a SNP.
+
+Quality values are also "decoded" so that each reported quality value
+is a function of the two color qualities overlapping it. Bowtie again
+adopts the scheme described in the [BWA paper], i.e., the decoded
+nucleotide quality is either the sum of the overlapping color qualities
+(when both overlapping colors correspond to bases that match in the
+alignment), the quality of the matching color minus the quality of the
+mismatching color, or 0 (when both overlapping colors correspond to
+mismatches).
+
+For accurate decoding, `--snpphred`/`--snpfrac` should be set according
+to the user's best guess of the SNP frequency in the subject. The
+`--snpphred` parameter sets the SNP penalty directly (on the [Phred
+quality] scale), whereas `--snpfrac` allows the user to specify the
+fraction of sites expected to be SNPs; the fraction is then converted
+to a [Phred quality] internally. For the purpose of decoding, the SNP
+fraction is defined in terms of SNPs per *haplotype* base. Thus, if
+the genome is diploid, heterozygous SNPs have half the weight of
+homozygous SNPs
+
+Note that in `-S`/`--sam` mode, the decoded nucleotide sequence is
+printed for alignments, but the original color sequence (with `A`=blue,
+`C`=green, `G`=orange, `T`=red) is printed for unaligned reads without
+any reported alignments. As always, the `--un`, `--max` and `--al`
+parameters print reads exactly as they appeared in the input file.
+
+ Paired-end colorspace alignment
+
+Like other platforms, SOLiD supports generation of paired-end reads.
+When colorspace alignment is enabled, the default paired-end
+orientation setting is `--ff`. This is because most SOLiD datasets
+have that orientation.
+
+Note that SOLiD-generated read files can have "orphaned" mates; i.e.
+mates without a correpsondingly-named mate in the other file. To avoid
+problems due to orphaned mates, SOLiD paired-end output should first be
+converted to `.csfastq` files with unpaired mates omitted. This can be
+accomplished using, for example, [Galaxy]'s conversion tool (click
+"NGS: QC and manipulation", then "SOLiD-to-FASTQ" in the left-hand
+sidebar).
+
+[Principles of Di-Base Sequencing]: http://tinyurl.com/ygnb2gn
+[BWA paper]: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/25/14/1754
+
+Performance Tuning
+------------------
+
+1. Use 64-bit bowtie if possible
+
+ The 64-bit version of Bowtie is substantially (usually more then
+ 50%) faster than the 32-bit version, owing to its use of 64-bit
+ arithmetic. If possible, download the 64-bit binaries for Bowtie
+ and run on a 64-bit computer. If you are building Bowtie from
+ sources, you may need to pass the `-m64` option to `g++` to compile
+ the 64-bit version; you can do this by including `BITS=64` in the
+ arguments to the `make` command; e.g.: `make BITS=64 bowtie`. To
+ determine whether your version of bowtie is 64-bit or 32-bit, run
+ `bowtie --version`.
+
+2. If your computer has multiple processors/cores, use `-p`
+
+ The `-p` option causes Bowtie to launch a specified number of
+ parallel search threads. Each thread runs on a different
+ processor/core and all threads find alignments in parallel,
+ increasing alignment throughput by approximately a multiple of the
+ number of threads (though in practice, speedup is somewhat worse
+ than linear).
+
+3. If reporting many alignments per read, try tweaking
+ `bowtie-build --offrate`
+
+ If you are using the `-k`, `-a` or `-m` options and Bowtie is
+ reporting many alignments per read (an average of more than about
+ 10 per read) and you have some memory to spare, using an index with
+ a denser SA sample can speed things up considerably.
+
+ To do this, specify a smaller-than-default `-o`/`--offrate` value
+ when running `bowtie-build`. A denser SA sample yields a larger
+ index, but is also particularly effective at speeding up alignment
+ when many alignments are reported per read. For example,
+ decreasing the index's `-o`/`--offrate` by 1 could as much as
+ double alignment performance, and decreasing by 2 could quadruple
+ alignment performance, etc.
+
+ On the other hand, decreasing `-o`/`--offrate` increases the size
+ of the Bowtie index, both on disk and in memory when aligning
+ reads. At the default `-o`/`--offrate` of 5, the SA sample for the
+ human genome occupies about 375 MB of memory when aligning reads.
+ Decreasing the `-o`/`--offrate` by 1 doubles the memory taken by
+ the SA sample, and decreasing by 2 quadruples the memory taken,
+ etc.
+
+4. If bowtie "thrashes", try increasing `bowtie --offrate`
+
+ If `bowtie` runs very slow on a relatively low-memory machine
+ (having less than about 4 GB of memory), then try setting `bowtie`
+ `-o`/`--offrate` to a *larger* value than the value used to build
+ the index. For example, `bowtie-build`'s default `-o`/`--offrate`
+ is 5 and all pre-built indexes available from the Bowtie website
+ are built with `-o`/`--offrate` 5; so if `bowtie` thrashes when
+ querying such an index, try using `bowtie` `--offrate` 6. If
+ `bowtie` still thrashes, try `bowtie` `--offrate` 7, etc. A higher
+ `-o`/`--offrate` causes `bowtie` to use a sparser sample of the
+ suffix array than is stored in the index; this saves memory but
+ makes alignment reporting slower (which is especially slow when
+ using `-a` or large `-k` or `-m`).
+
+Command Line
+------------
+
+Usage:
+
+ bowtie [options]* <ebwt> {-1 <m1> -2 <m2> | --12 <r> | <s>} [<hit>]
+
+ Main arguments
+
+ <ebwt>
+
+The basename of the index to be searched. The basename is the name of
+any of the index files up to but not including the final `.1.ebwt` /
+`.rev.1.ebwt` / etc. `bowtie` looks for the specified index first in
+the current directory, then in the `indexes` subdirectory under the
+directory where the `bowtie` executable is located, then looks in the
+directory specified in the `BOWTIE_INDEXES` environment variable.
+
+ <m1>
+
+Comma-separated list of files containing the #1 mates (filename usually
+includes `_1`), or, if `-c` is specified, the mate sequences
+themselves. E.g., this might be `flyA_1.fq,flyB_1.fq`, or, if `-c`
+is specified, this might be `GGTCATCCT,ACGGGTCGT`. Sequences specified
+with this option must correspond file-for-file and read-for-read with
+those specified in `<m2>`. Reads may be a mix of different lengths.
+If `-` is specified, `bowtie` will read the #1 mates from the "standard
+in" filehandle.
+
+ <m2>
+
+Comma-separated list of files containing the #2 mates (filename usually
+includes `_2`), or, if `-c` is specified, the mate sequences
+themselves. E.g., this might be `flyA_2.fq,flyB_2.fq`, or, if `-c`
+is specified, this might be `GGTCATCCT,ACGGGTCGT`. Sequences specified
+with this option must correspond file-for-file and read-for-read with
+those specified in `<m1>`. Reads may be a mix of different lengths.
+If `-` is specified, `bowtie` will read the #2 mates from the "standard
+in" filehandle.
+
+ <r>
+
+Comma-separated list of files containing a mix of unpaired and
+paired-end reads in Tab-delimited format. Tab-delimited format is a
+1-read-per-line format where unpaired reads consist of a read name,
+sequence and quality string each separated by tabs. A paired-end read
+consists of a read name, sequnce of the #1 mate, quality values of the
+#1 mate, sequence of the #2 mate, and quality values of the #2 mate
+separated by tabs. Quality values can be expressed using any of the
+scales supported in FASTQ files. Reads may be a mix of different
+lengths and paired-end and unpaired reads may be intermingled in the
+same file. If `-` is specified, `bowtie` will read the Tab-delimited
+reads from the "standard in" filehandle.
+
+ <s>
+
+A comma-separated list of files containing unpaired reads to be
+aligned, or, if `-c` is specified, the unpaired read sequences
+themselves. E.g., this might be
+`lane1.fq,lane2.fq,lane3.fq,lane4.fq`, or, if `-c` is specified, this
+might be `GGTCATCCT,ACGGGTCGT`. Reads may be a mix of different
+lengths. If `-` is specified, Bowtie gets the reads from the "standard
+in" filehandle.
+
+ <hit>
+
+File to write alignments to. By default, alignments are written to the
+"standard out" filehandle (i.e. the console).
+
+ Options
+
+ Input
+
+ -q
+
+The query input files (specified either as `<m1>` and `<m2>`, or as
+`<s>`) are FASTQ files (usually having extension `.fq` or `.fastq`).
+This is the default. See also: `--solexa-quals` and
+`--integer-quals`.
+
+ -f
+
+The query input files (specified either as `<m1>` and `<m2>`, or as
+`<s>`) are FASTA files (usually having extension `.fa`, `.mfa`, `.fna`
+or similar). All quality values are assumed to be 40 on the [Phred
+quality] scale.
+
+ -r
+
+The query input files (specified either as `<m1>` and `<m2>`, or as
+`<s>`) are Raw files: one sequence per line, without quality values or
+names. All quality values are assumed to be 40 on the [Phred quality]
+scale.
+
+ -c
+
+The query sequences are given on command line. I.e. `<m1>`, `<m2>` and
+`<singles>` are comma-separated lists of reads rather than lists of
+read files.
+
+ -C/--color
+
+Align in colorspace. Read characters are interpreted as colors. The
+index specified must be a colorspace index (i.e. built with
+`bowtie-build` `-C`, or `bowtie` will print an error message and quit.
+See [Colorspace alignment] for more details.
+
+ -Q/--quals <files>
+
+Comma-separated list of files containing quality values for
+corresponding unpaired CSFASTA reads. Use in combination with `-C`
+and `-f`. `--integer-quals` is set automatically when `-Q`/`--quals`
+is specified.
+
+ --Q1 <files>
+
+Comma-separated list of files containing quality values for
+corresponding CSFASTA #1 mates. Use in combination with `-C`, `-f`,
+and `-1`. `--integer-quals` is set automatically when `--Q1`
+is specified.
+
+ --Q2 <files>
+
+Comma-separated list of files containing quality values for
+corresponding CSFASTA #2 mates. Use in combination with `-C`, `-f`,
+and `-2`. `--integer-quals` is set automatically when `--Q2`
+is specified.
+
+ -s/--skip <int>
+
+Skip (i.e. do not align) the first `<int>` reads or pairs in the input.
+
+ -u/--qupto <int>
+
+Only align the first `<int>` reads or read pairs from the input (after
+the `-s`/`--skip` reads or pairs have been skipped). Default: no
+limit.
+
+ -5/--trim5 <int>
+
+Trim `<int>` bases from high-quality (left) end of each read before
+alignment (default: 0).
+
+ -3/--trim3 <int>
+
+Trim `<int>` bases from low-quality (right) end of each read before
+alignment (default: 0).
+
+ --phred33-quals
+
+Input qualities are ASCII chars equal to the [Phred quality] plus 33.
+Default: on.
+
+ --phred64-quals
+
+Input qualities are ASCII chars equal to the [Phred quality] plus 64.
+Default: off.
+
+ --solexa-quals
+
+Convert input qualities from [Solexa][Phred quality] (which can be
+negative) to [Phred][Phred quality] (which can't). This is usually the
+right option for use with (unconverted) reads emitted by GA Pipeline
+versions prior to 1.3. Default: off.
+
+ --solexa1.3-quals
+
+Same as `--phred64-quals`. This is usually the right option for use
+with (unconverted) reads emitted by GA Pipeline version 1.3 or later.
+Default: off.
+
+ --integer-quals
+
+Quality values are represented in the read input file as
+space-separated ASCII integers, e.g., `40 40 30 40`..., rather than
+ASCII characters, e.g., `II?I`.... Integers are treated as being on
+the [Phred quality] scale unless `--solexa-quals` is also specified.
+Default: off.
+
+ Alignment
+
+ -v <int>
+
+Report alignments with at most `<int>` mismatches. `-e` and `-l`
+options are ignored and quality values have no effect on what
+alignments are valid. `-v` is mutually exclusive with `-n`.
+
+ -n/--seedmms <int>
+
+Maximum number of mismatches permitted in the "seed", i.e. the first
+`L` base pairs of the read (where `L` is set with `-l`/`--seedlen`).
+This may be 0, 1, 2 or 3 and the default is 2. This option is mutually
+exclusive with the `-v` option.
+
+ -e/--maqerr <int>
+
+Maximum permitted total of quality values at *all* mismatched read
+positions throughout the entire alignment, not just in the "seed". The
+default is 70. Like [Maq], `bowtie` rounds quality values to the
+nearest 10 and saturates at 30; rounding can be disabled with
+`--nomaqround`.
+
+ -l/--seedlen <int>
+
+The "seed length"; i.e., the number of bases on the high-quality end of
+the read to which the `-n` ceiling applies. The lowest permitted
+setting is 5 and the default is 28. `bowtie` is faster for larger
+values of `-l`.
+
+ --nomaqround
+
+[Maq] accepts quality values in the [Phred quality] scale, but
+internally rounds values to the nearest 10, with a maximum of 30. By
+default, `bowtie` also rounds this way. `--nomaqround` prevents this
+rounding in `bowtie`.
+
+ -I/--minins <int>
+
+The minimum insert size for valid paired-end alignments. E.g. if `-I
+60` is specified and a paired-end alignment consists of two 20-bp
+alignments in the appropriate orientation with a 20-bp gap between
+them, that alignment is considered valid (as long as `-X` is also
+satisfied). A 19-bp gap would not be valid in that case. If trimming
+options `-3` or `-5` are also used, the `-I` constraint is
+applied with respect to the untrimmed mates. Default: 0.
+
+ -X/--maxins <int>
+
+The maximum insert size for valid paired-end alignments. E.g. if `-X
+100` is specified and a paired-end alignment consists of two 20-bp
+alignments in the proper orientation with a 60-bp gap between them,
+that alignment is considered valid (as long as `-I` is also
+satisfied). A 61-bp gap would not be valid in that case. If trimming
+options `-3` or `-5` are also used, the `-X` constraint is applied
+with respect to the untrimmed mates, not the trimmed mates. Default:
+250.
+
+ --fr/--rf/--ff
+
+The upstream/downstream mate orientations for a valid paired-end
+alignment against the forward reference strand. E.g., if `--fr` is
+specified and there is a candidate paired-end alignment where mate1
+appears upstream of the reverse complement of mate2 and the insert
+length constraints are met, that alignment is valid. Also, if mate2
+appears upstream of the reverse complement of mate1 and all other
+constraints are met, that too is valid. `--rf` likewise requires that
+an upstream mate1 be reverse-complemented and a downstream mate2 be
+forward-oriented. ` --ff` requires both an upstream mate1 and a
+downstream mate2 to be forward-oriented. Default: `--fr` when `-C`
+(colorspace alignment) is not specified, `--ff` when `-C` is specified.
+
+ --nofw/--norc
+
+If `--nofw` is specified, `bowtie` will not attempt to align against
+the forward reference strand. If `--norc` is specified, `bowtie` will
+not attempt to align against the reverse-complement reference strand.
+For paired-end reads using `--fr` or `--rf` modes, `--nofw` and
+`--norc` apply to the forward and reverse-complement pair orientations.
+I.e. specifying `--nofw` and `--fr` will only find reads in the R/F
+orientation where mate 2 occurs upstream of mate 1 with respect to the
+forward reference strand.
+
+ --maxbts
+
+The maximum number of backtracks permitted when aligning a read in
+`-n` 2 or `-n` 3 mode (default: 125 without `--best`, 800 with
+`--best`). A "backtrack" is the introduction of a speculative
+substitution into the alignment. Without this limit, the default
+parameters will sometimes require that `bowtie` try 100s or 1,000s of
+backtracks to align a read, especially if the read has many low-quality
+bases and/or has no valid alignments, slowing bowtie down
+significantly. However, this limit may cause some valid alignments to
+be missed. Higher limits yield greater sensitivity at the expensive of
+longer running times. See also: `-y`/`--tryhard`.
+
+ --pairtries <int>
+
+For paired-end alignment, this is the maximum number of attempts
+`bowtie` will make to match an alignment for one mate up with an
+alignment for the opposite mate. Most paired-end alignments require
+only a few such attempts, but pairs where both mates occur in highly
+repetitive regions of the reference can require significantly more.
+Setting this to a higher number allows `bowtie` to find more paired-
+end alignments for repetitive pairs at the expense of speed. The
+default is 100. See also: `-y`/`--tryhard`.
+
+ -y/--tryhard
+
+Try as hard as possible to find valid alignments when they exist,
+including paired-end alignments. This is equivalent to specifying very
+high values for the `--maxbts` and `--pairtries` options. This
+mode is generally much slower than the default settings, but can be
+useful for certain problems. This mode is slower when (a) the
+reference is very repetitive, (b) the reads are low quality, or (c) not
+many reads have valid alignments.
+
+ --chunkmbs <int>
+
+The number of megabytes of memory a given thread is given to store path
+descriptors in `--best` mode. Best-first search must keep track of
+many paths at once to ensure it is always extending the path with the
+lowest cumulative cost. Bowtie tries to minimize the memory impact of
+the descriptors, but they can still grow very large in some cases. If
+you receive an error message saying that chunk memory has been
+exhausted in `--best` mode, try adjusting this parameter up to
+dedicate more memory to the descriptors. Default: 64.
+
+ Reporting
+
+ -k <int>
+
+Report up to `<int>` valid alignments per read or pair (default: 1).
+Validity of alignments is determined by the alignment policy (combined
+effects of `-n`, `-v`, `-l`, and `-e`). If more than one valid
+alignment exists and the `--best` and `--strata` options are
+specified, then only those alignments belonging to the best alignment
+"stratum" will be reported. Bowtie is designed to be very fast for
+small `-k` but bowtie can become significantly slower as `-k`
+increases. If you would like to use Bowtie for larger values of
+`-k`, consider building an index with a denser suffix-array sample,
+i.e. specify a smaller `-o`/`--offrate` when invoking `bowtie-build`
+for the relevant index (see the [Performance tuning] section for
+details).
+
+ -a/--all
+
+Report all valid alignments per read or pair (default: off). Validity
+of alignments is determined by the alignment policy (combined effects
+of `-n`, `-v`, `-l`, and `-e`). If more than one valid alignment
+exists and the `--best` and `--strata` options are specified, then only
+those alignments belonging to the best alignment "stratum" will be
+reported. Bowtie is designed to be very fast for small `-k` but bowtie
+can become significantly slower if `-a`/`--all` is specified. If you
+would like to use Bowtie with `-a`, consider building an index with a
+denser suffix-array sample, i.e. specify a smaller `-o`/`--offrate`
+when invoking `bowtie-build` for the relevant index (see the
+[Performance tuning] section for details).
+
+ -m <int>
+
+Suppress all alignments for a particular read or pair if more than
+`<int>` reportable alignments exist for it. Reportable alignments are
+those that would be reported given the `-n`, `-v`, `-l`, `-e`, `-k`,
+`-a`, `--best`, and `--strata` options. Default: no limit. Bowtie is
+designed to be very fast for small `-m` but bowtie can become
+significantly slower for larger values of `-m`. If you would like to
+use Bowtie for larger values of `-k`, consider building an index with a
+denser suffix-array sample, i.e. specify a smaller `-o`/`--offrate` when
+invoking `bowtie-build` for the relevant index (see the [Performance
+tuning] section for details).
+
+ -M <int>
+
+Behaves like `-m` except that if a read has more than `<int>`
+reportable alignments, one is reported at random. In [default
+output mode], the selected alignment's 7th column is set to `<int>`+1 to
+indicate the read has at least `<int>`+1 valid alignments. In
+`-S`/`--sam` mode, the selected alignment is given a `MAPQ` (mapping
+quality) of 0 and the `XM:I` field is set to `<int>`+1. This option
+requires `--best`; if specified without `--best`, `--best` is enabled
+automatically.
+
+ --best
+
+Make Bowtie guarantee that reported singleton alignments are "best" in
+terms of stratum (i.e. number of mismatches, or mismatches in the seed
+in the case of `-n` mode) and in terms of the quality values at the
+mismatched position(s). Stratum always trumps quality; e.g. a
+1-mismatch alignment where the mismatched position has [Phred quality]
+40 is preferred over a 2-mismatch alignment where the mismatched
+positions both have [Phred quality] 10. When `--best` is not
+specified, Bowtie may report alignments that are sub-optimal in terms
+of stratum and/or quality (though an effort is made to report the best
+alignment). `--best` mode also removes all strand bias. Note that
+`--best` does not affect which alignments are considered "valid" by
+`bowtie`, only which valid alignments are reported by `bowtie`. When
+`--best` is specified and multiple hits are allowed (via `-k` or
+`-a`), the alignments for a given read are guaranteed to appear in
+best-to-worst order in `bowtie`'s output. `bowtie` is somewhat slower
+when `--best` is specified.
+
+ --strata
+
+If many valid alignments exist and are reportable (e.g. are not
+disallowed via the `-k` option) and they fall into more than one
+alignment "stratum", report only those alignments that fall into the
+best stratum. By default, Bowtie reports all reportable alignments
+regardless of whether they fall into multiple strata. When
+`--strata` is specified, `--best` must also be specified.
+
+ Output
+
+ -t/--time
+
+Print the amount of wall-clock time taken by each phase.
+
+ -B/--offbase <int>
+
+When outputting alignments, number the first base of a reference
+sequence as `<int>`. Default: 0.
+
+ --quiet
+
+Print nothing besides alignments.
+
+ --refout
+
+Write alignments to a set of files named `refXXXXX.map`, where `XXXXX`
+is the 0-padded index of the reference sequence aligned to. This can
+be a useful way to break up work for downstream analyses when dealing
+with, for example, large numbers of reads aligned to the assembled
+human genome. If `<hits>` is also specified, it will be ignored.
+
+ --refidx
+
+When a reference sequence is referred to in a reported alignment, refer
+to it by 0-based index (its offset into the list of references that
+were indexed) rather than by name.
+
+ --al <filename>
+
+Write all reads for which at least one alignment was reported to a file
+with name `<filename>`. Written reads will appear as they did in the
+input, without any of the trimming or translation of quality values
+that may have taken place within `bowtie`. Paired-end reads will be
+written to two parallel files with `_1` and `_2` inserted in the
+filename, e.g., if `<filename>` is `aligned.fq`, the #1 and #2 mates
+that fail to align will be written to `aligned_1.fq` and `aligned_2.fq`
+respectively.
+
+ --un <filename>
+
+Write all reads that could not be aligned to a file with name
+`<filename>`. Written reads will appear as they did in the input,
+without any of the trimming or translation of quality values that may
+have taken place within Bowtie. Paired-end reads will be written to
+two parallel files with `_1` and `_2` inserted in the filename, e.g.,
+if `<filename>` is `unaligned.fq`, the #1 and #2 mates that fail to
+align will be written to `unaligned_1.fq` and `unaligned_2.fq`
+respectively. Unless `--max` is also specified, reads with a number
+of valid alignments exceeding the limit set with the `-m` option are
+also written to `<filename>`.
+
+ --max <filename>
+
+Write all reads with a number of valid alignments exceeding the limit
+set with the `-m` option to a file with name `<filename>`. Written
+reads will appear as they did in the input, without any of the trimming
+or translation of quality values that may have taken place within
+`bowtie`. Paired-end reads will be written to two parallel files with
+`_1` and `_2` inserted in the filename, e.g., if `<filename>` is
+`max.fq`, the #1 and #2 mates that exceed the `-m` limit will be
+written to `max_1.fq` and `max_2.fq` respectively. These reads are not
+written to the file specified with `--un`.
+
+ --suppress <cols>
+
+Suppress columns of output in the [default output mode]. E.g. if
+`--suppress 1,5,6` is specified, the read name, read sequence, and read
+quality fields will be omitted. See [Default Bowtie output] for field
+descriptions. This option is ignored if the output mode is
+`-S`/`--sam`.
+
+ --fullref
+
+Print the full refernce sequence name, including whitespace, in
+alignment output. By default `bowtie` prints everything up to but not
+including the first whitespace.
+
+ Colorspace
+
+ --snpphred <int>
+
+When decoding colorspace alignments, use `<int>` as the SNP penalty.
+This should be set to the user's best guess of the true ratio of SNPs
+per base in the subject genome, converted to the [Phred quality] scale.
+E.g., if the user expects about 1 SNP every 1,000 positions,
+`--snpphred` should be set to 30 (which is also the default). To
+specify the fraction directly, use `--snpfrac`.
+
+ --snpfrac <dec>
+
+When decoding colorspace alignments, use `<dec>` as the estimated ratio
+of SNPs per base. For best decoding results, this should be set to the
+user's best guess of the true ratio. `bowtie` internally converts the
+ratio to a [Phred quality], and behaves as if that quality had been set
+via the `--snpphred` option. Default: 0.001.
+
+ --col-cseq
+
+If reads are in colorspace and the [default output mode] is active,
+`--col-cseq` causes the reads' color sequence to appear in the
+read-sequence column (column 5) instead of the decoded nucleotide
+sequence. See the [Decoding colorspace alignments] section for details
+about decoding. This option is ignored in `-S`/`--sam` mode.
+
+ --col-cqual
+
+If reads are in colorspace and the [default output mode] is active,
+`--col-cqual` causes the reads' original (color) quality sequence to
+appear in the quality column (column 6) instead of the decoded
+qualities. See the [Colorspace alignment] section for details about
+decoding. This option is ignored in `-S`/`--sam` mode.
+
+ --col-keepends
+
+When decoding colorpsace alignments, `bowtie` trims off a nucleotide
+and quality from the left and right edges of the alignment. This is
+because those nucleotides are supported by only one color, in contrast
+to the middle nucleotides which are supported by two. Specify
+`--col-keepends` to keep the extreme-end nucleotides and qualities.
+
+ SAM
+
+ -S/--sam
+
+Print alignments in [SAM] format. See the [SAM output] section of the
+manual for details. To suppress all SAM headers, use `--sam-nohead`
+in addition to `-S/--sam`. To suppress just the `@SQ` headers (e.g. if
+the alignment is against a very large number of reference sequences),
+use `--sam-nosq` in addition to `-S/--sam`. `bowtie` does not write
+BAM files directly, but SAM output can be converted to BAM on the fly
+by piping `bowtie`'s output to `samtools view`. `-S`/`--sam` is not
+compatible with `--refout`.
+
+ --mapq <int>
+
+If an alignment is non-repetitive (according to `-m`, `--strata` and
+other options) set the `MAPQ` (mapping quality) field to this value.
+See the [SAM Spec][SAM] for details about the `MAPQ` field Default: 255.
+
+ --sam-nohead
+
+Suppress header lines (starting with `@`) when output is `-S`/`--sam`.
+This must be specified *in addition to* `-S`/`--sam`. `--sam-nohead`
+is ignored unless `-S`/`--sam` is also specified.
+
+ --sam-nosq
+
+Suppress `@SQ` header lines when output is `-S`/`--sam`. This must be
+specified *in addition to* `-S`/`--sam`. `--sam-nosq` is ignored
+unless `-S`/`--sam` is also specified.
+
+ --sam-RG <text>
+
+Add `<text>` (usually of the form `TAG:VAL`, e.g. `ID:IL7LANE2`) as a
+field on the `@RG` header line. Specify `--sam-RG` multiple times to
+set multiple fields. See the [SAM Spec][SAM] for details about what fields
+are legal. Note that, if any `@RG` fields are set using this option,
+the `ID` and `SM` fields must both be among them to make the `@RG` line
+legal according to the [SAM Spec][SAM]. `--sam-RG` is ignored unless
+`-S`/`--sam` is also specified.
+
+ Performance
+
+ -o/--offrate <int>
+
+Override the offrate of the index with `<int>`. If `<int>` is greater
+than the offrate used to build the index, then some row markings are
+discarded when the index is read into memory. This reduces the memory
+footprint of the aligner but requires more time to calculate text
+offsets. `<int>` must be greater than the value used to build the
+index.
+
+ -p/--threads <int>
+
+Launch `<int>` parallel search threads (default: 1). Threads will run
+on separate processors/cores and synchronize when parsing reads and
+outputting alignments. Searching for alignments is highly parallel,
+and speedup is fairly close to linear. This option is only available
+if `bowtie` is linked with the `pthreads` library (i.e. if
+`BOWTIE_PTHREADS=0` is not specified at build time).
+
+ --mm
+
+Use memory-mapped I/O to load the index, rather than normal C file I/O.
+Memory-mapping the index allows many concurrent `bowtie` processes on
+the same computer to share the same memory image of the index (i.e. you
+pay the memory overhead just once). This facilitates memory-efficient
+parallelization of `bowtie` in situations where using `-p` is not
+possible.
+
+ --shmem
+
+Use shared memory to load the index, rather than normal C file I/O.
+Using shared memory allows many concurrent bowtie processes on the same
+computer to share the same memory image of the index (i.e. you pay the
+memory overhead just once). This facilitates memory-efficient
+parallelization of `bowtie` in situations where using `-p` is not
+desirable. Unlike `--mm`, `--shmem` installs the index into shared
+memory permanently, or until the user deletes the shared memory chunks
+manually. See your operating system documentation for details on how
+to manually list and remove shared memory chunks (on Linux and Mac OS
+X, these commands are `ipcs` and `ipcrm`). You may also need to
+increase your OS's maximum shared-memory chunk size to accomodate
+larger indexes; see your OS documentation.
+
+ Other
+
+ --seed <int>
+
+Use `<int>` as the seed for pseudo-random number generator.
+
+ --verbose
+
+Print verbose output (for debugging).
+
+ --version
+
+Print version information and quit.
+
+ -h/--help
+
+Print usage information and quit.
+
+Default `bowtie` output
+-----------------------
+
+`bowtie` outputs one alignment per line. Each line is a collection of
+8 fields separated by tabs; from left to right, the fields are:
+
+1. Name of read that aligned
+
+2. Reference strand aligned to, `+` for forward strand, `-` for
+ reverse
+
+3. Name of reference sequence where alignment occurs, or numeric ID if
+ no name was provided
+
+4. 0-based offset into the forward reference strand where leftmost
+ character of the alignment occurs
+
+5. Read sequence (reverse-complemented if orientation is `-`).
+
+ If the read was in colorspace, then the sequence shown in this
+ column is the sequence of *decoded nucleotides*, not the original
+ colors. See the [Colorspace alignment] section for details about
+ decoding. To display colors instead, use the `--col-cseq` option.
+
+6. ASCII-encoded read qualities (reversed if orientation is `-`). The
+ encoded quality values are on the Phred scale and the encoding is
+ ASCII-offset by 33 (ASCII char `!`).
+
+ If the read was in colorspace, then the qualities shown in this
+ column are the *decoded qualities*, not the original qualities.
+ See the [Colorspace alignment] section for details about decoding.
+ To display colors instead, use the `--col-cqual` option.
+
+7. If `-M` was specified and the prescribed ceiling was exceeded for
+ this read, this column contains the value of the ceiling,
+ indicating that at least that many valid alignments were found in
+ addition to the one reported.
+
+ Otherwise, this column contains the number of other instances where
+ the same sequence aligned against the same reference characters as
+ were aligned against in the reported alignment. This is *not* the
+ number of other places the read aligns with the same number of
+ mismatches. The number in this column is generally not a good
+ proxy for that number (e.g., the number in this column may be '0'
+ while the number of other alignments with the same number of
+ mismatches might be large).
+
+8. Comma-separated list of mismatch descriptors. If there are no
+ mismatches in the alignment, this field is empty. A single
+ descriptor has the format offset:reference-base>read-base. The
+ offset is expressed as a 0-based offset from the high-quality (5')
+ end of the read.
+
+SAM `bowtie` output
+-------------------
+
+Following is a brief description of the [SAM] format as output by
+`bowtie` when the `-S`/`--sam` option is specified. For more
+details, see the [SAM format specification][SAM].
+
+When `-S`/`--sam` is specified, `bowtie` prints a SAM header with
+`@HD`, `@SQ` and `@PG` lines. When one or more `--sam-RG` arguments
+are specified, `bowtie` will also print an `@RG` line that includes all
+user-specified `--sam-RG` tokens separated by tabs.
+
+Each subsequnt line corresponds to a read or an alignment. Each line
+is a collection of at least 12 fields separated by tabs; from left to
+right, the fields are:
+
+1. Name of read that aligned
+
+2. Sum of all applicable flags. Flags relevant to Bowtie are:
+
+ 1
+
+ The read is one of a pair
+
+ 2
+
+ The alignment is one end of a proper paired-end alignment
+
+ 4
+
+ The read has no reported alignments
+
+ 8
+
+ The read is one of a pair and has no reported alignments
+
+ 16
+
+ The alignment is to the reverse reference strand
+
+ 32
+
+ The other mate in the paired-end alignment is aligned to the
+ reverse reference strand
+
+ 64
+
+ The read is the first mate in a pair
+
+ 128
+
+ The read is the second mate in a pair
+
+ Thus, an unpaired read that aligns to the reverse reference strand
+ will have flag 16. A paired-end read that aligns and is the first
+ mate in the pair will have flag 83 (= 64 + 16 + 2 + 1).
+
+3. Name of reference sequence where alignment occurs, or ordinal ID
+ if no name was provided
+
+4. 1-based offset into the forward reference strand where leftmost
+ character of the alignment occurs
+
+5. Mapping quality
+
+6. CIGAR string representation of alignment
+
+7. Name of reference sequence where mate's alignment occurs. Set to
+ `=` if the mate's reference sequence is the same as this
+ alignment's, or `*` if there is no mate.
+
+8. 1-based offset into the forward reference strand where leftmost
+ character of the mate's alignment occurs. Offset is 0 if there is
+ no mate.
+
+9. Inferred insert size. Size is negative if the mate's alignment
+ occurs upstream of this alignment. Size is 0 if there is no mate.
+
+10. Read sequence (reverse-complemented if aligned to the reverse
+ strand)
+
+11. ASCII-encoded read qualities (reverse-complemented if the read
+ aligned to the reverse strand). The encoded quality values are on
+ the [Phred quality] scale and the encoding is ASCII-offset by 33
+ (ASCII char `!`), similarly to a [FASTQ] file.
+
+12. Optional fields. Fields are tab-separated. For descriptions of
+ all possible optional fields, see the SAM format specification.
+ `bowtie` outputs some of these optional fields for each alignment,
+ depending on the type of the alignment:
+
+ NM:i:<N>
+
+ Aligned read has an edit distance of `<N>`.
+
+ CM:i:<N>
+
+ Aligned read has an edit distance of `<N>` in colorspace. This
+ field is present in addition to the `NM` field in `-C`/`--color`
+ mode, but is omitted otherwise.
+
+ MD:Z:<S>
+
+ For aligned reads, `<S>` is a string representation of the
+ mismatched reference bases in the alignment. See [SAM] format
+ specification for details. For colorspace alignments, `<S>`
+ describes the decoded *nucleotide* alignment, not the colorspace
+ alignment.
+
+ XA:i:<N>
+
+ Aligned read belongs to stratum `<N>`. See [Strata] for definition.
+
+ XM:i:<N>
+
+ For a read with no reported alignments, `<N>` is 0 if the read had
+ no alignments. If `-m` was specified and the read's alignments
+ were supressed because the `-m` ceiling was exceeded, `<N>` equals
+ the `-m` ceiling + 1, to indicate that there were at least that
+ many valid alignments (but all were suppressed). In `-M` mode, if
+ the alignment was randomly selected because the `-M` ceiling was
+ exceeded, `<N>` equals the `-M` ceiling + 1, to indicate that there
+ were at least that many valid alignments (of which one was reported
+ at random).
+
+[SAM format specification]: http://samtools.sf.net/SAM1.pdf
+[FASTQ]: http://en.wikipedia.org/wiki/FASTQ_format
+
+The `bowtie-build` indexer
+==========================
+
+`bowtie-build` builds a Bowtie index from a set of DNA sequences.
+`bowtie-build` outputs a set of 6 files with suffixes
+`.1.ebwt`, `.2.ebwt`, `.3.ebwt`, `.4.ebwt`, `.rev.1.ebwt`, and
+`.rev.2.ebwt`. These files together constitute the index: they are all
+that is needed to align reads to that reference. The original sequence
+files are no longer used by Bowtie once the index is built.
+
+Use of Karkkainen's [blockwise algorithm] allows `bowtie-build` to
+trade off between running time and memory usage. `bowtie-build` has
+three options governing how it makes this trade: `-p`/`--packed`,
+`--bmax`/`--bmaxdivn`, and `--dcv`. By default, `bowtie-build` will
+automatically search for the settings that yield the best
+ running time without exhausting memory. This behavior can be disabled
+ using the `-a`/`--noauto` option.
+
+The indexer provides options pertaining to the "shape" of the index,
+e.g. `--offrate` governs the fraction of [Burrows-Wheeler] rows that
+are "marked" (i.e., the density of the suffix-array sample; see the
+original [FM Index] paper for details). All of these options are
+potentially profitable trade-offs depending on the application. They
+have been set to defaults that are reasonable for most cases according
+to our experiments. See [Performance Tuning] for details.
+
+Because `bowtie-build` uses 32-bit pointers internally, it can handle
+up to a theoretical maximum of 2^32-1 (somewhat more than 4 billion)
+characters in an index, though, with other constraints, the actual
+ceiling is somewhat less than that. If your reference exceeds 2^32-1
+characters, `bowtie-build` will print an error message and abort. To
+resolve this, divide your reference sequences into smaller batches
+and/or chunks and build a separate index for each.
+
+If your computer has more than 3-4 GB of memory and you would like to
+exploit that fact to make index building faster, use a 64-bit version
+of the `bowtie-build` binary. The 32-bit version of the binary is
+restricted to using less than 4 GB of memory. If a 64-bit pre-built
+binary does not yet exist for your platform on the sourceforge download
+site, you will need to build one from source.
+
+The Bowtie index is based on the [FM Index] of Ferragina and Manzini,
+which in turn is based on the [Burrows-Wheeler] transform. The
+algorithm used to build the index is based on the [blockwise algorithm]
+of Karkkainen.
+
+[Blockwise algorithm]: http://portal.acm.org/citation.cfm?id=1314852
+[FM Index]: http://portal.acm.org/citation.cfm?id=796543
+[Burrows-Wheeler]: http://en.wikipedia.org/wiki/Burrows-Wheeler_transform
+
+Command Line
+------------
+
+Usage:
+
+ bowtie-build [options]* <reference_in> <ebwt_base>
+
+ Main arguments
+
+ <reference_in>
+
+A comma-separated list of FASTA files containing the reference
+sequences to be aligned to, or, if `-c` is specified, the sequences
+themselves. E.g., `<reference_in>` might be
+`chr1.fa,chr2.fa,chrX.fa,chrY.fa`, or, if `-c` is specified, this might
+be `GGTCATCCT,ACGGGTCGT,CCGTTCTATGCGGCTTA`.
+
+ <ebwt_base>
+
+The basename of the index files to write. By default, `bowtie-build`
+writes files named `NAME.1.ebwt`, `NAME.2.ebwt`, `NAME.3.ebwt`,
+`NAME.4.ebwt`, `NAME.rev.1.ebwt`, and `NAME.rev.2.ebwt`, where `NAME`
+is `<ebwt_base>`.
+
+ Options
+
+ -f
+
+The reference input files (specified as `<reference_in>`) are FASTA
+files (usually having extension `.fa`, `.mfa`, `.fna` or similar).
+
+ -c
+
+The reference sequences are given on the command line. I.e.
+`<reference_in>` is a comma-separated list of sequences rather than a
+list of FASTA files.
+
+ -C/--color
+
+Build a colorspace index, to be queried using `bowtie` `-C`.
+
+ -a/--noauto
+
+Disable the default behavior whereby `bowtie-build` automatically
+selects values for the `--bmax`, `--dcv` and `--packed` parameters
+according to available memory. Instead, user may specify values for
+those parameters. If memory is exhausted during indexing, an error
+message will be printed; it is up to the user to try new parameters.
+
+ -p/--packed
+
+Use a packed (2-bits-per-nucleotide) representation for DNA strings.
+This saves memory but makes indexing 2-3 times slower. Default: off.
+This is configured automatically by default; use `-a`/`--noauto` to
+configure manually.
+
+ --bmax <int>
+
+The maximum number of suffixes allowed in a block. Allowing more
+suffixes per block makes indexing faster, but increases peak memory
+usage. Setting this option overrides any previous setting for
+`--bmax`, or `--bmaxdivn`. Default (in terms of the `--bmaxdivn`
+parameter) is `--bmaxdivn` 4. This is configured automatically by
+default; use `-a`/`--noauto` to configure manually.
+
+ --bmaxdivn <int>
+
+The maximum number of suffixes allowed in a block, expressed as a
+fraction of the length of the reference. Setting this option overrides
+any previous setting for `--bmax`, or `--bmaxdivn`. Default:
+`--bmaxdivn` 4. This is configured automatically by default; use
+`-a`/`--noauto` to configure manually.
+
+ --dcv <int>
+
+Use `<int>` as the period for the difference-cover sample. A larger
+period yields less memory overhead, but may make suffix sorting slower,
+especially if repeats are present. Must be a power of 2 no greater
+than 4096. Default: 1024. This is configured automatically by
+default; use `-a`/`--noauto` to configure manually.
+
+ --nodc
+
+Disable use of the difference-cover sample. Suffix sorting becomes
+quadratic-time in the worst case (where the worst case is an extremely
+repetitive reference). Default: off.
+
+ -r/--noref
+
+Do not build the `NAME.3.ebwt` and `NAME.4.ebwt` portions of the index,
+which contain a bitpacked version of the reference sequences and are
+used for paired-end alignment.
+
+ -3/--justref
+
+Build *only* the `NAME.3.ebwt` and `NAME.4.ebwt` portions of the index,
+which contain a bitpacked version of the reference sequences and are
+used for paired-end alignment.
+
+ -o/--offrate <int>
+
+To map alignments back to positions on the reference sequences, it's
+necessary to annotate ("mark") some or all of the [Burrows-Wheeler]
+rows with their corresponding location on the genome. `-o`/`--offrate`
+governs how many rows get marked: the indexer will mark every 2^`<int>`
+rows. Marking more rows makes reference-position lookups faster, but
+requires more memory to hold the annotations at runtime. The default
+is 5 (every 32nd row is marked; for human genome, annotations occupy
+about 340 megabytes).
+
+ -t/--ftabchars <int>
+
+The ftab is the lookup table used to calculate an initial
+[Burrows-Wheeler] range with respect to the first `<int>` characters
+of the query. A larger `<int>` yields a larger lookup table but faster
+query times. The ftab has size 4^(`<int>`+1) bytes. The default
+setting is 10 (ftab is 4MB).
+
+ --ntoa
+
+Convert Ns in the reference sequence to As before building the index.
+By default, Ns are simply excluded from the index and `bowtie` will not
+report alignments that overlap them.
+
+ --big --little
+
+Endianness to use when serializing integers to the index file.
+Default: little-endian (recommended for Intel- and AMD-based
+architectures).
+
+ --seed <int>
+
+Use `<int>` as the seed for pseudo-random number generator.
+
+ --cutoff <int>
+
+Index only the first `<int>` bases of the reference sequences
+(cumulative across sequences) and ignore the rest.
+
+ -q/--quiet
+
+`bowtie-build` is verbose by default. With this option `bowtie-build`
+will print only error messages.
+
+ -h/--help
+
+Print usage information and quit.
+
+ --version
+
+Print version information and quit.
+
+The `bowtie-inspect` index inspector
+====================================
+
+`bowtie-inspect` extracts information from a Bowtie index about what
+kind of index it is and what reference sequences were used to build it.
+When run without any options, the tool will output a FASTA file
+containing the sequences of the original references (with all
+non-`A`/`C`/`G`/`T` characters converted to `N`s). It can also be used
+to extract just the reference sequence names using the `-n`/`--names`
+option or a more verbose summary using the `-s`/`--summary` option.
+
+Command Line
+------------
+
+Usage:
+
+ bowtie-inspect [options]* <ebwt_base>
+
+ Main arguments
+
+ <ebwt_base>
+
+The basename of the index to be inspected. The basename is name of any
+of the index files but with the `.X.ebwt` or `.rev.X.ebwt` suffix
+omitted. `bowtie-inspect` first looks in the current directory for the
+index files, then looks in the `indexes` subdirectory under the
+directory where the currently-running `bowtie` executable is located,
+then looks in the directory specified in the `BOWTIE_INDEXES`
+environment variable.
+
+ Options
+
+ -a/--across <int>
+
+When printing FASTA output, output a newline character every `<int>`
+bases (default: 60).
+
+ -n/--names
+
+Print reference sequence names, one per line, and quit.
+
+ -s/--summary
+
+Print a summary that includes information about index settings, as well
+as the names and lengths of the input sequences. The summary has this
+format:
+
+ Colorspace <0 or 1>
+ SA-Sample 1 in <sample>
+ FTab-Chars <chars>
+ Sequence-1 <name> <len>
+ Sequence-2 <name> <len>
+ ...
+ Sequence-N <name> <len>
+
+Fields are separated by tabs.
+
+ -e/--ebwt-ref
+
+By default, when `bowtie-inspect` is run without `-s` or `-n`, it
+recreates the reference nucleotide sequences using the bit-encoded
+reference nucleotides kept in the `.3.ebwt` and `.4.ebwt` index files.
+When `-e/--ebwt-ref` is specified, `bowtie-inspect` recreates the
+reference sequences from the Burrows-Wheeler-transformed reference
+sequence in the `.1.ebwt` file instead. The reference recreation
+process is much slower when `-e/--ebwt-ref` is specified. Also, when
+`-e/--ebwt-ref` is specified and the index is in colorspace, the
+reference is printed in colors (A=blue, C=green, G=orange, T=red).
+
+ -v/--verbose
+
+Print verbose output (for debugging).
+
+ --version
+
+Print version information and quit.
+
+ -h/--help
+
+Print usage information and quit.
+
--- /dev/null
+<!--
+ ! This manual is written in "markdown" format and thus contains some
+ ! distracting clutter encoding information about how to convert to
+ ! HTML. See 'MANUAL' for a clearer version of this document.
+ -->
+
+What is Bowtie?
+===============
+
+[Bowtie] is an ultrafast, memory-efficient short read aligner geared
+toward quickly aligning large sets of short DNA sequences (reads) to
+large genomes. It aligns 35-base-pair reads to the human genome at a
+rate of 25 million reads per hour on a typical workstation. Bowtie
+indexes the genome with a [Burrows-Wheeler] index to keep its memory
+footprint small: for the human genome, the index is typically about
+2.2 GB (for unpaired alignment) or 2.9 GB (for paired-end or colorspace
+alignment). Multiple processors can be used simultaneously to achieve
+greater alignment speed. Bowtie can also output alignments in the
+standard [SAM] format, allowing Bowtie to interoperate with other tools
+supporting SAM, including the [SAMtools] consensus, SNP, and indel
+callers. Bowtie runs on the command line under Windows, Mac OS X,
+Linux, and Solaris.
+
+[Bowtie] also forms the basis for other tools, including [TopHat]: a
+fast splice junction mapper for RNA-seq reads, [Cufflinks]: a tool for
+transcriptome assembly and isoform quantitiation from RNA-seq reads,
+[Crossbow]: a cloud-computing software tool for large-scale
+resequencing data,and [Myrna]: a cloud computing tool for calculating
+differential gene expression in large RNA-seq datasets.
+
+If you use [Bowtie] for your published research, please cite the
+[Bowtie paper].
+
+[Bowtie]: http://bowtie-bio.sf.net
+[Burrows-Wheeler]: http://en.wikipedia.org/wiki/Burrows-Wheeler_transform
+[SAM]: http://samtools.sourceforge.net/SAM1.pdf
+[SAMtools]: http://samtools.sourceforge.net/
+[TopHat]: http://tophat.cbcb.umd.edu/
+[Cufflinks]: http://cufflinks.cbcb.umd.edu/
+[Crossbow]: http://bowtie-bio.sf.net/crossbow
+[Myrna]: http://bowtie-bio.sf.net/myrna
+[Bowtie paper]: http://genomebiology.com/2009/10/3/R25
+
+What isn't Bowtie?
+==================
+
+Bowtie is not a general-purpose alignment tool like [MUMmer], [BLAST]
+or [Vmatch]. Bowtie works best when aligning short reads to large
+genomes, though it supports arbitrarily small reference sequences (e.g.
+amplicons) and reads as long as 1024 bases. Bowtie is designed to be
+extremely fast for sets of short reads where (a) many of the reads have
+at least one good, valid alignment, (b) many of the reads are
+relatively high-quality, and (c) the number of alignments reported per
+read is small (close to 1).
+
+Bowtie does not yet report gapped alignments; this is future work.
+
+[MUMmer]: http://mummer.sourceforge.net/
+[BLAST]: http://blast.ncbi.nlm.nih.gov/Blast.cgi
+[Vmatch]: http://www.vmatch.de/
+
+Obtaining Bowtie
+================
+
+You may download either Bowtie sources or binaries for your platform
+from the [Download] section of the Sourceforge project site. Binaries
+are currently available for Intel architectures (`i386` and `x86_64`)
+running Linux, Windows, and Mac OS X.
+
+Building from source
+--------------------
+
+Building Bowtie from source requires a GNU-like environment that
+includes GCC, GNU Make and other basics. It should be possible to
+build Bowtie on a vanilla Linux or Mac installation. Bowtie can also
+be built on Windows using [Cygwin] or [MinGW]. We recommend
+[TDM's MinGW Build]. If using [MinGW], you must also have [MSYS]
+installed.
+
+To build Bowtie, extract the sources, change to the extracted
+directory, and run GNU `make` (usually with the command `make`, but
+sometimes with `gmake`) with no arguments. If building with [MinGW],
+run `make` from the [MSYS] command line.
+
+To support the [`-p`] (multithreading) option, Bowtie needs the
+`pthreads` library. To compile Bowtie without `pthreads` (which
+disables [`-p`]), use `make BOWTIE_PTHREADS=0`.
+
+[Cygwin]: http://www.cygwin.com/
+[MinGW]: http://www.mingw.org/
+[TDM's MinGW Build]: http://www.tdragon.net/recentgcc/
+[MSYS]: http://www.mingw.org/wiki/msys
+[Download]: https://sourceforge.net/projects/bowtie-bio/files/bowtie/
+
+The `bowtie` aligner
+====================
+
+`bowtie` takes an index and a set of reads as input and outputs a list
+of alignments. Alignments are selected according to a combination of
+the [`-v`]/[`-n`]/[`-e`]/[`-l`] options (plus the [`-I`]/[`-X`]/[`--fr`]/[`--rf`]/
+[`--ff`] options for paired-end alignment), which define which alignments
+are legal, and the [`-k`]/[`-a`]/[`-m`]/[`-M`]/[`--best`]/[`--strata`] options
+which define which and how many legal alignments should be reported.
+
+By default, Bowtie enforces an alignment policy similar to [Maq]'s
+default quality-aware policy ([`-n`] 2 [`-l`] 28 [`-e`] 70). See [the -n
+alignment mode] section of the manual for details about this mode. But
+Bowtie can also enforce a simpler end-to-end k-difference policy (e.g.
+with [`-v`] 2). See [the -v alignment mode] section of the manual for
+details about that mode. [The -n alignment mode] and [the -v alignment
+mode] are mutually exclusive.
+
+Bowtie works best when aligning short reads to large genomes (e.g.
+human or mouse), though it supports arbitrarily small reference
+sequences and reads as long as 1024 bases. Bowtie is designed to be
+very fast for sets of short reads where a) many reads have at least one
+good, valid alignment, b) many reads are relatively high-quality, c)
+the number of alignments reported per read is small (close to 1).
+These criteria are generally satisfied in the context of modern
+short-read analyses such as RNA-seq, ChIP-seq, other types of -seq, and
+mammalian resequencing. You may observe longer running times in other
+research contexts.
+
+If `bowtie` is too slow for your application, try some of the
+performance-tuning hints described in the [Performance Tuning] section
+below.
+
+Alignments involving one or more ambiguous reference characters (`N`,
+`-`, `R`, `Y`, etc.) are considered invalid by Bowtie. This is true
+only for ambiguous characters in the reference; alignments involving
+ambiguous characters in the read are legal, subject to the alignment
+policy. Ambiguous characters in the read mismatch all other
+characters. Alignments that "fall off" the reference sequence are not
+considered valid.
+
+The process by which `bowtie` chooses an alignment to report is
+randomized in order to avoid "mapping bias" - the phenomenon whereby
+an aligner systematically fails to report a particular class of good
+alignments, causing spurious "holes" in the comparative assembly.
+Whenever `bowtie` reports a subset of the valid alignments that exist,
+it makes an effort to sample them randomly. This randomness flows
+from a simple seeded pseudo-random number generator and is
+deterministic in the sense that Bowtie will always produce the same
+results for the same read when run with the same initial "seed" value
+(see [`--seed`] option).
+
+In the default mode, `bowtie` can exhibit strand bias. Strand bias
+occurs when input reference and reads are such that (a) some reads
+align equally well to sites on the forward and reverse strands of the
+reference, and (b) the number of such sites on one strand is different
+from the number on the other strand. When this happens for a given
+read, `bowtie` effectively chooses one strand or the other with 50%
+probability, then reports a randomly-selected alignment for that read
+from among the sites on the selected strand. This tends to overassign
+alignments to the sites on the strand with fewer sites and underassign
+to sites on the strand with more sites. The effect is mitigated,
+though it may not be eliminated, when reads are longer or when
+paired-end reads are used. Running Bowtie in [`--best`] mode
+eliminates strand bias by forcing Bowtie to select one strand or the
+other with a probability that is proportional to the number of best
+sites on the strand.
+
+Gapped alignments are not currently supported, but support is planned
+for a future release.
+
+[the -n alignment mode]: #the--n-alignment-mode
+[the -v alignment mode]: #the--v-alignment-mode
+[High Performance Tips]: #high-performance-tips
+[Maq]: http://maq.sf.net
+
+The `-n` alignment mode
+-----------------------
+
+When the [`-n`] option is specified (which is the default), `bowtie`
+determines which alignments are valid according to the following
+policy, which is similar to [Maq]'s default policy.
+
+ 1. Alignments may have no more than `N` mismatches (where `N` is a
+ number 0-3, set with [`-n`]) in the first `L` bases (where `L` is a
+ number 5 or greater, set with [`-l`]) on the high-quality (left) end
+ of the read. The first `L` bases are called the "seed".
+
+ 2. The sum of the [Phred quality] values at *all* mismatched positions
+ (not just in the seed) may not exceed `E` (set with [`-e`]). Where
+ qualities are unavailable (e.g. if the reads are from a FASTA
+ file), the [Phred quality] defaults to 40.
+
+The [`-n`] option is mutually exclusive with the [`-v`] option.
+
+If there are many possible alignments satisfying these criteria, Bowtie
+gives preference to alignments with fewer mismatches and where the sum
+from criterion 2 is smaller. When the [`--best`] option is specified,
+Bowtie guarantees the reported alignment(s) are "best" in terms of
+these criteria (criterion 1 has priority), and that the alignments are
+reported in best-to-worst order. Bowtie is somewhat slower when
+[`--best`] is specified.
+
+Note that [Maq] internally rounds base qualities to the nearest 10 and
+rounds qualities greater than 30 to 30. To maintain compatibility,
+Bowtie does the same. Rounding can be suppressed with the
+[`--nomaqround`] option.
+
+Bowtie is not fully sensitive in [`-n`] 2 and [`-n`] 3 modes by default.
+In these modes Bowtie imposes a "backtracking limit" to limit effort
+spent trying to find valid alignments for low-quality reads unlikely to
+have any. This may cause bowtie to miss some legal 2- and 3-mismatch
+alignments. The limit is set to a reasonable default (125 without
+[`--best`], 800 with [`--best`]), but the user may decrease or increase the
+limit using the [`--maxbts`] and/or [`-y`] options. [`-y`] mode is
+relatively slow but guarantees full sensitivity.
+
+[Maq]: http://maq.sf.net
+[Phred quality]: http://en.wikipedia.org/wiki/FASTQ_format#Variations
+
+The `-v` alignment mode
+-----------------------
+
+In [`-v`] mode, alignments may have no more than `V` mismatches, where
+`V` may be a number from 0 through 3 set using the [`-v`] option.
+Quality values are ignored. The [`-v`] option is mutually exclusive with
+the [`-n`] option.
+
+If there are many legal alignments, Bowtie gives preference to
+alignments with fewer mismatches. When the [`--best`] option is
+specified, Bowtie guarantees the reported alignment(s) are "best" in
+terms of the number of mismatches, and that the alignments are reported
+in best-to-worst order. Bowtie is somewhat slower when [`--best`] is
+specified.
+
+Strata
+------
+
+In [the -n alignment mode], an alignment's "stratum" is defined as the
+number of mismatches in the "seed" region, i.e. the leftmost `L` bases,
+where `L` is set with the [`-l`] option. In [the -v alignment mode], an
+alignment's stratum is defined as the total number of mismatches in the
+entire alignment. Some of Bowtie's options (e.g. [`--strata`] and [`-m`]
+use the notion of "stratum" to limit or expand the scope of reportable
+alignments.
+
+Reporting Modes
+---------------
+
+With the [`-k`], [`-a`], [`-m`], [`-M`], [`--best`] and [`--strata`] options, the
+user can flexibily select which alignments are reported. Below we
+demonstrate a few ways in which these options can be combined. All
+examples are using the `e_coli` index packaged with Bowtie. The
+[`--suppress`] option is used to keep the output concise and some
+output is elided for clarity.
+
+### Example 1: `-a`
+
+ $ ./bowtie -a -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+
+Specifying [`-a`] instructs bowtie to report *all* valid alignments,
+subject to the alignment policy: [`-v`] 2. In this case, bowtie finds
+5 inexact hits in the E. coli genome; 1 hit (the 2nd one listed)
+has 1 mismatch, and the other 4 hits have 2 mismatches. Four are on
+the reverse reference strand and one is on the forward strand. Note
+that they are not listed in best-to-worst order.
+
+### Example 2: `-k 3`
+
+ $ ./bowtie -k 3 -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+
+Specifying [`-k`] 3 instructs bowtie to report up to 3 valid
+alignments. In this case, a total of 5 valid alignments exist (see
+[Example 1]); `bowtie` reports 3 out of those 5. [`-k`] can be set to
+any integer greater than 0.
+
+[Example 1]: #example-1
+
+### Example 3: `-k 6`
+
+ $ ./bowtie -k 6 -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+
+Specifying [`-k`] 6 instructs bowtie to report up to 6 valid
+alignments. In this case, a total of 5 valid alignments exist, so
+`bowtie` reports all 5.
+
+### Example 4: default (`-k 1`)
+
+ $ ./bowtie -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+
+Leaving the reporting options at their defaults causes `bowtie` to
+report the first valid alignment it encounters. Because [`--best`] was
+not specified, we are not guaranteed that bowtie will report the best
+alignment, and in this case it does not (the 1-mismatch alignment from
+the previous example would have been better). The default reporting
+mode is equivalent to [`-k`] 1.
+
+### Example 5: `-a --best`
+
+ $ ./bowtie -a --best -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+
+Specifying [`-a`] [`--best`] results in the same alignments being printed
+as if just [`-a`] had been specified, but they are guaranteed to be
+reported in best-to-worst order.
+
+### Example 6: `-a --best --strata`
+
+ $ ./bowtie -a --best --strata -v 2 --suppress 1,5,6,7 e_coli -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+
+Specifying [`--strata`] in addition to [`-a`] and [`--best`] causes
+`bowtie` to report only those alignments in the best alignment
+"stratum". The alignments in the best stratum are those having the
+least number of mismatches (or mismatches just in the "seed" portion of
+the alignment in the case of [`-n`] mode). Note that if [`--strata`]
+is specified, [`--best`] must also be specified.
+
+### Example 7: `-a -m 3`
+
+ $ ./bowtie -a -m 3 -v 2 e_coli -c ATGCATCATGCGCCAT
+ No alignments
+
+Specifying [`-m`] 3 instructs bowtie to refrain from reporting any
+alignments for reads having more than 3 reportable alignments. The
+[`-m`] option is useful when the user would like to guarantee that
+reported alignments are "unique", for some definition of unique.
+
+Example 1 showed that the read has 5 reportable alignments when [`-a`]
+and [`-v`] 2 are specified, so the [`-m`] 3 limit causes bowtie to
+output no alignments.
+
+### Example 8: `-a -m 5`
+
+ $ ./bowtie -a -m 5 -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 148810 10:A>G,13:C>G
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+ - gi|110640213|ref|NC_008253.1| 4930433 4:G>T,6:C>G
+ - gi|110640213|ref|NC_008253.1| 905664 6:A>G,7:G>T
+ + gi|110640213|ref|NC_008253.1| 1093035 2:T>G,15:A>T
+
+Specifying [`-m`] 5 instructs bowtie to refrain from reporting any
+alignments for reads having more than 5 reportable alignments. Since
+the read has exactly 5 reportable alignments, the [`-m`] 5 limit allows
+`bowtie` to print them as usual.
+
+### Example 9: `-a -m 3 --best --strata`
+
+ $ ./bowtie -a -m 3 --best --strata -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
+ - gi|110640213|ref|NC_008253.1| 2852852 8:T>A
+
+Specifying [`-m`] 3 instructs bowtie to refrain from reporting any
+alignments for reads having more than 3 reportable alignments. As we
+saw in Example 6, the read has only 1 reportable alignment when [`-a`],
+[`--best`] and [`--strata`] are specified, so the [`-m`] 3 limit allows
+`bowtie` to print that alignment as usual.
+
+Intuitively, the [`-m`] option, when combined with the [`--best`] and
+[`--strata`] options, guarantees a principled, though weaker form of
+"uniqueness." A stronger form of uniqueness is enforced when [`-m`] is
+specified but [`--best`] and [`--strata`] are not.
+
+Paired-end Alignment
+--------------------
+
+`bowtie` can align paired-end reads when properly paired read files are
+specified using the [`-1`](#command-line) and [`-2`](#command-line) options (for pairs of raw, FASTA, or
+FASTQ read files), or using the [`--12`](#command-line) option (for Tab-delimited read
+files). A valid paired-end alignment satisfies these criteria:
+
+1. Both mates have a valid alignment according to the alignment policy
+ defined by the [`-v`]/[`-n`]/[`-e`]/[`-l`] options.
+2. The relative orientation and position of the mates satisfy the
+ constraints defined by the [`-I`]/[`-X`]/[`--fr`]/[`--rf`]/[`--ff`]
+ options.
+
+Policies governing which paired-end alignments are reported for a
+given read are specified using the [`-k`], [`-a`] and [`-m`] options as
+usual. The [`--strata`] and [`--best`] options do not apply in
+paired-end mode.
+
+A paired-end alignment is reported as a pair of mate alignments, both
+on a separate line, where the alignment for each mate is formatted the
+same as an unpaired (singleton) alignment. The alignment for the mate
+that occurs closest to the beginning of the reference sequence (the
+"upstream" mate) is always printed before the alignment for the
+downstream mate. Reads files containing paired-end reads will
+sometimes name the reads according to whether they are the #1 or #2
+mates by appending a `/1` or `/2` suffix to the read name. If no such
+suffix is present in Bowtie's input, the suffix will be added when
+Bowtie prints read names in alignments (except in [`-S`] "SAM" mode,
+where mate information is encoded in the `FLAGS` field instead).
+
+Finding a valid paired-end alignment where both mates align to
+repetitive regions of the reference can be very time-consuming. By
+default, Bowtie avoids much of this cost by imposing a limit on the
+number of "tries" it makes to match an alignment for one mate with a
+nearby alignment for the other. The default limit is 100. This causes
+`bowtie` to miss some valid paired-end alignments where both mates lie
+in repetitive regions, but the user may use the [`--pairtries`] or
+[`-y`] options to increase Bowtie's sensitivity as desired.
+
+Paired-end alignments where one mate's alignment is entirely contained
+within the other's are considered invalid.
+
+When colospace alignment is enabled via [`-C`], the default setting for
+paired-end orientation is [`--ff`]. This is because most SOLiD datasets
+have that orientation. When colorspace alignment is not enabled
+(default), the default setting for orientation is [`--fr`], since most
+Illumina datasets have this orientation. The default can be overriden
+in either case.
+
+Because Bowtie uses an in-memory representation of the original
+reference string when finding paired-end alignments, its memory
+footprint is larger when aligning paired-end reads. For example, the
+human index has a memory footprint of about 2.2 GB in single-end mode
+and 2.9 GB in paired-end mode. Note that paired-end and unpaired
+alignment incur the same memory footprint in colorspace (e.g. human
+incurs about 2.9 GB)
+
+Colorspace Alignment
+--------------------
+
+[Colorspace alignment]: #colorspace-alignment
+
+As of version 0.12.0, `bowtie` can align colorspace reads against a
+colorspace index when [`-C`] is specified. Colorspace is the
+characteristic output format of Applied Biosystems' SOLiD system. In a
+colorspace read, each character is a color rather than a nucleotide,
+where a color encodes a class of dinucleotides. E.g. the color blue
+encodes any of the dinucleotides: AA, CC, GG, TT. Colorspace has the
+advantage of (often) being able to distinguish sequencing errors from
+SNPs once the read has been aligned. See ABI's [Principles of Di-Base
+Sequencing] document for details.
+
+### Colorspace reads
+
+All input formats (FASTA [`-f`], FASTQ [`-q`], raw [`-r`], tab-delimited
+[`--12`](#command-line), command-line [`-c`]) are compatible with colorspace ([`-C`]).
+When [`-C`] is specified, read sequences are treated as colors. Colors
+may be encoded either as numbers (`0`=blue, `1`=green, `2`=orange,
+`3`=red) or as characters `A/C/G/T` (`A`=blue, `C`=green, `G`=orange,
+`T`=red).
+
+Some reads include a primer base as the first character; e.g.:
+
+ >1_53_33_F3
+ T2213120002010301233221223311331
+ >1_53_70_F3
+ T2302111203131231130300111123220
+ ...
+
+Here, `T` is the primer base. `bowtie` detects and handles primer
+bases properly (i.e., the primer base and the adjacent color are both
+trimmed away prior to alignment) as long as the rest of the read is
+encoded as numbers.
+
+`bowtie` also handles input in the form of parallel `.csfasta` and
+`_QV.qual` files. Use [`-f`] to specify the `.csfasta` files and [`-Q`]
+(for unpaired reads) or [`--Q1`]/[`--Q2`] (for paired-end reads) to
+specify the corresponding `_QV.qual` files. It is not necessary to
+first convert to FASTQ, though `bowtie` also handles FASTQ-formatted
+colorspace reads (with [`-q`], the default).
+
+### Building a colorspace index
+
+A colorspace index is built in the same way as a normal index except
+that [`-C`](#bowtie-build-options-C) must be specified when running `bowtie-build`. If the user
+attempts to use `bowtie` without [`-C`] to align against an index that
+was built with [`-C`] (or vice versa), `bowtie` prints an error message
+and quits.
+
+### Decoding colorspace alignments
+
+Once a colorspace read is aligned, Bowtie decodes the alignment into
+nucleotides and reports the decoded nucleotide sequence. A principled
+decoding scheme is necessary because many different possible decodings
+are usually possible. Finding the true decoding with 100% certainty
+requires knowing all variants (e.g. SNPs) in the subject's genome
+beforehand, which is usually not possible. Instead, `bowtie` employs
+the approximate decoding scheme described in the [BWA paper]. This
+scheme attempts to distinguish variants from sequencing errors
+according to their relative likelihood under a model that considers the
+quality values of the colors and the (configurable) global likelihood
+of a SNP.
+
+Quality values are also "decoded" so that each reported quality value
+is a function of the two color qualities overlapping it. Bowtie again
+adopts the scheme described in the [BWA paper], i.e., the decoded
+nucleotide quality is either the sum of the overlapping color qualities
+(when both overlapping colors correspond to bases that match in the
+alignment), the quality of the matching color minus the quality of the
+mismatching color, or 0 (when both overlapping colors correspond to
+mismatches).
+
+For accurate decoding, [`--snpphred`]/[`--snpfrac`] should be set according
+to the user's best guess of the SNP frequency in the subject. The
+[`--snpphred`] parameter sets the SNP penalty directly (on the [Phred
+quality] scale), whereas [`--snpfrac`] allows the user to specify the
+fraction of sites expected to be SNPs; the fraction is then converted
+to a [Phred quality] internally. For the purpose of decoding, the SNP
+fraction is defined in terms of SNPs per *haplotype* base. Thus, if
+the genome is diploid, heterozygous SNPs have half the weight of
+homozygous SNPs
+
+Note that in [`-S`/`--sam`] mode, the decoded nucleotide sequence is
+printed for alignments, but the original color sequence (with `A`=blue,
+`C`=green, `G`=orange, `T`=red) is printed for unaligned reads without
+any reported alignments. As always, the [`--un`], [`--max`] and [`--al`]
+parameters print reads exactly as they appeared in the input file.
+
+### Paired-end colorspace alignment
+
+Like other platforms, SOLiD supports generation of paired-end reads.
+When colorspace alignment is enabled, the default paired-end
+orientation setting is [`--ff`]. This is because most SOLiD datasets
+have that orientation.
+
+Note that SOLiD-generated read files can have "orphaned" mates; i.e.
+mates without a correpsondingly-named mate in the other file. To avoid
+problems due to orphaned mates, SOLiD paired-end output should first be
+converted to `.csfastq` files with unpaired mates omitted. This can be
+accomplished using, for example, [Galaxy]'s conversion tool (click
+"NGS: QC and manipulation", then "SOLiD-to-FASTQ" in the left-hand
+sidebar).
+
+[Principles of Di-Base Sequencing]: http://tinyurl.com/ygnb2gn
+[Decoding colorspace alignments]: #decoding-colorspace-alignments
+[BWA paper]: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/25/14/1754
+
+Performance Tuning
+------------------
+
+[Performance tuning]: #performance-tuning
+
+1. Use 64-bit bowtie if possible
+
+ The 64-bit version of Bowtie is substantially (usually more then
+ 50%) faster than the 32-bit version, owing to its use of 64-bit
+ arithmetic. If possible, download the 64-bit binaries for Bowtie
+ and run on a 64-bit computer. If you are building Bowtie from
+ sources, you may need to pass the `-m64` option to `g++` to compile
+ the 64-bit version; you can do this by including `BITS=64` in the
+ arguments to the `make` command; e.g.: `make BITS=64 bowtie`. To
+ determine whether your version of bowtie is 64-bit or 32-bit, run
+ `bowtie --version`.
+
+2. If your computer has multiple processors/cores, use `-p`
+
+ The [`-p`] option causes Bowtie to launch a specified number of
+ parallel search threads. Each thread runs on a different
+ processor/core and all threads find alignments in parallel,
+ increasing alignment throughput by approximately a multiple of the
+ number of threads (though in practice, speedup is somewhat worse
+ than linear).
+
+3. If reporting many alignments per read, try tweaking
+ `bowtie-build --offrate`
+
+ If you are using the [`-k`], [`-a`] or [`-m`] options and Bowtie is
+ reporting many alignments per read (an average of more than about
+ 10 per read) and you have some memory to spare, using an index with
+ a denser SA sample can speed things up considerably.
+
+ To do this, specify a smaller-than-default [`-o`/`--offrate`](#bowtie-build-options-o) value
+ when running `bowtie-build`. A denser SA sample yields a larger
+ index, but is also particularly effective at speeding up alignment
+ when many alignments are reported per read. For example,
+ decreasing the index's [`-o`/`--offrate`](#bowtie-build-options-o) by 1 could as much as
+ double alignment performance, and decreasing by 2 could quadruple
+ alignment performance, etc.
+
+ On the other hand, decreasing [`-o`/`--offrate`](#bowtie-build-options-o) increases the size
+ of the Bowtie index, both on disk and in memory when aligning
+ reads. At the default [`-o`/`--offrate`](#bowtie-build-options-o) of 5, the SA sample for the
+ human genome occupies about 375 MB of memory when aligning reads.
+ Decreasing the [`-o`/`--offrate`](#bowtie-build-options-o) by 1 doubles the memory taken by
+ the SA sample, and decreasing by 2 quadruples the memory taken,
+ etc.
+
+4. If bowtie "thrashes", try increasing `bowtie --offrate`
+
+ If `bowtie` runs very slow on a relatively low-memory machine
+ (having less than about 4 GB of memory), then try setting `bowtie`
+ [`-o`/`--offrate`] to a *larger* value than the value used to build
+ the index. For example, `bowtie-build`'s default [`-o`/`--offrate`](#bowtie-build-options-o)
+ is 5 and all pre-built indexes available from the Bowtie website
+ are built with [`-o`/`--offrate`](#bowtie-build-options-o) 5; so if `bowtie` thrashes when
+ querying such an index, try using `bowtie` [`--offrate`] 6. If
+ `bowtie` still thrashes, try `bowtie` [`--offrate`] 7, etc. A higher
+ [`-o`/`--offrate`] causes `bowtie` to use a sparser sample of the
+ suffix array than is stored in the index; this saves memory but
+ makes alignment reporting slower (which is especially slow when
+ using [`-a`] or large [`-k`] or [`-m`]).
+
+Command Line
+------------
+
+Usage:
+
+ bowtie [options]* <ebwt> {-1 <m1> -2 <m2> | --12 <r> | <s>} [<hit>]
+
+### Main arguments
+
+<table><tr><td>
+
+ <ebwt>
+
+</td><td>
+
+The basename of the index to be searched. The basename is the name of
+any of the index files up to but not including the final `.1.ebwt` /
+`.rev.1.ebwt` / etc. `bowtie` looks for the specified index first in
+the current directory, then in the `indexes` subdirectory under the
+directory where the `bowtie` executable is located, then looks in the
+directory specified in the `BOWTIE_INDEXES` environment variable.
+
+</td></tr><tr><td>
+
+ <m1>
+
+</td><td>
+
+Comma-separated list of files containing the #1 mates (filename usually
+includes `_1`), or, if [`-c`] is specified, the mate sequences
+themselves. E.g., this might be `flyA_1.fq,flyB_1.fq`, or, if [`-c`]
+is specified, this might be `GGTCATCCT,ACGGGTCGT`. Sequences specified
+with this option must correspond file-for-file and read-for-read with
+those specified in `<m2>`. Reads may be a mix of different lengths.
+If `-` is specified, `bowtie` will read the #1 mates from the "standard
+in" filehandle.
+
+</td></tr><tr><td>
+
+ <m2>
+
+</td><td>
+
+Comma-separated list of files containing the #2 mates (filename usually
+includes `_2`), or, if [`-c`] is specified, the mate sequences
+themselves. E.g., this might be `flyA_2.fq,flyB_2.fq`, or, if [`-c`]
+is specified, this might be `GGTCATCCT,ACGGGTCGT`. Sequences specified
+with this option must correspond file-for-file and read-for-read with
+those specified in `<m1>`. Reads may be a mix of different lengths.
+If `-` is specified, `bowtie` will read the #2 mates from the "standard
+in" filehandle.
+
+</td></tr><tr><td>
+
+ <r>
+
+</td><td>
+
+Comma-separated list of files containing a mix of unpaired and
+paired-end reads in Tab-delimited format. Tab-delimited format is a
+1-read-per-line format where unpaired reads consist of a read name,
+sequence and quality string each separated by tabs. A paired-end read
+consists of a read name, sequnce of the #1 mate, quality values of the
+#1 mate, sequence of the #2 mate, and quality values of the #2 mate
+separated by tabs. Quality values can be expressed using any of the
+scales supported in FASTQ files. Reads may be a mix of different
+lengths and paired-end and unpaired reads may be intermingled in the
+same file. If `-` is specified, `bowtie` will read the Tab-delimited
+reads from the "standard in" filehandle.
+
+</td></tr><tr><td>
+
+ <s>
+
+</td><td>
+
+A comma-separated list of files containing unpaired reads to be
+aligned, or, if [`-c`] is specified, the unpaired read sequences
+themselves. E.g., this might be
+`lane1.fq,lane2.fq,lane3.fq,lane4.fq`, or, if [`-c`] is specified, this
+might be `GGTCATCCT,ACGGGTCGT`. Reads may be a mix of different
+lengths. If `-` is specified, Bowtie gets the reads from the "standard
+in" filehandle.
+
+</td></tr><tr><td>
+
+ <hit>
+
+</td><td>
+
+File to write alignments to. By default, alignments are written to the
+"standard out" filehandle (i.e. the console).
+
+</td></tr></table>
+
+### Options
+
+#### Input
+
+<table>
+<tr><td id="bowtie-options-q">
+
+[`-q`]: #bowtie-options-q
+
+ -q
+
+</td><td>
+
+The query input files (specified either as `<m1>` and `<m2>`, or as
+`<s>`) are FASTQ files (usually having extension `.fq` or `.fastq`).
+This is the default. See also: [`--solexa-quals`] and
+[`--integer-quals`].
+
+</td></tr><tr><td id="bowtie-options-f">
+
+[`-f`]: #bowtie-options-f
+
+ -f
+
+</td><td>
+
+The query input files (specified either as `<m1>` and `<m2>`, or as
+`<s>`) are FASTA files (usually having extension `.fa`, `.mfa`, `.fna`
+or similar). All quality values are assumed to be 40 on the [Phred
+quality] scale.
+
+</td></tr><tr><td id="bowtie-options-r">
+
+[`-r`]: #bowtie-options-r
+
+ -r
+
+</td><td>
+
+The query input files (specified either as `<m1>` and `<m2>`, or as
+`<s>`) are Raw files: one sequence per line, without quality values or
+names. All quality values are assumed to be 40 on the [Phred quality]
+scale.
+
+</td></tr><tr><td id="bowtie-options-c">
+
+[`-c`]: #bowtie-options-c
+
+ -c
+
+</td><td>
+
+The query sequences are given on command line. I.e. `<m1>`, `<m2>` and
+`<singles>` are comma-separated lists of reads rather than lists of
+read files.
+
+</td></tr><tr><td id="bowtie-options-C">
+
+[`-C`]: #bowtie-options-C
+[`-C`/`--color`]: #bowtie-options-C
+
+ -C/--color
+
+</td><td>
+
+Align in colorspace. Read characters are interpreted as colors. The
+index specified must be a colorspace index (i.e. built with
+`bowtie-build` [`-C`](#bowtie-build-options-C), or `bowtie` will print an error message and quit.
+See [Colorspace alignment] for more details.
+
+</td></tr><tr><td id="bowtie-options-Q">
+
+[`-Q`]: #bowtie-options-Q
+[`-Q`/`--quals`]: #bowtie-options-Q
+
+ -Q/--quals <files>
+
+</td><td>
+
+Comma-separated list of files containing quality values for
+corresponding unpaired CSFASTA reads. Use in combination with [`-C`]
+and [`-f`]. [`--integer-quals`] is set automatically when `-Q`/`--quals`
+is specified.
+
+</td></tr><tr><td id="bowtie-options-Q1">
+
+[`--Q1`]: #bowtie-options-Q1
+
+ --Q1 <files>
+
+</td><td>
+
+Comma-separated list of files containing quality values for
+corresponding CSFASTA #1 mates. Use in combination with [`-C`], [`-f`],
+and [`-1`](#command-line). [`--integer-quals`] is set automatically when `--Q1`
+is specified.
+
+</td></tr><tr><td id="bowtie-options-Q2">
+
+[`--Q2`]: #bowtie-options-Q2
+
+ --Q2 <files>
+
+</td><td>
+
+Comma-separated list of files containing quality values for
+corresponding CSFASTA #2 mates. Use in combination with [`-C`], [`-f`],
+and [`-2`](#command-line). [`--integer-quals`] is set automatically when `--Q2`
+is specified.
+
+</td></tr><tr><td id="bowtie-options-s">
+
+[`-s`/`--skip`]: #bowtie-options-s
+[`-s`]: #bowtie-options-s
+
+ -s/--skip <int>
+
+</td><td>
+
+Skip (i.e. do not align) the first `<int>` reads or pairs in the input.
+
+</td></tr><tr><td id="bowtie-options-u">
+
+[`-u`/`--qupto`]: #bowtie-options-u
+[`-u`]: #bowtie-options-u
+
+ -u/--qupto <int>
+
+</td><td>
+
+Only align the first `<int>` reads or read pairs from the input (after
+the [`-s`/`--skip`] reads or pairs have been skipped). Default: no
+limit.
+
+</td></tr><tr><td id="bowtie-options-5">
+
+[`-5`/`--trim5`]: #bowtie-options-5
+[`-5`]: #bowtie-options-5
+
+ -5/--trim5 <int>
+
+</td><td>
+
+Trim `<int>` bases from high-quality (left) end of each read before
+alignment (default: 0).
+
+</td></tr><tr><td id="bowtie-options-3">
+
+[`-3`/`--trim3`]: #bowtie-options-3
+[`-3`]: #bowtie-options-3
+
+ -3/--trim3 <int>
+
+</td><td>
+
+Trim `<int>` bases from low-quality (right) end of each read before
+alignment (default: 0).
+
+</td></tr><tr><td id="bowtie-options-phred33-quals">
+
+[`--phred33-quals`]: #bowtie-options-phred33-quals
+
+ --phred33-quals
+
+</td><td>
+
+Input qualities are ASCII chars equal to the [Phred quality] plus 33.
+Default: on.
+
+</td></tr><tr><td id="bowtie-options-phred64-quals">
+
+[`--phred64-quals`]: #bowtie-options-phred64-quals
+
+ --phred64-quals
+
+</td><td>
+
+Input qualities are ASCII chars equal to the [Phred quality] plus 64.
+Default: off.
+
+</td></tr><tr><td id="bowtie-options-solexa-quals">
+
+[`--solexa-quals`]: #bowtie-options-solexa-quals
+
+ --solexa-quals
+
+</td><td>
+
+Convert input qualities from [Solexa][Phred quality] (which can be
+negative) to [Phred][Phred quality] (which can't). This is usually the
+right option for use with (unconverted) reads emitted by GA Pipeline
+versions prior to 1.3. Default: off.
+
+</td></tr><tr><td id="bowtie-options-solexa1.3-quals">
+
+[`--solexa1.3-quals`]: #bowtie-options-solexa1.3-quals
+
+ --solexa1.3-quals
+
+</td><td>
+
+Same as [`--phred64-quals`]. This is usually the right option for use
+with (unconverted) reads emitted by GA Pipeline version 1.3 or later.
+Default: off.
+
+</td></tr><tr><td id="bowtie-options-integer-quals">
+
+[`--integer-quals`]: #bowtie-options-integer-quals
+
+ --integer-quals
+
+</td><td>
+
+Quality values are represented in the read input file as
+space-separated ASCII integers, e.g., `40 40 30 40`..., rather than
+ASCII characters, e.g., `II?I`.... Integers are treated as being on
+the [Phred quality] scale unless [`--solexa-quals`] is also specified.
+Default: off.
+
+</td></tr></table>
+
+#### Alignment
+
+<table>
+
+<tr><td id="bowtie-options-v">
+
+[`-v`]: #bowtie-options-v
+
+ -v <int>
+
+</td><td>
+
+Report alignments with at most `<int>` mismatches. [`-e`] and [`-l`]
+options are ignored and quality values have no effect on what
+alignments are valid. [`-v`] is mutually exclusive with [`-n`].
+
+</td></tr><tr><td id="bowtie-options-n">
+
+[`-n`/`--seedmms`]: #bowtie-options-n
+[`-n`]: #bowtie-options-n
+
+ -n/--seedmms <int>
+
+</td><td>
+
+Maximum number of mismatches permitted in the "seed", i.e. the first
+`L` base pairs of the read (where `L` is set with [`-l`/`--seedlen`]).
+This may be 0, 1, 2 or 3 and the default is 2. This option is mutually
+exclusive with the [`-v`] option.
+
+</td></tr><tr><td id="bowtie-options-e">
+
+[`-e`/`--maqerr`]: #bowtie-options-e
+[`-e`]: #bowtie-options-e
+
+ -e/--maqerr <int>
+
+</td><td>
+
+Maximum permitted total of quality values at *all* mismatched read
+positions throughout the entire alignment, not just in the "seed". The
+default is 70. Like [Maq], `bowtie` rounds quality values to the
+nearest 10 and saturates at 30; rounding can be disabled with
+[`--nomaqround`].
+
+</td></tr><tr><td id="bowtie-options-l">
+
+[`-l`/`--seedlen`]: #bowtie-options-l
+[`-l`]: #bowtie-options-l
+
+ -l/--seedlen <int>
+
+</td><td>
+
+The "seed length"; i.e., the number of bases on the high-quality end of
+the read to which the [`-n`] ceiling applies. The lowest permitted
+setting is 5 and the default is 28. `bowtie` is faster for larger
+values of [`-l`].
+
+</td></tr><tr><td id="bowtie-options-nomaqround">
+
+[`--nomaqround`]: #bowtie-options-nomaqround
+
+ --nomaqround
+
+</td><td>
+
+[Maq] accepts quality values in the [Phred quality] scale, but
+internally rounds values to the nearest 10, with a maximum of 30. By
+default, `bowtie` also rounds this way. [`--nomaqround`] prevents this
+rounding in `bowtie`.
+
+</td></tr><tr><td id="bowtie-options-I">
+
+[`-I`/`--minins`]: #bowtie-options-I
+[`-I`]: #bowtie-options-I
+
+ -I/--minins <int>
+
+</td><td>
+
+The minimum insert size for valid paired-end alignments. E.g. if `-I
+60` is specified and a paired-end alignment consists of two 20-bp
+alignments in the appropriate orientation with a 20-bp gap between
+them, that alignment is considered valid (as long as [`-X`] is also
+satisfied). A 19-bp gap would not be valid in that case. If trimming
+options [`-3`] or [`-5`] are also used, the [`-I`] constraint is
+applied with respect to the untrimmed mates. Default: 0.
+
+</td></tr><tr><td id="bowtie-options-X">
+
+[`-X`/`--maxins`]: #bowtie-options-X
+[`-X`]: #bowtie-options-X
+
+ -X/--maxins <int>
+
+</td><td>
+
+The maximum insert size for valid paired-end alignments. E.g. if `-X
+100` is specified and a paired-end alignment consists of two 20-bp
+alignments in the proper orientation with a 60-bp gap between them,
+that alignment is considered valid (as long as [`-I`] is also
+satisfied). A 61-bp gap would not be valid in that case. If trimming
+options [`-3`] or [`-5`] are also used, the `-X` constraint is applied
+with respect to the untrimmed mates, not the trimmed mates. Default:
+250.
+
+</td></tr><tr><td id="bowtie-options-fr">
+
+[`--fr`/`--rf`/`--ff`]: #bowtie-options-fr
+[`--fr`]: #bowtie-options-fr
+[`--rf`]: #bowtie-options-fr
+[`--ff`]: #bowtie-options-fr
+
+ --fr/--rf/--ff
+
+</td><td>
+
+The upstream/downstream mate orientations for a valid paired-end
+alignment against the forward reference strand. E.g., if `--fr` is
+specified and there is a candidate paired-end alignment where mate1
+appears upstream of the reverse complement of mate2 and the insert
+length constraints are met, that alignment is valid. Also, if mate2
+appears upstream of the reverse complement of mate1 and all other
+constraints are met, that too is valid. `--rf` likewise requires that
+an upstream mate1 be reverse-complemented and a downstream mate2 be
+forward-oriented. ` --ff` requires both an upstream mate1 and a
+downstream mate2 to be forward-oriented. Default: `--fr` when [`-C`]
+(colorspace alignment) is not specified, `--ff` when [`-C`] is specified.
+
+</td></tr><tr><td id="bowtie-options-nofw">
+
+[`--nofw`]: #bowtie-options-nofw
+
+ --nofw/--norc
+
+</td><td>
+
+If `--nofw` is specified, `bowtie` will not attempt to align against
+the forward reference strand. If `--norc` is specified, `bowtie` will
+not attempt to align against the reverse-complement reference strand.
+For paired-end reads using [`--fr`] or [`--rf`] modes, `--nofw` and
+`--norc` apply to the forward and reverse-complement pair orientations.
+I.e. specifying `--nofw` and [`--fr`] will only find reads in the R/F
+orientation where mate 2 occurs upstream of mate 1 with respect to the
+forward reference strand.
+
+</td></tr><tr><td id="bowtie-options-maxbts">
+
+[`--maxbts`]: #bowtie-options-maxbts
+
+ --maxbts
+
+</td><td>
+
+The maximum number of backtracks permitted when aligning a read in
+[`-n`] 2 or [`-n`] 3 mode (default: 125 without [`--best`], 800 with
+[`--best`]). A "backtrack" is the introduction of a speculative
+substitution into the alignment. Without this limit, the default
+parameters will sometimes require that `bowtie` try 100s or 1,000s of
+backtracks to align a read, especially if the read has many low-quality
+bases and/or has no valid alignments, slowing bowtie down
+significantly. However, this limit may cause some valid alignments to
+be missed. Higher limits yield greater sensitivity at the expensive of
+longer running times. See also: [`-y`/`--tryhard`].
+
+</td></tr><tr><td id="bowtie-options-pairtries">
+
+[`--pairtries`]: #bowtie-options-pairtries
+
+ --pairtries <int>
+
+</td><td>
+
+For paired-end alignment, this is the maximum number of attempts
+`bowtie` will make to match an alignment for one mate up with an
+alignment for the opposite mate. Most paired-end alignments require
+only a few such attempts, but pairs where both mates occur in highly
+repetitive regions of the reference can require significantly more.
+Setting this to a higher number allows `bowtie` to find more paired-
+end alignments for repetitive pairs at the expense of speed. The
+default is 100. See also: [`-y`/`--tryhard`].
+
+</td></tr><tr><td id="bowtie-options-y">
+
+[`-y`/`--tryhard`]: #bowtie-options-y
+[`-y`]: #bowtie-options-y
+
+ -y/--tryhard
+
+</td><td>
+
+Try as hard as possible to find valid alignments when they exist,
+including paired-end alignments. This is equivalent to specifying very
+high values for the [`--maxbts`] and [`--pairtries`] options. This
+mode is generally much slower than the default settings, but can be
+useful for certain problems. This mode is slower when (a) the
+reference is very repetitive, (b) the reads are low quality, or (c) not
+many reads have valid alignments.
+
+</td></tr><tr><td id="bowtie-options-chunkmbs">
+
+[`--chunkmbs`]: #bowtie-options-chunkmbs
+
+ --chunkmbs <int>
+
+</td><td>
+
+The number of megabytes of memory a given thread is given to store path
+descriptors in [`--best`] mode. Best-first search must keep track of
+many paths at once to ensure it is always extending the path with the
+lowest cumulative cost. Bowtie tries to minimize the memory impact of
+the descriptors, but they can still grow very large in some cases. If
+you receive an error message saying that chunk memory has been
+exhausted in [`--best`] mode, try adjusting this parameter up to
+dedicate more memory to the descriptors. Default: 64.
+
+</td></tr></table>
+
+#### Reporting
+
+<table><tr><td id="bowtie-options-k">
+
+[`-k`]: #bowtie-options-k
+
+ -k <int>
+
+</td><td>
+
+Report up to `<int>` valid alignments per read or pair (default: 1).
+Validity of alignments is determined by the alignment policy (combined
+effects of [`-n`], [`-v`], [`-l`], and [`-e`]). If more than one valid
+alignment exists and the [`--best`] and [`--strata`] options are
+specified, then only those alignments belonging to the best alignment
+"stratum" will be reported. Bowtie is designed to be very fast for
+small [`-k`] but bowtie can become significantly slower as [`-k`]
+increases. If you would like to use Bowtie for larger values of
+[`-k`], consider building an index with a denser suffix-array sample,
+i.e. specify a smaller [`-o`/`--offrate`](#bowtie-build-options-o) when invoking `bowtie-build`
+for the relevant index (see the [Performance tuning] section for
+details).
+
+</td></tr><tr><td id="bowtie-options-a">
+
+[`-a`/`--all`]: #bowtie-options-a
+[`-a`]: #bowtie-options-a
+
+ -a/--all
+
+</td><td>
+
+Report all valid alignments per read or pair (default: off). Validity
+of alignments is determined by the alignment policy (combined effects
+of [`-n`], [`-v`], [`-l`], and [`-e`]). If more than one valid alignment
+exists and the [`--best`] and [`--strata`] options are specified, then only
+those alignments belonging to the best alignment "stratum" will be
+reported. Bowtie is designed to be very fast for small [`-k`] but bowtie
+can become significantly slower if [`-a`/`--all`] is specified. If you
+would like to use Bowtie with [`-a`], consider building an index with a
+denser suffix-array sample, i.e. specify a smaller [`-o`/`--offrate`](#bowtie-build-options-o)
+when invoking `bowtie-build` for the relevant index (see the
+[Performance tuning] section for details).
+
+</td></tr><tr><td id="bowtie-options-m">
+
+[`-m`]: #bowtie-options-m
+
+ -m <int>
+
+</td><td>
+
+Suppress all alignments for a particular read or pair if more than
+`<int>` reportable alignments exist for it. Reportable alignments are
+those that would be reported given the [`-n`], [`-v`], [`-l`], [`-e`], [`-k`],
+[`-a`], [`--best`], and [`--strata`] options. Default: no limit. Bowtie is
+designed to be very fast for small [`-m`] but bowtie can become
+significantly slower for larger values of [`-m`]. If you would like to
+use Bowtie for larger values of [`-k`], consider building an index with a
+denser suffix-array sample, i.e. specify a smaller [`-o`/`--offrate`](#bowtie-build-options-o) when
+invoking `bowtie-build` for the relevant index (see the [Performance
+tuning] section for details).
+
+</td></tr><tr><td id="bowtie-options-M">
+
+[`-M`]: #bowtie-options-M
+
+ -M <int>
+
+</td><td>
+
+Behaves like [`-m`] except that if a read has more than `<int>`
+reportable alignments, one is reported at random. In [default
+output mode], the selected alignment's 7th column is set to `<int>`+1 to
+indicate the read has at least `<int>`+1 valid alignments. In
+[`-S`/`--sam`] mode, the selected alignment is given a `MAPQ` (mapping
+quality) of 0 and the `XM:I` field is set to `<int>`+1. This option
+requires [`--best`]; if specified without [`--best`], [`--best`] is enabled
+automatically.
+
+[default output mode]: #default-bowtie-output
+
+</td></tr><tr><td id="bowtie-options-best">
+
+[`--best`]: #bowtie-options-best
+
+ --best
+
+</td><td>
+
+Make Bowtie guarantee that reported singleton alignments are "best" in
+terms of stratum (i.e. number of mismatches, or mismatches in the seed
+in the case of [`-n`] mode) and in terms of the quality values at the
+mismatched position(s). Stratum always trumps quality; e.g. a
+1-mismatch alignment where the mismatched position has [Phred quality]
+40 is preferred over a 2-mismatch alignment where the mismatched
+positions both have [Phred quality] 10. When [`--best`] is not
+specified, Bowtie may report alignments that are sub-optimal in terms
+of stratum and/or quality (though an effort is made to report the best
+alignment). [`--best`] mode also removes all strand bias. Note that
+[`--best`] does not affect which alignments are considered "valid" by
+`bowtie`, only which valid alignments are reported by `bowtie`. When
+[`--best`] is specified and multiple hits are allowed (via [`-k`] or
+[`-a`]), the alignments for a given read are guaranteed to appear in
+best-to-worst order in `bowtie`'s output. `bowtie` is somewhat slower
+when [`--best`] is specified.
+
+</td></tr><tr><td id="bowtie-options-strata">
+
+[`--strata`]: #bowtie-options-strata
+
+ --strata
+
+</td><td>
+
+If many valid alignments exist and are reportable (e.g. are not
+disallowed via the [`-k`] option) and they fall into more than one
+alignment "stratum", report only those alignments that fall into the
+best stratum. By default, Bowtie reports all reportable alignments
+regardless of whether they fall into multiple strata. When
+[`--strata`] is specified, [`--best`] must also be specified.
+
+</td></tr>
+</table>
+
+#### Output
+
+<table>
+
+<tr><td id="bowtie-options-t">
+
+[`-t`/`--time`]: #bowtie-options-t
+[`-t`]: #bowtie-options-t
+
+ -t/--time
+
+</td><td>
+
+Print the amount of wall-clock time taken by each phase.
+
+</td></tr><tr><td id="bowtie-options-B">
+
+[`-B`/`--offbase`]: #bowtie-options-B
+[`-B`]: #bowtie-options-B
+
+ -B/--offbase <int>
+
+</td><td>
+
+When outputting alignments, number the first base of a reference
+sequence as `<int>`. Default: 0.
+
+</td></tr><tr><td id="bowtie-options-quiet">
+
+[`--quiet`]: #bowtie-options-quiet
+
+ --quiet
+
+</td><td>
+
+Print nothing besides alignments.
+
+</td></tr><tr><td id="bowtie-options-refout">
+
+[`--refout`]: #bowtie-options-refout
+
+ --refout
+
+</td><td>
+
+Write alignments to a set of files named `refXXXXX.map`, where `XXXXX`
+is the 0-padded index of the reference sequence aligned to. This can
+be a useful way to break up work for downstream analyses when dealing
+with, for example, large numbers of reads aligned to the assembled
+human genome. If `<hits>` is also specified, it will be ignored.
+
+</td></tr><tr><td id="bowtie-options-refidx">
+
+[`--refidx`]: #bowtie-options-refidx
+
+ --refidx
+
+</td><td>
+
+When a reference sequence is referred to in a reported alignment, refer
+to it by 0-based index (its offset into the list of references that
+were indexed) rather than by name.
+
+</td></tr><tr><td id="bowtie-options-al">
+
+[`--al`]: #bowtie-options-al
+
+ --al <filename>
+
+</td><td>
+
+Write all reads for which at least one alignment was reported to a file
+with name `<filename>`. Written reads will appear as they did in the
+input, without any of the trimming or translation of quality values
+that may have taken place within `bowtie`. Paired-end reads will be
+written to two parallel files with `_1` and `_2` inserted in the
+filename, e.g., if `<filename>` is `aligned.fq`, the #1 and #2 mates
+that fail to align will be written to `aligned_1.fq` and `aligned_2.fq`
+respectively.
+
+</td></tr><tr><td id="bowtie-options-un">
+
+[`--un`]: #bowtie-options-un
+
+ --un <filename>
+
+</td><td>
+
+Write all reads that could not be aligned to a file with name
+`<filename>`. Written reads will appear as they did in the input,
+without any of the trimming or translation of quality values that may
+have taken place within Bowtie. Paired-end reads will be written to
+two parallel files with `_1` and `_2` inserted in the filename, e.g.,
+if `<filename>` is `unaligned.fq`, the #1 and #2 mates that fail to
+align will be written to `unaligned_1.fq` and `unaligned_2.fq`
+respectively. Unless [`--max`] is also specified, reads with a number
+of valid alignments exceeding the limit set with the [`-m`] option are
+also written to `<filename>`.
+
+</td></tr><tr><td id="bowtie-options-max">
+
+[`--max`]: #bowtie-options-max
+
+ --max <filename>
+
+</td><td>
+
+Write all reads with a number of valid alignments exceeding the limit
+set with the [`-m`] option to a file with name `<filename>`. Written
+reads will appear as they did in the input, without any of the trimming
+or translation of quality values that may have taken place within
+`bowtie`. Paired-end reads will be written to two parallel files with
+`_1` and `_2` inserted in the filename, e.g., if `<filename>` is
+`max.fq`, the #1 and #2 mates that exceed the [`-m`] limit will be
+written to `max_1.fq` and `max_2.fq` respectively. These reads are not
+written to the file specified with [`--un`].
+
+</td></tr><tr><td id="bowtie-options-suppress">
+
+[`--suppress`]: #bowtie-options-suppress
+
+ --suppress <cols>
+
+</td><td>
+
+Suppress columns of output in the [default output mode]. E.g. if
+`--suppress 1,5,6` is specified, the read name, read sequence, and read
+quality fields will be omitted. See [Default Bowtie output] for field
+descriptions. This option is ignored if the output mode is
+[`-S`/`--sam`].
+
+</td></tr>
+<tr><td id="bowtie-options-fullref">
+
+[`--fullref`]: #bowtie-options-fullref
+
+ --fullref
+
+</td><td>
+
+Print the full refernce sequence name, including whitespace, in
+alignment output. By default `bowtie` prints everything up to but not
+including the first whitespace.
+
+</td></tr></table>
+
+#### Colorspace
+
+<table>
+<tr><td id="bowtie-options-snpphred">
+
+[`--snpphred`]: #bowtie-options-snpphred
+
+ --snpphred <int>
+
+</td><td>
+
+When decoding colorspace alignments, use `<int>` as the SNP penalty.
+This should be set to the user's best guess of the true ratio of SNPs
+per base in the subject genome, converted to the [Phred quality] scale.
+E.g., if the user expects about 1 SNP every 1,000 positions,
+`--snpphred` should be set to 30 (which is also the default). To
+specify the fraction directly, use [`--snpfrac`].
+
+</td></tr>
+<tr><td id="bowtie-options-snpfrac">
+
+[`--snpfrac`]: #bowtie-options-snpfrac
+
+ --snpfrac <dec>
+
+</td><td>
+
+When decoding colorspace alignments, use `<dec>` as the estimated ratio
+of SNPs per base. For best decoding results, this should be set to the
+user's best guess of the true ratio. `bowtie` internally converts the
+ratio to a [Phred quality], and behaves as if that quality had been set
+via the [`--snpphred`] option. Default: 0.001.
+
+</td></tr>
+<tr><td id="bowtie-options-col-cseq">
+
+[`--col-cseq`]: #bowtie-options-col-cseq
+
+ --col-cseq
+
+</td><td>
+
+If reads are in colorspace and the [default output mode] is active,
+`--col-cseq` causes the reads' color sequence to appear in the
+read-sequence column (column 5) instead of the decoded nucleotide
+sequence. See the [Decoding colorspace alignments] section for details
+about decoding. This option is ignored in [`-S`/`--sam`] mode.
+
+</td></tr>
+<tr><td id="bowtie-options-col-cqual">
+
+[`--col-cqual`]: #bowtie-options-col-cqual
+
+ --col-cqual
+
+</td><td>
+
+If reads are in colorspace and the [default output mode] is active,
+`--col-cqual` causes the reads' original (color) quality sequence to
+appear in the quality column (column 6) instead of the decoded
+qualities. See the [Colorspace alignment] section for details about
+decoding. This option is ignored in [`-S`/`--sam`] mode.
+
+</td></tr>
+<tr><td id="bowtie-options-col-keepends">
+
+[`--col-keepends`]: #bowtie-options-col-keepends
+
+ --col-keepends
+
+</td><td>
+
+When decoding colorpsace alignments, `bowtie` trims off a nucleotide
+and quality from the left and right edges of the alignment. This is
+because those nucleotides are supported by only one color, in contrast
+to the middle nucleotides which are supported by two. Specify
+`--col-keepends` to keep the extreme-end nucleotides and qualities.
+
+</td></tr>
+</table>
+
+#### SAM
+
+<table>
+
+<tr><td id="bowtie-options-S">
+
+[`-S`/`--sam`]: #bowtie-options-S
+[`-S`]: #bowtie-options-S
+
+ -S/--sam
+
+</td><td>
+
+Print alignments in [SAM] format. See the [SAM output] section of the
+manual for details. To suppress all SAM headers, use [`--sam-nohead`]
+in addition to `-S/--sam`. To suppress just the `@SQ` headers (e.g. if
+the alignment is against a very large number of reference sequences),
+use [`--sam-nosq`] in addition to `-S/--sam`. `bowtie` does not write
+BAM files directly, but SAM output can be converted to BAM on the fly
+by piping `bowtie`'s output to `samtools view`. [`-S`/`--sam`] is not
+compatible with [`--refout`].
+
+[SAM output]: #sam-bowtie-output
+
+</td></tr><tr><td id="bowtie-options-mapq">
+
+[`--mapq`]: #bowtie-options-mapq
+
+ --mapq <int>
+
+</td><td>
+
+If an alignment is non-repetitive (according to [`-m`], [`--strata`] and
+other options) set the `MAPQ` (mapping quality) field to this value.
+See the [SAM Spec][SAM] for details about the `MAPQ` field Default: 255.
+
+</td></tr><tr><td id="bowtie-options-sam-nohead">
+
+[`--sam-nohead`]: #bowtie-options-sam-nohead
+
+ --sam-nohead
+
+</td><td>
+
+Suppress header lines (starting with `@`) when output is [`-S`/`--sam`].
+This must be specified *in addition to* [`-S`/`--sam`]. `--sam-nohead`
+is ignored unless [`-S`/`--sam`] is also specified.
+
+</td></tr><tr><td id="bowtie-options-sam-nosq">
+
+[`--sam-nosq`]: #bowtie-options-sam-nosq
+
+ --sam-nosq
+
+</td><td>
+
+Suppress `@SQ` header lines when output is [`-S`/`--sam`]. This must be
+specified *in addition to* [`-S`/`--sam`]. `--sam-nosq` is ignored
+unless [`-S`/`--sam`] is also specified.
+
+</td></tr><tr><td id="bowtie-options-sam-RG">
+
+[`--sam-RG`]: #bowtie-options-sam-RG
+
+ --sam-RG <text>
+
+</td><td>
+
+Add `<text>` (usually of the form `TAG:VAL`, e.g. `ID:IL7LANE2`) as a
+field on the `@RG` header line. Specify `--sam-RG` multiple times to
+set multiple fields. See the [SAM Spec][SAM] for details about what fields
+are legal. Note that, if any `@RG` fields are set using this option,
+the `ID` and `SM` fields must both be among them to make the `@RG` line
+legal according to the [SAM Spec][SAM]. `--sam-RG` is ignored unless
+[`-S`/`--sam`] is also specified.
+
+</td></tr></table>
+
+#### Performance
+
+<table><tr>
+
+<td id="bowtie-options-o">
+
+[`-o`/`--offrate`]: #bowtie-options-o
+[`-o`]: #bowtie-options-o
+[`--offrate`]: #bowtie-options-o
+
+ -o/--offrate <int>
+
+</td><td>
+
+Override the offrate of the index with `<int>`. If `<int>` is greater
+than the offrate used to build the index, then some row markings are
+discarded when the index is read into memory. This reduces the memory
+footprint of the aligner but requires more time to calculate text
+offsets. `<int>` must be greater than the value used to build the
+index.
+
+</td></tr><tr><td id="bowtie-options-p">
+
+[`-p`/`--threads`]: #bowtie-options-p
+[`-p`]: #bowtie-options-p
+
+ -p/--threads <int>
+
+</td><td>
+
+Launch `<int>` parallel search threads (default: 1). Threads will run
+on separate processors/cores and synchronize when parsing reads and
+outputting alignments. Searching for alignments is highly parallel,
+and speedup is fairly close to linear. This option is only available
+if `bowtie` is linked with the `pthreads` library (i.e. if
+`BOWTIE_PTHREADS=0` is not specified at build time).
+
+</td></tr><tr><td id="bowtie-options-mm">
+
+[`--mm`]: #bowtie-options-mm
+
+ --mm
+
+</td><td>
+
+Use memory-mapped I/O to load the index, rather than normal C file I/O.
+Memory-mapping the index allows many concurrent `bowtie` processes on
+the same computer to share the same memory image of the index (i.e. you
+pay the memory overhead just once). This facilitates memory-efficient
+parallelization of `bowtie` in situations where using [`-p`] is not
+possible.
+
+</td></tr><tr><td id="bowtie-options-shmem">
+
+[`--shmem`]: #bowtie-options-shmem
+
+ --shmem
+
+</td><td>
+
+Use shared memory to load the index, rather than normal C file I/O.
+Using shared memory allows many concurrent bowtie processes on the same
+computer to share the same memory image of the index (i.e. you pay the
+memory overhead just once). This facilitates memory-efficient
+parallelization of `bowtie` in situations where using [`-p`] is not
+desirable. Unlike [`--mm`], `--shmem` installs the index into shared
+memory permanently, or until the user deletes the shared memory chunks
+manually. See your operating system documentation for details on how
+to manually list and remove shared memory chunks (on Linux and Mac OS
+X, these commands are `ipcs` and `ipcrm`). You may also need to
+increase your OS's maximum shared-memory chunk size to accomodate
+larger indexes; see your OS documentation.
+
+</td></tr></table>
+
+#### Other
+
+<table><tr><td id="bowtie-options-seed">
+
+[`--seed`]: #bowtie-options-seed
+
+ --seed <int>
+
+</td><td>
+
+Use `<int>` as the seed for pseudo-random number generator.
+
+</td></tr><tr><td id="bowtie-options-verbose">
+
+[`--verbose`]: #bowtie-options-verbose
+
+ --verbose
+
+</td><td>
+
+Print verbose output (for debugging).
+
+</td></tr><tr><td id="bowtie-options-version">
+
+[`--version`]: #bowtie-options-version
+
+ --version
+
+</td><td>
+
+Print version information and quit.
+
+</td></tr><tr><td id="bowtie-options-h">
+
+ -h/--help
+
+</td><td>
+
+Print usage information and quit.
+
+</td></tr></table>
+
+Default `bowtie` output
+-----------------------
+
+[Default Bowtie output]: #default-bowtie-output
+
+`bowtie` outputs one alignment per line. Each line is a collection of
+8 fields separated by tabs; from left to right, the fields are:
+
+1. Name of read that aligned
+
+2. Reference strand aligned to, `+` for forward strand, `-` for
+ reverse
+
+3. Name of reference sequence where alignment occurs, or numeric ID if
+ no name was provided
+
+4. 0-based offset into the forward reference strand where leftmost
+ character of the alignment occurs
+
+5. Read sequence (reverse-complemented if orientation is `-`).
+
+ If the read was in colorspace, then the sequence shown in this
+ column is the sequence of *decoded nucleotides*, not the original
+ colors. See the [Colorspace alignment] section for details about
+ decoding. To display colors instead, use the [`--col-cseq`] option.
+
+6. ASCII-encoded read qualities (reversed if orientation is `-`). The
+ encoded quality values are on the Phred scale and the encoding is
+ ASCII-offset by 33 (ASCII char `!`).
+
+ If the read was in colorspace, then the qualities shown in this
+ column are the *decoded qualities*, not the original qualities.
+ See the [Colorspace alignment] section for details about decoding.
+ To display colors instead, use the [`--col-cqual`] option.
+
+7. If [`-M`] was specified and the prescribed ceiling was exceeded for
+ this read, this column contains the value of the ceiling,
+ indicating that at least that many valid alignments were found in
+ addition to the one reported.
+
+ Otherwise, this column contains the number of other instances where
+ the same sequence aligned against the same reference characters as
+ were aligned against in the reported alignment. This is *not* the
+ number of other places the read aligns with the same number of
+ mismatches. The number in this column is generally not a good
+ proxy for that number (e.g., the number in this column may be '0'
+ while the number of other alignments with the same number of
+ mismatches might be large).
+
+8. Comma-separated list of mismatch descriptors. If there are no
+ mismatches in the alignment, this field is empty. A single
+ descriptor has the format offset:reference-base>read-base. The
+ offset is expressed as a 0-based offset from the high-quality (5')
+ end of the read.
+
+SAM `bowtie` output
+-------------------
+
+Following is a brief description of the [SAM] format as output by
+`bowtie` when the [`-S`/`--sam`] option is specified. For more
+details, see the [SAM format specification][SAM].
+
+When [`-S`/`--sam`] is specified, `bowtie` prints a SAM header with
+`@HD`, `@SQ` and `@PG` lines. When one or more [`--sam-RG`] arguments
+are specified, `bowtie` will also print an `@RG` line that includes all
+user-specified [`--sam-RG`] tokens separated by tabs.
+
+Each subsequnt line corresponds to a read or an alignment. Each line
+is a collection of at least 12 fields separated by tabs; from left to
+right, the fields are:
+
+1. Name of read that aligned
+
+2. Sum of all applicable flags. Flags relevant to Bowtie are:
+
+ <table><tr><td>
+
+ 1
+
+ </td><td>
+
+ The read is one of a pair
+
+ </td></tr><tr><td>
+
+ 2
+
+ </td><td>
+
+ The alignment is one end of a proper paired-end alignment
+
+ </td></tr><tr><td>
+
+ 4
+
+ </td><td>
+
+ The read has no reported alignments
+
+ </td></tr><tr><td>
+
+ 8
+
+ </td><td>
+
+ The read is one of a pair and has no reported alignments
+
+ </td></tr><tr><td>
+
+ 16
+
+ </td><td>
+
+ The alignment is to the reverse reference strand
+
+ </td></tr><tr><td>
+
+ 32
+
+ </td><td>
+
+ The other mate in the paired-end alignment is aligned to the
+ reverse reference strand
+
+ </td></tr><tr><td>
+
+ 64
+
+ </td><td>
+
+ The read is the first (#1) mate in a pair
+
+ </td></tr><tr><td>
+
+ 128
+
+ </td><td>
+
+ The read is the second (#2) mate in a pair
+
+ </td></tr></table>
+
+ Thus, an unpaired read that aligns to the reverse reference strand
+ will have flag 16. A paired-end read that aligns and is the first
+ mate in the pair will have flag 83 (= 64 + 16 + 2 + 1).
+
+3. Name of reference sequence where alignment occurs, or ordinal ID
+ if no name was provided
+
+4. 1-based offset into the forward reference strand where leftmost
+ character of the alignment occurs
+
+5. Mapping quality
+
+6. CIGAR string representation of alignment
+
+7. Name of reference sequence where mate's alignment occurs. Set to
+ `=` if the mate's reference sequence is the same as this
+ alignment's, or `*` if there is no mate.
+
+8. 1-based offset into the forward reference strand where leftmost
+ character of the mate's alignment occurs. Offset is 0 if there is
+ no mate.
+
+9. Inferred insert size. Size is negative if the mate's alignment
+ occurs upstream of this alignment. Size is 0 if there is no mate.
+
+10. Read sequence (reverse-complemented if aligned to the reverse
+ strand)
+
+11. ASCII-encoded read qualities (reverse-complemented if the read
+ aligned to the reverse strand). The encoded quality values are on
+ the [Phred quality] scale and the encoding is ASCII-offset by 33
+ (ASCII char `!`), similarly to a [FASTQ] file.
+
+12. Optional fields. Fields are tab-separated. For descriptions of
+ all possible optional fields, see the SAM format specification.
+ `bowtie` outputs some of these optional fields for each alignment,
+ depending on the type of the alignment:
+
+ <table><tr><td>
+
+ NM:i:<N>
+
+ </td><td>
+
+ Aligned read has an edit distance of `<N>`.
+
+ </td></tr><tr><td>
+
+ CM:i:<N>
+
+ </td><td>
+
+ Aligned read has an edit distance of `<N>` in colorspace. This
+ field is present in addition to the `NM` field in [`-C`/`--color`]
+ mode, but is omitted otherwise.
+
+ </td></tr><tr><td>
+
+ MD:Z:<S>
+
+ </td><td>
+
+ For aligned reads, `<S>` is a string representation of the
+ mismatched reference bases in the alignment. See [SAM] format
+ specification for details. For colorspace alignments, `<S>`
+ describes the decoded *nucleotide* alignment, not the colorspace
+ alignment.
+
+ </td></tr><tr><td>
+
+ XA:i:<N>
+
+ </td><td>
+
+ Aligned read belongs to stratum `<N>`. See [Strata] for definition.
+
+[Strata]: #strata
+
+ </td></tr><tr><td>
+
+ XM:i:<N>
+
+ </td><td>
+
+ For a read with no reported alignments, `<N>` is 0 if the read had
+ no alignments. If [`-m`] was specified and the read's alignments
+ were supressed because the [`-m`] ceiling was exceeded, `<N>` equals
+ the [`-m`] ceiling + 1, to indicate that there were at least that
+ many valid alignments (but all were suppressed). In [`-M`] mode, if
+ the alignment was randomly selected because the [`-M`] ceiling was
+ exceeded, `<N>` equals the [`-M`] ceiling + 1, to indicate that there
+ were at least that many valid alignments (of which one was reported
+ at random).
+
+ </td></tr></table>
+
+[SAM format specification]: http://samtools.sf.net/SAM1.pdf
+[FASTQ]: http://en.wikipedia.org/wiki/FASTQ_format
+[`-S`/`--sam`]: #bowtie-options-S
+[`-m`]: #bowtie-options-m
+
+The `bowtie-build` indexer
+==========================
+
+`bowtie-build` builds a Bowtie index from a set of DNA sequences.
+`bowtie-build` outputs a set of 6 files with suffixes
+`.1.ebwt`, `.2.ebwt`, `.3.ebwt`, `.4.ebwt`, `.rev.1.ebwt`, and
+`.rev.2.ebwt`. These files together constitute the index: they are all
+that is needed to align reads to that reference. The original sequence
+files are no longer used by Bowtie once the index is built.
+
+Use of Karkkainen's [blockwise algorithm] allows `bowtie-build` to
+trade off between running time and memory usage. `bowtie-build` has
+three options governing how it makes this trade: [`-p`/`--packed`],
+[`--bmax`]/[`--bmaxdivn`], and [`--dcv`]. By default, `bowtie-build` will
+automatically search for the settings that yield the best
+ running time without exhausting memory. This behavior can be disabled
+ using the [`-a`/`--noauto`] option.
+
+The indexer provides options pertaining to the "shape" of the index,
+e.g. [`--offrate`](#bowtie-build-options-o) governs the fraction of [Burrows-Wheeler] rows that
+are "marked" (i.e., the density of the suffix-array sample; see the
+original [FM Index] paper for details). All of these options are
+potentially profitable trade-offs depending on the application. They
+have been set to defaults that are reasonable for most cases according
+to our experiments. See [Performance Tuning] for details.
+
+Because `bowtie-build` uses 32-bit pointers internally, it can handle
+up to a theoretical maximum of 2^32-1 (somewhat more than 4 billion)
+characters in an index, though, with other constraints, the actual
+ceiling is somewhat less than that. If your reference exceeds 2^32-1
+characters, `bowtie-build` will print an error message and abort. To
+resolve this, divide your reference sequences into smaller batches
+and/or chunks and build a separate index for each.
+
+If your computer has more than 3-4 GB of memory and you would like to
+exploit that fact to make index building faster, use a 64-bit version
+of the `bowtie-build` binary. The 32-bit version of the binary is
+restricted to using less than 4 GB of memory. If a 64-bit pre-built
+binary does not yet exist for your platform on the sourceforge download
+site, you will need to build one from source.
+
+The Bowtie index is based on the [FM Index] of Ferragina and Manzini,
+which in turn is based on the [Burrows-Wheeler] transform. The
+algorithm used to build the index is based on the [blockwise algorithm]
+of Karkkainen.
+
+[Blockwise algorithm]: http://portal.acm.org/citation.cfm?id=1314852
+[FM Index]: http://portal.acm.org/citation.cfm?id=796543
+[Burrows-Wheeler]: http://en.wikipedia.org/wiki/Burrows-Wheeler_transform
+
+Command Line
+------------
+
+Usage:
+
+ bowtie-build [options]* <reference_in> <ebwt_base>
+
+### Main arguments
+
+<table><tr><td>
+
+ <reference_in>
+
+</td><td>
+
+A comma-separated list of FASTA files containing the reference
+sequences to be aligned to, or, if [`-c`](#bowtie-build-options-c) is specified, the sequences
+themselves. E.g., `<reference_in>` might be
+`chr1.fa,chr2.fa,chrX.fa,chrY.fa`, or, if [`-c`](#bowtie-build-options-c) is specified, this might
+be `GGTCATCCT,ACGGGTCGT,CCGTTCTATGCGGCTTA`.
+
+</td></tr><tr><td>
+
+ <ebwt_base>
+
+</td><td>
+
+The basename of the index files to write. By default, `bowtie-build`
+writes files named `NAME.1.ebwt`, `NAME.2.ebwt`, `NAME.3.ebwt`,
+`NAME.4.ebwt`, `NAME.rev.1.ebwt`, and `NAME.rev.2.ebwt`, where `NAME`
+is `<ebwt_base>`.
+
+</td></tr></table>
+
+### Options
+
+<table><tr><td>
+
+ -f
+
+</td><td>
+
+The reference input files (specified as `<reference_in>`) are FASTA
+files (usually having extension `.fa`, `.mfa`, `.fna` or similar).
+
+</td></tr><tr><td id="bowtie-build-options-c">
+
+ -c
+
+</td><td>
+
+The reference sequences are given on the command line. I.e.
+`<reference_in>` is a comma-separated list of sequences rather than a
+list of FASTA files.
+
+</td></tr><tr><td id="bowtie-build-options-C">
+
+ -C/--color
+
+</td><td>
+
+Build a colorspace index, to be queried using `bowtie` [`-C`].
+
+</td></tr><tr><td id="bowtie-build-options-a">
+
+[`-a`/`--noauto`]: #bowtie-build-options-a
+
+ -a/--noauto
+
+</td><td>
+
+Disable the default behavior whereby `bowtie-build` automatically
+selects values for the [`--bmax`], [`--dcv`] and [`--packed`] parameters
+according to available memory. Instead, user may specify values for
+those parameters. If memory is exhausted during indexing, an error
+message will be printed; it is up to the user to try new parameters.
+
+</td></tr><tr><td id="bowtie-build-options-p">
+
+[`--packed`]: #bowtie-build-options-p
+[`-p`/`--packed`]: #bowtie-build-options-p
+
+ -p/--packed
+
+</td><td>
+
+Use a packed (2-bits-per-nucleotide) representation for DNA strings.
+This saves memory but makes indexing 2-3 times slower. Default: off.
+This is configured automatically by default; use [`-a`/`--noauto`] to
+configure manually.
+
+</td></tr><tr><td id="bowtie-build-options-bmax">
+
+[`--bmax`]: #bowtie-build-options-bmax
+
+ --bmax <int>
+
+</td><td>
+
+The maximum number of suffixes allowed in a block. Allowing more
+suffixes per block makes indexing faster, but increases peak memory
+usage. Setting this option overrides any previous setting for
+[`--bmax`], or [`--bmaxdivn`]. Default (in terms of the [`--bmaxdivn`]
+parameter) is [`--bmaxdivn`] 4. This is configured automatically by
+default; use [`-a`/`--noauto`] to configure manually.
+
+</td></tr><tr><td id="bowtie-build-options-bmaxdivn">
+
+[`--bmaxdivn`]: #bowtie-build-options-bmaxdivn
+
+ --bmaxdivn <int>
+
+</td><td>
+
+The maximum number of suffixes allowed in a block, expressed as a
+fraction of the length of the reference. Setting this option overrides
+any previous setting for [`--bmax`], or [`--bmaxdivn`]. Default:
+[`--bmaxdivn`] 4. This is configured automatically by default; use
+[`-a`/`--noauto`] to configure manually.
+
+</td></tr><tr><td id="bowtie-build-options-dcv">
+
+[`--dcv`]: #bowtie-build-options-dcv
+
+ --dcv <int>
+
+</td><td>
+
+Use `<int>` as the period for the difference-cover sample. A larger
+period yields less memory overhead, but may make suffix sorting slower,
+especially if repeats are present. Must be a power of 2 no greater
+than 4096. Default: 1024. This is configured automatically by
+default; use [`-a`/`--noauto`] to configure manually.
+
+</td></tr><tr><td id="bowtie-build-options-nodc">
+
+[`--nodc`]: #bowtie-build-options-nodc
+
+ --nodc
+
+</td><td>
+
+Disable use of the difference-cover sample. Suffix sorting becomes
+quadratic-time in the worst case (where the worst case is an extremely
+repetitive reference). Default: off.
+
+</td></tr><tr><td>
+
+ -r/--noref
+
+</td><td>
+
+Do not build the `NAME.3.ebwt` and `NAME.4.ebwt` portions of the index,
+which contain a bitpacked version of the reference sequences and are
+used for paired-end alignment.
+
+</td></tr><tr><td>
+
+ -3/--justref
+
+</td><td>
+
+Build *only* the `NAME.3.ebwt` and `NAME.4.ebwt` portions of the index,
+which contain a bitpacked version of the reference sequences and are
+used for paired-end alignment.
+
+</td></tr><tr><td id="bowtie-build-options-o">
+
+ -o/--offrate <int>
+
+</td><td>
+
+To map alignments back to positions on the reference sequences, it's
+necessary to annotate ("mark") some or all of the [Burrows-Wheeler]
+rows with their corresponding location on the genome. [`-o`/`--offrate`](#bowtie-build-options-o)
+governs how many rows get marked: the indexer will mark every 2^`<int>`
+rows. Marking more rows makes reference-position lookups faster, but
+requires more memory to hold the annotations at runtime. The default
+is 5 (every 32nd row is marked; for human genome, annotations occupy
+about 340 megabytes).
+
+</td></tr><tr><td>
+
+ -t/--ftabchars <int>
+
+</td><td>
+
+The ftab is the lookup table used to calculate an initial
+[Burrows-Wheeler] range with respect to the first `<int>` characters
+of the query. A larger `<int>` yields a larger lookup table but faster
+query times. The ftab has size 4^(`<int>`+1) bytes. The default
+setting is 10 (ftab is 4MB).
+
+</td></tr><tr><td id="bowtie-build-options-ntoa">
+
+ --ntoa
+
+</td><td>
+
+Convert Ns in the reference sequence to As before building the index.
+By default, Ns are simply excluded from the index and `bowtie` will not
+report alignments that overlap them.
+
+</td></tr><tr><td id="bowtie-build-options-big-little">
+
+ --big --little
+
+</td><td>
+
+Endianness to use when serializing integers to the index file.
+Default: little-endian (recommended for Intel- and AMD-based
+architectures).
+
+</td></tr><tr><td id="bowtie-build-options-seed">
+
+ --seed <int>
+
+</td><td>
+
+Use `<int>` as the seed for pseudo-random number generator.
+
+</td></tr><tr><td>
+
+ --cutoff <int>
+
+</td><td>
+
+Index only the first `<int>` bases of the reference sequences
+(cumulative across sequences) and ignore the rest.
+
+</td></tr><tr><td>
+
+ -q/--quiet
+
+</td><td>
+
+`bowtie-build` is verbose by default. With this option `bowtie-build`
+will print only error messages.
+
+</td></tr><tr><td>
+
+ -h/--help
+
+</td><td>
+
+Print usage information and quit.
+
+</td></tr><tr><td>
+
+ --version
+
+</td><td>
+
+Print version information and quit.
+
+</td></tr></table>
+
+The `bowtie-inspect` index inspector
+====================================
+
+`bowtie-inspect` extracts information from a Bowtie index about what
+kind of index it is and what reference sequences were used to build it.
+When run without any options, the tool will output a FASTA file
+containing the sequences of the original references (with all
+non-`A`/`C`/`G`/`T` characters converted to `N`s). It can also be used
+to extract just the reference sequence names using the [`-n`/`--names`]
+option or a more verbose summary using the [`-s`/`--summary`] option.
+
+Command Line
+------------
+
+Usage:
+
+ bowtie-inspect [options]* <ebwt_base>
+
+### Main arguments
+
+<table><tr><td>
+
+ <ebwt_base>
+
+</td><td>
+
+The basename of the index to be inspected. The basename is name of any
+of the index files but with the `.X.ebwt` or `.rev.X.ebwt` suffix
+omitted. `bowtie-inspect` first looks in the current directory for the
+index files, then looks in the `indexes` subdirectory under the
+directory where the currently-running `bowtie` executable is located,
+then looks in the directory specified in the `BOWTIE_INDEXES`
+environment variable.
+
+</td></tr></table>
+
+### Options
+
+<table><tr><td>
+
+ -a/--across <int>
+
+</td><td>
+
+When printing FASTA output, output a newline character every `<int>`
+bases (default: 60).
+
+</td></tr><tr><td id="bowtie-build-options-n">
+
+[`-n`/`--names`]: #bowtie-build-options-n
+
+ -n/--names
+
+</td><td>
+
+Print reference sequence names, one per line, and quit.
+
+</td></tr><tr><td id="bowtie-inspect-options-s">
+
+[`-s`/`--summary`]: #bowtie-inspect-options-s
+
+ -s/--summary
+
+</td><td>
+
+Print a summary that includes information about index settings, as well
+as the names and lengths of the input sequences. The summary has this
+format:
+
+ Colorspace <0 or 1>
+ SA-Sample 1 in <sample>
+ FTab-Chars <chars>
+ Sequence-1 <name> <len>
+ Sequence-2 <name> <len>
+ ...
+ Sequence-N <name> <len>
+
+Fields are separated by tabs.
+
+</td></tr><tr><td id="bowtie-inspect-options-e">
+
+[`-e`/`--ebwt-ref`]: #bowtie-inspect-options-e
+
+ -e/--ebwt-ref
+
+</td><td>
+
+By default, when `bowtie-inspect` is run without [`-s`] or [`-n`], it
+recreates the reference nucleotide sequences using the bit-encoded
+reference nucleotides kept in the `.3.ebwt` and `.4.ebwt` index files.
+When `-e/--ebwt-ref` is specified, `bowtie-inspect` recreates the
+reference sequences from the Burrows-Wheeler-transformed reference
+sequence in the `.1.ebwt` file instead. The reference recreation
+process is much slower when `-e/--ebwt-ref` is specified. Also, when
+`-e/--ebwt-ref` is specified and the index is in colorspace, the
+reference is printed in colors (A=blue, C=green, G=orange, T=red).
+
+</td></tr><tr><td>
+
+ -v/--verbose
+
+</td><td>
+
+Print verbose output (for debugging).
+
+</td></tr><tr><td>
+
+ --version
+
+</td><td>
+
+Print version information and quit.
+
+</td></tr><tr><td>
+
+ -h/--help
+
+</td><td>
+
+Print usage information and quit.
+
+</td></tr></table>
+
--- /dev/null
+#
+# Makefile for bowtie, bowtie-build, bowtie-inspect
+#
+
+SEQAN_DIR = SeqAn-1.1
+SEQAN_INC = -I $(SEQAN_DIR)
+INC = $(SEQAN_INC)
+GCC_PREFIX = $(shell dirname `which gcc`)
+GCC_SUFFIX =
+CC = $(GCC_PREFIX)/gcc$(GCC_SUFFIX)
+CPP = $(GCC_PREFIX)/g++$(GCC_SUFFIX)
+CXX = $(CPP)
+HEADERS = $(wildcard *.h)
+BOWTIE_PTHREADS = 1
+BOWTIE_MM = 1
+BOWTIE_SHARED_MEM = 1
+EXTRA_FLAGS =
+EXTRA_CFLAGS =
+EXTRA_CXXFLAGS =
+CFLAGS += $(EXTRA_CFLAGS)
+CXXFLAGS += $(EXTRA_CXXFLAGS)
+
+# Detect Cygwin or MinGW
+WINDOWS = 0
+ifneq (,$(findstring CYGWIN,$(shell uname)))
+WINDOWS = 1
+# POSIX memory-mapped files not currently supported on Windows
+BOWTIE_MM = 0
+BOWTIE_SHARED_MEM = 0
+else
+ifneq (,$(findstring MINGW,$(shell uname)))
+WINDOWS = 1
+# POSIX memory-mapped files not currently supported on Windows
+BOWTIE_MM = 0
+BOWTIE_SHARED_MEM = 0
+endif
+endif
+
+MACOS = 0
+ifneq (,$(findstring Darwin,$(shell uname)))
+MACOS = 1
+endif
+
+LINUX = 0
+ifneq (,$(findstring Linux,$(shell uname)))
+LINUX = 1
+EXTRA_FLAGS += -Wl,--hash-style=both
+endif
+
+MM_DEF =
+ifeq (1,$(BOWTIE_MM))
+MM_DEF = -DBOWTIE_MM
+endif
+SHMEM_DEF =
+ifeq (1,$(BOWTIE_SHARED_MEM))
+SHMEM_DEF = -DBOWTIE_SHARED_MEM
+endif
+PTHREAD_PKG =
+PTHREAD_LIB =
+PTHREAD_DEF =
+ifeq (1,$(BOWTIE_PTHREADS))
+PTHREAD_DEF = -DBOWTIE_PTHREADS
+ifeq (1,$(WINDOWS))
+# pthreads for windows forces us to be specific about the library
+PTHREAD_LIB = -L . -lpthreadGC2
+PTHREAD_PKG = pthreadGC2.dll
+else
+# There's also -pthread, but that only seems to work on Linux
+PTHREAD_LIB = -lpthread
+endif
+endif
+
+PREFETCH_LOCALITY = 2
+PREF_DEF = -DPREFETCH_LOCALITY=$(PREFETCH_LOCALITY)
+
+LIBS =
+SEARCH_LIBS = $(PTHREAD_LIB)
+BUILD_LIBS =
+
+OTHER_CPPS = ccnt_lut.cpp ref_read.cpp alphabet.c shmem.cpp \
+ edit.cpp ebwt.cpp
+SEARCH_CPPS = qual.cpp pat.cpp ebwt_search_util.cpp ref_aligner.cpp \
+ log.cpp hit_set.cpp refmap.cpp annot.cpp sam.cpp \
+ color.cpp color_dec.cpp hit.cpp
+SEARCH_CPPS_MAIN = $(SEARCH_CPPS) bowtie_main.cpp
+
+BUILD_CPPS =
+BUILD_CPPS_MAIN = $(BUILD_CPPS) bowtie_build_main.cpp
+
+SEARCH_FRAGMENTS = $(wildcard search_*_phase*.c)
+VERSION = $(shell cat VERSION)
+
+# Convert BITS=?? to a -m flag
+BITS_FLAG =
+ifeq (32,$(BITS))
+BITS_FLAG = -m32
+endif
+ifeq (64,$(BITS))
+BITS_FLAG = -m64
+endif
+
+# Convert CHUD=1 to CHUD-related flags
+CHUD=0
+CHUD_DEF =
+ifeq (1,$(CHUD))
+EXTRA_FLAGS += -g3
+ifeq (1,$(MACOS))
+CHUD_DEF = -F/System/Library/PrivateFrameworks -weak_framework CHUD -DCHUD_PROFILING
+endif
+endif
+
+DEBUG_FLAGS = -O0 -g3 $(BITS_FLAG)
+RELEASE_FLAGS = -O3 $(BITS_FLAG)
+NOASSERT_FLAGS = -DNDEBUG
+FILE_FLAGS = -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
+
+BIN_LIST = bowtie-build \
+ bowtie \
+ bowtie-inspect
+BIN_LIST_AUX = bowtie-build-debug \
+ bowtie-debug \
+ bowtie-inspect-debug
+
+GENERAL_LIST = $(wildcard scripts/*.sh) \
+ $(wildcard scripts/*.pl) \
+ $(wildcard indexes/e_coli*) \
+ $(wildcard genomes/NC_008253.fna) \
+ $(wildcard reads/e_coli_1000.*) \
+ $(wildcard reads/e_coli_1000_*) \
+ doc/manual.html \
+ doc/README \
+ doc/style.css \
+ reads/e_coli_10000snp.fa \
+ reads/e_coli_10000snp.fq \
+ $(PTHREAD_PKG) \
+ AUTHORS \
+ COPYING \
+ NEWS \
+ MANUAL \
+ MANUAL.markdown \
+ TUTORIAL \
+ VERSION
+
+# This is helpful on Windows under MinGW/MSYS, where Make might go for
+# the Windows FIND tool instead.
+FIND=$(shell which find)
+
+SRC_PKG_LIST = $(wildcard *.h) \
+ $(wildcard *.hh) \
+ $(wildcard *.c) \
+ $(wildcard *.cpp) \
+ $(shell $(FIND) SeqAn-1.1 -name "*.h") \
+ $(shell $(FIND) SeqAn-1.1 -name "*.txt") \
+ doc/strip_markdown.pl \
+ Makefile \
+ $(GENERAL_LIST)
+
+BIN_PKG_LIST = $(GENERAL_LIST)
+
+all: $(BIN_LIST)
+
+allall: $(BIN_LIST) $(BIN_LIST_AUX)
+
+DEFS=-fno-strict-aliasing \
+ -DBOWTIE_VERSION="\"`cat VERSION`\"" \
+ -DBUILD_HOST="\"`hostname`\"" \
+ -DBUILD_TIME="\"`date`\"" \
+ -DCOMPILER_VERSION="\"`$(CXX) -v 2>&1 | tail -1`\"" \
+ $(FILE_FLAGS) \
+ $(PTHREAD_DEF) \
+ $(PREF_DEF) \
+ $(MM_DEF) \
+ $(SHMEM_DEF) \
+ $(CHUD_DEF)
+
+define checksum
+ cat $^ | md5sum | awk '{print $$1}' > .$@.md5
+endef
+
+ALL_FLAGS=$(EXTRA_FLAGS) $(CFLAGS) $(CXXFLAGS)
+DEBUG_DEFS = -DCOMPILER_OPTIONS="\"$(DEBUG_FLAGS) $(ALL_FLAGS)\""
+RELEASE_DEFS = -DCOMPILER_OPTIONS="\"$(RELEASE_FLAGS) $(ALL_FLAGS)\""
+
+#
+# bowtie-build targets
+#
+
+bowtie-build: ebwt_build.cpp $(OTHER_CPPS) $(HEADERS)
+ $(checksum)
+ $(CXX) $(RELEASE_FLAGS) $(RELEASE_DEFS) $(ALL_FLAGS) \
+ -DEBWT_BUILD_HASH=`cat .$@.md5` \
+ $(DEFS) $(NOASSERT_FLAGS) -Wall \
+ $(INC) \
+ -o $@ $< \
+ $(OTHER_CPPS) $(BUILD_CPPS_MAIN) \
+ $(LIBS) $(BUILD_LIBS)
+
+bowtie-build_prof: ebwt_build.cpp $(OTHER_CPPS) $(HEADERS)
+ $(checksum)
+ $(CXX) $(RELEASE_FLAGS) -pg -p -g3 $(RELEASE_DEFS) $(ALL_FLAGS) \
+ -DEBWT_BUILD_HASH=`cat .$@.md5` \
+ $(DEFS) $(NOASSERT_FLAGS) -Wall \
+ $(INC) \
+ -o $@ $< \
+ $(OTHER_CPPS) $(BUILD_CPPS_MAIN) \
+ $(LIBS) $(BUILD_LIBS)
+
+bowtie-build-debug: ebwt_build.cpp $(OTHER_CPPS) $(HEADERS)
+ $(checksum)
+ $(CXX) $(DEBUG_FLAGS) $(DEBUG_DEFS) $(ALL_FLAGS) \
+ -DEBWT_BUILD_HASH=`cat .$@.md5` \
+ $(DEFS) -Wall \
+ $(INC) \
+ -o $@ $< \
+ $(OTHER_CPPS) $(BUILD_CPPS_MAIN) \
+ $(LIBS) $(BUILD_LIBS)
+
+#
+# bowtie targets
+#
+
+bowtie: ebwt_search.cpp $(SEARCH_CPPS) $(OTHER_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
+ $(checksum)
+ $(CXX) $(RELEASE_FLAGS) $(RELEASE_DEFS) $(ALL_FLAGS) \
+ -DEBWT_SEARCH_HASH=`cat .$@.md5` \
+ $(DEFS) $(NOASSERT_FLAGS) -Wall \
+ $(INC) \
+ -o $@ $< \
+ $(OTHER_CPPS) $(SEARCH_CPPS_MAIN) \
+ $(LIBS) $(SEARCH_LIBS)
+
+bowtie_prof: ebwt_search.cpp $(SEARCH_CPPS) $(OTHER_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
+ $(checksum)
+ $(CXX) $(RELEASE_FLAGS) \
+ $(RELEASE_DEFS) -pg -p -g3 $(ALL_FLAGS) \
+ -DEBWT_SEARCH_HASH=`cat .$@.md5` \
+ $(DEFS) $(NOASSERT_FLAGS) -Wall \
+ $(INC) \
+ -o $@ $< \
+ $(OTHER_CPPS) $(SEARCH_CPPS_MAIN) \
+ $(LIBS) $(SEARCH_LIBS)
+
+bowtie-debug: ebwt_search.cpp $(SEARCH_CPPS) $(OTHER_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
+ $(checksum)
+ $(CXX) $(DEBUG_FLAGS) \
+ $(DEBUG_DEFS) $(ALL_FLAGS) \
+ -DEBWT_SEARCH_HASH=`cat .$@.md5` \
+ $(DEFS) -Wall \
+ $(INC) \
+ -o $@ $< \
+ $(OTHER_CPPS) $(SEARCH_CPPS_MAIN) \
+ $(LIBS) $(SEARCH_LIBS)
+
+#
+# bowtie-inspect targets
+#
+
+bowtie-inspect: bowtie_inspect.cpp $(HEADERS) $(OTHER_CPPS)
+ $(checksum)
+ $(CXX) $(RELEASE_FLAGS) \
+ $(RELEASE_DEFS) $(ALL_FLAGS) \
+ -DEBWT_INSPECT_HASH=`cat .$@.md5` \
+ $(DEFS) -Wall \
+ $(INC) -I . \
+ -o $@ $< \
+ $(OTHER_CPPS) \
+ $(LIBS)
+
+bowtie-inspect-debug: bowtie_inspect.cpp $(HEADERS) $(OTHER_CPPS)
+ $(checksum)
+ $(CXX) $(DEBUG_FLAGS) \
+ $(DEBUG_DEFS) $(ALL_FLAGS) \
+ -DEBWT_INSPECT_HASH=`cat .$@.md5` \
+ $(DEFS) -Wall \
+ $(INC) -I . \
+ -o $@ $< \
+ $(OTHER_CPPS) \
+ $(LIBS)
+
+chaincat: chaincat.cpp hit_set.h filebuf.h hit_set.cpp alphabet.h alphabet.c
+ $(CXX) $(DEBUG_FLAGS) $(DEBUG_DEFS) $(ALL_FLAGS) -Wall $(INC) -I . -o $@ $< hit_set.cpp alphabet.c
+
+bowtie-src.zip: $(SRC_PKG_LIST)
+ chmod a+x scripts/*.sh scripts/*.pl
+ mkdir .src.tmp
+ mkdir .src.tmp/bowtie-$(VERSION)
+ zip tmp.zip $(SRC_PKG_LIST)
+ mv tmp.zip .src.tmp/bowtie-$(VERSION)
+ cd .src.tmp/bowtie-$(VERSION) ; unzip tmp.zip ; rm -f tmp.zip
+ cd .src.tmp ; zip -r $@ bowtie-$(VERSION)
+ cp .src.tmp/$@ .
+ rm -rf .src.tmp
+
+bowtie-bin.zip: $(BIN_PKG_LIST) $(BIN_LIST) $(BIN_LIST_AUX)
+ chmod a+x scripts/*.sh scripts/*.pl
+ rm -rf .bin.tmp
+ mkdir .bin.tmp
+ mkdir .bin.tmp/bowtie-$(VERSION)
+ if [ -f bowtie.exe ] ; then \
+ zip tmp.zip $(BIN_PKG_LIST) $(addsuffix .exe,$(BIN_LIST) $(BIN_LIST_AUX)) ; \
+ else \
+ zip tmp.zip $(BIN_PKG_LIST) $(BIN_LIST) $(BIN_LIST_AUX) ; \
+ fi
+ mv tmp.zip .bin.tmp/bowtie-$(VERSION)
+ cd .bin.tmp/bowtie-$(VERSION) ; unzip tmp.zip ; rm -f tmp.zip
+ cd .bin.tmp ; zip -r $@ bowtie-$(VERSION)
+ cp .bin.tmp/$@ .
+ rm -rf .bin.tmp
+
+.PHONY: doc
+doc: doc/manual.html MANUAL
+
+doc/manual.html: MANUAL.markdown
+ echo "<h1>Table of Contents</h1>" > .tmp.head
+ pandoc -T "Bowtie Manual" -B .tmp.head \
+ --css style.css -o $@ \
+ --from markdown --to HTML \
+ --table-of-contents $^
+
+MANUAL: MANUAL.markdown
+ perl doc/strip_markdown.pl < $^ > $@
+
+.PHONY: clean
+clean:
+ rm -f $(BIN_LIST) $(BIN_LIST_AUX) \
+ bowtie_prof \
+ $(addsuffix .exe,$(BIN_LIST) $(BIN_LIST_AUX) bowtie_prof) \
+ bowtie-src.zip bowtie-bin.zip
+ rm -f core.*
--- /dev/null
+Bowtie: an Ultrafast, Lightweight Short Read Aligner
+
+Bowtie NEWS
+===========
+
+ Bowtie is now available for download. 0.9.0 is the first version to
+be released under the OSI Artistic License (see `COPYING') and freely
+available to the public for download. The current version is 0.12.7.
+
+Reporting Issues
+================
+
+Please report any issues using the Sourceforge bug tracker:
+
+ https://sourceforge.net/tracker/?group_id=236897&atid=1101606
+
+Announcements
+=============
+
+To receive announcements (including release announcements) about Bowtie
+and related tools (including Crossbow, TopHat, Cufflinks, Myrna)
+subscribe to our mailing list:
+
+ https://lists.sourceforge.net/lists/listinfo/bowtie-bio-announce
+
+Version Release History
+=======================
+
+Version 0.12.7 - September 7, 2010
+ * Fixes the all-gap reference sequence issue that was present in
+ Bowtie 0.12.6. Index files produced by bowtie-build 0.12.5 and
+ earlier (back to 0.10.*) are compatible with bowtie 0.12.7. Index
+ files produced by bowtie-build 0.12.7 are backward compatible with
+ bowtie 0.12.5 and earlier as long as the first reference sequence
+ is not all-gaps (or, when colorspace indexes, as long as the first
+ reference sequence has two consecutive ACGT characters in it
+ somewhere).
+ * Indexes where the first sequence consists of all gaps or other
+ non-ACGT characters are not handled properly by Bowtie versions
+ 0.12.5 and older, but are handled properly by Bowtie 0.12.7.
+ * REMOVED: bowtie-build's --old-reverse option; the old reverse-
+ index scheme is again the default. The new scheme is disabled
+ pending further refinement.
+
+September 3, 2010
+ * The version of bowtie-build distributed in Bowtie 0.12.6 IS
+ BROKEN. It could not handle stretches of ambiguous reference
+ characters compatibly with prior versions of Bowtie. Please use
+ Bowtie 0.12.5's bowtie-build instead. Bowtie 0.12.7, due out
+ soon, will contain a complete set of working tools. Bowtie 0.12.7
+ will revert to old way of building the reverse index. Sorry for
+ the inconvenience.
+
+Version 0.12.6 - August 29, 2010
+ * Modified bowtie-inspect's default mode to use the bit-encoded
+ reference portion of the index to reconstruct the reference
+ sequence, rather than the ebwt portion. This makes bowtie-inspect
+ much faster and uses less memory, and the output for a colorspace
+ index will now be in nucleotide space. To get the behavior of the
+ old default, use the new -e/--ebwt-ref option.
+ * Fixed bug whereby SOLiD QV strings would fail to parse.
+ * Moved to a new default way of building the reverse index. Revert
+ to the old behavior with bowtie-build's new --old-reverse option.
+ The new reverse index format is forward and backward compatible
+ with `bowtie`, unless otherwise noted in a future version.
+ * Fixed issue that would sometimes cause bowtie-build to crash when
+ building a large index with a low --offrate.
+ * Fixed build issue that would cause bowtie-build built on one
+ version of Linux to die with a "floating point error" on other
+ versions.
+ * Fixed a bug whereby alignment cost could sometimes be
+ miscalculated. Stratum was unaffected.
+ * bowtie now simply skips reads with 0 characters. Previously it
+ would print an error and exit.
+
+Version 0.12.5 - April 10, 2010
+ * Fixed spurious "Error while writing string output; not all
+ characters written" errors in -S/--sam mode.
+
+Version 0.12.4 - April 5, 2010
+ * Periods in read sequences are now treated as Ns instead of
+ ignored. This should help with some problems where Bowtie
+ erroneously reports "Reads file contained a pattern with more than
+ 1024 quality values..." for data from recent versions of the
+ Illumina GA pipeline.
+ * Fixed a bug whereby some error and warning messages would be
+ printed on top of each other in -p mode.
+ * Chunk-exhaustion warnings messages are now suppressed when --quiet
+ is specified.
+ * Fixed small issue in quality decoding whereby no-confidence colors
+ would incorrectly influence decoded quality of adjacent bases.
+
+Version 0.12.3 - February 17, 2010
+ * Fixed a significant bug in -C/--color mode whereby quality values
+ for SNP nucleotide positions were erroneously penalized.
+ * Fixed a bug in -S/--sam mode whereby if whitespace occurred in the
+ original read name, it would be printed in the QNAME field in
+ violation of the SAM spec. Bowtie now truncates read names at the
+ first whitespace character before printing.
+ * When input is FASTQ and -C/--color is enabled, Bowtie is now
+ tolerant of output from FASTQ converters that include the primer
+ base and where the quality string is one character shorter than
+ the sequence string (due to the primer). This fixes issues users
+ have reported using output from BFAST's solid2fastq.
+ * Added support for SOLiD-style _QV files, via the -Q/--quals, --Q1
+ and --Q2 options. These options are used in combination with
+ -C/--color and -f to align parallel colorspace read/quality files
+ without having to convert to FASTQ.
+
+Version 0.12.2 - February 2, 2010
+ * When -C/--color is enabled, the default paired-end orientation is
+ now --ff, not --fr. The new default fits typical SOLiD output.
+ * Fixed a bug whereby very large reads could cause bowtie to crash
+ in --best mode
+ * After 0.12.1, some issues remained whereby bowtie would fail to
+ trim the primer in colorspace (e.g. in -c mode). All input modes
+ should now have fully-functioning
+ * Fixed a bug whereby decoded colorspace qualities could overflow
+ and erroneously become low qualities.
+ * Fixed a bug that could produce incorrect paired-end alignment
+ results in -n mode when using paired-end orientation modes other
+ than --fr. Even with the bug, reported results are reasonable;
+ but the seed edit constraint (-n) may have been applied to the
+ wrong end of one of the mates.
+ * Changed --chunkmbs default up to 64 from 32.
+ * Better error checking and reporting for some bowtie options.
+ * Some basic testing scripts are now bundled with Bowtie (in
+ scripts/test), which should make it easier to regression-test.
+
+Version 0.12.1 - January 8, 2010
+ * IMPORTANT: Fixed bug whereby bowtie would fail to remove both the
+ primer base and the first color when parsing .csfasta files with
+ primer bases in -C -f mode. A workaround for users of version
+ 0.12.0 is to use "-5 1" in that situation.
+ * Fixed bug whereby, when -M limit was exceeded for an unpaired
+ read, the number printed in the 7th column for the random
+ alignment was too low by 1.
+ * Added documentation discussing a pitfall regarding SOLiD paired-
+ end input, i.e., not all entries necessarily have corresponding
+ mates in the other file.
+
+Version 0.12.0 - December 23, 2009
+ * Added missing README.markdown file
+ * Minor documentation additions
+
+Version 0.12.0-beta1 - December 12, 2009
+ * Added SOLiD colorspace support
+ * Colorspace indexes are distinct from standard letterspace
+ indexes and must be built with a separate invocation of
+ bowtie-build (with -C option)
+ * Running bowtie with -C causes Bowtie to align in colorspace;
+ both index and reads must be in colorspace
+ * Colorspace memory requirement is the same as paired-end
+ alignment in nucleotide-space (normal) mode. Paired-end
+ alignment does not increase the memory requirement further in
+ colorspace.
+ * csfasta, csfastq, and "raw" read formats are all supported with
+ -C; '0' means "blue" and is intechangeable with 'A', likewise
+ '1' ('C') means "green", '2' ('G') means "orange" and '3' ('T')
+ means "red"
+ * Colorspace versions of pre-built indexes added (see Bowtie web
+ site)
+ * New manual section discussing colorspace features
+ * Fixed a few SAM output issues
+ * @PG line now properly uses colons instead of equals signs
+ * Removed /1, /2 suffixes for paired-end reads in SAM mode
+ * Added --sam-RG option that permits the user to insert set values
+ for flags that appear on the @RG line
+ * Fixed lingering pthreads bugs that would cause Bowtie to hang or
+ crash toward the end of execution with -p > 1.
+ * Fixed performance-related bug that would cause paired-end
+ alignment to be artificially slow in many situations.
+ * Fixed issue with random number generation that would result in
+ non-random selection of alignments in some situations.
+ * bowtie -f now supports fasta files with reads split across
+ multiple lines
+ * New --suppress option suppresses unwanted columns of output
+ * The MANUAL file was converted to markdown format, facilitating
+ conversion to various other formats using tools like pandoc
+ * DEPRECATED: bowtie: --concise, bowtie-build: --big, --little
+ * REMOVED: -z/--phased, -b/--binout, bowtie-maptool,
+ bowtie-maqconvert
+
+Version 0.11.3 - October 12, 2009
+ * Fixed crashing bug in -S/--sam mode when the number of reference
+ sequences in the index is very large.
+ * Added --sam-nohead option to suppress output of SAM headers in
+ -S/--sam mode.
+ * Added --sam-nosq option to suppress output of @SQ SAM headers in
+ -S/--sam mode. These can become a nuisance when the reference
+ index contains a very large number of sequences.
+ * Fixed a bug in bowtie-build's auto-configure mode that would cause
+ it to underestimate the amount of memory required by a set of
+ parameters. This in turn would cause the index to be corrupted.
+
+Version 0.11.2 - October 7, 2009
+ * Fixed issue whereby --max option was disabled.
+
+Version 0.11.1 - October 5, 2009
+ * SAM output: changed XS:i optional field to be named XA:i to avoid
+ a conflict with TopHat's XS:i field.
+
+Version 0.11.0 - October 5, 2009
+ * Initial SAM output support with -S/--sam option. Bowtie sets all
+ fields according to the SAM spec (Version 0.1.2-draft, 20090820).
+ See the new "SAM Output" section of the manual for details.
+ * Added --shmem option: --shmem is similar to --mm in that it allows
+ concurrent 'bowtie' processes querying the same index to share a
+ single memory image of the index. Unlike --mm, shared memory
+ alocated by --shmem is permanent.
+ * The alignment summary printed to stderr at the end of an alignment
+ run is now more friendly and includes data about the number and
+ proportion of reads that aligned, failed to align, or were
+ suppressed via the -m option.
+ * When too-short reads are encountered, Bowtie now always prints
+ warnings, not errors. --quiet now suppresses those warnings.
+ * By default, when bowtie prints a reference sequence name it now
+ stops at the first whitespace. In 0.10.1, the default was to
+ print the entire name, which could cause confusion when parsing
+ Bowtie output. To revert to printing the full name, use the new
+ --fullref option.
+ * Bowtie now prints the command-line before exiting with an error.
+ * Fixed mistake in the manual's "Default output" section: the offset
+ in field 4 is 0-based, not 1-based. To obtain a 1-based offset
+ instead, use the -B 1 option.
+ * Various minor bug fixes.
+ * DEPRECATED: -z/--phased, -b/--binout, bowtie-maptool,
+ bowtie-maqconvert. These features will be removed in a future
+ version of Bowtie. Note that -b/--binout, bowtie-maptool, and
+ bowtie-maqconvert are largely superseded by the SAM output format
+ (-S/--sam), BAM, and SAMtools (http://samtools.sf.net). Contact
+ the authors if this is a problem.
+ * REMOVED: --unfq/--unfa/--maxfq/--maxfa/--alfq/--alfa. Please use
+ --un/--max/--al instead. Contact the authors if this is a
+ problem.
+
+Version 0.10.1 - July 19, 2009
+ * Now when -3/-5 are used in combination with -I/-X, the -I/-X
+ constraints are interpreted as applying to the original insert,
+ not the trimmed insert.
+ * Fixed issue whereby -I option was ignored; -I option works now.
+ * Fixed a bug whereby some large indexes were incorrectly reported
+ as corrupt by bowtie-build.
+ * Fixed issue whereby negative quality values were wrongly rejected
+ when both --integer-quals and --solexa-quals were specified.
+ * The -l/--seedlen parameter can now be adjusted down to 5
+ (previously had to be >= 20).
+ * Fixed several minor memory leaks and out-of-bounds issues. The
+ Linux version of the bowtie aligner now receives a clean bill of
+ health from valgrind's memcheck.
+ * Other minor bugfixes.
+
+Version 0.10.0.2 - 6/28/09
+ * Second bugfix for Windows version. src and bin-win32 packages
+ updated. Linux and Mac users are not affected. Thanks for your
+ bug reports and patience.
+
+Version 0.10.0.1 - 6/23/09
+ * Fix for crashing bug in Windows version. src and bin-win32
+ packages updated. Linux and Mac users are not affected.
+
+Version 0.10.0 - June 12, 2009
+ * Major change: All alignment modes are now unstratified by default.
+ The --nostrata option has been removed, since it is now the
+ default. A --strata option has been added to override the default
+ and force stratified reporting. Reporting is stratified if and
+ only if --strata is specified. --strata now cannot be specified
+ without also specifying --best. Please note that, because of this
+ change, specifying the same arguments to this version of Bowtie
+ may yield different reported results.
+ * Replaced the --unfa/--unfq options with a single --un option,
+ which writes unaligned reads to an output file (or pair of output
+ files) but keeps reads in their original form. This is in
+ contrast to the old --unfa/--unfq options, which only supported
+ FASTA or FASTQ formats, and which would print a post-trimming and
+ post-quality-value translation version of the read. Likewise, the
+ --alfa/--alfq and --maxfa/--maxfq options have been replaced with
+ --al and --max options. The old options are still present, but
+ are deprecated and will be removed in a future version.
+ * Added --nofw and --norc options, allowing alignment to just one
+ reference strand or the other.
+ * Added --mm option that causes bowtie to use memory-mapped files
+ instead of traditional file I/O to access the reference index.
+ This allows multiple bowtie processes running on the same computer
+ to share a single in-memory image of a given index. This is a
+ useful feature for parallelizing bowtie in situations where memory
+ is limited and where -p is inappropriate or insufficient. This
+ feature is not available in the Windows version of Bowtie.
+ * Added a section to the manual ("Reporting Modes") clarifying and
+ giving examples of how to use Bowtie's reporting options.
+ * The --al and -z/--phased options previously interacted in such a
+ way that the --al file could contain multiple entries for the same
+ aligned read. --al and -z/--phased are now incompatible.
+ * The --oldpmap option, deprecated in version 0.9.8, has been
+ removed.
+
+Version 0.9.9.3 - May 12, 2009
+ * Fixed an issue where bowtie --best would sometimes use excessive
+ amounts of memory to store path descriptors. There is now a per-
+ thread 32-MB ceiling (configurable with new option --chunkmbs
+ <int>) on the memory taken by path descriptors. If the ceiling is
+ exceeded Bowtie will skip the offending read, print a warning
+ message identifying the read, and continue.
+ * More options are available for defining the quality-value format,
+ including new --phred64-quals/--solexa1.3-quals options
+ appropriate for the 64-based-Phred output of Illumina's GA
+ Pipeline 1.3. Added option --phred33-quals to to handle the more
+ typical 33-based-Phred scale (the default). The --solexa-quals
+ option still handles the 64-based-Solexa scale output by GA
+ Pipeline versions prior to 1.3.
+ * bowtie-build now checks output files for obvious corruption due,
+ for example, to disk exhaustion.
+ * Specifying "-" (meaning stdin) as an input to bowtie is now
+ supported and documented.
+ * Fixed a bug whereby bowtie-maqconvert could fail to notice that it
+ had exhausted memory and output a corrupt Maq map file.
+ * Fixed a bug whereby bowtie would crash when trying to use an index
+ built on a machine with different endianness.
+ * Fixed several issues that prevented Bowtie from compiling on
+ Solaris. I confirm that Bowtie builds and runs on Solaris.
+ * Added _LARGEFILE_SOURCE _FILE_OFFSET_BITS=64 _GNU_SOURCE to the
+ default build options in an attempt to resolve some of the large-
+ file issues users are having.
+ * Clarified column 7 in the manual. We received many queries from
+ users curious about this number.
+ * Moderate speed improvements in --best mode.
+
+Version 0.9.9.2 - April 6, 2009
+ * Paired-end alignment is now available in all alignment modes,
+ including all -n modes.
+ * --best now provides better guarantees. Reported alignments are
+ now guaranteed to be "best" both in terms of stratum (i.e. number
+ of mismatches, or mismatches in the seed in the case of -n mode),
+ and in terms of the quality values at the mismatched position(s).
+ Stratum always trumps quality when determining best alignments.
+ Also, --best mode resolves the strand bias issue (see manual for a
+ discussion of the issue).
+ * Speed improvements for --best mode in most alignment modes.
+ * Major speed improvement for the -v 3 alignment mode (except when
+ -z is also used)
+ * The "Reported X alignments..." message is now printed to stderr
+ rather than stdout. Only alignments are written to stdout.
+ * In bowtie-maqconvert, read names longer than Maq's limit (36) are
+ now truncated to a suffix of the original name, rather than a
+ prefix. This mimics Maq's behavior and prevents "/1" and "/2"
+ suffixes for paired-end reads from being destroyed.
+ * Added --alfq/--alfa options to dump aligned reads to FASTQ and/or
+ FASTA files.
+ * Removed many extraneous source files.
+
+Version 0.9.9.1 - March 10, 2009
+ * Added paired-end alignment for -v 2 and -v 3 alignment modes (-n
+ modes coming soon).
+ * Minor bug fixes and speed improvements for all paired-end modes.
+ * Added -s/--skip <int> option to skip over the first <int> reads or
+ pairs in the input.
+ * --unfq/--unfa/--maxfq/--maxfa modes no longer create empty output
+ files.
+ * All Bowtie tools now compile under GCC 4.3.3.
+ * Fixed bug whereby bowtie -b would sometimes write garbage into the
+ reference offset field.
+ * Paired-end info is now persisted in the -b format, allowing
+ bowtie-maptool output to add "/1" and "/2" suffixes as
+ appropriate.
+
+Version 0.9.9 - February 19, 2009
+ * Added some preliminary support for paired-end alignment in -v 0
+ and -v 1 modes. -1/-2 options to specify the paired-end files,
+ -I/-X to specify min and max insert sizes, and --fr/--rf/--ff
+ specify relative orientation of upstream and downstream mates.
+ bowtie-build now builds two additional files: NAME.3.ebwt and
+ NAME.4.ebwt. Together, these files store a bitpacked version of
+ the reference and they are required for paired-end alignment. If
+ your index does not include these files and you would like to
+ perform a paired-end alignment, you will have to rebuild the index
+ with bowtie-build version 0.9.9 or later. Paired-end alignment is
+ not compatible with -z mode, and it incurs about a 30% greater
+ memory overhead than single-end mode.
+ * Pre-built indexes available from Bowtie website have been updated
+ to include .3/.4.ebwt index files. These new pre-built indexes
+ are no longer compatible with bowtie versions prior to 0.9.8.
+ * New -B/--offbase option allows user to specify how bowtie numbers
+ reference positions in its output. E.g. -B 1 causes bowtie to
+ number leftmost char as 1. -B 0 is the default, but -B 1 will
+ likely become the default in the 1.0 release.
+ * Fixed a bug that caused trimming options -3 and -5 not to work
+ properly in -r (raw input) mode.
+ * bowtie-build now prints a friendly error message and exits if an
+ input file doesn't exist.
+ * Fixed a bug that caused the Win32 version of bowtie to hang just
+ before it would normally have exited.
+ * Fixed bug that could prevent successful read-in of very large
+ (>1GB) .2.ebwt index files.
+ * Removed --maxns option since it's mostly redundant with what -v
+ and -n already do.
+ * Removed --ntoa option.
+ * bowtie usage message is now divided into sections for clarity.
+
+Version 0.9.8.1 - January 7, 2009
+ * Fixed all known problems with the --unfa/--unfq options:
+ * They now work properly with multiple threads.
+ * Fixed issue where sequence and quals were sometimes reversed.
+ * Fixed other issues causing spurious omission of unaligned reads.
+ * Added --maxfa/--maxfq options so that reads that don't align due
+ to the -m limit can be dumped separately from reads that don't
+ align at all.
+ * Alignment output is now guaranteed to be "deterministic" even when
+ multiple threads are used. I.e., given the same input reads (in
+ any order) and the same --seed, bowtie will produce the same
+ alignments every time it is run, though not necessarily in the
+ same order. This does not hold across different versions of
+ Bowtie.
+ * Multiple other bug fixes.
+
+Version 0.9.8 - November 25, 2008
+ * --unfa/--unfq <filename> options cause bowtie to dump unaligned
+ reads to FASTA and/or FASTQ files.
+ * bowtie-build now selects its memory-efficiency parameters (--bmax,
+ --dcv, --packed) automatically by default; this makes it far
+ easier to build an index under memory constraints by eliminating
+ tedious trail-and-error. New -a option disables this, yielding
+ old behavior.
+ * bowtie-build-packed is no longer a separate binary. Supplying the
+ new -p/--packed argument to bowtie-build is the new equivalent.
+ * New tool bowtie-maptool converts between Bowtie's output formats.
+ * New tool bowtie-inspect recreates reference strings from Bowtie
+ index.
+ * Renamed bowtie-convert to bowtie-maqconvert for clarity.
+ * New universal Mac binary combines i386 & x86_64 binaries. PowerPC
+ still not supported.
+ * Added --nomaqround option to bowtie.
+ * Fixed memory leaks in bowtie.
+ * Switched to a new scheme for mapping positions in "joined"
+ reference string to positions in original strings. This changes
+ the index format. bowtie-build's --oldpmap parameter reverts to
+ the old format. Versions of bowtie prior to 0.9.8 cannot search
+ indexes produced by bowtie-build 0.9.8 unless bowtie-build is run
+ with --oldpmap. bowtie 0.9.8 can search either index format.
+ Pre-built indexes are still in the old format, but will switch to
+ new format when Bowtie 1.0 is released.
+
+Version 0.9.7.1 - November 11, 2008
+ * Fixed an issue that caused a spurious loss of sensitivity between
+ Bowtie versions 0.9.6 and 0.9.7 in certain modes. Many thanks to
+ Ali Mortazavi for bringing this to our attention.
+
+Version 0.9.7 - November 8, 2008
+ * Added new reporting option -m <int> which suppresses all
+ alignments for a particular read if more than <int> reportable
+ alignments exist for it.
+ * Threads now buffer all alignments for a particular read/phase then
+ output all alignments in one critical section. This guarantees
+ that all alignments for a given read/phase appear in one
+ consecutive block of the output, even when multiple threads are
+ operating in parallel.
+ * Separated the quality-conversion and parsing aspects of the old
+ --solexa-quals argument into separate arguments: --solexa-quals
+ (quality conversion) and --integer-quals (parsing).
+ * bowtie-convert now handles the new (post-0.7.0) Maq alignment
+ format. The new format allows Maq tools to handle reads up to
+ 127 bases, whereas the old format was limited to 63 bases. Added
+ a -o option to opt for the old Maq format.
+ * New --refout argument sends alignments to a set of files named
+ refXXXXX.map, where XXXXX is the 0-padded index of the reference
+ sequence aligned to. Useful for dealing with large datasets
+ aligned to, e.g., the assembled human genome.
+ * Improved tutorial to use a simple simulated read set (included)
+ to do SNP calls with Maq.
+ * Added --nota option to bowtie-build
+ * Fixed make_h_sapiens_asm.sh script to include mitochondrial DNA.
+
+Version 0.9.6 - October 10, 2008
+ * 'bowtie' now supports a host of options that allow the user to
+ specify which and how many valid alignments to report per read.
+ The default is still to report 1 "good" alignment, which is by far
+ the fastest mode. See -k/-a/--best/--nostrata options described
+ in the manual for details.
+ * 'bowtie' now supports reads up to 1024 bases long. Note that for
+ reads much longer than, say, 35 bases, the user must be careful to
+ set alignment policy parameters (especially -e) appropriately.
+ * --fast flag eliminated, double-index mode is now the default.
+ Added the -z/--phased flag to revert to phased, half-index mode.
+ * --concise output mode now officially supported. Now outputs one
+ alignment per line.
+ * Changed 'bowtie-build' default back to --bmaxdivn 4.
+ * -h/--help now prints much more verbose help for 'bowtie' and
+ 'bowtie-build' (verbatim from MANUAL file)
+ * BWT-searching code streamlined; much old code eliminated
+
+Version 0.9.5 - September 27, 2008
+ * Last column of output now additionally reports the reference and
+ query bases (in that order) for mismatches. E.g., old: "30,32",
+ new: "30:C>A,32:C>T".
+ * Eliminated spurious trailing space in first column of output.
+ * Minor performance and sensitivity improvements.
+ * New option '-p' spawns a user-specified number of pthreads for
+ parallel processing of reads. For example, use '-p 4' to run
+ 'bowtie' on 4 processor cores simultaneously.
+ * Due to the new '-p' option, 'bowtie' needs pthreads to compile and
+ run. To compile 'bowtie' without pthreads support (which disables
+ the '-p' option), use 'make BOWTIE_PTHREADS=0'.
+ * Also due to '-p' option, the Windows version of Bowtie now comes
+ with the pthreadGC2.dll file from the pthreads for Win32 project
+ (http://sourceware.org/pthreads-win32). This library is released
+ under the LGPL license.
+ * New option '--fast' causes Bowtie to load both the "forward" and
+ "mirror" halves of the index at once, which eliminates the need
+ for multiple phases and speeds up matching at the cost of using
+ about twice as much memory. '--fast' also causes 'bowtie' to
+ scale better when used in combination with '-p'.
+ * Fixed crashing bug with -o/--offrate in 'bowtie'.
+ * Improved error reporting.
+
+Version 0.9.4 - September 16, 2008
+ * New method for handling gaps and ambiguity codes in the reference.
+ New 'bowtie-build' method handles long stretches of gaps
+ gracefully. New 'bowtie' rejects alignments that overlap a gap or
+ ambiguous character in the reference.
+ * Due to above change, index file format has been changed. All
+ pre-built indexes available on this site have been updated to the
+ new format. To obtain indexes with the old format, contact us.
+ * In 'bowtie' unnamed reads are now given ordinal names (rather than
+ "default") in the alignment output. Works for all input modes.
+ * New 'bowtie' input mode: Raw, activated with -r. Expects one read
+ sequence per line; no quality values or names.
+ * Fixed 'bowtie' bug whereby trimming did not work in -c mode.
+ * Changed 'bowtie-build' default to not use blockwise mode.
+ * Changed 'bowtie-build' to avoid certain infinite-loop and very-
+ long-runtime scenarios.
+ * Packaging improvements: archives now explode into subdirectories
+ and scripts are executable.
+
+Version 0.9.3 - September 6, 2008
+ * Major reference-name bug fixes to bowtie-convert
+
+Version 0.9.2 - September 4, 2008
+ * Now allows 3-mismatches: -n and -v options accept 3
+ * Output format prints reference name instead of id in third column
+ * Pre-built indexes updated to encode reference names
+ * Ns in reads now match nothing (previously, they matched As/Ts)
+ * Dropped -l/--linerate and -i/--linesperside arguments to bowtie-
+ build
+ * Fixed bug in Maq-like mode that allowed some poor alignments
+ * Minor speed improvements
+
+Version 0.9.1 - August 25, 2008
+ * Integrated relevant SeqAn-1.1 sources into Bowtie source release
+ * Now builds on Windows under MinGW (needs pthreads and zlib)
+ * Binary releases for Linux (i386, x86_64), Windows (i386) and MacOS
+ X (i386)
+
+Version 0.9.0 - August 18, 2008
+ * First stable release of Bowtie.
+ * Includes the three core Bowtie tools: the indexer 'bowtie-build',
+ the read aligner 'bowtie' and the converter from Bowtie's to Maq's
+ mapping output format, 'bowtie-convert'.
+ * Compatible pre-built indexes for many model organisms are
+ available from http://bowtie-bio.sf.net.
+ * FASTA, FASTQ inputs supported; tested with Solexa FASTQ
+ * Supports Maq alignment policy (-n and -e behave as in Maq)
+ * Supports X-mismatch policy (-v option behaves as in SOAP)
+ * -n and -v options accept 0, 1, or 2
--- /dev/null
+ GNU GENERAL PUBLIC LICENSE
+ Version 3, 29 June 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The GNU General Public License is a free, copyleft license for
+software and other kinds of works.
+
+ The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works. By contrast,
+the GNU General Public License is intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users. We, the Free Software Foundation, use the
+GNU General Public License for most of our software; it applies also to
+any other work released this way by its authors. You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+ To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights. Therefore, you have
+certain responsibilities if you distribute copies of the software, or if
+you modify it: responsibilities to respect the freedom of others.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received. You must make sure that they, too, receive
+or can get the source code. And you must show them these terms so they
+know their rights.
+
+ Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+
+ For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software. For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+
+ Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the manufacturer
+can do so. This is fundamentally incompatible with the aim of
+protecting users' freedom to change the software. The systematic
+pattern of such abuse occurs in the area of products for individuals to
+use, which is precisely where it is most unacceptable. Therefore, we
+have designed this version of the GPL to prohibit the practice for those
+products. If such problems arise substantially in other domains, we
+stand ready to extend this provision to those domains in future versions
+of the GPL, as needed to protect the freedom of users.
+
+ Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary. To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ TERMS AND CONDITIONS
+
+ 0. Definitions.
+
+ "This License" refers to version 3 of the GNU General Public License.
+
+ "Copyright" also means copyright-like laws that apply to other kinds of
+works, such as semiconductor masks.
+
+ "The Program" refers to any copyrightable work licensed under this
+License. Each licensee is addressed as "you". "Licensees" and
+"recipients" may be individuals or organizations.
+
+ To "modify" a work means to copy from or adapt all or part of the work
+in a fashion requiring copyright permission, other than the making of an
+exact copy. The resulting work is called a "modified version" of the
+earlier work or a work "based on" the earlier work.
+
+ A "covered work" means either the unmodified Program or a work based
+on the Program.
+
+ To "propagate" a work means to do anything with it that, without
+permission, would make you directly or secondarily liable for
+infringement under applicable copyright law, except executing it on a
+computer or modifying a private copy. Propagation includes copying,
+distribution (with or without modification), making available to the
+public, and in some countries other activities as well.
+
+ To "convey" a work means any kind of propagation that enables other
+parties to make or receive copies. Mere interaction with a user through
+a computer network, with no transfer of a copy, is not conveying.
+
+ An interactive user interface displays "Appropriate Legal Notices"
+to the extent that it includes a convenient and prominently visible
+feature that (1) displays an appropriate copyright notice, and (2)
+tells the user that there is no warranty for the work (except to the
+extent that warranties are provided), that licensees may convey the
+work under this License, and how to view a copy of this License. If
+the interface presents a list of user commands or options, such as a
+menu, a prominent item in the list meets this criterion.
+
+ 1. Source Code.
+
+ The "source code" for a work means the preferred form of the work
+for making modifications to it. "Object code" means any non-source
+form of a work.
+
+ A "Standard Interface" means an interface that either is an official
+standard defined by a recognized standards body, or, in the case of
+interfaces specified for a particular programming language, one that
+is widely used among developers working in that language.
+
+ The "System Libraries" of an executable work include anything, other
+than the work as a whole, that (a) is included in the normal form of
+packaging a Major Component, but which is not part of that Major
+Component, and (b) serves only to enable use of the work with that
+Major Component, or to implement a Standard Interface for which an
+implementation is available to the public in source code form. A
+"Major Component", in this context, means a major essential component
+(kernel, window system, and so on) of the specific operating system
+(if any) on which the executable work runs, or a compiler used to
+produce the work, or an object code interpreter used to run it.
+
+ The "Corresponding Source" for a work in object code form means all
+the source code needed to generate, install, and (for an executable
+work) run the object code and to modify the work, including scripts to
+control those activities. However, it does not include the work's
+System Libraries, or general-purpose tools or generally available free
+programs which are used unmodified in performing those activities but
+which are not part of the work. For example, Corresponding Source
+includes interface definition files associated with source files for
+the work, and the source code for shared libraries and dynamically
+linked subprograms that the work is specifically designed to require,
+such as by intimate data communication or control flow between those
+subprograms and other parts of the work.
+
+ The Corresponding Source need not include anything that users
+can regenerate automatically from other parts of the Corresponding
+Source.
+
+ The Corresponding Source for a work in source code form is that
+same work.
+
+ 2. Basic Permissions.
+
+ All rights granted under this License are granted for the term of
+copyright on the Program, and are irrevocable provided the stated
+conditions are met. This License explicitly affirms your unlimited
+permission to run the unmodified Program. The output from running a
+covered work is covered by this License only if the output, given its
+content, constitutes a covered work. This License acknowledges your
+rights of fair use or other equivalent, as provided by copyright law.
+
+ You may make, run and propagate covered works that you do not
+convey, without conditions so long as your license otherwise remains
+in force. You may convey covered works to others for the sole purpose
+of having them make modifications exclusively for you, or provide you
+with facilities for running those works, provided that you comply with
+the terms of this License in conveying all material for which you do
+not control copyright. Those thus making or running the covered works
+for you must do so exclusively on your behalf, under your direction
+and control, on terms that prohibit them from making any copies of
+your copyrighted material outside their relationship with you.
+
+ Conveying under any other circumstances is permitted solely under
+the conditions stated below. Sublicensing is not allowed; section 10
+makes it unnecessary.
+
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+
+ No covered work shall be deemed part of an effective technological
+measure under any applicable law fulfilling obligations under article
+11 of the WIPO copyright treaty adopted on 20 December 1996, or
+similar laws prohibiting or restricting circumvention of such
+measures.
+
+ When you convey a covered work, you waive any legal power to forbid
+circumvention of technological measures to the extent such circumvention
+is effected by exercising rights under this License with respect to
+the covered work, and you disclaim any intention to limit operation or
+modification of the work as a means of enforcing, against the work's
+users, your or third parties' legal rights to forbid circumvention of
+technological measures.
+
+ 4. Conveying Verbatim Copies.
+
+ You may convey verbatim copies of the Program's source code as you
+receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy an appropriate copyright notice;
+keep intact all notices stating that this License and any
+non-permissive terms added in accord with section 7 apply to the code;
+keep intact all notices of the absence of any warranty; and give all
+recipients a copy of this License along with the Program.
+
+ You may charge any price or no price for each copy that you convey,
+and you may offer support or warranty protection for a fee.
+
+ 5. Conveying Modified Source Versions.
+
+ You may convey a work based on the Program, or the modifications to
+produce it from the Program, in the form of source code under the
+terms of section 4, provided that you also meet all of these conditions:
+
+ a) The work must carry prominent notices stating that you modified
+ it, and giving a relevant date.
+
+ b) The work must carry prominent notices stating that it is
+ released under this License and any conditions added under section
+ 7. This requirement modifies the requirement in section 4 to
+ "keep intact all notices".
+
+ c) You must license the entire work, as a whole, under this
+ License to anyone who comes into possession of a copy. This
+ License will therefore apply, along with any applicable section 7
+ additional terms, to the whole of the work, and all its parts,
+ regardless of how they are packaged. This License gives no
+ permission to license the work in any other way, but it does not
+ invalidate such permission if you have separately received it.
+
+ d) If the work has interactive user interfaces, each must display
+ Appropriate Legal Notices; however, if the Program has interactive
+ interfaces that do not display Appropriate Legal Notices, your
+ work need not make them do so.
+
+ A compilation of a covered work with other separate and independent
+works, which are not by their nature extensions of the covered work,
+and which are not combined with it such as to form a larger program,
+in or on a volume of a storage or distribution medium, is called an
+"aggregate" if the compilation and its resulting copyright are not
+used to limit the access or legal rights of the compilation's users
+beyond what the individual works permit. Inclusion of a covered work
+in an aggregate does not cause this License to apply to the other
+parts of the aggregate.
+
+ 6. Conveying Non-Source Forms.
+
+ You may convey a covered work in object code form under the terms
+of sections 4 and 5, provided that you also convey the
+machine-readable Corresponding Source under the terms of this License,
+in one of these ways:
+
+ a) Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by the
+ Corresponding Source fixed on a durable physical medium
+ customarily used for software interchange.
+
+ b) Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by a
+ written offer, valid for at least three years and valid for as
+ long as you offer spare parts or customer support for that product
+ model, to give anyone who possesses the object code either (1) a
+ copy of the Corresponding Source for all the software in the
+ product that is covered by this License, on a durable physical
+ medium customarily used for software interchange, for a price no
+ more than your reasonable cost of physically performing this
+ conveying of source, or (2) access to copy the
+ Corresponding Source from a network server at no charge.
+
+ c) Convey individual copies of the object code with a copy of the
+ written offer to provide the Corresponding Source. This
+ alternative is allowed only occasionally and noncommercially, and
+ only if you received the object code with such an offer, in accord
+ with subsection 6b.
+
+ d) Convey the object code by offering access from a designated
+ place (gratis or for a charge), and offer equivalent access to the
+ Corresponding Source in the same way through the same place at no
+ further charge. You need not require recipients to copy the
+ Corresponding Source along with the object code. If the place to
+ copy the object code is a network server, the Corresponding Source
+ may be on a different server (operated by you or a third party)
+ that supports equivalent copying facilities, provided you maintain
+ clear directions next to the object code saying where to find the
+ Corresponding Source. Regardless of what server hosts the
+ Corresponding Source, you remain obligated to ensure that it is
+ available for as long as needed to satisfy these requirements.
+
+ e) Convey the object code using peer-to-peer transmission, provided
+ you inform other peers where the object code and Corresponding
+ Source of the work are being offered to the general public at no
+ charge under subsection 6d.
+
+ A separable portion of the object code, whose source code is excluded
+from the Corresponding Source as a System Library, need not be
+included in conveying the object code work.
+
+ A "User Product" is either (1) a "consumer product", which means any
+tangible personal property which is normally used for personal, family,
+or household purposes, or (2) anything designed or sold for incorporation
+into a dwelling. In determining whether a product is a consumer product,
+doubtful cases shall be resolved in favor of coverage. For a particular
+product received by a particular user, "normally used" refers to a
+typical or common use of that class of product, regardless of the status
+of the particular user or of the way in which the particular user
+actually uses, or expects or is expected to use, the product. A product
+is a consumer product regardless of whether the product has substantial
+commercial, industrial or non-consumer uses, unless such uses represent
+the only significant mode of use of the product.
+
+ "Installation Information" for a User Product means any methods,
+procedures, authorization keys, or other information required to install
+and execute modified versions of a covered work in that User Product from
+a modified version of its Corresponding Source. The information must
+suffice to ensure that the continued functioning of the modified object
+code is in no case prevented or interfered with solely because
+modification has been made.
+
+ If you convey an object code work under this section in, or with, or
+specifically for use in, a User Product, and the conveying occurs as
+part of a transaction in which the right of possession and use of the
+User Product is transferred to the recipient in perpetuity or for a
+fixed term (regardless of how the transaction is characterized), the
+Corresponding Source conveyed under this section must be accompanied
+by the Installation Information. But this requirement does not apply
+if neither you nor any third party retains the ability to install
+modified object code on the User Product (for example, the work has
+been installed in ROM).
+
+ The requirement to provide Installation Information does not include a
+requirement to continue to provide support service, warranty, or updates
+for a work that has been modified or installed by the recipient, or for
+the User Product in which it has been modified or installed. Access to a
+network may be denied when the modification itself materially and
+adversely affects the operation of the network or violates the rules and
+protocols for communication across the network.
+
+ Corresponding Source conveyed, and Installation Information provided,
+in accord with this section must be in a format that is publicly
+documented (and with an implementation available to the public in
+source code form), and must require no special password or key for
+unpacking, reading or copying.
+
+ 7. Additional Terms.
+
+ "Additional permissions" are terms that supplement the terms of this
+License by making exceptions from one or more of its conditions.
+Additional permissions that are applicable to the entire Program shall
+be treated as though they were included in this License, to the extent
+that they are valid under applicable law. If additional permissions
+apply only to part of the Program, that part may be used separately
+under those permissions, but the entire Program remains governed by
+this License without regard to the additional permissions.
+
+ When you convey a copy of a covered work, you may at your option
+remove any additional permissions from that copy, or from any part of
+it. (Additional permissions may be written to require their own
+removal in certain cases when you modify the work.) You may place
+additional permissions on material, added by you to a covered work,
+for which you have or can give appropriate copyright permission.
+
+ Notwithstanding any other provision of this License, for material you
+add to a covered work, you may (if authorized by the copyright holders of
+that material) supplement the terms of this License with terms:
+
+ a) Disclaiming warranty or limiting liability differently from the
+ terms of sections 15 and 16 of this License; or
+
+ b) Requiring preservation of specified reasonable legal notices or
+ author attributions in that material or in the Appropriate Legal
+ Notices displayed by works containing it; or
+
+ c) Prohibiting misrepresentation of the origin of that material, or
+ requiring that modified versions of such material be marked in
+ reasonable ways as different from the original version; or
+
+ d) Limiting the use for publicity purposes of names of licensors or
+ authors of the material; or
+
+ e) Declining to grant rights under trademark law for use of some
+ trade names, trademarks, or service marks; or
+
+ f) Requiring indemnification of licensors and authors of that
+ material by anyone who conveys the material (or modified versions of
+ it) with contractual assumptions of liability to the recipient, for
+ any liability that these contractual assumptions directly impose on
+ those licensors and authors.
+
+ All other non-permissive additional terms are considered "further
+restrictions" within the meaning of section 10. If the Program as you
+received it, or any part of it, contains a notice stating that it is
+governed by this License along with a term that is a further
+restriction, you may remove that term. If a license document contains
+a further restriction but permits relicensing or conveying under this
+License, you may add to a covered work material governed by the terms
+of that license document, provided that the further restriction does
+not survive such relicensing or conveying.
+
+ If you add terms to a covered work in accord with this section, you
+must place, in the relevant source files, a statement of the
+additional terms that apply to those files, or a notice indicating
+where to find the applicable terms.
+
+ Additional terms, permissive or non-permissive, may be stated in the
+form of a separately written license, or stated as exceptions;
+the above requirements apply either way.
+
+ 8. Termination.
+
+ You may not propagate or modify a covered work except as expressly
+provided under this License. Any attempt otherwise to propagate or
+modify it is void, and will automatically terminate your rights under
+this License (including any patent licenses granted under the third
+paragraph of section 11).
+
+ However, if you cease all violation of this License, then your
+license from a particular copyright holder is reinstated (a)
+provisionally, unless and until the copyright holder explicitly and
+finally terminates your license, and (b) permanently, if the copyright
+holder fails to notify you of the violation by some reasonable means
+prior to 60 days after the cessation.
+
+ Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+
+ Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License. If your rights have been terminated and not permanently
+reinstated, you do not qualify to receive new licenses for the same
+material under section 10.
+
+ 9. Acceptance Not Required for Having Copies.
+
+ You are not required to accept this License in order to receive or
+run a copy of the Program. Ancillary propagation of a covered work
+occurring solely as a consequence of using peer-to-peer transmission
+to receive a copy likewise does not require acceptance. However,
+nothing other than this License grants you permission to propagate or
+modify any covered work. These actions infringe copyright if you do
+not accept this License. Therefore, by modifying or propagating a
+covered work, you indicate your acceptance of this License to do so.
+
+ 10. Automatic Licensing of Downstream Recipients.
+
+ Each time you convey a covered work, the recipient automatically
+receives a license from the original licensors, to run, modify and
+propagate that work, subject to this License. You are not responsible
+for enforcing compliance by third parties with this License.
+
+ An "entity transaction" is a transaction transferring control of an
+organization, or substantially all assets of one, or subdividing an
+organization, or merging organizations. If propagation of a covered
+work results from an entity transaction, each party to that
+transaction who receives a copy of the work also receives whatever
+licenses to the work the party's predecessor in interest had or could
+give under the previous paragraph, plus a right to possession of the
+Corresponding Source of the work from the predecessor in interest, if
+the predecessor has it or can get it with reasonable efforts.
+
+ You may not impose any further restrictions on the exercise of the
+rights granted or affirmed under this License. For example, you may
+not impose a license fee, royalty, or other charge for exercise of
+rights granted under this License, and you may not initiate litigation
+(including a cross-claim or counterclaim in a lawsuit) alleging that
+any patent claim is infringed by making, using, selling, offering for
+sale, or importing the Program or any portion of it.
+
+ 11. Patents.
+
+ A "contributor" is a copyright holder who authorizes use under this
+License of the Program or a work on which the Program is based. The
+work thus licensed is called the contributor's "contributor version".
+
+ A contributor's "essential patent claims" are all patent claims
+owned or controlled by the contributor, whether already acquired or
+hereafter acquired, that would be infringed by some manner, permitted
+by this License, of making, using, or selling its contributor version,
+but do not include claims that would be infringed only as a
+consequence of further modification of the contributor version. For
+purposes of this definition, "control" includes the right to grant
+patent sublicenses in a manner consistent with the requirements of
+this License.
+
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
+patent license under the contributor's essential patent claims, to
+make, use, sell, offer for sale, import and otherwise run, modify and
+propagate the contents of its contributor version.
+
+ In the following three paragraphs, a "patent license" is any express
+agreement or commitment, however denominated, not to enforce a patent
+(such as an express permission to practice a patent or covenant not to
+sue for patent infringement). To "grant" such a patent license to a
+party means to make such an agreement or commitment not to enforce a
+patent against the party.
+
+ If you convey a covered work, knowingly relying on a patent license,
+and the Corresponding Source of the work is not available for anyone
+to copy, free of charge and under the terms of this License, through a
+publicly available network server or other readily accessible means,
+then you must either (1) cause the Corresponding Source to be so
+available, or (2) arrange to deprive yourself of the benefit of the
+patent license for this particular work, or (3) arrange, in a manner
+consistent with the requirements of this License, to extend the patent
+license to downstream recipients. "Knowingly relying" means you have
+actual knowledge that, but for the patent license, your conveying the
+covered work in a country, or your recipient's use of the covered work
+in a country, would infringe one or more identifiable patents in that
+country that you have reason to believe are valid.
+
+ If, pursuant to or in connection with a single transaction or
+arrangement, you convey, or propagate by procuring conveyance of, a
+covered work, and grant a patent license to some of the parties
+receiving the covered work authorizing them to use, propagate, modify
+or convey a specific copy of the covered work, then the patent license
+you grant is automatically extended to all recipients of the covered
+work and works based on it.
+
+ A patent license is "discriminatory" if it does not include within
+the scope of its coverage, prohibits the exercise of, or is
+conditioned on the non-exercise of one or more of the rights that are
+specifically granted under this License. You may not convey a covered
+work if you are a party to an arrangement with a third party that is
+in the business of distributing software, under which you make payment
+to the third party based on the extent of your activity of conveying
+the work, and under which the third party grants, to any of the
+parties who would receive the covered work from you, a discriminatory
+patent license (a) in connection with copies of the covered work
+conveyed by you (or copies made from those copies), or (b) primarily
+for and in connection with specific products or compilations that
+contain the covered work, unless you entered into that arrangement,
+or that patent license was granted, prior to 28 March 2007.
+
+ Nothing in this License shall be construed as excluding or limiting
+any implied license or other defenses to infringement that may
+otherwise be available to you under applicable patent law.
+
+ 12. No Surrender of Others' Freedom.
+
+ If conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot convey a
+covered work so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you may
+not convey it at all. For example, if you agree to terms that obligate you
+to collect a royalty for further conveying from those to whom you convey
+the Program, the only way you could satisfy both those terms and this
+License would be to refrain entirely from conveying the Program.
+
+ 13. Use with the GNU Affero General Public License.
+
+ Notwithstanding any other provision of this License, you have
+permission to link or combine any covered work with a work licensed
+under version 3 of the GNU Affero General Public License into a single
+combined work, and to convey the resulting work. The terms of this
+License will continue to apply to the part which is the covered work,
+but the special requirements of the GNU Affero General Public License,
+section 13, concerning interaction through a network will apply to the
+combination as such.
+
+ 14. Revised Versions of this License.
+
+ The Free Software Foundation may publish revised and/or new versions of
+the GNU General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+ Each version is given a distinguishing version number. If the
+Program specifies that a certain numbered version of the GNU General
+Public License "or any later version" applies to it, you have the
+option of following the terms and conditions either of that numbered
+version or of any later version published by the Free Software
+Foundation. If the Program does not specify a version number of the
+GNU General Public License, you may choose any version ever published
+by the Free Software Foundation.
+
+ If the Program specifies that a proxy can decide which future
+versions of the GNU General Public License can be used, that proxy's
+public statement of acceptance of a version permanently authorizes you
+to choose that version for the Program.
+
+ Later license versions may give you additional or different
+permissions. However, no additional obligations are imposed on any
+author or copyright holder as a result of your choosing to follow a
+later version.
+
+ 15. Disclaimer of Warranty.
+
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
+OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
+IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+ 16. Limitation of Liability.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGES.
+
+ 17. Interpretation of Sections 15 and 16.
+
+ If the disclaimer of warranty and limitation of liability provided
+above cannot be given local legal effect according to their terms,
+reviewing courts shall apply local law that most closely approximates
+an absolute waiver of all civil liability in connection with the
+Program, unless a warranty or assumption of liability accompanies a
+copy of the Program in return for a fee.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+ If the program does terminal interaction, make it output a short
+notice like this when it starts in an interactive mode:
+
+ <program> Copyright (C) <year> <name of author>
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, your program's commands
+might be different; for a GUI interface, you would use an "about box".
+
+ You should also get your employer (if you work as a programmer) or school,
+if any, to sign a "copyright disclaimer" for the program, if necessary.
+For more information on this, and how to apply and follow the GNU GPL, see
+<http://www.gnu.org/licenses/>.
+
+ The GNU General Public License does not permit incorporating your program
+into proprietary programs. If your program is a subroutine library, you
+may consider it more useful to permit linking proprietary applications with
+the library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License. But first, please read
+<http://www.gnu.org/philosophy/why-not-lgpl.html>.
--- /dev/null
+ GNU LESSER GENERAL PUBLIC LICENSE
+ Version 3, 29 June 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+
+ This version of the GNU Lesser General Public License incorporates
+the terms and conditions of version 3 of the GNU General Public
+License, supplemented by the additional permissions listed below.
+
+ 0. Additional Definitions.
+
+ As used herein, "this License" refers to version 3 of the GNU Lesser
+General Public License, and the "GNU GPL" refers to version 3 of the GNU
+General Public License.
+
+ "The Library" refers to a covered work governed by this License,
+other than an Application or a Combined Work as defined below.
+
+ An "Application" is any work that makes use of an interface provided
+by the Library, but which is not otherwise based on the Library.
+Defining a subclass of a class defined by the Library is deemed a mode
+of using an interface provided by the Library.
+
+ A "Combined Work" is a work produced by combining or linking an
+Application with the Library. The particular version of the Library
+with which the Combined Work was made is also called the "Linked
+Version".
+
+ The "Minimal Corresponding Source" for a Combined Work means the
+Corresponding Source for the Combined Work, excluding any source code
+for portions of the Combined Work that, considered in isolation, are
+based on the Application, and not on the Linked Version.
+
+ The "Corresponding Application Code" for a Combined Work means the
+object code and/or source code for the Application, including any data
+and utility programs needed for reproducing the Combined Work from the
+Application, but excluding the System Libraries of the Combined Work.
+
+ 1. Exception to Section 3 of the GNU GPL.
+
+ You may convey a covered work under sections 3 and 4 of this License
+without being bound by section 3 of the GNU GPL.
+
+ 2. Conveying Modified Versions.
+
+ If you modify a copy of the Library, and, in your modifications, a
+facility refers to a function or data to be supplied by an Application
+that uses the facility (other than as an argument passed when the
+facility is invoked), then you may convey a copy of the modified
+version:
+
+ a) under this License, provided that you make a good faith effort to
+ ensure that, in the event an Application does not supply the
+ function or data, the facility still operates, and performs
+ whatever part of its purpose remains meaningful, or
+
+ b) under the GNU GPL, with none of the additional permissions of
+ this License applicable to that copy.
+
+ 3. Object Code Incorporating Material from Library Header Files.
+
+ The object code form of an Application may incorporate material from
+a header file that is part of the Library. You may convey such object
+code under terms of your choice, provided that, if the incorporated
+material is not limited to numerical parameters, data structure
+layouts and accessors, or small macros, inline functions and templates
+(ten or fewer lines in length), you do both of the following:
+
+ a) Give prominent notice with each copy of the object code that the
+ Library is used in it and that the Library and its use are
+ covered by this License.
+
+ b) Accompany the object code with a copy of the GNU GPL and this license
+ document.
+
+ 4. Combined Works.
+
+ You may convey a Combined Work under terms of your choice that,
+taken together, effectively do not restrict modification of the
+portions of the Library contained in the Combined Work and reverse
+engineering for debugging such modifications, if you also do each of
+the following:
+
+ a) Give prominent notice with each copy of the Combined Work that
+ the Library is used in it and that the Library and its use are
+ covered by this License.
+
+ b) Accompany the Combined Work with a copy of the GNU GPL and this license
+ document.
+
+ c) For a Combined Work that displays copyright notices during
+ execution, include the copyright notice for the Library among
+ these notices, as well as a reference directing the user to the
+ copies of the GNU GPL and this license document.
+
+ d) Do one of the following:
+
+ 0) Convey the Minimal Corresponding Source under the terms of this
+ License, and the Corresponding Application Code in a form
+ suitable for, and under terms that permit, the user to
+ recombine or relink the Application with a modified version of
+ the Linked Version to produce a modified Combined Work, in the
+ manner specified by section 6 of the GNU GPL for conveying
+ Corresponding Source.
+
+ 1) Use a suitable shared library mechanism for linking with the
+ Library. A suitable mechanism is one that (a) uses at run time
+ a copy of the Library already present on the user's computer
+ system, and (b) will operate properly with a modified version
+ of the Library that is interface-compatible with the Linked
+ Version.
+
+ e) Provide Installation Information, but only if you would otherwise
+ be required to provide such information under section 6 of the
+ GNU GPL, and only to the extent that such information is
+ necessary to install and execute a modified version of the
+ Combined Work produced by recombining or relinking the
+ Application with a modified version of the Linked Version. (If
+ you use option 4d0, the Installation Information must accompany
+ the Minimal Corresponding Source and Corresponding Application
+ Code. If you use option 4d1, you must provide the Installation
+ Information in the manner specified by section 6 of the GNU GPL
+ for conveying Corresponding Source.)
+
+ 5. Combined Libraries.
+
+ You may place library facilities that are a work based on the
+Library side by side in a single library together with other library
+facilities that are not Applications and are not covered by this
+License, and convey such a combined library under terms of your
+choice, if you do both of the following:
+
+ a) Accompany the combined library with a copy of the same work based
+ on the Library, uncombined with any other library facilities,
+ conveyed under the terms of this License.
+
+ b) Give prominent notice with the combined library that part of it
+ is a work based on the Library, and explaining where to find the
+ accompanying uncombined form of the same work.
+
+ 6. Revised Versions of the GNU Lesser General Public License.
+
+ The Free Software Foundation may publish revised and/or new versions
+of the GNU Lesser General Public License from time to time. Such new
+versions will be similar in spirit to the present version, but may
+differ in detail to address new problems or concerns.
+
+ Each version is given a distinguishing version number. If the
+Library as you received it specifies that a certain numbered version
+of the GNU Lesser General Public License "or any later version"
+applies to it, you have the option of following the terms and
+conditions either of that published version or of any later version
+published by the Free Software Foundation. If the Library as you
+received it does not specify a version number of the GNU Lesser
+General Public License, you may choose any version of the GNU Lesser
+General Public License ever published by the Free Software Foundation.
+
+ If the Library as you received it specifies that a proxy can decide
+whether future versions of the GNU Lesser General Public License shall
+apply, that proxy's public statement of acceptance of any version is
+permanent authorization for you to choose that version for the
+Library.
--- /dev/null
+This is SeqAn, the C++ template library for sequence analysis
+
+See http://www.seqan.de for more information
+
+Read "docs/Page_Installation.html" for detailed installation instructions.
+
+
+Folders:
+========
+
+"seqan": SeqAn library (add the folder that contains "seqan" to
+ your include path)
+"demos": SeqAn demos, see also "docs/INDEXPAGE_Demo.html"
+"apps": SeqAn applications
+"docs": HTML-Documentation
+
+
+Files:
+======
+
+"Makefile": make file for Linux/Darwin/Solaris
+"*_7.sln", "*_7.vcproj": Visual Studio .net 2003 solution and project files
+"*_8.sln", "*_8.vcproj": Visual Studio .net 2005 solution and project files
+"*_9.sln", "*_9.vcproj": Visual Studio .net 2008 solution and project files
+
+
+Have fun!
+
+Your SeqAn Team
--- /dev/null
+#define PLATFORM "gcc"
+
+#ifndef PLATFORM_GCC
+ #define PLATFORM_GCC
+#endif
+
+// should be set before including anything
+#ifndef _FILE_OFFSET_BITS
+ #define _FILE_OFFSET_BITS 64
+#endif
+
+#ifndef _LARGEFILE_SOURCE
+ #define _LARGEFILE_SOURCE
+#endif
+
+//#include <unistd.h>
+#include <inttypes.h>
+
+#define finline __inline__
+
+// default 64bit type
+typedef int64_t __int64;
+
+
+//define SEQAN_SWITCH_USE_FORWARDS to use generated forwards
+#define SEQAN_SWITCH_USE_FORWARDS
--- /dev/null
+#define PLATFORM "windows"
+
+#ifndef PLATFORM_WINDOWS
+ #define PLATFORM_WINDOWS
+#endif
+
+#define finline __inline__
+
+//define SEQAN_SWITCH_USE_FORWARDS to use generated forwards
+//#define SEQAN_SWITCH_USE_FORWARDS
--- /dev/null
+#define PLATFORM "windows"
+
+#ifndef PLATFORM_WINDOWS
+ #define PLATFORM_WINDOWS
+#endif
+
+#pragma warning( disable : 4675 )
+#pragma warning( disable : 4503 )
+
+#define finline __forceinline
+
+//define SEQAN_SWITCH_USE_FORWARDS to use generated forwards
+//#define SEQAN_SWITCH_USE_FORWARDS
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic.h,v 1.2 2009/05/06 20:32:59 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_H
+#define SEQAN_HEADER_BASIC_H
+
+//____________________________________________________________________________
+// prerequisites
+
+#include <seqan/platform.h>
+
+//#include <cstring>
+#ifdef PLATFORM_WINDOWS
+#include <limits> // limits include file exists only for g++ >= 3.0
+#endif
+
+#include <cstddef> // size_t
+#include <cstdio> // FILE, basic_debug
+#include <ctime>
+#include <iterator>
+#include <algorithm>
+#include <memory.h> // memset
+#include <string> // basic_profile
+
+#define SEQAN_NAMESPACE_MAIN seqan
+
+//____________________________________________________________________________
+
+#include <seqan/basic/basic_forwards.h>
+#ifdef SEQAN_SWITCH_USE_FORWARDS
+#include <seqan/basic/basic_generated_forwards.h>
+#endif
+
+#include <seqan/basic/basic_debug.h>
+#include <seqan/basic/basic_profile.h>
+#include <seqan/basic/basic_definition.h>
+#include <seqan/basic/basic_metaprogramming.h>
+#include <seqan/basic/basic_type.h>
+#include <seqan/basic/basic_tag.h>
+
+//____________________________________________________________________________
+// allocators
+
+#include <seqan/basic/basic_allocator_interface.h>
+#include <seqan/basic/basic_allocator_to_std.h>
+
+#include <seqan/basic/basic_holder.h>
+
+#include <seqan/basic/basic_allocator_simple.h>
+#include <seqan/basic/basic_allocator_singlepool.h>
+#include <seqan/basic/basic_allocator_multipool.h>
+//#include <seqan/basic/basic_allocator_chunkpool.h>
+
+//____________________________________________________________________________
+
+#include <seqan/basic/basic_converter.h>
+#include <seqan/basic/basic_compare.h>
+#include <seqan/basic/basic_operator.h>
+
+#include <seqan/basic/basic_host.h>
+
+//____________________________________________________________________________
+// iterators
+
+#include <seqan/basic/basic_iterator.h>
+#include <seqan/basic/basic_iterator_base.h>
+
+#include <seqan/basic/basic_transport.h>
+
+#include <seqan/basic/basic_iterator_simple.h>
+#include <seqan/basic/basic_iterator_adaptor.h>
+#include <seqan/basic/basic_iterator_position.h>
+#include <seqan/basic/basic_iterator_adapt_std.h>
+//#include <seqan/basic_identifier.h>
+
+#include <seqan/basic/basic_proxy.h>
+
+#include <seqan/basic/basic_pointer.h>
+
+//____________________________________________________________________________
+// alphabets
+
+#include <seqan/basic/basic_alphabet_interface.h>
+#include <seqan/basic/basic_alphabet_trait_basic.h>
+
+#include <seqan/basic/basic_alphabet_interface2.h>
+
+#include <seqan/basic/basic_alphabet_simple_tabs.h>
+#include <seqan/basic/basic_alphabet_simple.h>
+
+//____________________________________________________________________________
+
+//#include <seqan/basic/basic_counted_ptr>
+#include <seqan/basic/basic_volatile_ptr.h>
+
+#include <seqan/basic/basic_aggregates.h>
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_aggregates.h,v 1.1 2008/08/25 16:20:01 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_AGGREGATES_H
+#define SEQAN_HEADER_BASIC_AGGREGATES_H
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+
+//____________________________________________________________________________
+
+ struct _Compressed;
+ typedef Tag<_Compressed> Compressed;
+
+ // for Pairs with small i1-values
+ // store i1 and i2 in one word of type i2
+ // use the upper bits for i1 and the lower bits for i2
+ template <unsigned valueSizeI1 = 16>
+ struct CutCompressed {
+ enum { bitSizeI1 = Log2<valueSizeI1>::VALUE };
+ };
+
+/**
+.Class.Pair:
+..cat:Aggregates
+..summary:Stores two arbitrary objects.
+..signature:Pair<T1, T2[, Compression]>
+..param.T1:The type of the first object.
+..param.T2:The type of the second object.
+..param.Compression:If $Compressed$, the pair is stored in a more space efficient way (useful for external storage).
+...note:When compression is enabled, referring to members is not allowed.
+...default:$void$, no compression (faster access).
+.Memfunc.Pair#Pair:
+..class:Class.Pair
+..summary:Constructor
+..signature:Pair<T1, T2> ()
+..signature:Pair<T1, T2> (pair)
+..signature:Pair<T1, T2> (i1, i2)
+..param.pair:Other Pair object. (copy constructor)
+..param.i1:T1 object.
+..param.i2:T2 object.
+.Memvar.Pair#i1:
+..class:Class.Pair
+..summary:T1 object
+.Memvar.Pair#i2:
+..class:Class.Pair
+..summary:T2 object
+*/
+
+ // standard storage
+ template <typename _T1, typename _T2 = _T1, typename TCompression = void>
+ struct Pair {
+ typedef _T1 T1;
+ typedef _T2 T2;
+ _T1 i1;
+ _T2 i2;
+ inline Pair() {}
+ inline Pair(Pair const &_p): i1(_p.i1), i2(_p.i2) {}
+ inline Pair(_T1 const &_i1, _T2 const &_i2): i1(_i1), i2(_i2) {}
+
+ template <typename __T1, typename __T2, typename __TCompression>
+ inline Pair(Pair<__T1, __T2, __TCompression> const &_p):
+ i1(getValueI1(_p)), i2(getValueI2(_p)) {}
+ };
+
+
+
+ // unaligned and unpadded storage (space efficient)
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(push,1)
+#endif
+ template <typename _T1, typename _T2>
+ struct Pair<_T1, _T2, Compressed> {
+ typedef _T1 T1;
+ typedef _T2 T2;
+ _T1 i1;
+ _T2 i2;
+ inline Pair() {}
+ inline Pair(Pair const &_p): i1(_p.i1), i2(_p.i2) {}
+ inline Pair(_T1 const &_i1, _T2 const &_i2): i1(_i1), i2(_i2) {}
+
+ template <typename __T1, typename __T2, typename __TCompression>
+ inline Pair(Pair<__T1, __T2, __TCompression> const &_p):
+ i1(getValueI1(_p)), i2(getValueI2(_p)) {}
+ }
+#ifndef PLATFORM_WINDOWS
+ __attribute__((packed))
+#endif
+ ;
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(pop)
+#endif
+
+
+
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(push,1)
+#endif
+ template <typename _T1, typename _T2, unsigned valueSizeI1>
+ struct Pair<_T1, _T2, CutCompressed<valueSizeI1> > {
+ typedef _T1 T1;
+ typedef _T2 T2;
+
+ typedef _T2 T12;
+
+ T12 i12;
+
+ enum { bitSizeI1 = CutCompressed<valueSizeI1>::bitSizeI1 };
+ enum { bitShiftI1 = BitsPerValue<T12>::VALUE - bitSizeI1 };
+
+ inline Pair() {}
+ inline Pair(Pair const &_p): i12(_p.i12) {}
+ inline Pair(_T1 const &_i1, _T2 const &_i2):
+ i12(((T12)_i1 << bitShiftI1) + (T12)_i2) {}
+
+ template <typename __T1, typename __T2, typename __TCompression>
+ inline Pair(Pair<__T1, __T2, __TCompression> const &_p):
+ i12(((T12)getValueI1(_p) << bitShiftI1) + (T12)getValueI2(_p)) {}
+ }
+#ifndef PLATFORM_WINDOWS
+ __attribute__((packed))
+#endif
+ ;
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(pop)
+#endif
+
+
+
+ template <typename _T1, typename _T2, typename TCompression>
+ std::ostream& operator<<(std::ostream &out, Pair<_T1,_T2,TCompression> const &p) {
+ out << "< " << getValueI1(p) << " , " << getValueI2(p) << " >";
+ return out;
+ }
+
+ template <typename T1, typename T2, typename TCompression>
+ struct Value< Pair<T1, T2, TCompression>, 1 > {
+ typedef T1 Type;
+ };
+
+ template <typename T1, typename T2, typename TCompression>
+ struct Value< Pair<T1, T2, TCompression>, 2 > {
+ typedef T2 Type;
+ };
+
+ template <typename T1, typename T2, typename TCompression>
+ struct Spec< Pair<T1, T2, TCompression> > {
+ typedef TCompression Type;
+ };
+
+
+//____________________________________________________________________________
+
+ template <typename TKey, typename TObject, typename TSpec>
+ struct Key< Pair<TKey, TObject, TSpec> >
+ {
+ typedef TKey Type;
+ };
+
+ template <typename TKey, typename TCargo, typename TSpec>
+ struct Cargo< Pair<TKey, TCargo, TSpec> >
+ {
+ typedef TCargo Type;
+ };
+//____________________________________________________________________________
+
+/**
+.Class.Triple:
+..cat:Aggregates
+..summary:Stores three arbitrary objects.
+..signature:Triple<T1, T2, T3[, Compression]>
+..param.T1:The type of the first object.
+..param.T2:The type of the second object.
+..param.T3:The type of the third object.
+..param.Compression:If $Compressed$, the triple is stored in a more space efficient way (useful for external storage).
+...note:When compression is enabled, referring to members is not allowed.
+...default:$void$, no compression (faster access).
+.Memfunc.Triple#Triple:
+..class:Class.Triple
+..summary:Constructor
+..signature:Triple<T1, T2, T3> ()
+..signature:Triple<T1, T2, T3> (triple)
+..signature:Triple<T1, T2, T3> (i1, i2, i3)
+..param.triple:Other Triple object. (copy constructor)
+..param.i1:T1 object.
+..param.i2:T2 object.
+..param.i3:T3 object.
+.Memvar.Triple#i1:
+..class:Class.Triple
+..summary:T1 object
+.Memvar.Triple#i2:
+..class:Class.Triple
+..summary:T2 object
+.Memvar.Triple#i3:
+..class:Class.Triple
+..summary:T3 object
+*/
+
+ // standard storage
+ template <typename _T1, typename _T2 = _T1, typename _T3 = _T1, typename TCompression = void>
+ struct Triple {
+ typedef _T1 T1;
+ typedef _T2 T2;
+ typedef _T3 T3;
+ _T1 i1;
+ _T2 i2;
+ _T3 i3;
+ inline Triple() {}
+ inline Triple(Triple const &_p):
+ i1(_p.i1), i2(_p.i2), i3(_p.i3) {}
+ inline Triple(_T1 const &_i1, _T2 const &_i2, _T3 const &_i3):
+ i1(_i1), i2(_i2), i3(_i3) {}
+
+ template <typename __T1, typename __T2, typename __T3, typename __TCompression>
+ inline Triple(Triple<__T1, __T2, __T3, __TCompression> const &_p):
+ i1(getValueI1(_p)), i2(getValueI2(_p)), i3(getValueI3(_p)) {}
+ };
+
+ // unaligned and unpadded storage (space efficient)
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(push,1)
+#endif
+ template <typename _T1, typename _T2, typename _T3>
+ struct Triple<_T1, _T2, _T3, Compressed> {
+ typedef _T1 T1;
+ typedef _T2 T2;
+ typedef _T3 T3;
+ _T1 i1;
+ _T2 i2;
+ _T3 i3;
+ inline Triple() {}
+ inline Triple(Triple const &_p):
+ i1(_p.i1), i2(_p.i2), i3(_p.i3) {}
+ inline Triple(_T1 const &_i1, _T2 const &_i2, _T3 const &_i3):
+ i1(_i1), i2(_i2), i3(_i3) {}
+
+ template <typename __T1, typename __T2, typename __T3, typename __TCompression>
+ inline Triple(Triple<__T1, __T2, __T3, __TCompression> const &_p):
+ i1(getValueI1(_p)), i2(getValueI2(_p)), i3(getValueI3(_p)) {}
+ }
+#ifndef PLATFORM_WINDOWS
+ __attribute__((packed))
+#endif
+ ;
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(pop)
+#endif
+
+ template <typename _T1, typename _T2, typename _T3, typename TCompression>
+ std::ostream& operator<<(std::ostream &out, Triple<_T1,_T2,_T3,TCompression> const &t) {
+ out << "< " << getValueI1(t) << " , " << getValueI2(t) << " , " << getValueI3(t) << " >";
+ return out;
+ }
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ struct Value< Triple<T1, T2, T3, TCompression>, 1 > {
+ typedef T1 Type;
+ };
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ struct Value< Triple<T1, T2, T3, TCompression>, 2 > {
+ typedef T2 Type;
+ };
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ struct Value< Triple<T1, T2, T3, TCompression>, 3 > {
+ typedef T3 Type;
+ };
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ struct Spec< Triple<T1, T2, T3, TCompression> > {
+ typedef TCompression Type;
+ };
+
+
+//____________________________________________________________________________
+
+/**
+.Class.Tuple:
+..cat:Aggregates
+..summary:A plain fixed-length string.
+..signature:Tuple<T, size[, compress]>
+..param.T:The value type, that is the type of characters stored in the tuple.
+..param.size:The size/length of the tuple.
+...remarks:In contrast to @Class.String@ the length of Tuple is fixed.
+..param.compress:Enable/Disable compression.
+..param.compress:If $void$, no compression is used.
+..param.compress:If $Compressed$, the characters are stored as a bit sequence in an ordinal type (char, ..., __int64)
+...remarks:Only useful for small alphabets and small tuple sizes (|Sigma|^size <= 2^64) as for DNA or protein m-grams)
+...default:void.
+..see:Spec.Sampler
+*/
+
+ // standard storage
+ template <typename _T, unsigned _size, typename TCompression = void>
+ struct Tuple {
+ typedef _T T;
+ enum { size = _size };
+ _T i[_size];
+
+ template <typename TPos>
+ inline _T& operator[](TPos k) {
+ SEQAN_ASSERT(k >= 0 && k < size);
+ return i[k];
+ }
+ template <typename TPos>
+ inline const _T& operator[](TPos k) const {
+ SEQAN_ASSERT(k >= 0 && k < size);
+ return i[k];
+ }
+ inline _T* operator&() { return i; }
+ inline const _T* operator&() const { return i; }
+
+ // has to be inline because elements (like this tuple) of packed structs can't be arguments
+ template <typename TPos, typename SSS>
+ inline SSS const assignValueAt(TPos k, SSS const source) {
+ return i[k] = source;
+ }
+ };
+
+
+ template < unsigned char _size >
+ struct _BitVector {
+ typedef typename _BitVector<_size + 1>::Type Type;
+ };
+
+ template <> struct _BitVector<8> { typedef unsigned char Type; };
+ template <> struct _BitVector<16> { typedef unsigned short Type; };
+ template <> struct _BitVector<32> { typedef unsigned int Type; };
+ template <> struct _BitVector<64> { typedef __int64 Type; };
+ template <> struct _BitVector<255> { typedef __int64 Type; };
+
+ // bit-compressed storage (space efficient)
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(push,1)
+#endif
+ template <typename _T, unsigned _size>
+ struct Tuple<_T, _size, Compressed> {
+ typedef _T T;
+ enum { size = _size };
+ enum { bitSize = BitsPerValue<_T>::VALUE };
+ enum { bitMask = (1 << bitSize) - 1 };
+ enum { mask = (1 << (size * bitSize)) - 1 };
+ typedef typename _BitVector< bitSize * size >::Type CT;
+
+ CT i;
+/*
+ inline Tuple() {
+ SEQAN_ASSERT(bitSize * size <= sizeof(CT) * 8);
+ }
+*/
+ template <typename TPos>
+ inline const _T operator[](TPos k) const {
+ SEQAN_ASSERT(k >= 0 && k < size);
+ return (i >> (size - 1 - k) * bitSize) & bitMask;
+ }
+ template <unsigned __size>
+ inline Tuple operator=(Tuple<_T, __size, Compressed> const &_right) {
+ i = _right.i;
+ return *this;
+ }
+ template <typename TShiftSize>
+ inline CT operator<<=(TShiftSize shift) {
+ return i = (i << (shift * bitSize)) & mask;
+ }
+ template <typename TShiftSize>
+ inline CT operator<<(TShiftSize shift) const {
+ return (i << (shift * bitSize)) & mask;
+ }
+ template <typename TShiftSize>
+ inline CT operator>>=(TShiftSize shift) {
+ return i = (i >> (shift * bitSize));
+ }
+ template <typename TShiftSize>
+ inline CT operator>>(TShiftSize shift) const {
+ return i >> (shift * bitSize);
+ }
+ template <typename T>
+ inline void operator|=(T const &t) {
+ i |= t;
+ }
+ template <typename T, typename TSpec>
+ inline void operator|=(SimpleType<T, TSpec> const &t) {
+ i |= t.value;
+ }
+ inline CT* operator&() { return &i; }
+ inline const CT* operator&() const { return &i; }
+
+ // has to be inline because elements (like this tuple) of packed structs can't be arguments
+ template <typename TPos, typename SSS>
+ inline SSS const assignValueAt(TPos k, SSS const source) {
+ typedef Tuple<_T, _size, Compressed> Tup;
+ typename Tup::CT mask = Tup::bitMask << ((_size - 1 - k) * bitSize);
+ i = (i & ~mask) | ((CT)source << ((_size - 1 - k) * bitSize));
+ return source;
+ }
+ }
+#ifndef PLATFORM_WINDOWS
+ __attribute__((packed))
+#endif
+ ;
+#ifdef PLATFORM_WINDOWS
+ #pragma pack(pop)
+#endif
+
+
+//////////////////////////////////////////////////////////////////////////////
+// length
+
+ template <typename _T, unsigned _size, typename TCompression>
+ inline unsigned length(Tuple<_T, _size, TCompression> const &) { return _size; }
+
+ ///.Metafunction.LENGTH.param.T.type:Class.Tuple
+ template <typename _T, unsigned _size, typename TCompression>
+ struct LENGTH< Tuple<_T, _size, TCompression> >
+ {
+ enum { VALUE = _size };
+ };
+
+//////////////////////////////////////////////////////////////////////////////
+// assignValueAt
+
+ template <typename TObject, typename TPos, typename TSource>
+ inline TSource &
+ assignValueAt(TObject &me, TPos k, TSource &source) {
+ assign(value(me, k), source);
+ return source;
+ }
+
+ template <typename TObject, typename TPos, typename TSource>
+ inline TSource const &
+ assignValueAt(TObject &me, TPos k, TSource const &source) {
+ assign(value(me, k), source);
+ return source;
+ }
+
+ template <typename TTT, unsigned _size, typename SSS, typename TPos>
+ inline SSS const assignValueAt(Tuple<TTT, _size, void> &me, TPos k, SSS const source) {
+ return me.i[k] = source;
+ }
+
+ template <typename TTT, unsigned _size, typename SSS, typename TPos>
+ inline SSS const assignValueAt(Tuple<TTT, _size, Compressed> &me, TPos k, SSS const source) {
+ typedef Tuple<TTT, _size, Compressed> Tup;
+ typename Tup::CT mask = Tup::bitMask << ((_size - 1 - k) * me.bitSize);
+ me.i = (me.i & ~mask) | source << ((_size - 1 - k) * me.bitSize);
+ return source;
+ }
+
+ template <typename TTT, typename SSS, typename SSSpec, unsigned _size, typename TPos>
+ inline SimpleType<SSS, SSSpec> const & assignValueAt(Tuple<TTT, _size, Compressed> &me, TPos k, SimpleType<SSS, SSSpec> const &source) {
+ typedef Tuple<TTT, _size, Compressed> Tup;
+ typename Tup::CT mask = Tup::bitMask << ((_size - 1 - k) * me.bitSize);
+ me.i = (me.i & ~mask) | source.value << ((_size - 1 - k) * me.bitSize);
+ return source;
+ }
+
+//////////////////////////////////////////////////////////////////////////////
+// clear
+
+ template <typename TTT, unsigned _size, typename TCompression>
+ inline void clear(Tuple<TTT, _size, TCompression> &me) {
+ memset<sizeof(me.i), 0>(&(me.i));
+ }
+ template <typename TTT, unsigned _size>
+ inline void clear(Tuple<TTT, _size, Compressed> &me) {
+ me.i = 0;
+ }
+
+//////////////////////////////////////////////////////////////////////////////
+// optimized compares
+
+ template <typename TTT, unsigned _sizeL, unsigned _sizeR>
+ inline bool operator<(Tuple<TTT, _sizeL, Compressed> const &_left, Tuple<TTT, _sizeR, Compressed> const &_right) {
+ return _left.i < _right.i;
+ }
+ template <typename TTT, unsigned _sizeL, unsigned _sizeR>
+ inline bool operator>(Tuple<TTT, _sizeL, Compressed> const &_left, Tuple<TTT, _sizeR, Compressed> const &_right) {
+ return _left.i > _right.i;
+ }
+ template <typename TTT, unsigned _sizeL, unsigned _sizeR>
+ inline bool operator==(Tuple<TTT, _sizeL, Compressed> const &_left, Tuple<TTT, _sizeR, Compressed> const &_right) {
+ return _left.i == _right.i;
+ }
+ template <typename TTT, unsigned _sizeL, unsigned _sizeR>
+ inline bool operator!=(Tuple<TTT, _sizeL, Compressed> const &_left, Tuple<TTT, _sizeR, Compressed> const &_right) {
+ return _left.i != _right.i;
+ }
+
+//////////////////////////////////////////////////////////////////////////////
+// optimized shifts
+
+ struct _TupleShiftLeftWorker {
+ template <typename Arg>
+ static inline void body(Arg &arg, unsigned I) {
+ arg[I-1] = arg[I];
+ }
+ };
+
+ struct _TupleShiftRightWorker {
+ template <typename Arg>
+ static inline void body(Arg &arg, unsigned I) {
+ arg[I] = arg[I-1];
+ }
+ };
+
+ template <typename _T, unsigned _size, typename TCompression>
+ inline void shiftLeft(Tuple<_T, _size, TCompression> &me) {
+ LOOP<_TupleShiftLeftWorker, _size - 1>::run(me);
+ }
+
+ template <typename _T, unsigned _size, typename TCompression>
+ inline void shiftRight(Tuple<_T, _size, TCompression> &me) {
+ LOOP_REVERSE<_TupleShiftRightWorker, _size - 1>::run(me);
+ }
+
+ template <typename _T, unsigned _size>
+ inline void shiftLeft(Tuple<_T, _size, Compressed> &me) {
+ me<<=1;
+ }
+
+ template <typename _T, unsigned _size>
+ inline void shiftRight(Tuple<_T, _size, Compressed> &me) {
+ me>>=1;
+ }
+
+//////////////////////////////////////////////////////////////////////////////
+// standard output
+
+ template <typename _T, unsigned _size, typename TCompression>
+ std::ostream& operator<<(std::ostream& out, Tuple<_T,_size,TCompression> const &a) {
+ out << "[";
+ if (a.size > 0)
+ out << a[0];
+ for(unsigned j = 1; j < a.size; ++j)
+ out << " " << a[j];
+ out << "]";
+ return out;
+ }
+
+ template <typename _T, unsigned _size, typename TCompression>
+ struct Value< Tuple<_T, _size, TCompression> > {
+ typedef _T Type;
+ };
+
+ template <typename _T, unsigned _size, typename TCompression>
+ struct Spec< Tuple<_T, _size, TCompression> > {
+ typedef TCompression Type;
+ };
+
+//////////////////////////////////////////////////////////////////////////////
+// getValueIx
+
+ template <typename T1, typename T2, typename TCompression>
+ inline T1 getValueI1(Pair<T1, T2, TCompression> const &pair) {
+ return pair.i1;
+ }
+
+ template <typename T1, typename T2, typename TCompression>
+ inline T2 getValueI2(Pair<T1, T2, TCompression> const &pair) {
+ return pair.i2;
+ }
+
+ template <typename T1, typename T2, unsigned valueSizeI1>
+ inline T1 getValueI1(Pair<T1, T2, CutCompressed<valueSizeI1> > const &pair) {
+ typedef Pair<T1, T2, CutCompressed<valueSizeI1> > TPair;
+ return pair.i12 >> TPair::bitShiftI1;
+ }
+
+ template <typename T1, typename T2, unsigned valueSizeI1>
+ inline T2 getValueI2(Pair<T1, T2, CutCompressed<valueSizeI1> > const &pair) {
+ typedef Pair<T1, T2, CutCompressed<valueSizeI1> > TPair;
+ return pair.i12 & (((typename TPair::T12)1 << TPair::bitShiftI1) - 1);
+ }
+//____________________________________________________________________________
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ inline T1 getValueI1(Triple<T1, T2, T3, TCompression> const &triple) {
+ return triple.i1;
+ }
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ inline T2 getValueI2(Triple<T1, T2, T3, TCompression> const &triple) {
+ return triple.i2;
+ }
+
+ template <typename T1, typename T2, typename T3, typename TCompression>
+ inline T3 getValueI3(Triple<T1, T2, T3, TCompression> const &triple) {
+ return triple.i3;
+ }
+
+//////////////////////////////////////////////////////////////////////////////
+// assignValueIx
+
+ template <typename T1, typename T2, typename TCompression, typename T>
+ inline void assignValueI1(Pair<T1, T2, TCompression> &pair, T const &_i) {
+ pair.i1 = _i;
+ }
+
+ template <typename T1, typename T2, typename TCompression, typename T>
+ inline void assignValueI2(Pair<T1, T2, TCompression> &pair, T const &_i) {
+ pair.i2 = _i;
+ }
+
+ template <typename T1, typename T2, unsigned valueSizeI1, typename T>
+ inline void assignValueI1(Pair<T1, T2, CutCompressed<valueSizeI1> > &pair, T const &_i)
+ {
+ typedef Pair<T1, T2, CutCompressed<valueSizeI1> > TPair;
+ pair.i12 = ((typename TPair::T12)_i << TPair::bitShiftI1) |
+ (pair.i12 & (((typename TPair::T12)1 << TPair::bitShiftI1) - 1));
+ }
+
+ template <typename T1, typename T2, unsigned valueSizeI1, typename T>
+ inline void assignValueI2(Pair<T1, T2, CutCompressed<valueSizeI1> > &pair, T const &_i) {
+ typedef Pair<T1, T2, CutCompressed<valueSizeI1> > TPair;
+ pair.i12 = (pair.i12 & ~(((typename TPair::T12)1 << TPair::bitShiftI1) - 1)) | _i;
+ }
+//____________________________________________________________________________
+
+ template <typename T1, typename T2, typename T3, typename TCompression, typename T>
+ inline T const assignValueI1(Triple<T1, T2, T3, TCompression> &triple, T const &_i) {
+ return triple.i1 = _i;
+ }
+
+ template <typename T1, typename T2, typename T3, typename TCompression, typename T>
+ inline T const assignValueI2(Triple<T1, T2, T3, TCompression> &triple, T const &_i) {
+ return triple.i2 = _i;
+ }
+
+ template <typename T1, typename T2, typename T3, typename TCompression, typename T>
+ inline T const assignValueI3(Triple<T1, T2, T3, TCompression> &triple, T const &_i) {
+ return triple.i3 = _i;
+ }
+
+//////////////////////////////////////////////////////////////////////////////
+// operator ==/!= for pairs and triples
+
+ template <typename L1, typename L2, typename LCompression, typename R1, typename R2, typename RCompression>
+ inline bool operator==(Pair<L1, L2, LCompression> const &_left, Pair<R1, R2, RCompression> const &_right) {
+ return _left.i1 == _right.i1 && _left.i2 == _right.i2;
+ }
+ template <typename L1, typename L2, typename LCompression, typename R1, typename R2, typename RCompression>
+ inline bool operator!=(Pair<L1, L2, LCompression> const &_left, Pair<R1, R2, RCompression> const &_right) {
+ return _left.i1 != _right.i1 || _left.i2 != _right.i2;
+ }
+
+ template <typename L1, typename L2, unsigned LSizeI1, typename R1, typename R2, unsigned RSizeI1>
+ inline bool operator==(Pair<L1, L2, CutCompressed<LSizeI1> > const &_left, Pair<R1, R2, CutCompressed<RSizeI1> > const &_right) {
+ return _left.i12 == _right.i12;
+ }
+ template <typename L1, typename L2, unsigned LSizeI1, typename R1, typename R2, unsigned RSizeI1>
+ inline bool operator!=(Pair<L1, L2, CutCompressed<LSizeI1> > const &_left, Pair<R1, R2, CutCompressed<RSizeI1> > const &_right) {
+ return _left.i12 != _right.i12;
+ }
+//____________________________________________________________________________
+
+ template <
+ typename L1, typename L2, typename L3, typename LCompression,
+ typename R1, typename R2, typename R3, typename RCompression>
+ inline bool operator==(Triple<L1, L2, L3, LCompression> const &_left, Triple<R1, R2, R3, RCompression> const &_right) {
+ return _left.i1 == _right.i1 && _left.i2 == _right.i2 && _left.i3 == _right.i3;
+ }
+ template <
+ typename L1, typename L2, typename L3, typename LCompression,
+ typename R1, typename R2, typename R3, typename RCompression>
+ inline bool operator!=(Triple<L1, L2, L3, LCompression> const &_left, Triple<R1, R2, R3, RCompression> const &_right) {
+ return _left.i1 != _right.i1 || _left.i2 != _right.i2 || _left.i3 != _right.i3;
+ }
+
+}// namespace SEQAN_NAMESPACE_MAIN
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_allocator_interface.h,v 1.1 2008/08/25 16:20:02 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_ALLOCATOR_INTERFACE_H
+#define SEQAN_HEADER_BASIC_ALLOCATOR_INTERFACE_H
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+//////////////////////////////////////////////////////////////////////////////
+//Allocator
+//////////////////////////////////////////////////////////////////////////////
+
+
+/**
+.Class.Allocator:
+..cat:Basic
+..summary:Manager for allocated memory.
+..signature:Allocator<TSpec>
+..param.TSpec:The specializing type.
+...metafunction:Metafunction.Spec
+..implements:Concept.Allocator
+..include:basic.h
+..remarks:There are two reasons for using non-trivial allocators:
+...text:1. Allocators support the function @Function.Allocator#clear@ for a fast deallocation of all
+allocated memory blocks.
+...text:2. Some allocators are faster in allocating an deallocating memory.
+Pool allocators like e.g. @Spec.Single Pool Allocator@ or @Spec.Multi Pool Allocator@
+speed up @Function.allocate@, @Function.deallocate@, and @Function.Allocator#clear@ for
+pooled memory blocks.
+*/
+
+template <typename TSpec>
+struct Allocator;
+
+///.Function.allocate.param.object.type:Class.Allocator
+///.Function.deallocate.param.object.type:Class.Allocator
+
+
+//////////////////////////////////////////////////////////////////////////////
+// Metafunctions
+//////////////////////////////////////////////////////////////////////////////
+
+//.Metafunction.Spec.param.T.type:Class.Allocator
+
+template <typename TSpec>
+struct Spec<Allocator<TSpec> >
+{
+ typedef TSpec Type;
+};
+
+
+//////////////////////////////////////////////////////////////////////////////
+/**
+.Tag.Allocator Usage:
+..summary:The purpose of an allocated memory block.
+..tag.TagAllocateTemp:Temporary memory.
+..tag.TagAllocateStorage:Memory for storing container content.
+..see:Function.allocate
+..see:Function.deallocate
+*/
+struct TagAllocateUnspecified_; //< usage not specified
+typedef Tag<TagAllocateUnspecified_> const TagAllocateUnspecified;
+
+struct TagAllocateTemp_; //< allocate temporary memory
+typedef Tag<TagAllocateTemp_> const TagAllocateTemp;
+
+struct TagAllocateStorage_; //< allocate memory for storing member data
+typedef Tag<TagAllocateStorage_> const TagAllocateStorage;
+
+
+//////////////////////////////////////////////////////////////////////////////
+//allocates memory on heap. No c'tors are called.
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.allocate:
+..cat:Memory
+..summary:Allocates memory from heap.
+..signature:allocate(object, data, count [, usage_tag])
+..param.object:Allocator object.
+...remarks:$object$ is conceptually the "owner" of the allocated memory.
+ Objects of all types can be used as allocators. If no special behavior is implemented,
+ default functions allocation/deallocation are applied that uses standard
+ $new$ and $delete$ operators.
+..param.count:Number of items that could be stored in the allocated memory.
+...text:The type of the allocated items is given by the type of $data$.
+..param.usage_tag:A tag the specifies the purpose for the allocated memory.
+...value:@Tag.Allocator Usage@
+..returns.param.data:Pointer to allocated memory.
+...remarks:The value of this pointer is overwritten by the function.
+..remarks:
+...text:The function allocates at least $count*sizeof(data)$ bytes.
+ The allocated memory is large enough
+ to hold $count$ objects of type $T$, where $T *$ is type of $data$.
+...note:These objects are not constructed by $allocate$.
+...text:Use e.g. one of the functions @Function.valueConstruct@, @Function.arrayConstruct@, @Function.arrayConstructCopy@ or @Function.arrayFill@
+to construct the objects.
+A $new$ operator which is part of the C++ standard (defined in $<new>$)
+ can also be used to construct objects at a given memory address.
+..note:All allocated memory blocks should be deallocated by the corresponding function @Function.deallocate@.
+..see:Function.deallocate
+..see:Function.valueConstruct
+..see:Function.arrayFill
+..see:Function.arrayConstruct
+..see:Function.arrayConstructCopy
+*/
+template <typename T, typename TValue, typename TSize>
+inline void
+allocate(T const & me,
+ TValue * & data,
+ TSize count)
+{
+ allocate(me, data, count, TagAllocateUnspecified());
+}
+template <typename T, typename TValue, typename TSize>
+inline void
+allocate(T & me,
+ TValue * & data,
+ TSize count)
+{
+ allocate(me, data, count, TagAllocateUnspecified());
+}
+
+template <typename T, typename TValue, typename TSize, typename TUsage>
+inline void
+allocate(T const &,
+ TValue * & data,
+ TSize count,
+ Tag<TUsage> const)
+{
+ data = (TValue *) operator new(count * sizeof(TValue));
+ if (data)
+ SEQAN_PROADD(SEQAN_PROMEMORY, count * sizeof(TValue));
+}
+template <typename T, typename TValue, typename TSize, typename TUsage>
+inline void
+allocate(T &,
+ TValue * & data,
+ TSize count,
+ Tag<TUsage> const)
+{
+ data = (TValue *) operator new(count * sizeof(TValue));
+ if (data)
+ SEQAN_PROADD(SEQAN_PROMEMORY, count * sizeof(TValue));
+}
+
+
+//////////////////////////////////////////////////////////////////////////////
+//deallocates memory that was allocates using allocate(.)
+
+/**
+.Function.deallocate:
+..cat:Memory
+..summary:Deallocates memory.
+..signature:deallocate(object, data, count [, usage_tag])
+..param.object:Allocator object.
+...remarks:$object$ is conceptually the "owner" of the allocated memory.
+ Objects of all types can be used as allocators. If no special behavior is implemented,
+ default functions allocation/deallocation are applied that uses standard
+ $new$ and $delete$ operators.
+..param.data:Pointer to allocated memory that was allocated by $allocate$.
+..param.count:Number of items that could be stored in the allocated memory.
+..param.usage_tag:A tag the specifies the purpose for the allocated memory.
+...value:@Tag.Allocator Usage@
+..remarks:
+...text:The values for $object$, $count$ and $usage_tag$ should be the same that was
+used when $allocate$ was called. The value of $data$ should be the same that was
+returned by $allocate$.
+...note:$deallocate$ does not destruct objects.
+...text:Use e.g. one of the functions @Function.valueDestruct@ or @Function.arrayDestruct@ to destruct the objects.
+$delete$ and $delete []$ operators which are part of the C++ standard (defined in $<new>$)
+ can also be used to destruct objects at a given memory address.
+..see:Function.valueDestruct
+..see:Function.arrayDestruct
+*/
+template <typename T, typename TValue, typename TSize>
+inline void
+deallocate(T const & me,
+ TValue * data,
+ TSize const count)
+{
+ deallocate(me, data, count, TagAllocateUnspecified());
+}
+template <typename T, typename TValue, typename TSize>
+inline void
+deallocate(T & me,
+ TValue * data,
+ TSize const count)
+{
+ deallocate(me, data, count, TagAllocateUnspecified());
+}
+
+template <typename T, typename TValue, typename TSize, typename TUsage>
+inline void
+deallocate(T const & /*me*/,
+ TValue * data,
+ TSize count,
+ Tag<TUsage> const)
+{
+ if (data && count) // .. to use count if SEQAN_PROFILE is not defined
+ SEQAN_PROSUB(SEQAN_PROMEMORY, count * sizeof(TValue));
+ operator delete ((void *) data);
+}
+template <typename T, typename TValue, typename TSize, typename TUsage>
+inline void
+deallocate(T & /*me*/,
+ TValue * data,
+ TSize count,
+ Tag<TUsage> const)
+{
+ if (data && count) // .. to use count if SEQAN_PROFILE is not defined
+ SEQAN_PROSUB(SEQAN_PROMEMORY, count * sizeof(TValue));
+ operator delete ((void *) data);
+}
+//////////////////////////////////////////////////////////////////////////////
+
+} //namespace SEQAN_NAMESPACE_MAIN
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_allocator_multipool.h,v 1.2 2009/02/19 01:51:23 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_ALLOCATOR_MULTIPOOL_H
+#define SEQAN_HEADER_BASIC_ALLOCATOR_MULTIPOOL_H
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+//////////////////////////////////////////////////////////////////////////////
+// MultiPool Allocator
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Spec.Multi Pool Allocator:
+..cat:Allocators
+..general:Class.Allocator
+..summary:Allocator that pools memory blocks.
+..signature:Allocator< MultiPool<ParentAllocator, BLOCKING_LIMIT> >
+..param.ParentAllocator:An allocator that is by the pool allocator used to allocate memory.
+...default:@Spec.Simple Allocator@
+...note:The multi pool allocator only supports @Function.clear@ if this function is also implemented for $ParentAllocator$.
+..remarks:A pool allocator allocates several memory blocks at once.
+..param.BLOCKING_LIMIT:The maximum size for memory blocks to be pooled.
+...default:256
+Freed blocks are not immediately deallocated but recycled in subsequential allocations.
+This way, the number of calls to the heap manager is reduced, and that speeds up memory management.
+...text:Note that memory blocks larger than $BLOCKING_LIMIT$ are not pooled
+but immediately allocated and deallocated using $ParentAllocator$.
+*/
+
+
+template <typename TParentAllocator = Allocator<SimpleAlloc<Default> >, unsigned int BLOCKING_LIMIT = 0x100>
+struct MultiPool;
+
+//////////////////////////////////////////////////////////////////////////////
+
+typedef Allocator<MultiPool<Allocator<SimpleAlloc<Default> >, 0x100> > PoolAllocator;
+
+template <typename TParentAllocator, unsigned int BLOCKING_LIMIT_>
+struct Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT_> >
+{
+ enum
+ {
+ BLOCKING_LIMIT = BLOCKING_LIMIT_,
+ GRANULARITY_BITS = 2,
+ BLOCKING_COUNT = BLOCKING_LIMIT >> GRANULARITY_BITS,
+ STORAGE_SIZE = 0xf80
+ };
+
+ char * data_recycled_blocks [BLOCKING_COUNT];
+ char * data_current_begin [BLOCKING_COUNT];
+ char * data_current_free [BLOCKING_COUNT];
+ Holder<TParentAllocator> data_parent_allocator;
+
+ Allocator()
+ {
+SEQAN_CHECKPOINT
+ memset(data_recycled_blocks, 0, sizeof(data_recycled_blocks));
+ memset(data_current_begin, 0, sizeof(data_current_begin));
+ memset(data_current_free, 0, sizeof(data_current_free));
+ }
+
+ Allocator(TParentAllocator & parent_alloc)
+ {
+SEQAN_CHECKPOINT
+ memset(data_recycled_blocks, 0, sizeof(data_recycled_blocks));
+ memset(data_current_begin, 0, sizeof(data_current_begin));
+ memset(data_current_free, 0, sizeof(data_current_free));
+
+ setValue(data_parent_allocator, parent_alloc);
+ }
+
+ //Dummy copy
+ Allocator(Allocator const &)
+ {
+ memset(data_recycled_blocks, 0, sizeof(data_recycled_blocks));
+ memset(data_current_begin, 0, sizeof(data_current_begin));
+ memset(data_current_free, 0, sizeof(data_current_free));
+ }
+ inline Allocator &
+ operator = (Allocator const &)
+ {
+ clear(*this);
+ return *this;
+ }
+
+ ~Allocator()
+ {
+SEQAN_CHECKPOINT
+ clear(*this);
+ }
+};
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, unsigned int BLOCKING_LIMIT>
+inline TParentAllocator &
+parentAllocator(Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > & me)
+{
+SEQAN_CHECKPOINT
+ return value(me.data_parent_allocator);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, unsigned int BLOCKING_LIMIT>
+void
+clear(Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > & me)
+{
+SEQAN_CHECKPOINT
+ memset(me.data_recycled_blocks, 0, sizeof(me.data_recycled_blocks));
+ memset(me.data_current_begin, 0, sizeof(me.data_current_begin));
+ memset(me.data_current_free, 0, sizeof(me.data_current_free));
+
+ clear(parentAllocator(me));
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, unsigned int BLOCKING_LIMIT>
+inline unsigned int
+_allocatorBlockNumber(Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > &,
+ size_t size_)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > TAllocator;
+
+ SEQAN_ASSERT(size_)
+
+ if (size_ < BLOCKING_LIMIT)
+ {//blocks
+ return size_ >> TAllocator::GRANULARITY_BITS;
+ }
+ else
+ {//no blocking
+ return TAllocator::BLOCKING_COUNT;
+ }
+}
+
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, unsigned int BLOCKING_LIMIT, typename TValue, typename TSize, typename TUsage>
+inline void
+allocate(Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > & me,
+ TValue * & data,
+ TSize count,
+ Tag<TUsage> const tag_)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > TAllocator;
+
+ size_t bytes_needed = count * sizeof(TValue);
+ char * ptr;
+
+ unsigned int block_number = _allocatorBlockNumber(me, bytes_needed);
+ if (block_number == TAllocator::BLOCKING_COUNT)
+ {//no blocking
+ return allocate(parentAllocator(me), data, count, tag_);
+ }
+
+ bytes_needed = (block_number + 1) << TAllocator::GRANULARITY_BITS;
+
+ if (me.data_recycled_blocks[block_number])
+ {//use recycled
+ ptr = me.data_recycled_blocks[block_number];
+ me.data_recycled_blocks[block_number] = * reinterpret_cast<char **>(ptr);
+ }
+ else
+ {//use new
+ ptr = me.data_current_free[block_number];
+ if (!ptr || (ptr + bytes_needed > me.data_current_begin[block_number] + TAllocator::STORAGE_SIZE))
+ {//not enough free space in current storage: allocate new
+ allocate(parentAllocator(me), ptr, (size_t) TAllocator::STORAGE_SIZE, tag_);
+ me.data_current_begin[block_number] = ptr;
+ }
+ me.data_current_free[block_number] = ptr + bytes_needed;
+ }
+
+ data = reinterpret_cast<TValue *>(ptr);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, unsigned int BLOCKING_LIMIT, typename TValue, typename TSize, typename TUsage>
+inline void
+deallocate(Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > & me,
+ TValue * data,
+ TSize count,
+ Tag<TUsage> const tag_)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<MultiPool<TParentAllocator, BLOCKING_LIMIT> > TAllocator;
+
+ size_t bytes_needed = count * sizeof(TValue);
+
+ unsigned int block_number = _allocatorBlockNumber(me, bytes_needed);
+ if (block_number == TAllocator::BLOCKING_COUNT)
+ {//no blocking
+ return deallocate(parentAllocator(me), data, count, tag_);
+ }
+
+ bytes_needed = (block_number + 1) << TAllocator::GRANULARITY_BITS;
+
+ //link in recycling list
+ *reinterpret_cast<char **>(data) = me.data_recycled_blocks[block_number];
+ me.data_recycled_blocks[block_number] = reinterpret_cast<char *>(data);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+} //namespace SEQAN_NAMESPACE_MAIN
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_allocator_simple.h,v 1.1 2008/08/25 16:20:02 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_ALLOCATOR_SIMPLE_H
+#define SEQAN_HEADER_BASIC_ALLOCATOR_SIMPLE_H
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+//////////////////////////////////////////////////////////////////////////////
+
+
+//////////////////////////////////////////////////////////////////////////////
+// SimpleAlloc Allocator
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Spec.Simple Allocator:
+..cat:Allocators
+..general:Class.Allocator
+..summary:General purpose allocator.
+..signature:Allocator< SimpleAlloc<ParentAllocator> >
+..param.ParentAllocator:An allocator that is by the simple allocator used to allocate memory.
+...default:@Tag.Default@
+...remarks:@Tag.Default@ used as allocator means that the default implementations
+of @Function.allocate@ and @Function.deallocate@ are used.
+*/
+
+template <typename TParentAllocator = Default>
+struct SimpleAlloc;
+
+//////////////////////////////////////////////////////////////////////////////
+
+
+typedef Allocator<SimpleAlloc<Default> > SimpleAllocator;
+
+template <typename TParentAllocator>
+struct Allocator<SimpleAlloc<TParentAllocator> >
+{
+ struct Header
+ {
+ Header * left;
+ Header * right;
+ size_t size;
+ };
+
+ Header * data_storages;
+ Holder<TParentAllocator> data_parent_allocator;
+
+ Allocator():
+ data_storages(0)
+ {
+SEQAN_CHECKPOINT
+ }
+
+ Allocator(TParentAllocator & parent_alloc):
+ data_storages(0)
+ {
+SEQAN_CHECKPOINT
+ setValue(data_parent_allocator, parent_alloc);
+ }
+
+ //Dummy copy
+ Allocator(Allocator const &):
+ data_storages(0)
+ {
+ }
+ inline Allocator &
+ operator = (Allocator const &)
+ {
+ clear(*this);
+ return *this;
+ }
+
+ ~Allocator()
+ {
+SEQAN_CHECKPOINT
+ clear(*this);
+ }
+};
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator>
+inline TParentAllocator &
+parentAllocator(Allocator<SimpleAlloc<TParentAllocator> > & me)
+{
+SEQAN_CHECKPOINT
+ return value(me.data_parent_allocator);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+/**
+.Function.Allocator#clear:
+..cat:Memory
+..summary:Deallocates all memory blocks.
+..signature:clear(allocator)
+..param.allocator:Allocator object.
+...type:Class.Allocator
+...concept:Concept.Allocator
+..remarks:This function deallocates all memory blocks
+that was allocated using @Function.allocate@ for $allocator$.
+The memory is not pooled but directly passed back to the heap manager.
+..see:Function.allocate
+..see:Function.deallocate
+*/
+template <typename TParentAllocator>
+void
+clear(Allocator<SimpleAlloc<TParentAllocator> > & me)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<SimpleAlloc<TParentAllocator> > TAllocator;
+
+ while (me.data_storages)
+ {
+ typename TAllocator::Header * next_storage = me.data_storages->right;
+ deallocate(parentAllocator(me), reinterpret_cast<char *>(me.data_storages), me.data_storages->size);
+ me.data_storages = next_storage;
+ }
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, typename TValue, typename TSize, typename TUsage>
+inline void
+allocate(Allocator<SimpleAlloc<TParentAllocator> > & me,
+ TValue * & data,
+ TSize count,
+ Tag<TUsage> const)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<SimpleAlloc<TParentAllocator> > TAllocator;
+ typedef typename TAllocator::Header THeader;
+
+ //compute needed bytes
+ size_t bytes_needed = count * sizeof(TValue) + sizeof(THeader);
+
+ //allocate storage from parent
+ char * ptr;
+ allocate(parentAllocator(me), ptr, bytes_needed, TagAllocateStorage());
+
+ THeader * new_block = reinterpret_cast<THeader *>(ptr);
+ new_block->left = 0;
+ new_block->right = me.data_storages;
+ new_block->size = bytes_needed;
+
+ if (me.data_storages)
+ {
+ me.data_storages->left = new_block;
+ }
+ me.data_storages = new_block;
+
+ //return data
+ data = reinterpret_cast<TValue *>(ptr + sizeof(THeader));
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TParentAllocator, typename TValue, typename TSize, typename TUsage>
+inline void
+deallocate(Allocator<SimpleAlloc<TParentAllocator> > & me,
+ TValue * data,
+ TSize,
+ Tag<TUsage> const)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<SimpleAlloc<TParentAllocator> > TAllocator;
+ typedef typename TAllocator::Header THeader;
+
+ //update links
+ THeader & header = *(reinterpret_cast<THeader *>(data) - 1);
+ if (header.left)
+ {
+ header.left->right = header.right;
+ }
+ else
+ {
+ me.data_storages = header.right;
+ }
+ if (header.right)
+ {
+ header.right->left = header.left;
+ }
+
+ //deallocate storage using parent
+ char * ptr = reinterpret_cast<char *>(& header);
+ deallocate(parentAllocator(me), ptr, header.size);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+} //namespace SEQAN_NAMESPACE_MAIN
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_allocator_singlepool.h,v 1.1 2008/08/25 16:20:01 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_ALLOCATOR_SINGLE_POOL_H
+#define SEQAN_HEADER_BASIC_ALLOCATOR_SINGLE_POOL_H
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+//////////////////////////////////////////////////////////////////////////////
+// SinglePool Allocator
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Spec.Single Pool Allocator:
+..cat:Allocators
+..general:Class.Allocator
+..summary:Allocator that pools memory blocks of specific size.
+..signature:Allocator< SinglePool<SIZE, ParentAllocator> >
+..param.SIZE:Size of memory blocks that are pooled.
+...value:An unsigned integer with $SIZE >= sizeof(void *)$.
+..param.ParentAllocator:An allocator that is by the pool allocator used to allocate memory.
+...default:@Spec.Simple Allocator@
+...note:The single pool allocator only supports @Function.clear@ if this function is also implemented for $ParentAllocator$.
+..remarks:A pool allocator allocates several memory blocks at once.
+Freed blocks are not immediately deallocated but recycled in subsequential allocations.
+This way, the number of calls to the heap manager is reduced, and that speeds up memory management.
+...text:The single pool allocator only pools memory blocks of size $SIZE$.
+Blocks of other sizes are allocated and deallocated using an allocator of type $ParentAllocator$.
+...text:Using the single pool allocator for blocksizes larger than some KB is not advised.
+*/
+
+template <size_t SIZE, typename TParentAllocator = SimpleAllocator>
+struct SinglePool;
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <size_t SIZE, typename TParentAllocator>
+struct Allocator<SinglePool<SIZE, TParentAllocator> >
+{
+ enum
+ {
+ SIZE_PER_ITEM = SIZE,
+ ITEMS_PER_BLOCK = (SIZE_PER_ITEM < 0x0100) ? 0x01000 / SIZE_PER_ITEM : 16,
+ STORAGE_SIZE = SIZE * ITEMS_PER_BLOCK,
+
+ STORAGE_SIZE_MIN = SIZE
+ };
+
+ char * data_recycled_blocks;
+ char * data_current_begin;
+ char * data_current_end;
+ char * data_current_free;
+ Holder<TParentAllocator> data_parent_allocator;
+
+ Allocator()
+ {
+SEQAN_CHECKPOINT
+ data_recycled_blocks = data_current_end = data_current_free = 0;
+ //dont need to initialize data_current_begin
+ }
+
+ Allocator(size_t reserve_item_count)
+ {
+SEQAN_CHECKPOINT
+ data_recycled_blocks = 0;
+
+ size_t storage_size = (reserve_item_count * SIZE > STORAGE_SIZE_MIN) ? reserve_item_count * SIZE : STORAGE_SIZE_MIN;
+ allocate( parentAllocator( *this ), data_current_begin, storage_size );
+ data_current_end = data_current_begin + storage_size;
+ data_current_free = data_current_begin;
+ }
+
+ Allocator(TParentAllocator & parent_alloc)
+ {
+SEQAN_CHECKPOINT
+ setValue(data_parent_allocator, parent_alloc);
+
+ data_recycled_blocks = data_current_end = data_current_free = 0;
+ //dont need to initialize data_current_begin
+ }
+
+ Allocator(size_t reserve_item_count, TParentAllocator & parent_alloc)
+ {
+SEQAN_CHECKPOINT
+ data_recycled_blocks = 0;
+
+ setValue(data_parent_allocator, parent_alloc);
+
+ size_t storage_size = (reserve_item_count * SIZE > STORAGE_SIZE_MIN) ? reserve_item_count * SIZE : STORAGE_SIZE_MIN;
+ allocate( parentAllocator( *this ), data_current_begin, storage_size );
+ data_current_end = data_current_begin + storage_size;
+ data_current_free = data_current_begin;
+ }
+
+ //Dummy copy
+ Allocator(Allocator const &)
+ {
+ data_recycled_blocks = data_current_end = data_current_free = 0;
+ //dont need to initialize data_current_begin
+ }
+ inline Allocator &
+ operator = (Allocator const &)
+ {
+ clear(*this);
+ return *this;
+ }
+
+ ~Allocator()
+ {
+SEQAN_CHECKPOINT
+ clear(*this);
+ }
+};
+//////////////////////////////////////////////////////////////////////////////
+
+template <size_t SIZE, typename TParentAllocator>
+inline TParentAllocator &
+parentAllocator(Allocator<SinglePool<SIZE, TParentAllocator> > & me)
+{
+SEQAN_CHECKPOINT
+ return value(me.data_parent_allocator);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <size_t SIZE, typename TParentAllocator>
+void
+clear(Allocator<SinglePool<SIZE, TParentAllocator> > & me)
+{
+SEQAN_CHECKPOINT
+
+ me.data_recycled_blocks = me.data_current_end = me.data_current_free = 0;
+
+ clear(parentAllocator(me));
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <size_t SIZE, typename TParentAllocator, typename TValue, typename TSize, typename TUsage>
+inline void
+allocate(Allocator<SinglePool<SIZE, TParentAllocator> > & me,
+ TValue * & data,
+ TSize count,
+ Tag<TUsage> const tag_)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<SinglePool<SIZE, TParentAllocator> > TAllocator;
+ size_t bytes_needed = count * sizeof(TValue);
+
+ if (bytes_needed != TAllocator::SIZE_PER_ITEM)
+ {//no blocking
+ allocate(parentAllocator(me), data, count, tag_);
+ return;
+ }
+
+ char * ptr;
+ if (me.data_recycled_blocks)
+ {//use recycled
+ ptr = me.data_recycled_blocks;
+ me.data_recycled_blocks = * reinterpret_cast<char **>(ptr);
+ }
+ else
+ {//use new
+ ptr = me.data_current_free;
+ if (ptr + bytes_needed > me.data_current_end)
+ {//not enough free space in current storage: allocate new
+ allocate(parentAllocator(me), ptr, (size_t) TAllocator::STORAGE_SIZE, tag_);
+ me.data_current_begin = ptr;
+ me.data_current_end = ptr + TAllocator::STORAGE_SIZE;
+ }
+ me.data_current_free = ptr + bytes_needed;
+ }
+
+ data = reinterpret_cast<TValue *>(ptr);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <size_t SIZE, typename TParentAllocator, typename TValue, typename TSize, typename TUsage>
+inline void
+deallocate(Allocator<SinglePool<SIZE, TParentAllocator> > & me,
+ TValue * data,
+ TSize count,
+ Tag<TUsage> const tag_)
+{
+SEQAN_CHECKPOINT
+ typedef Allocator<SinglePool<SIZE, TParentAllocator> > TAllocator;
+
+ size_t bytes_needed = count * sizeof(TValue);
+
+ if (bytes_needed != TAllocator::SIZE_PER_ITEM)
+ {//no blocking
+ deallocate(parentAllocator(me), data, count, tag_);
+ return;
+ }
+
+ //link in recycling list
+ *reinterpret_cast<char **>(data) = me.data_recycled_blocks;
+ me.data_recycled_blocks = reinterpret_cast<char *>(data);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//////////////////////////////////////////////////////////////////////////////
+// alternative Interface that takes a Type instead of a SIZE
+//////////////////////////////////////////////////////////////////////////////
+
+
+template <typename TValue, typename TParentAllocator = SimpleAllocator>
+struct SinglePool2;
+
+template <typename TValue, typename TParentAllocator>
+struct Allocator<SinglePool2<TValue, TParentAllocator> >
+{
+ Allocator<SinglePool<sizeof(TValue), TParentAllocator> > data_alloc;
+
+
+ Allocator(size_t reserve_item_count)
+ : data_alloc(reserve_item_count)
+ {
+ }
+
+ Allocator(TParentAllocator & parent_alloc)
+ : data_alloc(parent_alloc)
+ {
+ }
+
+ Allocator(size_t reserve_item_count, TParentAllocator & parent_alloc)
+ : data_alloc(reserve_item_count, parent_alloc)
+
+ {
+ }
+};
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TValue, typename TParentAllocator>
+inline TParentAllocator &
+parentAllocator(Allocator<SinglePool2<TValue, TParentAllocator> > & me)
+{
+SEQAN_CHECKPOINT
+ return parentAllocator(me.data_alloc);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TValue, typename TParentAllocator>
+void
+clear(Allocator<SinglePool2<TValue, TParentAllocator> > & me)
+{
+SEQAN_CHECKPOINT
+ clear(me.data_alloc);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+template <typename TValue, typename TParentAllocator, typename TValue2, typename TSize, typename TUsage>
+inline void
+allocate(Allocator<SinglePool2<TValue, TParentAllocator> > & me,
+ TValue2 * & data,
+ TSize count,
+ Tag<TUsage> const tag_)
+{
+SEQAN_CHECKPOINT
+ allocate(me.data_alloc, data, count, tag_);
+}
+
+template <typename TValue, typename TParentAllocator, typename TValue2, typename TSize, typename TUsage>
+inline void
+deallocate(Allocator<SinglePool2<TValue, TParentAllocator> > & me,
+ TValue2 * data,
+ TSize count,
+ Tag<TUsage> const tag_)
+{
+SEQAN_CHECKPOINT
+ deallocate(me.data_alloc, data, count, tag_);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+} //namespace SEQAN_NAMESPACE_MAIN
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_allocator_to_std.h,v 1.1 2008/08/25 16:20:01 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_ALLOCATOR_TO_STD_H
+#define SEQAN_HEADER_BASIC_ALLOCATOR_TO_STD_H
+
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+
+//////////////////////////////////////////////////////////////////////////////
+//helper caller for calling functions that have same name as member functions
+
+template <typename TMe, typename TValue, typename TSize>
+inline void call_allocate(TMe & me, TValue * & data, TSize const count)
+{
+ allocate(me, data, count);
+}
+template <typename TMe, typename TValue, typename TSize>
+inline void call_deallocate(TMe & me, TValue * data, TSize const count)
+{
+ deallocate(me, data, count);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//Filter that adapts seqan allocator zu std allocator
+/**
+.Class.ToStdAllocator:
+..summary:Emulates standard conform allocator.
+..signature:ToStdAllocator<THost, TValue>
+..param.THost:Type of the host allocator object.
+...text:This object is used to call @Function.allocate@ and @Function.deallocate@.
+..param.TValue:Type of allocated items.
+..remarks:The member functions $allocate$ and $deallocate$ of $ToStdAllocator$ call
+the (globale) functions @Function.allocate@ and @Function.deallocate@, respectively. The globale functions
+get an allocator object as their first arguments. This allocator object is not the $ToStdAllocator$ object itself,
+but the host object that was given to the constructor.
+..remarks:
+..see:Function.allocate
+..see:Function.deallocate
+*/
+template <typename THost, typename TValue>
+struct ToStdAllocator
+{
+ typedef TValue value_type;
+ typedef value_type * pointer;
+ typedef value_type & reference;
+ typedef value_type const * const_pointer;
+ typedef value_type const & const_reference;
+
+// typedef typename THost::Size size_type;
+// typedef typename THost::Difference difference_type;
+ typedef size_t size_type;
+ typedef ptrdiff_t difference_type;
+
+/**
+.Memfunc.ToStdAllocator:
+..summary:Constructor
+..signature:ToStdAllocator(host)
+..class:Class.ToStdAllocator
+..param.host:The host object that is used as allocator for @Function.allocate@ and @Function.deallocate@.
+*/
+ ToStdAllocator(THost & host): m_host(& host)
+ {
+ }
+ ToStdAllocator(ToStdAllocator const & alloc): m_host(alloc.m_host)
+ {
+ }
+ ToStdAllocator & operator= (ToStdAllocator const & alloc)
+ {
+ m_host = alloc.m_host;
+ }
+ ~ToStdAllocator()
+ {
+ }
+
+/**
+.Function.host:
+..summary:The object a given object depends on.
+..cat:Dependent Objects
+..signature:host(object)
+..param.object:An object.
+...type:Class.ToStdAllocator
+..returns:The host object.
+*/
+ friend THost & host(ToStdAllocator & me)
+ {
+ return *me.m_host;
+ }
+
+ pointer allocate(size_type count)
+ {
+ value_type * ptr;
+ call_allocate(*m_host, ptr, count);
+ return pointer(ptr);
+ }
+ pointer allocate(size_type count, const void *)
+ {
+ value_type * ptr;
+ call_allocate(*m_host, ptr, count);
+ return pointer(ptr);
+ }
+
+ void deallocate(pointer data, size_type count)
+ {
+ call_deallocate(*m_host, data, count);
+ }
+
+ void construct(pointer ptr, const_reference data)
+ {
+ new(ptr) TValue(data);
+ }
+
+ void destroy(pointer ptr)
+ {
+ ptr->~TValue();
+ }
+
+ pointer address(reference value) const
+ {
+ return (&value);
+ }
+ const_pointer address(const_reference value) const
+ {
+ return (&value);
+ }
+
+ size_type max_size() const
+ {
+ return ~0UL / sizeof(value_type);
+ }
+
+ template<class TValue2>
+ struct rebind
+ {
+ typedef ToStdAllocator<THost, TValue2> other;
+ };
+
+ private:
+ THost * m_host;
+};
+//////////////////////////////////////////////////////////////////////////////
+
+
+
+//returns std-allocator type (for allocators)
+template <typename T, typename TData>
+struct StdAllocator
+{
+ typedef ToStdAllocator<T, TData> Type;
+};
+
+
+//////////////////////////////////////////////////////////////////////////////
+
+} //namespace SEQAN_NAMESPACE_MAIN
+
+
+//////////////////////////////////////////////////////////////////////////////
+
+#endif //#ifndef SEQAN_HEADER_...
--- /dev/null
+ /*==========================================================================
+ SeqAn - The Library for Sequence Analysis
+ http://www.seqan.de
+ ============================================================================
+ Copyright (C) 2007
+
+ This library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 3 of the License, or (at your option) any later version.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ ============================================================================
+ $Id: basic_alphabet_interface.h,v 1.1 2008/08/25 16:20:01 langmead Exp $
+ ==========================================================================*/
+
+#ifndef SEQAN_HEADER_BASIC_ALPHABET_INTERFACE_H
+#define SEQAN_HEADER_BASIC_ALPHABET_INTERFACE_H
+
+#include <new>
+
+namespace SEQAN_NAMESPACE_MAIN
+{
+//////////////////////////////////////////////////////////////////////////////
+//IsSimple
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Metafunction.IsSimple:
+..summary:Tests type to be simple.
+..signature:IsSimple<T>::Type
+..param.T:Type that is tested.
+..returns.param.Type:@Tag.Logical Values.True@, if $T$ is a simple type, @Tag.Logical Values.False@ otherwise.
+...default:@Tag.Logical Values.False@
+..remarks:A simple type is a type that does not need constructors to be created,
+a destructor to be destroyed, and copy assignment operators or copy constructors
+to be copied. All POD ("plain old data") types are simple, but some
+non-POD types could be simple too, e.g. some specializations of @Class.SimpleType@.
+..see:Class.SimpleType
+*/
+
+template <typename T>
+struct _IsSimple {
+ typedef False Type;
+};
+
+template <typename T>
+struct IsSimple:
+ public _IsSimple<T> {};
+template <typename T>
+struct IsSimple<T const>:
+ public IsSimple<T> {};
+
+//////////////////////////////////////////////////////////////////////////////
+//very basic Alphabets
+
+typedef char Ascii;
+typedef unsigned char Byte;
+typedef wchar_t Unicode;
+
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.valueConstruct:
+..cat:Content Manipulation
+..summary:Constructs an object at specified position.
+..signature:valueConstruct(iterator [, param [, move_tag] ])
+..param.iterator:Pointer or iterator to position where the object should be constructed.
+..param.param:Parameter that is forwarded to constructor. (optional)
+..param.move_tag:Instance of the @Tag.Move Switch.move switch tag@. (optional)
+...remarks:If the @Tag.Move Switch.move switch tag@ is specified, it is forwarded to the constructor,
+so the constructed object must support move construction.
+..remarks:The type of the destructed object is the @Metafunction.Value.value type@ of $iterator$.
+*/
+
+struct _ValueConstructor
+{
+ template <typename TIterator>
+ static inline void
+ construct(TIterator it)
+ {
+ typedef typename Value<TIterator>::Type TValue;
+ new( & value(it) ) TValue;
+ }
+
+ template <typename TIterator, typename TParam>
+ static inline void
+ construct(TIterator it,
+ TParam const & param_)
+ {
+ typedef typename Value<TIterator>::Type TValue;
+ new( & value(it) ) TValue(param_);
+ }
+
+ template <typename TIterator, typename TParam>
+ static inline void
+ construct(TIterator it,
+ TParam const & param_,
+ Move tag)
+ {
+ typedef typename Value<TIterator>::Type TValue;
+ new( & value(it) ) TValue(param_, tag);
+ }
+};
+
+struct _ValueConstructorProxy
+{
+ template <typename TIterator>
+ static inline void construct(TIterator) {}
+
+ template <typename TIterator, typename TParam>
+ static inline void construct(TIterator, TParam const &) {}
+
+ template <typename TIterator, typename TParam>
+ static inline void construct(TIterator, TParam const &, Move) {}
+};
+
+//____________________________________________________________________________
+
+struct _ValueDestructor
+{
+ template <typename TIterator>
+ static inline void
+ destruct(TIterator it)
+ {
+ typedef typename Value<TIterator>::Type TValue;
+ value(it).~TValue();
+ }
+};
+struct _ValueDestructorProxy
+{
+ template <typename TIterator>
+ static inline void destruct(TIterator) {}
+};
+
+//____________________________________________________________________________
+
+template <typename TIterator>
+inline void
+valueConstruct(TIterator it)
+{
+SEQAN_CHECKPOINT
+ typedef typename IF<
+ TYPECMP<
+ typename Value<TIterator>::Type &,
+ typename Reference<TIterator>::Type
+ >::VALUE,
+ // THEN
+ _ValueConstructor, // true, types are equal
+ // ELSE
+ _ValueConstructorProxy // false, types differ -> value() returns a proxy
+ >::Type TConstructor;
+
+ TConstructor::construct(it);
+}
+
+template <typename TIterator, typename TParam>
+inline void
+valueConstruct(TIterator it,
+ TParam const & param_)
+{
+SEQAN_CHECKPOINT
+ typedef typename IF<
+ TYPECMP<
+ typename Value<TIterator>::Type &,
+ typename Reference<TIterator>::Type
+ >::VALUE,
+ // THEN
+ _ValueConstructor, // true, types are equal
+ // ELSE
+ _ValueConstructorProxy // false, types differ -> value() returns a proxy
+ >::Type TConstructor;
+
+ TConstructor::construct(it, param_);
+}
+
+template <typename TIterator, typename TParam>
+inline void
+valueConstruct(TIterator it,
+ TParam const & param_,
+ Move tag)
+{
+SEQAN_CHECKPOINT
+ typedef typename IF<
+ TYPECMP<
+ typename Value<TIterator>::Type &,
+ typename Reference<TIterator>::Type
+ >::VALUE,
+ // THEN
+ _ValueConstructor, // true, types are equal
+ // ELSE
+ _ValueConstructorProxy // false, types differ -> value() returns a proxy
+ >::Type TConstructor;
+
+ TConstructor::construct(it, param_, tag);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.valueDestruct:
+..cat:Content Manipulation
+..summary:Destoys an object at specified position.
+..signature:valueDestruct(iterator)
+..param.iterator:Pointer or iterator to position where the object should be constructed.
+..remarks:The type of the constructed object is the @Metafunction.Value.value type@ of $iterator$.
+..see:Function.valueConstruct
+*/
+template <typename TIterator>
+inline void
+valueDestruct(TIterator it)
+{
+SEQAN_CHECKPOINT
+ typedef typename IF<
+ TYPECMP<
+ typename Value<TIterator>::Type &,
+ typename Reference<TIterator>::Type
+ >::VALUE,
+ // THEN
+ _ValueDestructor, // true, types are equal
+ // ELSE
+ _ValueDestructorProxy // false, types differ -> value() returns a proxy
+ >::Type TDestructor;
+
+ TDestructor::destruct(it);
+}
+
+
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.valueConstructMove:
+..cat:Content Manipulation
+..summary:Move constructs an object at specified position.
+..signature:valueConstructMove(iterator, param)
+..param.iterator:Pointer or iterator to position where the object should be constructed.
+..param.param:Parameter that is moved to the new constructed object.
+..remarks:The type of the destructed object is the @Metafunction.Value.value type@ of $iterator$.
+..remarks:The default implementation just calls @Function.valueConstruct@.
+*/
+template <typename TIterator, typename TValue>
+inline void
+valueConstructMove(TIterator it, TValue const & value)
+{
+ valueConstruct(it, value);
+}
+
+
+//////////////////////////////////////////////////////////////////////////////
+//////////////////////////////////////////////////////////////////////////////
+//////////////////////////////////////////////////////////////////////////////
+//arrayConstruct
+//////////////////////////////////////////////////////////////////////////////
+/**
+.Function.arrayConstruct:
+..cat:Array Handling
+..summary:Construct objects in a given memory buffer.
+..signature:arrayConstruct(begin, end [, value])
+..param.begin:Iterator to the begin of the range that is to be constructed.
+..param.end:Iterator behind the end of the range.
+..param.value:Argument that is forwarded to the constructor. (optional)
+...text:An appropriate constructor is required.
+If $value$ is not specified, the default constructor is used.
+..remarks:The type of the constructed Objects is the @Metafunction.Value.value type@
+of $begin$ and $end$.
+..see:Function.arrayDestruct
+..see:Function.arrayConstructCopy
+..see:Function.arrayFill
+..see:Class.SimpleType
+..see:Function.valueConstruct
+*/
+template<typename TIterator1, typename TIterator2>
+inline void
+_arrayConstruct_Default(TIterator1 begin_,
+ TIterator2 end_)
+{
+SEQAN_CHECKPOINT
+ while (begin_ != end_)
+ {
+ valueConstruct(begin_);
+ ++begin_;
+ }
+}
+template<typename TIterator1, typename TIterator2>
+inline void
+arrayConstruct(TIterator1 begin_,
+ TIterator2 end_)
+{
+SEQAN_CHECKPOINT
+ _arrayConstruct_Default(begin_, end_);
+}
+
+//____________________________________________________________________________
+
+template<typename TIterator1, typename TIterator2, typename TParam>
+inline void
+_arrayConstruct_Default(TIterator1 begin_,
+ TIterator2 end_,
+ TParam const & param_)
+{
+SEQAN_CHECKPOINT
+ while (begin_ != end_)
+ {
+ valueConstruct(begin_, param_);
+ ++begin_;
+ }
+}
+template<typename TIterator1, typename TIterator2, typename TParam>
+inline void
+arrayConstruct(TIterator1 begin_,
+ TIterator2 end_,
+ TParam const & param_)
+{
+SEQAN_CHECKPOINT
+ _arrayConstruct_Default(begin_, end_, param_);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//arrayConstructCopy
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.arrayConstructCopy:
+..cat:Array Handling
+..summary:Copy constructs an array of objects into in a given memory buffer.
+..signature:arrayConstructCopy(source_begin, source_end, target)
+..param.source_begin:Iterator to the first element of the source range.
+..param.source_end:Iterator behind the last element of the source range.
+...text:$source_end$ should have the same type as $source_begin$.
+..param.target:Pointer to the memory block the new objects will be constructed in.
+...text:The type of $target$ specifies the type of the constructed objects:
+If $T*$ is the type of $target$, then the function constructs objects of type $T$.
+...text:The memory buffer should be large enough to store $source_end$ - $source_begin$ objects.
+An appropriate (copy-) constructor that constructs an target objects given a source object is required.
+..see:Function.arrayDestruct
+..see:Function.arrayCopyForward
+..see:Function.arrayCopy
+..see:Function.valueConstruct
+*/
+template<typename TTarget, typename TSource1, typename TSource2>
+inline void
+_arrayConstructCopy_Default(TSource1 source_begin,
+ TSource2 source_end,
+ TTarget target_begin)
+{
+SEQAN_CHECKPOINT
+ while (source_begin != source_end)
+ {
+ valueConstruct(target_begin, *source_begin);
+ ++source_begin;
+ ++target_begin;
+ }
+}
+
+template<typename TTarget, typename TSource1, typename TSource2>
+inline void
+arrayConstructCopy(TSource1 source_begin,
+ TSource2 source_end,
+ TTarget target_begin)
+{
+SEQAN_CHECKPOINT
+ _arrayConstructCopy_Default(source_begin, source_end, target_begin);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//arrayConstructMove
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.arrayConstructMove:
+..cat:Array Handling
+..summary:Move constructs an array of objects into in a given memory buffer.
+..signature:arrayConstructMove(source_begin, source_end, target)
+..param.source_begin:Iterator to the first element of the source range.
+..param.source_end:Iterator behind the last element of the source range.
+...text:$source_end$ should have the same type as $source_begin$.
+..param.target:Pointer to the memory block the new objects will be constructed in.
+...text:The type of $target$ specifies the type of the constructed objects:
+If $T*$ is the type of $target$, then the function constructs objects of type $T$.
+...text:The memory buffer should be large enough to store $source_end$ - $source_begin$ objects.
+An appropriate move constructor that constructs an target objects given a source object is required.
+..see:Function.arrayDestruct
+..see:Function.arrayConstructCopy
+..see:Function.arrayMoveForward
+..see:Function.arrayMove
+..see:Function.valueConstruct
+*/
+template<typename TTarget, typename TSource1, typename TSource2>
+inline void
+_arrayConstructMove_Default(TSource1 source_begin,
+ TSource2 source_end,
+ TTarget target_begin)
+{
+SEQAN_CHECKPOINT
+ while (source_begin < source_end)
+ {
+ valueConstructMove(target_begin, *source_begin);
+ ++source_begin;
+ ++target_begin;
+ }
+}
+
+template<typename TTarget, typename TSource1, typename TSource2>
+inline void
+arrayConstructMove(TSource1 source_begin,
+ TSource2 source_end,
+ TTarget target_begin)
+{
+SEQAN_CHECKPOINT
+ _arrayMoveConstruct_Default(source_begin, source_end, target_begin);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//arrayDestruct
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.arrayDestruct:
+..cat:Array Handling
+..summary:Destroys an array of objects.
+..signature:arrayDestruct(begin, end)
+..param.begin:Iterator to the begin of the range that is to be destructed.
+..param.end:Iterator behind the end of the range.
+..remarks:This function does not deallocates the memory.
+..see:Class.SimpleType
+..see:Function.valueDestruct
+*/
+template<typename TIterator1, typename TIterator2>
+inline void
+_arrayDestruct_Default(TIterator1 begin_,
+ TIterator2 end_)
+{
+SEQAN_CHECKPOINT
+ while (begin_ != end_)
+ {
+ valueDestruct(begin_);
+ ++begin_;
+ }
+}
+template<typename TIterator1, typename TIterator2>
+inline void
+arrayDestruct(TIterator1 begin_,
+ TIterator2 end_)
+{
+SEQAN_CHECKPOINT
+ _arrayDestruct_Default(begin_, end_);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//arrayFill
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.arrayFill:
+..cat:Array Handling
+..summary:Assigns one object to each element of a range.
+..signature:arrayFill(begin, end, value)
+..param.begin:Iterator to the begin of the range that is to be filled.
+..param.end:Iterator behind the end of the range.
+..param.value:Argument that is assigned to all $count$ objects in $array$.
+..remarks:All objects $target_begin[0]$ to $target_begin[count-1]$ are set to $value$.
+..see:Function.arrayCopy
+..see:Function.arrayCopyForward
+*/
+template<typename TIterator1, typename TIterator2, typename TValue>
+inline void
+arrayFill(TIterator1 begin_,
+ TIterator2 end_,
+ TValue const & value)
+{
+SEQAN_CHECKPOINT
+ ::std::fill_n(begin_, end_ - begin_, value);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//arrayCopyForward
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.arrayCopyForward:
+..cat:Array Handling
+..summary:Copies a range of objects into another range of objects starting from the first element.
+..signature:arrayCopyForward(source_begin, source_end, target)
+..param.source_begin:Iterator to the first element of the source array.
+..param.source_end:Iterator behind the last element of the source array.
+...text:$source_end$ must have the same type as $source_begin$.
+..param.target:Iterator to the first element of the target array.
+...text:The target capacity should be at least as long as the source range.
+..remarks.note:Be careful if source and target range overlap, because in this case
+ some source elements could be accidently overwritten before they are moved.
+..remarks:If there is no need for the source elements to persist, consider to use
+@Function.arrayMoveForward@ instead to improve performance.
+..see:Class.SimpleType
+*/
+template<typename TTarget, typename TSource1, typename TSource2>
+inline void
+_arrayCopyForward_Default(TSource1 source_begin,
+ TSource2 source_end,
+ TTarget target_begin)
+{
+SEQAN_CHECKPOINT
+ ::std::copy(source_begin, source_end, target_begin);
+}
+template<typename TTarget, typename TSource1, typename TSource2>
+inline void
+arrayCopyForward(TSource1 source_begin,
+ TSource2 source_end,
+ TTarget target_begin)
+{
+SEQAN_CHECKPOINT
+ _arrayCopyForward_Default(source_begin, source_end, target_begin);
+}
+
+//////////////////////////////////////////////////////////////////////////////
+//arrayCopyBackward
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+.Function.arrayCopyBackward:
+..cat:Array Handling
+..summary:Copies a range of objects into another range of objects starting from the last element.
+..signature:arrayCopyBackward(source_begin, source_end, target)
+..param.source_begin:Iterator to the first element of the source array.
+..param.source_end:Iterator behind the last element of the source array.
+...text:$source_end$ must have the same type as $source_begin$.
+..param.target:Iterator to the first element of the target array.
+...text:The target capacity should be at least as long as the source range.
+..remarks.note:Be careful if source and target range overlap, because in this case
+ some source elements could be accidently overwritten before they are moved.
+..remarks.text:If source and target do not overlap, consider to use the function
+@Function.arrayCopyForward@ instead that is faster in some cases.
+..remarks:If there is no need for the source elements to persist, consider to use
+@Function.arrayMoveBackward@ instead to improve performance.
+..remarks.note:The semantic of this function's argument $target$ differ from the arguments of $::std::copy_backward$.
+..see:Function.arrayCopyForward
+..see:C