X-Git-Url: http://woldlab.caltech.edu/gitweb/?p=samtools.git;a=blobdiff_plain;f=samtools.1;h=c71fc879d8c5f590f71a90cd27f09573b1520e62;hp=e0595607196e577f15227c5cef8af0c9f51de2d6;hb=9f4bebab2e0917c676ae739b2d05cb22ad6c4ed5;hpb=78c7059ff49802b06e7318fb6ad9a0908e80903b diff --git a/samtools.1 b/samtools.1 index e059560..c71fc87 100644 --- a/samtools.1 +++ b/samtools.1 @@ -1,4 +1,4 @@ -.TH samtools 1 "21 November 2010" "samtools-0.1.11" "Bioinformatics tools" +.TH samtools 1 "21 April 2011" "samtools-0.1.16" "Bioinformatics tools" .SH NAME .PP samtools - Utilities for the Sequence Alignment/Map (SAM) format @@ -137,7 +137,7 @@ viewing the same reference sequence. .TP .B mpileup -samtools mpileup [-Bug] [-C capQcoef] [-r reg] [-f in.fa] [-l list] [-M capMapQ] [-Q minBaseQ] [-q minMapQ] in.bam [in2.bam [...]] +samtools mpileup [-EBug] [-C capQcoef] [-r reg] [-f in.fa] [-l list] [-M capMapQ] [-Q minBaseQ] [-q minMapQ] in.bam [in2.bam [...]] Generate BCF or pileup for one or multiple BAM files. Alignment records are grouped by sample identifiers in @RG header lines. If sample @@ -145,7 +145,10 @@ identifiers are absent, each input file is regarded as one sample. .B OPTIONS: .RS -.TP 8 +.TP 10 +.B -A +Do not skip anomalous read pairs in variant calling. +.TP .B -B Disable probabilistic realignment for the computation of base alignment quality (BAQ). BAQ is the Phred-scaled probability of a read base being @@ -159,11 +162,23 @@ being generated from the mapped position, the new mapping quality is about sqrt((INT-q)/INT)*INT. A zero value disables this functionality; if enabled, the recommended value for BWA is 50. [0] .TP +.BI -d \ INT +At a position, read maximally +.I INT +reads per input BAM. [250] +.TP +.B -D +Output per-sample read depth +.TP .BI -e \ INT Phred-scaled gap extension sequencing error probability. Reducing .I INT leads to longer indels. [20] .TP +.B -E +Extended BAQ computation. This option helps sensitivity especially for MNPs, but may hurt +specificity a little bit. +.TP .BI -f \ FILE The reference file [null] .TP @@ -180,9 +195,17 @@ is modeled as .IR INT * s / l . [100] .TP +.B -I +Do not perform INDEL calling +.TP .BI -l \ FILE File containing a list of sites where pileup or BCF is outputted [null] .TP +.BI -L \ INT +Skip INDEL calling if the average per-sample depth is above +.IR INT . +[250] +.TP .BI -o \ INT Phred-scaled gap open sequencing error probability. Reducing .I INT @@ -206,6 +229,9 @@ Only generate pileup in region .I STR [all sites] .TP +.B -S +Output per-sample Phred-scaled strand bias P-value +.TP .B -u Similar to .B -g @@ -223,6 +249,16 @@ with the header in This command is much faster than replacing the header with a BAM->SAM->BAM conversion. +.TP +.B cat +samtools cat [-h header.sam] [-o out.bam] [ ... ] + +Concatenate BAMs. The sequence dictionary of each input BAM must be identical, +although this command does not check this. This command uses a similar trick +to +.B reheader +which enables fast BAM concatenation. + .TP .B sort samtools sort [-no] [-m maxMem] @@ -249,7 +285,7 @@ Approximately the maximum required memory. [500000000] .TP .B merge -samtools merge [-nur] [-h inh.sam] [-R reg] [...] +samtools merge [-nur1f] [-h inh.sam] [-R reg] [...] Merge multiple sorted alignments. The header reference lists of all the input BAM files, and the @SQ headers of @@ -266,6 +302,12 @@ and the headers of other files will be ignored. .B OPTIONS: .RS .TP 8 +.B -1 +Use zlib compression level 1 to comrpess the output +.TP +.B -f +Force to overwrite the output file if present. +.TP 8 .BI -h \ FILE Use the lines of .I FILE @@ -277,17 +319,18 @@ replacing any header lines that would otherwise be copied from is actually in SAM format, though any alignment records it may contain are ignored.) .TP +.B -n +The input alignments are sorted by read names rather than by chromosomal +coordinates +.TP .BI -R \ STR Merge files in the specified region indicated by .I STR +[null] .TP .B -r Attach an RG tag to each alignment. The tag value is inferred from file names. .TP -.B -n -The input alignments are sorted by read names rather than by chromosomal -coordinates -.TP .B -u Uncompressed BAM output .RE @@ -355,7 +398,7 @@ Treat paired-end reads and single-end reads. .TP .B calmd -samtools calmd [-eubSr] [-C capQcoef] +samtools calmd [-EeubSr] [-C capQcoef] Generate the MD tag. If the MD tag is already present, this command will give a warning if the MD tag generated is different from the existing @@ -388,7 +431,58 @@ Coefficient to cap mapping quality of poorly mapped reads. See the command for details. [0] .TP .B -r -Compute the BQ tag without changing the base quality. +Compute the BQ tag (without -A) or cap base quality by BAQ (with -A). +.TP +.B -E +Extended BAQ calculation. This option trades specificity for sensitivity, though the +effect is minor. +.RE + +.TP +.B targetcut +samtools targetcut [-Q minBaseQ] [-i inPenalty] [-0 em0] [-1 em1] [-2 em2] [-f ref] + +This command identifies target regions by examining the continuity of read depth, computes +haploid consensus sequences of targets and outputs a SAM with each sequence corresponding +to a target. When option +.B -f +is in use, BAQ will be applied. This command is +.B only +designed for cutting fosmid clones from fosmid pool sequencing [Ref. Kitzman et al. (2010)]. +.RE + +.TP +.B phase +samtools phase [-AF] [-k len] [-b prefix] [-q minLOD] [-Q minBaseQ] + +Call and phase heterozygous SNPs. +.B OPTIONS: +.RS +.TP 8 +.B -A +Drop reads with ambiguous phase. +.TP 8 +.BI -b \ STR +Prefix of BAM output. When this option is in use, phase-0 reads will be saved in file +.BR STR .0.bam +and phase-1 reads in +.BR STR .1.bam. +Phase unknown reads will be randomly allocated to one of the two files. Chimeric reads +with switch errors will be saved in +.BR STR .chimeric.bam. +[null] +.TP +.B -F +Do not attempt to fix chimeric reads. +.TP +.BI -k \ INT +Maximum length for local phasing. [13] +.TP +.BI -q \ INT +Minimum Phred-scaled LOD to call a heterozygote. [40] +.TP +.BI -Q \ INT +Minimum base quality to be used in het calling. [13] .RE .TP @@ -629,6 +723,20 @@ mismatches. Applying this option usually helps .B BWA-short but may not other mappers. +.IP o 2 +Generate the consensus sequence for one diploid individual: + + samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq + +.IP o 2 +Phase one individual: + + samtools calmd -AEur aln.bam ref.fa | samtools phase -b prefix - > phase.out + +The +.B calmd +command is used to reduce false heterozygotes around INDELs. + .IP o 2 Call SNPs and short indels for multiple diploid individuals: