(a) is the default in the current release of ERANGE.
Simply proceed to RUNNING THE PEAK FINDER for (a) and
-(a). You can ignore multireads (b) by using the -nomulti
+(a). You can ignore multireads (b) by using the --nomulti
flag with findall.py. For (c), use weighMultireads.py
to weigh multireads based on a unique reads in the
respective radius of each potential location. Once run,
To run the peak finder without read shifting, use the
following command:
-python $ERANGEPATH/findall.py label chip.rds chip.regions.txt -control control.rds -listPeak -revbackground
+python $ERANGEPATH/findall.py label chip.rds chip.regions.txt --control control.rds --listPeak --revbackground
which will run the peak finder on chip.rds / control.rds ,
store the enriched region coordinates in chip.regions.txt,
You will *NEED* to change some of the default parameters
if working in smaller genomes (e.g. use smaller -spacing),
if working with certain types of IPs such as histones and
-polymerases (test with and without -notrim and
--nodirectionality), if working with rather weak IPs
-(e.g. -minimum and -ratio), or if working with larger
+polymerases (test with and without --notrim and
+--nodirectionality), if working with rather weak IPs
+(e.g. --minimum and --ratio), or if working with larger
fragment sizes (see the paragraph below discussing read
shifting).
findall.py returns a per-peak p-value. By default, this
is calculated using a Poisson distribution of peak RPMs
-(or counts, if using -raw) for each chromosome in the IP.
+(or counts, if using --raw) for each chromosome in the IP.
P-value calculations can be turned off using
-'-pvalue none '. Alternatively, the p-value can be
+'--pvalue none '. Alternatively, the p-value can be
calculated from the background using the option
-'-pvalue back ', which must be combined with the option
--revbackground.
+'--pvalue back ', which must be combined with the option
+--revbackground.
By default, findall.py does not try to adjust the location
of the reads based on half the size of the expected fragment
length (the "shift"). If you believe that you need to shift
your peaks, findall.py can try to pick the best shift based
on the best shift for strong sites using the parameter
-'-shift learn '. You can also either manually specify a
-shift value using '-shift #bp ' or ou can calculate a
-"best shift" for each region using '-autoshift'. If you
+'--shift learn '. You can also either manually specify a
+shift value using '--shift #bp ' or ou can calculate a
+"best shift" for each region using '--autoshift'. If you
need to using the shift options, the recommended usage is:
-(i) first run findall.py with '-shift learn ', which will
+(i) first run findall.py with '--shift learn ', which will
peak a shift if there are at least 30 regions that meet
its training criteria.
(ii) if (i) couldn't pick a shift, run findall.py with
--autoshift and -reportshift
+--autoshift and --reportshift
(iii) look at the mode (most common #) for the shift
-(iv) rerun findall.py with -shift #bp where #bp is the mode
+(iv) rerun findall.py with --shift #bp where #bp is the mode
If you are storing the RDS files on an network-mounted
-directory, make sure to use '-cache XXXXX' to enable
+directory, make sure to use '--cache XXXXX' to enable
local caching, where is as large as appropriate as
described in section 9 of README.build-rds .
RELEASE HISTORY
+version 3.2 November 2010 - updated command line options
version 3.1 February 2009 - support for read shifting
version 3.0 February 2009 - support for UCSC narrowPeak format in regiontobed.py
version 3.0rc1 December 2008 - added parameter to control peak-trimming