X-Git-Url: http://woldlab.caltech.edu/gitweb/?p=erange.git;a=blobdiff_plain;f=docs%2FREADME.chip-seq;fp=docs%2FREADME.chip-seq;h=846a441a434f8e3cff4506108f27126855a2fb47;hp=6529a6fa1b9404c8d1d0b42484f83bc48c0f048c;hb=0d3e3112fd04c2e6b44a25cacef1d591658ad181;hpb=5e4ae21098dba3d1edcf11e7279da0d84c3422e4 diff --git a/docs/README.chip-seq b/docs/README.chip-seq index 6529a6f..846a441 100644 --- a/docs/README.chip-seq +++ b/docs/README.chip-seq @@ -86,7 +86,7 @@ given radius (a) is the default in the current release of ERANGE. Simply proceed to RUNNING THE PEAK FINDER for (a) and -(a). You can ignore multireads (b) by using the -nomulti +(a). You can ignore multireads (b) by using the --nomulti flag with findall.py. For (c), use weighMultireads.py to weigh multireads based on a unique reads in the respective radius of each potential location. Once run, @@ -98,7 +98,7 @@ proceed to the section below. To run the peak finder without read shifting, use the following command: -python $ERANGEPATH/findall.py label chip.rds chip.regions.txt -control control.rds -listPeak -revbackground +python $ERANGEPATH/findall.py label chip.rds chip.regions.txt --control control.rds --listPeak --revbackground which will run the peak finder on chip.rds / control.rds , store the enriched region coordinates in chip.regions.txt, @@ -119,40 +119,40 @@ fragment sizes, on the order of 40-60 bp. You will *NEED* to change some of the default parameters if working in smaller genomes (e.g. use smaller -spacing), if working with certain types of IPs such as histones and -polymerases (test with and without -notrim and --nodirectionality), if working with rather weak IPs -(e.g. -minimum and -ratio), or if working with larger +polymerases (test with and without --notrim and +--nodirectionality), if working with rather weak IPs +(e.g. --minimum and --ratio), or if working with larger fragment sizes (see the paragraph below discussing read shifting). findall.py returns a per-peak p-value. By default, this is calculated using a Poisson distribution of peak RPMs -(or counts, if using -raw) for each chromosome in the IP. +(or counts, if using --raw) for each chromosome in the IP. P-value calculations can be turned off using -'-pvalue none '. Alternatively, the p-value can be +'--pvalue none '. Alternatively, the p-value can be calculated from the background using the option -'-pvalue back ', which must be combined with the option --revbackground. +'--pvalue back ', which must be combined with the option +--revbackground. By default, findall.py does not try to adjust the location of the reads based on half the size of the expected fragment length (the "shift"). If you believe that you need to shift your peaks, findall.py can try to pick the best shift based on the best shift for strong sites using the parameter -'-shift learn '. You can also either manually specify a -shift value using '-shift #bp ' or ou can calculate a -"best shift" for each region using '-autoshift'. If you +'--shift learn '. You can also either manually specify a +shift value using '--shift #bp ' or ou can calculate a +"best shift" for each region using '--autoshift'. If you need to using the shift options, the recommended usage is: -(i) first run findall.py with '-shift learn ', which will +(i) first run findall.py with '--shift learn ', which will peak a shift if there are at least 30 regions that meet its training criteria. (ii) if (i) couldn't pick a shift, run findall.py with --autoshift and -reportshift +--autoshift and --reportshift (iii) look at the mode (most common #) for the shift -(iv) rerun findall.py with -shift #bp where #bp is the mode +(iv) rerun findall.py with --shift #bp where #bp is the mode If you are storing the RDS files on an network-mounted -directory, make sure to use '-cache XXXXX' to enable +directory, make sure to use '--cache XXXXX' to enable local caching, where is as large as appropriate as described in section 9 of README.build-rds . @@ -223,6 +223,7 @@ for example. RELEASE HISTORY +version 3.2 November 2010 - updated command line options version 3.1 February 2009 - support for read shifting version 3.0 February 2009 - support for UCSC narrowPeak format in regiontobed.py version 3.0rc1 December 2008 - added parameter to control peak-trimming