MACS 1.4 --tisize error

142 views
Skip to first unread message

Alberto Termanini

unread,
Sep 22, 2010, 12:45:21 PM9/22/10
to macs-ann...@googlegroups.com
Hi to all!

Using MACS 1.4 , I've found an error that doesn't happen with previous versions of MACS.

It seems that MACS 1.4 is not able to parse the --tsize option (or s option) and/or to auto-set
the tsize parameter.

Below I report the output with MACS 1.4 or with MACS previous version.

Anyone should help me?

Many thank

Best,

Alberto Termanini
IFOM-IEO Campus
Italy







macs14 --format=SAM  -t /scratch/freearea/Alberto_res/TESTPROG/s6.sam -c /scratch/freearea/Alberto_res/TESTPROG/s0.sam --tsize=35 --name=s6_s0

INFO  @ Wed, 22 Sep 2010 18:16:24: 
# ARGUMENTS LIST:
# name = s6_s0
# format = SAM
# ChIP-seq file = /scratch/freearea/Alberto_res/TESTPROG/s6.sam
# control file = /scratch/freearea/Alberto_res/TESTPROG/s0.sam
# effective genome size = 2.70e+09
# band width = 300
# model fold = 10,30
# pvalue cutoff = 1.00e-05
# Range for calculating regional lambda is: 1000 bps and 10000 bps

INFO  @ Wed, 22 Sep 2010 18:16:24: #1 read tag files... 
INFO  @ Wed, 22 Sep 2010 18:16:24: #1 read treatment tags... 
Traceback (most recent call last):
 File "/usr/bin/macs14", line 336, in <module>
   main()
 File "/usr/bin/macs14", line 58, in main
   (treat, control) = load_tag_files_options (options)
 File "/usr/bin/macs14", line 305, in load_tag_files_options
   ttsize = tp.tsize()
 File "/usr/lib/python2.6/site-packages/MACS14/IO/Parser.py", line 655, in tsize
   return int(s/n)
ZeroDivisionError: integer division or modulo by zero
-clustershell-3.2$ 









macs --format=SAM  -t /scratch/freearea/Alberto_res/TESTPROG/s6.sam -c /scratch/freearea/Alberto_res/TESTPROG/s0.sam --tsize=35 --name=s6_s0

INFO  @ Wed, 22 Sep 2010 18:14:42: 
# ARGUMENTS LIST:
# name = s6_s0
# format = SAM
# ChIP-seq file = /scratch/freearea/Alberto_res/TESTPROG/s6.sam
# control file = /scratch/freearea/Alberto_res/TESTPROG/s0.sam
# effective genome size = 2.70e+09
# tag size = 35
# band width = 300
# model fold = 32
# pvalue cutoff = 1.00e-05
# Ranges for calculating regional lambda are : peak_region,1000,5000,10000 

Davide Cittaro

unread,
Sep 23, 2010, 10:34:32 AM9/23/10
to macs-ann...@googlegroups.com
Hi MACS users, 

On Sep 22, 2010, at 6:45 PM, Alberto Termanini wrote:

Hi to all!

Using MACS 1.4 , I've found an error that doesn't happen with previous versions of MACS.


Indeed there's an issue with the tsize estimate and the SAM parser. The original implementation:

   640      def tsize (self):
   641          s = 0
   642          n = 0
   643          m = 0
   644          while n<10 and m<1000:
   645              m += 1
   646              thisline = self.fhd.readline()
   647              (chromosome,fpos,strand) = self.__fw_parse_line(thisline)
   648              if not fpos or not chromosome:
   649                  continue
   650              thisline = thisline.rstrip()
   651              thisfields = thisline.split("\t")
   652              s += len(thisfields[9])
   653              n += 1
   654          self.fhd.seek(0)
   655          return int(s/n)

if the read is not aligned, line 647 doesn't return a chromosome, the loop at line 648 skips to the next iteration. If one has more than 1000 unaligned reads at the beginning of the file, n remain 0 and the division raises an error. The same could happen if there are more than 1000 headers (as in zebrafish). I've applied this:

$ diff -u !$
diff -u lib/IO/Parser.py*
--- lib/IO/Parser.py 2010-09-23 16:31:59.000000000 +0200
+++ lib/IO/Parser.py.old 2010-09-23 16:31:15.000000000 +0200
@@ -641,13 +641,11 @@
         s = 0
         n = 0
         m = 0
-        c = 1000
-        while n<10 and m<c:
+        while n<10 and m<1000:
             m += 1
             thisline = self.fhd.readline()
             (chromosome,fpos,strand) = self.__fw_parse_line(thisline)
             if not fpos or not chromosome:
-                if c - m < 10: c += 1000
                 continue
             thisline = thisline.rstrip()
             thisfields = thisline.split("\t")
 
When the checked lines is close to the limit (c = 1000) but there are less than required reads (n), extends the limit to other 1000... 
Besides, it may be useful to re-enable options.tsize parsing.

d


--
You received this message because you are subscribed to the Google Groups "MACS announcement" group.
To post to this group, send email to macs-ann...@googlegroups.com.
To unsubscribe from this group, send email to macs-announcem...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/macs-announcement?hl=en.

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

*/



Reply all
Reply to author
Forward
0 new messages