MACS 1.4 --tisize error

143 views

Skip to first unread message

Alberto Termanini

unread,

Sep 22, 2010, 12:45:21 PM9/22/10

to macs-ann...@googlegroups.com

Hi to all!

Using MACS 1.4 , I've found an error that doesn't happen with previous versions of MACS.

It seems that MACS 1.4 is not able to parse the --tsize option (or s option) and/or to auto-set
the tsize parameter.

Below I report the output with MACS 1.4 or with MACS previous version.

Anyone should help me?

Many thank

Best,

Alberto Termanini
IFOM-IEO Campus
Italy

macs14 --format=SAM -t /scratch/freearea/Alberto_res/TESTPROG/s6.sam -c /scratch/freearea/Alberto_res/TESTPROG/s0.sam --tsize=35 --name=s6_s0

INFO @ Wed, 22 Sep 2010 18:16:24:
# ARGUMENTS LIST:
# name = s6_s0
# format = SAM
# ChIP-seq file = /scratch/freearea/Alberto_res/TESTPROG/s6.sam
# control file = /scratch/freearea/Alberto_res/TESTPROG/s0.sam
# effective genome size = 2.70e+09
# band width = 300
# model fold = 10,30
# pvalue cutoff = 1.00e-05
# Range for calculating regional lambda is: 1000 bps and 10000 bps

INFO @ Wed, 22 Sep 2010 18:16:24: #1 read tag files...
INFO @ Wed, 22 Sep 2010 18:16:24: #1 read treatment tags...
Traceback (most recent call last):
File "/usr/bin/macs14", line 336, in <module>
   main()
File "/usr/bin/macs14", line 58, in main
   (treat, control) = load_tag_files_options (options)
File "/usr/bin/macs14", line 305, in load_tag_files_options
   ttsize = tp.tsize()
File "/usr/lib/python2.6/site-packages/MACS14/IO/Parser.py", line 655, in tsize
   return int(s/n)
ZeroDivisionError: integer division or modulo by zero
-clustershell-3.2$

macs --format=SAM -t /scratch/freearea/Alberto_res/TESTPROG/s6.sam -c /scratch/freearea/Alberto_res/TESTPROG/s0.sam --tsize=35 --name=s6_s0

INFO @ Wed, 22 Sep 2010 18:14:42:
# ARGUMENTS LIST:
# name = s6_s0
# format = SAM
# ChIP-seq file = /scratch/freearea/Alberto_res/TESTPROG/s6.sam
# control file = /scratch/freearea/Alberto_res/TESTPROG/s0.sam
# effective genome size = 2.70e+09
# tag size = 35
# band width = 300
# model fold = 32
# pvalue cutoff = 1.00e-05
# Ranges for calculating regional lambda are : peak_region,1000,5000,10000

Davide Cittaro

unread,

Sep 23, 2010, 10:34:32 AM9/23/10

to macs-ann...@googlegroups.com

Hi MACS users,

On Sep 22, 2010, at 6:45 PM, Alberto Termanini wrote:

Hi to all!

Using MACS 1.4 , I've found an error that doesn't happen with previous versions of MACS.

Indeed there's an issue with the tsize estimate and the SAM parser. The original implementation:

640 def tsize (self):

641 s = 0

642 n = 0

643 m = 0

644 while n<10 and m<1000:

645 m += 1

646 thisline = self.fhd.readline()

647 (chromosome,fpos,strand) = self.__fw_parse_line(thisline)

648 if not fpos or not chromosome:

649 continue

650 thisline = thisline.rstrip()

651 thisfields = thisline.split("\t")

652 s += len(thisfields[9])

653 n += 1

654 self.fhd.seek(0)

655 return int(s/n)

if the read is not aligned, line 647 doesn't return a chromosome, the loop at line 648 skips to the next iteration. If one has more than 1000 unaligned reads at the beginning of the file, n remain 0 and the division raises an error. The same could happen if there are more than 1000 headers (as in zebrafish). I've applied this:

$ diff -u !$

diff -u lib/IO/Parser.py*

--- lib/IO/Parser.py 2010-09-23 16:31:59.000000000 +0200

+++ lib/IO/Parser.py.old 2010-09-23 16:31:15.000000000 +0200

@@ -641,13 +641,11 @@

s = 0

n = 0

m = 0

- c = 1000

- while n<10 and m<c:

+ while n<10 and m<1000:

m += 1

thisline = self.fhd.readline()

(chromosome,fpos,strand) = self.__fw_parse_line(thisline)

if not fpos or not chromosome:

- if c - m < 10: c += 1000

continue

thisline = thisline.rstrip()

thisfields = thisline.split("\t")

When the checked lines is close to the limit (c = 1000) but there are less than required reads (n), extends the limit to other 1000...

Besides, it may be useful to re-enable options.tsize parsing.

--
You received this message because you are subscribed to the Google Groups "MACS announcement" group.
To post to this group, send email to macs-ann...@googlegroups.com.
To unsubscribe from this group, send email to macs-announcem...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/macs-announcement?hl=en.

Davide Cittaro

Cogentech - Consortium for Genomic Technologies

via adamello, 16

20139 Milano

Italy

tel.: +39(02)574303007

e-mail: davide....@ifom-ieo-campus.it

Reply all

Reply to author

Forward

0 new messages