Error in 'htslibWrapper.ReadIterator.get'

193 views
Skip to first unread message

Tim Jeske

unread,
Feb 18, 2016, 9:40:17 AM2/18/16
to Platypus Users
Hello,

i tried Platypus as it is faster as the HaplotypeCaller and better regarding Indel calling.
However, today i got the following error message:

Exception OverflowError: 'value too large to convert to short' in 'htslibWrapper.ReadIterator.get' ignored

I was able to track down the problem in the source code. It is due to CIGAR strings with H operations longer than the maximum value of short.
I think the problem can easily fixed by using int instead of short to store the length of CIGAR string operations.
Or would it be better to remove such reads from the BAM file as Platypus is not able to deal with them anyway?

Best regards
Tim

Daniel Cooke

unread,
Feb 18, 2016, 9:56:16 AM2/18/16
to Tim Jeske, Platypus Users
Hi Tim,

Is this Nanopore or PacBio data? A short' range is approx ±32,000 which is always sufficient for Illumina reads. I could probably make this an unsigned short to double the maximum size, but I’d be a little hesitant to increase to an int; Platypus was only really intended for short read Illumina reads so a substantial re-design would be needed to work with long read data.

Best
Dan

--
You received this message because you are subscribed to the Google Groups "Platypus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platypus-user...@googlegroups.com.
To post to this group, send email to platypu...@googlegroups.com.
Visit this group at https://groups.google.com/group/platypus-users.
To view this discussion on the web, visit https://groups.google.com/d/msgid/platypus-users/54f84048-8b40-467d-a46f-2ee0f0a33a64%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Jeske

unread,
Feb 18, 2016, 11:12:01 AM2/18/16
to Platypus Users, tim.p...@gmail.com, dco...@well.ox.ac.uk
Hi Dan,

thank you for your fast answer.

Actually, it is RNASeq data generated by an Illumina Platform. The reads have an average length of 49 bases/read.
They have been mapped using the STAR aligner. As I didn't generate the bam files myself, I'm not sure why STAR generated such mappings or whether those are biologically meaningful.
I even found reads with the following CIGAR string: 13M542361H

Do you have any idea how to deal with those mappings without removing the reads from the bam file?

Best
Tim

James Turton

unread,
Nov 8, 2017, 6:52:23 AM11/8/17
to Platypus Users

James Turton

unread,
Nov 8, 2017, 6:52:33 AM11/8/17
to Platypus Users
Hi Tim, 

I have had exactly the same issue, did you obtain a resolution to this problem?

On Thursday, 18 February 2016 14:40:17 UTC, Tim Jeske wrote:
Reply all
Reply to author
Forward
0 new messages