tandem2xml gives "invalid pointer" error on linux

116 views
Skip to first unread message

Gautam Saxena

unread,
Aug 24, 2012, 10:39:21 AM8/24/12
to spctools...@googlegroups.com
I have TPP v4.5 RAPTURE rev 2, Build 201208061232 (linux) on Centos 6.3 (all up to date, 64 bit) installed and pretty much working. For one project, though, we ran tandem2xml on 632 X!Tandem files. For 597 of such files, it converted fine. For 35 though, we got the "invalid pointer" error as follows:

*** glibc detected *** /usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML: free(): invalid pointer: 0x00000000016fba40 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e316753c6]
/usr/lib64/libstdc++.so.6(_ZNSsD1Ev+0x39)[0x3e34a9d4a9]
/usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML[0x413821]
/usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML[0x405f4e]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3e3161ecdd]
/usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML[0x405c89]
======= Memory map: ========
00400000-00469000 r-xp 00000000 00:18 32638052                           /usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML
00669000-0066b000 rw-p 00069000 00:18 32638052                           /usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML
0066b000-0066d000 rw-p 00000000 00:00 0
016f8000-01719000 rw-p 00000000 00:00 0                                  [heap]
3e31200000-3e31220000 r-xp 00000000 fd:00 6809                           /lib64/ld-2.12.so
3e3141f000-3e31420000 r--p 0001f000 fd:00 6809                           /lib64/ld-2.12.so
3e31420000-3e31421000 rw-p 00020000 fd:00 6809                           /lib64/ld-2.12.so
3e31421000-3e31422000 rw-p 00000000 00:00 0
3e31600000-3e31789000 r-xp 00000000 fd:00 6810                           /lib64/libc-2.12.so
3e31789000-3e31988000 ---p 00189000 fd:00 6810                           /lib64/libc-2.12.so
3e31988000-3e3198c000 r--p 00188000 fd:00 6810                           /lib64/libc-2.12.so
3e3198c000-3e3198d000 rw-p 0018c000 fd:00 6810                           /lib64/libc-2.12.so
3e3198d000-3e31992000 rw-p 00000000 00:00 0
3e31e00000-3e31e17000 r-xp 00000000 fd:00 6811                           /lib64/libpthread-2.12.so
3e31e17000-3e32017000 ---p 00017000 fd:00 6811                           /lib64/libpthread-2.12.so
3e32017000-3e32018000 r--p 00017000 fd:00 6811                           /lib64/libpthread-2.12.so
3e32018000-3e32019000 rw-p 00018000 fd:00 6811                           /lib64/libpthread-2.12.so
3e32019000-3e3201d000 rw-p 00000000 00:00 0
3e32200000-3e32215000 r-xp 00000000 fd:00 48060                          /lib64/libz.so.1.2.3
3e32215000-3e32414000 ---p 00015000 fd:00 48060                          /lib64/libz.so.1.2.3
3e32414000-3e32415000 r--p 00014000 fd:00 48060                          /lib64/libz.so.1.2.3
3e32415000-3e32416000 rw-p 00015000 fd:00 48060                          /lib64/libz.so.1.2.3
3e32600000-3e32683000 r-xp 00000000 fd:00 4968                           /lib64/libm-2.12.so
3e32683000-3e32882000 ---p 00083000 fd:00 4968                           /lib64/libm-2.12.so
3e32882000-3e32883000 r--p 00082000 fd:00 4968                           /lib64/libm-2.12.so
3e32883000-3e32884000 rw-p 00083000 fd:00 4968                           /lib64/libm-2.12.so
3e33a00000-3e33a10000 r-xp 00000000 fd:00 6655                           /lib64/libbz2.so.1.0.4
3e33a10000-3e33c0f000 ---p 00010000 fd:00 6655                           /lib64/libbz2.so.1.0.4
3e33c0f000-3e33c11000 rw-p 0000f000 fd:00 6655                           /lib64/libbz2.so.1.0.4
3e34200000-3e34216000 r-xp 00000000 fd:00 6814                           /lib64/libgcc_s-4.4.6-20120305.so.1
3e34216000-3e34415000 ---p 00016000 fd:00 6814                           /lib64/libgcc_s-4.4.6-20120305.so.1
3e34415000-3e34416000 rw-p 00015000 fd:00 6814                           /lib64/libgcc_s-4.4.6-20120305.so.1
3e34a00000-3e34ae8000 r-xp 00000000 fd:00 6815                           /usr/lib64/libstdc++.so.6.0.13
3e34ae8000-3e34ce8000 ---p 000e8000 fd:00 6815                           /usr/lib64/libstdc++.so.6.0.13
3e34ce8000-3e34cef000 r--p 000e8000 fd:00 6815                           /usr/lib64/libstdc++.so.6.0.13
3e34cef000-3e34cf1000 rw-p 000ef000 fd:00 6815                           /usr/lib64/libstdc++.so.6.0.13
3e34cf1000-3e34d06000 rw-p 00000000 00:00 0
2adaee375000-2adaee376000 rw-p 00000000 00:00 0
2adaee382000-2adaee389000 rw-p 00000000 00:00 0
7fffc87d1000-7fffc87e6000 rw-p 00000000 00:00 0                          [stack]
7fffc87ff000-7fffc8800000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]


The program was run as follows:

/usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML zz85-control-8-003-9 0Cw6-zz85-control-8-003-9.pep.xml

 I have attached a zipped version of the input file, zz85-control-8-003-9. (Note, you'll need to unzip and then rename the file to get rid of the ".temp" suffix; then, you should be able to run the above command exactly as is.)

Any clues?

Thanks in advance.

Regards,
Gautam


zz85-control-8-003-9.temp.gz

Dave Trudgian

unread,
Aug 24, 2012, 10:56:38 AM8/24/12
to spctools...@googlegroups.com
Gautum,

Maybe this is related to the problem I observed that seems filename dependent:


If you rename the files do they process correctly? Is there anything common about the filenames of the 35 that don't work? If you run Tandem2XML against the gzipped file does it work?

DT

If you try to run the command with the

Jimmy Eng

unread,
Aug 24, 2012, 12:11:10 PM8/24/12
to spctools...@googlegroups.com
FWIW, just doing something as simple as changing the name from
"zz85-control-8-003-9" to "zz85-control-8-003-9y" allows the
conversion to go through fine. Hopefully someone with time will look
at the convoluted input handling in Tandem2XML as the logic being used
to parse input file name and recognize file extensions is horribly
broken.
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/spctools-discuss/-/9ASFYEooFCMJ.
>
> To post to this group, send email to spctools...@googlegroups.com.
> To unsubscribe from this group, send email to
> spctools-discu...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.

David Trudgian

unread,
Aug 24, 2012, 12:16:52 PM8/24/12
to spctools...@googlegroups.com
That's the same problem I saw then. I'll try and have a look at this properly on the weekend, or failing that on Monday / Tuesday. I started to look through it before, but the input handling is so complex an hour wasn't enough to understand it.

We come across this issue a lot now, due to the way our mass-spectrometrists commonly name files.

DT
________________________________

UT Southwestern Medical Center
The future of medicine, today.

Gautam Saxena

unread,
Aug 25, 2012, 1:10:50 PM8/25/12
to spctools...@googlegroups.com
Gziping the file seems to have solved the problem for those 35 files. But, 1) I don't know if it always solves the problem; 2) it causes other problems: specifically, the resultant pep.xml file now references within itself the xtandem file with the "gz" extension, and so at least one downstream program, namely LIBRA, now errors out because it's looking for an mzXML file that is named with a ".gz.mzXML" extension. Here's a sample error from Libra:

Failed to open input file '/usr/ia_working_dir/DASH/DASH_Server/JHU_NHLBI/Users/gsaxena_i-a-inc.com/Workflow_Outputs/Centroided_-_Cys_TMT_Zero_0824_50ppm/zTvL-120625_1CPr_CysTMT0.gz.mzXML'.

And, less importantly, gziping consumes a little CPU, which adds up if we have 1000 or so MS files for a given project.

So, it would be very beneficial if this bug were to be resolved. In case it helps, here's something else I noticed:

  1. In once project in which we had 1216 MS files, 1201 of those files correctly ran though tandem2xml. However, 15 failed with the same error message as indicated previously. The input file to tandem2xml was in a different directory that I referenced with a fully qualified path name, as in:
cd /usr/my_working_dir
/usr/ia_working_dir/DASH/DASH_Server/IA_Common/programs/linux_programs/tpp/bin/Tandem2XML /usr/some_dir/another_dir/zz85-control-8-003-9 0Cw6-zz85-control-8-003-9.pep.xml 

I solved the problem by creating symbolic link called " zz85-control-8-003-9"  in the "/usr/my_working_dir" that pointed to  /usr/some_dir/another_dir/zz85-control-8-003-9 and then tandem2xml magically worked.

Also, I'm not sure if this is an additional wrinkle, but my X!Tandem files have NO extension. (I needed to do this to support LIBRA properly in my scenario.)

Gautam Saxena

unread,
Sep 2, 2012, 12:56:49 PM9/2/12
to spctools...@googlegroups.com
I have bad news to report: gzipping the files sometimes fixes problems, and sometimes not. In fact, we just ran some 1216 files through X!Tandem, and 585 failed with this same "null pointer exception", so it looks like one would tryly need to figure out what's going on with the program and it's parsing of the name....Will provide more of an update tonight if I get a chance....

Gautam Saxena

unread,
Sep 3, 2012, 2:05:02 PM9/3/12
to spctools...@googlegroups.com
Jimmy et al: Is there a naming pattern/trick that will always work? For example, if I just suffix the X!Tandem filename with say the characters "yyy" or "xml" or if I prefix it in some similar fashion, will that be a workaround for now?

David Trudgian

unread,
Sep 3, 2012, 8:36:03 PM9/3/12
to spctools...@googlegroups.com
Gutem,

I've started to look through the code, but it is very convoluted. I can't yet say what's causing the crash, so I don't know what would be workaround r.e. changing filenames.

I still haven't seen any crashes at all when files are gzipped - can you supply exact names of your files which cause crashes even when gzipped?

DT

--
David Trudgian Ph.D.

Instructor, Biochemistry Dept and Proteomics Core
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd, Dallas, TX 75390-9038

Room:ND6.214 Tel:(214)648-7025 Fax:(214)645-6298




From: spctools...@googlegroups.com [spctools...@googlegroups.com] on behalf of Gautam Saxena [gsaxe...@gmail.com]
Sent: Monday, September 03, 2012 1:05 PM

Dave Trudgian

unread,
Sep 4, 2012, 4:59:16 PM9/4/12
to spctools...@googlegroups.com
Here's a patch that fixes the problem (for me).

The problem occurs where TandemResultsParser writePepXML() code is sending a filename and a string buffer to rampConstructInputFileName in the RAMPFace code. It's trying to add prefixes/extension to the tandem filename in order to find the corresponding mzXML/mzML file. Unfortunately the buffer declaration is too short. The longest prefix/extension combination tried is 21 characters. The buffer is declared in TandemResultsParser only as the original filename length +20 chars.

This was difficult to hunt down, as the prefixes are specified in TandemResultsParser, with the suffixes in RAMPFace - also the crash occurs where the RAMPFace code tries adding additional '/' at the start of the string, over and above the prefixes that are passed in by the TandemResultsParser code. This seems pointless, as it will result in search paths above from the root directory, e.g. 

/../../EColi_07_10_12_1microlitre_01.tandem.mzXML.gz

... but I've left this behavior and just modified the buffer size.

Whether or not the problem code causes a complete crash will depend on what else is around in the memory at the relevant location at the time. I guess in my case I saw no problems with gzipped files due to where things were being allocated in memory from my particular build of TPP on this machine.

DT



Apply to src/Parsers/Algorithm2XML/Tandem2XML/TandemResultsParser.cxx
NB: line numbers might be slightly off. This is against previous TPP release.

@@ -678,7 +678,10 @@ bool TandemResultsParser::writePepXML()
                if (strMzXMLFile.length() >= 5 && strMzXMLFile.compare(strMzXMLFile.length() - 5, 5, ".xtan") == 0)
                        strMzXMLFile.erase(strMzXMLFile.length() - 5);
                int len;
-               char *fname=new char[len=strMzXMLFile.length()+20];
+               // DCT fix: buffer needs to have additional 21 chars, not 20:
+               //          longest prefix tried is /../../xml/         11 chars
+               //          longest suffix tried is .mzXML.gz           9 chars 
+               char *fname=new char[len=strMzXMLFile.length()+21];
 
                rampConstructInputFileName(fname,len,strMzXMLFile.data());
                 pf = rampOpenFile(fname);




Dave Trudgian

unread,
Sep 4, 2012, 5:12:00 PM9/4/12
to spctools...@googlegroups.com
Sorry,

Should be an extra comment line in the patch to make it more obvious... It's 11 chars + 9 chars + 1 char for the terminating null terminator = 21 chars!

DT

Joseph Slagel

unread,
Sep 4, 2012, 8:50:17 PM9/4/12
to spctools...@googlegroups.com
Thanks Dave for the path.  I've applied it to the 4.6 branch and it should go out in 4.6.1.


--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To view this discussion on the web visit https://groups.google.com/d/msg/spctools-discuss/-/XXnZTDbx2ToJ.
Reply all
Reply to author
Forward
0 new messages