segmentation fault (core dumped)

137 views
Skip to first unread message

Abhinav

unread,
Feb 25, 2014, 3:40:58 AM2/25/14
to meth...@googlegroups.com
Hi,

I was running the methpipe on some of methylome analysis and realized that i am getting this new error with segmentation fault (core dumped)

What exactly it mean?is it some bug?

I was running the to-mr script where i was converting the bismark file to mr mapped file for downstream analysis
it ran smoothly earlier and this time it is giving me this error


Thank you

Abhinav
Bioinformatics Analyst
University of washington

Song, Qiang

unread,
Feb 25, 2014, 3:50:31 AM2/25/14
to Methpipe Users
Hi Abhinav,

I may miss your previous email. Can you be more specific about the error?
Is there any error messages?

Best,
Song Qiang




--
You received this message because you are subscribed to the Google Groups "MethPipe and MethBase Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methpipe+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Abhinav

unread,
Feb 25, 2014, 3:55:28 AM2/25/14
to meth...@googlegroups.com
Hi,

I am getting segmentation fault (core dumped) error when i run the to-mr script

Song, Qiang

unread,
Feb 25, 2014, 4:00:51 AM2/25/14
to Methpipe Users, abhi...@gmail.com

Song, Qiang

unread,
Feb 25, 2014, 2:02:24 PM2/25/14
to Methpipe Users
Hi Abhinav,

It seems there may be some tricky issue with the to-mr program. Is it possible for you to send us the input file that produced the error?
To be able to reproduce the error will greatly help us figure out what is wrong.

Best,
Song Qiang





--

Abhinav rao

unread,
Feb 25, 2014, 4:36:41 PM2/25/14
to meth...@googlegroups.com
Hi Song,

i think i have some problem sam headers in the bismark since i am using the ensembl genome

do u need sam headers to be perfect?

Abhinav

Song, Qiang

unread,
Feb 25, 2014, 4:42:53 PM2/25/14
to Methpipe Users
It is hard to say with information I have now. The sam files in methpipe are downloaded from the samtools
website (version 0.1.16), so we can assume they are valid. What is the error message you get?
 

Abhinav rao

unread,
Feb 25, 2014, 5:11:49 PM2/25/14
to meth...@googlegroups.com
hi,

i am sending you the first few lines of the mapping file 

if i run this 

$ to-mr -o 2.mr -m bismark -v 1.sam
Input file: 1.sam
Output file: 2.mr
Segmentation fault (core dumped)

is there any problem with the header?

Abhinav
1.sam

Abhinav rao

unread,
Feb 25, 2014, 5:23:57 PM2/25/14
to meth...@googlegroups.com
Initially i mapped with ensembl genome and then add chr to the header and the mapped read lines in the files

Abhinav

Meng Zhou

unread,
Feb 27, 2014, 1:15:07 PM2/27/14
to meth...@googlegroups.com
Hi Abhinav,

I did find some format problem in the file you sent us. In line 26, the last line of the header starting with @PG, the string for command line contains "tab" character instead of space, which causes the segmentation fault in samtools before it is loaded to to-mr.

There are some additional format errors in the following reads. The column starting with "XR" is somehow concatenated with the previous column. After fixing these errors, I was able to convert the sample you sent. Maybe you made some mistakes in changing the file contents. I recommend you check your file format. I also attach the fixed file for your reference.


Best regards,
Meng
1.sam

Abhinav rao

unread,
Feb 27, 2014, 1:22:27 PM2/27/14
to meth...@googlegroups.com
Thank you

yes,i solved it.i found the same thing(it was a mistake in converting files)

i have another question with the duplicate-remover:

what is the -D option exactly does?


Abhinav





Song, Qiang

unread,
Feb 27, 2014, 1:46:38 PM2/27/14
to Methpipe Users
The -D option, if given, disables the step to verify the input reads are properly sorted.
If you are confident your input file is properly sorted, you may specify that option and it will slightly speed up the program.

Otherwise, just leave the program to run in the default mode.


Abhinav rao

unread,
Feb 27, 2014, 3:31:26 PM2/27/14
to meth...@googlegroups.com
Hi,

I have an issue with the sorting files and removing duplicates

for example:

I have an mr file(attached to mail) which i want to duplicates in them,so i followed these steps 

I am getting error in this:

LC_ALL=C; sort -k 1,1 -k 2,2n -k 3,3n -k 6,6 -o example.mr.sorted example.mr
duplicate-remover -o example.uniq -A example.mr.sorted 
input not properly sorted:
chr1 10003 10144 FRAG:131213_SN711_0410_BC3AWTACXX2:1:1114:18712:90039:0#0/1/ 33 - AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTGGGGCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAACCTAAACCTAACCCTAACCCTAACCCTAACCCTAACACTAACCATAATCCTAACCCTAACCCAA :;;;+:B:AB:AD>>9C@+<+++2?EA@ECADDDB#####@@<DDFFFBDHHDHE?GBFG@?EHFB8:CFIE)C?D*????G*0)9?DBB(8CF(.6CCE#########################################
chr1 10003 10144 FRAG:131213_SN711_0410_BC3AWTACXX2:1:1308:7500:8754:1#0/1/ 1 + TTTTAATTTTAATTTTAATTTTAATTTTAATTTTAATTTTAAAATTAAAATTAAAATTAAAATTAAAATTAAAATTAAATTAAAATTAAAATTAAAATTAAAATTAAAATTAAAATTAAAATTAAAATTAAAATTAAAATT CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJCCCFFFFFHGHHHJJIJJJJJJJJIIJJJJJIJJJJJIIJJIJJJJJJJJHIJJJJJJJIJJJJJIJJJJJJJJJJJJJJJJJHHHHHHFFFFFFFEEEE3

but when i do with -D option:

duplicate-remover -o examplewithDoption.uniq -A -S example.stats -D example.mr.sorted

example stats:
TOTAL READS IN: 99
GOOD BASES IN: 15351
TOTAL READS OUT: 79
GOOD BASES OUT: 12627
DUPLICATES REMOVED: 20
READS WITH DUPLICATES: 13

this option kept that positive read


Thank you

Abhinav

example.mr

Song, Qiang

unread,
Feb 27, 2014, 8:27:34 PM2/27/14
to Methpipe Users
This error is related to the way setting locale in your machine.

Beside setting LC_ALL to C, you need export export the newly defined LC_ALL, i.e.,

export LC_ALL=C; sort -k 1,1 -k 2,2n -k 3,3n -k 6,6 -o example.mr.sorted example.mr

duplicate-remover -o example.uniq -A example.mr.sorted

Please give a try and see if it works.

Abhinav rao

unread,
Feb 27, 2014, 9:03:09 PM2/27/14
to meth...@googlegroups.com
yes,it worked 

Thanks

Abhinav

Abhinav rao

unread,
Mar 3, 2014, 5:02:27 PM3/3/14
to meth...@googlegroups.com
Hi,

I am getting a new error when i am running the hmr on the sample

 hmr -o 1.hmr -i 15 -v -p 1.out 1_all.meth
[READING CPGS AND METH PROPS]
ERROR: could not allocate memory

what is this?is it something i need to give to run?


Abhinav

Song, Qiang

unread,
Mar 3, 2014, 5:18:25 PM3/3/14
to Methpipe Users

I think if it's 28M CpG sites (human), then you only need a few G of RAM.

Can you do "head" on the file, and sent it?

Andrew

Abhinav rao

unread,
Mar 3, 2014, 5:42:15 PM3/3/14
to meth...@googlegroups.com
the meth file has all positions(including CHH,CHG)

just the CpG file itself has 56434018 positions

the whole methylome which i am doing analysis on is about 270 million reads 

head of file
chr1 10003 + CHH 0 4
chr1 10004 + CHH 0 6
chr1 10005 + CHH 0 7
chr1 10009 + CHH 0 7
chr1 10010 + CHH 0 7
chr1 10011 + CHH 0 7
chr1 10015 + CHH 0 7
chr1 10016 + CHH 0 7
chr1 10017 + CHH 0 7
chr1 10021 + CHH 0 7



Song, Qiang

unread,
Mar 3, 2014, 6:21:57 PM3/3/14
to Methpipe Users
The memory usage of the program grows linearly as the number of cytosines site in your input file.
Running hmr with CpG sites only (28M) uses about 3Gb. You may get an estimate of the memory required for all cytosines.

However I am a little curious what you expect from finding HMRs with all cytosines?
As you may know, most of non-CpG cytosines are unmethylated in mammalian genomes.
Even in stem cells, only about 1-2% percent of non-CpG cytosine are found methylated.

If you really want to call HMR with all cytosines, I think you may try to do one chromosome each time.

Abhinav rao

unread,
Mar 3, 2014, 6:39:22 PM3/3/14
to meth...@googlegroups.com
based on the paper Lister et al 2009 Nature

25% of mC in stem cells are nonCPGcontext and most of the interesting changes in methylation both in stem cells and adult brain occur are in nonCpG context

ok,i will try on chromosome basis

Abhinav

Song, Qiang

unread,
Mar 3, 2014, 7:03:26 PM3/3/14
to Methpipe Users
I see your point.

However my concern is as below:

The Lister paper also said:
"We detected approximately 62 million and 45 million methylcytosines in H1 and IMR90 cells, respectively, ...., comprising 5.83% and 4.25% of the cytosines with sequence coverage"

Given the information you mentioned (25% of mC in stem cells are nonCPGcontext), which gives roughly 5.83% x 25% = 1.5% of all cytosines are methylated.
which means a substantial number of non-CpG cytosines are unmethylated. So if you run HMR program with all cytosines, you will probably find those regions
without non-CpG methylation.

Actually if you want to find regions with non-CpG methylation, you may try the hmr_plant program. It is designed to find hyper-methylated regions.

Please keep us updated how they work.

Abhinav rao

unread,
Mar 4, 2014, 6:50:45 PM3/4/14
to meth...@googlegroups.com
hi,
i was combining the methcounts file's and after 2 after hours i got and error saying 

merge-methcounts -o a_combined.meth -S  a_combined.stats -v a.1_uniq_All.meth a_2_uniq_All.meth
error reading methcount file: a_2_uniq_All.meth

there is no stats file generated so waht would be the error?

Abhinav

Song, Qiang

unread,
Mar 4, 2014, 6:58:24 PM3/4/14
to Methpipe Users
Make sure a_2_uniq_All.meth is a valid file name.
Can the file name be a.2_uniq_All.meth since the first input file is a.1_uniq_All.meth?

In addition, please run:
wc -l a.1_uniq_All.meth a_2_uniq_All.meth
head a_2_uniq_All.meth
head a_2_uniq_All.meth
check the output is valid.


Abhinav rao

unread,
Mar 4, 2014, 8:01:19 PM3/4/14
to meth...@googlegroups.com
sorry the filenames are same.i pasted them wrongly

wc -l:

1170371008  a_1_uniq_All.meth
 1170371008 a_2_uniq_All.meth

head a_1_uniq_All.meth

chr1 10003 + CHH 0 1
chr1 10004 + CHH 0 1
chr1 10005 + CHH 0 1
chr1 10009 + CHH 0 1
chr1 10010 + CHH 0 1
chr1 10011 + CHH 0 1
chr1 10015 + CHH 0 2

head a_2_uniq_All.meth

chr1 10003 + CHH 0 4
chr1 10004 + CHH 0 6
chr1 10005 + CHH 0 7
chr1 10009 + CHH 0 7
chr1 10010 + CHH 0 7
chr1 10011 + CHH 0 7


Song, Qiang

unread,
Mar 4, 2014, 8:42:56 PM3/4/14
to Methpipe Users
So do you still have the same error when you run the command with the right file names?
merge-methcounts -o a_combined.meth -S  a_combined.stats -v a_1_uniq_All.meth a_2_uniq_All.meth

what is the output if you run the commands below?
tail a_combined.meth
tail a_1_uniq_All.meth
tail a_2_uniq_All.meth

On Tue, Mar 4, 2014 at 5:01 PM, Abhinav rao <abhi...@gmail.com> wrote:
sorry the filenames are same.i pasted them wrongly

wc -l:

1170371008  a_1_uniq_All.meth
 1170371008 a_2_uniq_All.meth

head a_1_uniq_All.meth

chr1 10003 + CHH 0 1
chr1 10004 + CHH 0 1
chr1 10005 + CHH 0 1
chr1 10009 + CHH 0 1
chr1 10010 + CHH 0 1
chr1 10011 + CHH 0 1
chr1 10015 + CHH 0 2

head a_2_uniq_All.meth

chr1 10003 + CHH 0 4
chr1 10004 + CHH 0 6
chr1 10005 + CHH 0 7
chr1 10009 + CHH 0 7
chr1 10010 + CHH 0 7
chr1 10011 + CHH 0 7


Abhinav rao

unread,
Mar 4, 2014, 8:57:57 PM3/4/14
to meth...@googlegroups.com

so replicates should have the same file name to run the methcounts program

like R1.meth and R2.meth as in pdf


here are the tail commands

a_1_uniq_All.meth
chrY 59363550 + CHH 0 0
chrY 59363551 + CHH 0 0
chrY 59363553 + CHH 0 0
chrY 59363555 + CHH 0 0
chrY 59363557 + CHH 0 0
chrY 59363558 + CHH 0 0
chrY 59363559 + CHH 0 0
chrY 59363561 + CHH 0 0
chrY 59363563 + CHH 0 0
chrY 59363564 + CHH 0 0

==> a_2_uniq_All.meth <==
chrY 59363550 + CHH 0 0
chrY 59363551 + CHH 0 0
chrY 59363553 + CHH 0 0
chrY 59363555 + CHH 0 0
chrY 59363557 + CHH 0 0
chrY 59363558 + CHH 0 0
chrY 59363559 + CHH 0 0
chrY 59363561 + CHH 0 0
chrY 59363563 + CHH 0 0
chrY 59363564 + CHH 0 0
 tail  a_combined.meth
chrY 59033372 + CHH 1 1
chrY 59033375 + CHH 0 0
chrY 59033376 + CHH 0 1
chrY 59033378 + CHH 0 0
chrY 59033382 + CHH 0 0
chrY 59033384 + CHH 0 0
chrY 59033387 + CHH 0 1
chrY 59033390 + CHH 0 0
chrY 59033393 - CHH 0 0
chrY 59033394 + CHH 0


Abhinav rao

unread,
Mar 5, 2014, 2:38:50 PM3/5/14
to meth...@googlegroups.com
Hi,

what does the -i(max iterations) option does in the hmr program?
what is the best option to give


Abhinav
 

Song, Qiang

unread,
Mar 5, 2014, 8:57:44 PM3/5/14
to Methpipe Users
The -i option specifies the maximum number of iterations when using Baum-Welch methods to train the HMM model.
Empirically, the HMM usually converges with 10-20 iterations. Setting the option to 20 should be fine.

Abhinav rao

unread,
Mar 5, 2014, 9:31:52 PM3/5/14
to meth...@googlegroups.com
i used -i 3 for my analysis,and generated the bigbed's 
should i redo once again?

Abhinav

Song, Qiang

unread,
Mar 5, 2014, 9:34:06 PM3/5/14
to Methpipe Users

Yes. I would recommend that.

Abhinav rao

unread,
Mar 5, 2014, 9:49:11 PM3/5/14
to meth...@googlegroups.com
since i generated the files i will post them and see and how it looks
and in mean time i will redo with more iterations

Abhinav

Abhinav rao

unread,
Mar 5, 2014, 10:43:16 PM3/5/14
to meth...@googlegroups.com
bsrate for the samples question:

i see bsrate in my samples around 0.860771
i mapped the samples even with the lambda and see the conversion worked

Abhinav

Abhinav rao

unread,
Mar 12, 2014, 2:09:34 PM3/12/14
to meth...@googlegroups.com
Hi,

i have a question regarding coverage in developed samples from methbase

i recently got H1NPC data from methbase and was comparing with my read file(i filtered for 8x coverage) and saw a difference in coverage

few lines

 awk '$4==0' Human_H1-NPC.read.bedGraph | head
chr1 10640 10641 0
chr1 10643 10644 0
chr1 10649 10650 0
chr1 10659 10660 0
chr1 10661 10662 0
chr1 10664 10665 0
chr1 10666 10667 0
chr1 10669 10670 0
chr1 10672 10673 0
chr1 10678 10679 0


why there are positions that are with coverage 0?what is the coverage threshold you use in samples?

Thank you

Abhinav




Song, Qiang

unread,
Mar 12, 2014, 3:27:20 PM3/12/14
to Methpipe Users
We keep all CpG positions even if there is zero coverage.
This makes output files from different dataset has the same number of lines, which simplifies many downstream analysis.


For more options, visit https://groups.google.com/d/optout.

Abhinav rao

unread,
Mar 25, 2014, 1:57:09 PM3/25/14
to meth...@googlegroups.com


---------- Forwarded message ----------
From: Abhinav rao <abhi...@gmail.com>
Date: Tue, Mar 18, 2014 at 12:53 PM
Subject: Re: [methpipe] Re: segmentation fault (core dumped)
To: meth...@googlegroups.com


Hi,

I am running hmr-plant on non-CpG context(by chromosomes which suggested) and i am getting this error

$ hmr_plant -o chr1.bed -i 3 -v chr1.txt
[READING CPGS AND METH PROPS]
TOTAL CPGS: 89472034
MEAN COVERAGE: 9.65897

[SEPARATING BY CPG DESERT]
CPGS RETAINED: 85503987
DESERTS REMOVED: 564

Baum-Welch Training
hmr_plant: ../common/ThreeStateHMM.cpp:364: void ThreeStateHMM::estimate_state_posterior(size_t, size_t): Assertion `fabs(hypo_posteriors[i] + HYPER_posteriors[i] + HYPO_posteriors[i] - 1.0) < 1e-3' failed.
Aborted (core dumped)

i have given a lot of memory for running this?

Abhinav

Reply all
Reply to author
Forward
0 new messages