Loading .bam files generated with nf-core/methylseq bwa-mem option

178 views
Skip to first unread message

Alejandra Rodríguez

unread,
Jul 21, 2022, 8:38:55 AM7/21/22
to methylkit_discussion
Hello,

I obtained my .bam files from the nf-core/methylseq using the bwa-mem workflow instead of the Bismark one as stated in the documentation. I have copied the description as an image below.
bwamem.PNG
My .bam files have been aligned and sorted, so as far as I understand, I should be able to produce methylation call files for each sample and then use methRead to create the methylrawList object.

My code reads as follows:

> my.methRaw=processBismarkAln(location = "PN0341_0001_S1_L001_R1_001.sorted.markDups.bam",sample.id="test1",assembly="hg19",read.context="CpG",save.folder=getwd())

But I keep getting this as an output:

Trying to process:
    PN0341_0001_S1_L001_R1_001.sorted.markDups.bam
Using htslib.

Error in methCall(read1 = location, type = "bam", nolap = nolap, minqual = minqual,  :
  no methylation tag found for bam file PN0341_0001_S1_L001_R1_001.sorted.markDups.bam

If I understand correctly, the problem is with the .bam file, but as I am not too experienced in this, I would like to know if there is any way to overcome this problem or if methylKit can only read .bam files that come from Bismark instead.

I appreciate any help that you could provide in this matter.
Alejandra

Alejandra Rodríguez

unread,
Jul 26, 2022, 3:49:29 AM7/26/22
to methylkit_discussion
I just wanted to follow up to see if anyone had any thoughts about this error or if anyone knows if I can in fact load .bam files that are not generated by Bismark. Thanks.

Altuna Akalin

unread,
Jul 26, 2022, 4:36:04 AM7/26/22
to methylkit_...@googlegroups.com
You can not. You have to call basepair resolution methylation from those bam files and use that as input to methylkit 

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/0af7489b-6e48-4f0a-a8fa-7fe2ca9714e2n%40googlegroups.com.
--
Sent from mobile, excuse the brevity

Alejandra Rodríguez

unread,
Jul 26, 2022, 5:31:38 AM7/26/22
to methylkit_discussion
Hi, thanks for the response - is there anywhere where I could read more about how to do that? I don't really know where to start in this particular case.

Thank you,
Alejandra

Alexander Blume

unread,
Jul 27, 2022, 2:55:55 AM7/27/22
to methylkit_discussion
HI Alejandra, 

You can use methylDackel to extract the methylation calls from bwa, 
but you have to pass the - - methylkit flag to produce compatible output. 
In methylkit, you should be able to directly load the generated file using methRead. 

Best
Alex 

Alejandra Rodríguez

unread,
Jul 27, 2022, 4:29:04 AM7/27/22
to methylkit_discussion
Hi Alex,

Thanks for your input, this is very helpful. I do have however another question in this regard - I used the nf-core/methylseq pipeline and the workflow option I chose uses methylDackel to extract the calls, but when I passed the --methylkit flag to produce the output it didn't generate any files, so I have methylDackel output files - four per sample: .bedGraph, .txt, and two .svg

I cannot re-run my samples as it takes too long and we are a bit in a rush. How could I go from these output files to generate a methylkit compatible output?

Thanks,
Alejandra

Alexander Blume

unread,
Jul 27, 2022, 4:44:03 AM7/27/22
to methylkit_...@googlegroups.com
Hi Alejandra,

If you use methylDackel directly, please recheck the correct spelling since this flag is in snake case (--methylKit) (https://github.com/dpryan79/MethylDackel#methylkit-compatible-output).
Otherwise, there seems to be a methylKit flag in the bwa-meth options (—methyl_kit) in the parameter docs of the nf-core/methylseq pipeline (https://nf-co.re/methylseq/1.6.1/parameters#bwa-meth-options).  

Best
Alex

You received this message because you are subscribed to a topic in the Google Groups "methylkit_discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/methylkit_discussion/kyrqBXUv4zQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/53e73f86-5fce-484f-8716-86f8a229dab7n%40googlegroups.com.

Alejandra Rodríguez

unread,
Aug 2, 2022, 7:20:42 AM8/2/22
to methylkit_discussion
Hi again,

I am happy to say that I have my methylKit files and they look as they should - thanks for your help!

I was wondering if you could perhaps help me with another issue I'm getting. I have a total of 12 files averaging around 750.000 KB and I want to read them using methRead() but I am getting the error "cannot allocate vector of size 128.0 Mb". 

To troubleshoot, I tried to load just one of the files instead of all 12 and I got the same error, as you can see in the attached image below. The window on the left shows the Memory that is being used at the time that R aborts the session and I am puzzled as I was expecting it to be completely full or something like that, but it seems like it has enough capacity, so I would appreciate some advice about what should my next troubleshooting step be.

Thanks,
Alejandra

memory_error.PNG

Alexander Blume

unread,
Aug 2, 2022, 7:56:20 AM8/2/22
to methylkit_...@googlegroups.com
Hi Alejandra,

This is truly weird behaviour, given your large amount of memory. 
Could you please share the code that created this issue? 

Otherwise, you might want to switch to a tabix-based analysis which would be easier on the memory. 
See here for more info: 

Best
Alex

On 2. Aug 2022, at 13:20, Alejandra Rodríguez <aros...@gmail.com> wrote:

Hi again,

I am happy to say that I have my methylKit files and they look as they should - thanks for your help!

I was wondering if you could perhaps help me with another issue I'm getting. I have a total of 12 files averaging around 750.000 KB and I want to read them using methRead() but I am getting the error "cannot allocate vector of size 128.0 Mb". 

To troubleshoot, I tried to load just one of the files instead of all 12 and I got the same error, as you can see in the attached image below. The window on the left shows the Memory that is being used at the time that R aborts the session and I am puzzled as I was expecting it to be completely full or something like that, but it seems like it has enough capacity, so I would appreciate some advice about what should my next troubleshooting step be.

Thanks,
Alejandra

<memory_error.PNG>

Alejandra Rodríguez

unread,
Aug 2, 2022, 8:35:12 AM8/2/22
to methylkit_discussion
No problem, here is the code I used for just 1 file ( I got the same error trying to load the 12 files as well)

library(methylKit)

wd<-c("D:/projects/NucleusServer/PN0341/")
setwd(wd)

filesIn<-dir(pattern=".methylKit")
condition<-c(rep=0, times=1)
samples <-c("CB1")

#Read in data
myObj<-methRead(location=as.list(filesIn),
                sample.id=as.list(samples),assembly="hg19",
                header=F, treatment=condition)


I will take a look at the links you provided as well but if you find anything that could've been causing this error in my code I would be very grateful to hear about it.

Thanks,
Alejandra

Alexander Blume

unread,
Aug 2, 2022, 2:34:20 PM8/2/22
to methylkit_...@googlegroups.com
Hi Alejandra,

I do not see anything obvious which could cause the memory leakage. 
However, R under Windows is known for having issues with memory (see https://stackoverflow.com/questions/5171593/r-memory-management-cannot-allocate-vector-of-size-n-mb).
Maybe you could try some of the mentioned approaches and see if they resolve your issues.

Concerning the code, there are only a few things wrong in the chunk that you sent, but I guess those would have triggered different errors anyways: 

```
library(methylKit)

wd<-c("~/projects/pigx/pigx_bsseq/test_real_data/out_subsampled100000/06_methyl_calls/methylDackel/")
setwd(wd)

filesIn<-dir(pattern=".methylKit")[1]
condition<-c(rep(0, times=1))  # <<--- changed this to proper rep() call 

samples <-c("CB1")

#Read in data
myObj<-methRead(location=filesIn,
                sample.id=samples,
                assembly="hg19", 
                header=T, 
# <<— set this true, since files should contain header (you should check your files again)                
treatment=condition)
```

Best
Alex

Reply all
Reply to author
Forward
0 new messages