Summary of methylKit objects

55 views
Skip to first unread message

Ryan Daniels

unread,
Jun 12, 2023, 6:51:27 AM6/12/23
to methylkit_discussion
Hello All

What is the best way to get a summary of a methylKit object?

In my case I am trying to get the list of chromosomes/linkage bodies, the number of sites per chr, the start and end position for each chr.
I want to subset a really big methylKit object by chr for custom plots etc.
At the moment if I try to methylKit::select() on the object using a vector of row indices to extract I get a segfault.
The segfault  only happens with really big datasets (billions of sites).

Any assistance will be greatly appreciated!
Ryan

Ryan Daniels

unread,
Jun 12, 2023, 7:18:12 AM6/12/23
to methylkit_discussion

Something like this, for example, produces a segfault. So I cannot subset the data inside R.
```tmp.loci = grepl(chr.vec, getData(meth.obj)$chr) %>% which # Used for partial string matching```

Ryan Daniels

unread,
Jun 26, 2023, 6:06:53 AM6/26/23
to methylkit_discussion
I see now that the manual says select does not return a DB object, so the seg fault, I guess, is a result of the data being read into memory.

For anyone interested, I have resorted to doing the summary outside of methylKit/R to avoid seg faults.

Lucy R Barnard

unread,
Sep 17, 2025, 8:26:16 AM (10 days ago) Sep 17
to methylkit_discussion
Hi Ryan,

I am having similar large data issues using select(), getData(), and when trying to convert to GRange to get around segmentation errors. I'd be interested to know what alternatives outside of R you found? I'm aiming to get a proportion methylated for each sample (>1000) for each CpG (~50 million) from a methylkitDB object.

Cheers,
Lucy

Alexander Blume

unread,
Sep 17, 2025, 3:34:45 PM (10 days ago) Sep 17
to methylkit_...@googlegroups.com
Hi Ryan and Luci,

I am usually using the internal applyTbxBy(Chunk,Chr,Overlap) functions to deal with tabix files. You can call them with the ::: accessor, like methylKit:::applyTbxByChr. The take the path to a tabix file (getDBPath) and a function to apply on a dataframe by  chunk/chr/overlapping region. You should find some inspiration in the https://github.com/al2na/methylKit/blob/master/R%2FmethylDBFunctions.R file.

Best,
Alex

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/methylkit_discussion/bb90b7f0-dc81-4477-9744-614d1df7b8ben%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages