On 26. Jan 2021, at 15:37, Johan Zicola <johan....@gmail.com> wrote:
Dear Alex,
Thank you for your quick answer!
So I tried `methylKit:::fread.gzipped` and it works properly, I get 3 data.table objects without NAs issues and the size of the objects is realistic.
<image2.PNG><image3.PNG>
I also tried with `skipDecompress =TRUE` and it also works (no NAs) but I get exactly the same object when I do the function a second time, as R was stuck to the same cache file:
<image1.PNG>
I had this problem before and I now understand why. R seems to read what is already in cache instead of extracting the new file given in the function. Maybe it is because the file names are identical, although the paths are different. I tested this hypothesis by renaming one of my methylBaseDB file and its associated tabix index file and it is the case:
<image5.PNG>
I had a whole pipeline running smoothly with an earlier methylKit version, where I was using similar name files in different folders. It does not work anymore and I assume it is because of a new version of `R.utils::decompressFile` you mention line 22 in https://github.com/al2na/methylKit/blob/cfbc32ed9dd490e9e8070d33cacaa6ebecb56d76/R/backbone.R#L9
Now, I restart R to empty the cache and I start loading the mCHH methylBase object first (with the original name methylBase_.txt.bgz) and it works. The object is big and without NAs so the argument `skipDecompress=TRUE` is the definitive culprit for the misreading of cache files with the same names. It makes sense if they are located in the same "temporary folder or RAM slot" but this problem was not there in earlier versions of methylKit.
<image4.PNG>
That leads me to think that these cache files are also responsible for these NAs lines: I thought the issue was coming from the RAM as my first object was always CpG and was also the smallest in size. But the thing is that it was the first object to go to the uncompressed cache file and I guess the CHG and CHH were overwriting part of this file but not all. It makes sense since all consecutive loaded files have the same beginning (see below) but a different end (see below). It seems that for mCHH, where we see NAs, the object is extended (more lines as shown with the dim() ) function but not filled with the correct value. However, the data.table objects seem to have a correct ending, even with `skipDecompress=TRUE`. It seems to me that these NAs occur then when the data.table is converted into a methylKit object. I guess R extends the object based on the tabix file, which is probably not uncompressed, but uses the wrong uncompressed file to fill in. Somehow data.table objects deal with it but not methylKit objects.
--
You received this message because you are subscribed to a topic in the Google Groups "methylkit_discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/methylkit_discussion/UruFjvX89B4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/e95f651c-e44a-44da-9f53-6f1332294bd3n%40googlegroups.com.
<image3.PNG><image6.PNG><image2.PNG><image5.PNG><image4.PNG><image1.PNG>
--
You received this message because you are subscribed to a topic in the Google Groups "methylkit_discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/methylkit_discussion/UruFjvX89B4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/67ABF5D6-447F-4A2E-947A-65430A2423EE%40gmail.com.
On 27. Jan 2021, at 15:26, Johan Zicola <johan....@gmail.com> wrote:Sorry, I had a typo in my last sentence:
Also, I could not find the role of the ellipsis (three dots) that you use in the function `new()` to retrieve some variables from the methylBaseDB object in the R documentation or the web. Why do you need these instead of using directly `x...@sample.ids` for instance?
On Wednesday, 27 January 2021 at 15:22:39 UTC+1 Johan Zicola wrote:
Hi Alex,It seems there is an issue with your function. What represents the index i? this variable is not initialized anywhereAlso, I could not find the role of the ellipsis (three dots) that you use in the function `new()` to retrieve some variables from the methylBaseDB object in the R documentation or the web. Why do you need these instead of using directly `x...@sample.ids` for instance?Thanks,Best,Johan
On Tuesday, 26 January 2021 at 17:15:19 UTC+1 Alexander Blume wrote:
Hi Johan,
Great you figured this out!
To circumvent the as() function you could implement your own version. Basically all you are missing is to rename the columns and create a new object of class “methylBase”:
```
convert2MethylBase <- function(x) {df <- methylKit:::fread.gzipped(x@dbpath, stringsAsFactors = FALSE, data.table = FALSE,skipDecompress=FALSE)methylKit:::.setMethylDBNames(df,"methylBaseDB")
new("methylBase",df[i,],
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/c12c2a2f-dd2f-4d23-a0f5-dedb2e7cb523n%40googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "methylkit_discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/methylkit_discussion/UruFjvX89B4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/67ABF5D6-447F-4A2E-947A-65430A2423EE%40gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/CA%2Btog-fwc7ZcROQdHytRfwGF6WqdfimBGU_qFeGh9f8peEHmqQ%40mail.gmail.com.
<methylKitDB_to_methylKit_conversion.R>
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/e5d8b335-9845-486e-8437-888f12e092ean%40googlegroups.com.