
Hello, I am analyzing whole-genome bisulfite sequencing data from a plant species using Bismark v0.21.0 and methylKit in R. My extraction command includes --CX and --comprehensive flags, which extract methylation information for all cytosine contexts (CpG, CHG, and CHH) into a single .bismark.cov.gz file: bismark_methylation_extractor \ --multicore 3 \ --comprehensive \ --bedGraph \ --CX \ --cytosine_report \ --report \ --genome_folder $GENOME \ --output $OUTDIR \ sample.bam I then read the files into methylKit using: myobj <- methRead(file.list, sample.id = as.list(sample.id), assembly = "my_assembly", treatment = treatment, context = "CpG", mincov = 7, pipeline = "bismarkCoverage") After tiling (1kb windows) and getMethylationStats(), the percent methylation histogram shows a flat/uniform distribution across 0-100%, rather than the expected bimodal pattern described in the methylKit documentation. My questions are: 1. Is this flat distribution expected when all three cytosine contexts (CpG + CHG + CHH) are mixed in a single file and read with pipeline = "bismarkCoverage"? In plants, CHH methylation is predominantly low (~0-5%) and CHG is intermediate, which I suspect dilutes the bimodal CpG signal. 2. Is context = "CpG" the correct parameter when reading a file containing all three contexts, or should a different approach be used? 3. Is it valid to perform differential methylation analysis across all contexts simultaneously using this approach, or should each context be analyzed separately? For reference, samples extracted without --CX (CpG only) show the expected bimodal distribution. Thank you.![]()
--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/methylkit_discussion/95037d89-f5c7-45a8-aeae-27e9411f097fn%40googlegroups.com.
Thank you very much for the detailed explanation.
That makes perfect sense. I will regenerate the analyses using CpG, CHG, and CHH separately as recommended. Thank you again for your help and for pointing me in the right direction.