Dear Karl,
after over a year I have a follow up question regarding these sharply defined intervals!
My intervals are indeed super sharp (e.g. 2e-4 cM). However, to parse genes I need to translate this interval to a physical location. Repeating the BCI calculation without pseudomarkers therefore always gives me a start and stop interval marker which are switched ie. the stop marker lies before the start marker in the genome - hence the original negative intervals.
Maybe for reference, this is particularly about a hotspot on a chromosome where I get multiple different significant QTLs which are 200+ kbp apart. When I calculate the BCI with pseudo markers the interval size in cM varies quite a bit, but overall is rather small. However when I repeate the BCI calculation without pseudo markers I get the same/similar physical markers delineating the intervals for different significant QTLs.
What's more is that I always get the same stop interval marker - which makes me wonder whether this could have any relevance?
Now my question is whether there is a rule of thumb with which I could convert a cM distance to bp distance?
I am absolutely aware that these are different units, and the linkage varies strongly by region.
But do you think it’s okay to do a rough estimate, e.g. given that my reference genome has an approximate size of 495 Mb and we have 5386 markers with a median distance of 28.96±22.25 cM, to use e.g. 3000 bp per cM?
Alternatively I thought of just using the flanking markers surrounding a significant QTL to search for genes within - but as the BCI intervals do vary over the different significant QLTs this does not seem reliable either.
I also tried to just switch the start and stop marker delineating the support interval if one happens to lie before the other - however the retrieved intervals and thereby number of genes is just unnecessarily large, particularly since my BCI actually seems so sharply defined.
Thank you so much!
I really appreciate any insight!
Kind regards,
Lena