Mapping of a qualitative trait

43 views

Skip to first unread message

Helene Grindeland

unread,

Nov 19, 2024, 10:40:48 AM11/19/24

to R/qtl discussion

Hello all,

I am currently working with data from an F2 cross and I am trying to map a qualitative trait, but I have issues with getting any clear candidate regions.

Initially, when I loaded the data, I had just over 17000 markers.
I then filtered with the following steps:

- 10% missingness (which excluded about 10 000 of the total markers)
- excluded markers with the worst segregation distortion
- excluded duplicated markers
- only kept makers with a minimum distance of 5 cM

which resulted in a cross object with a bit more than 7261 markers in total.

The issue is that when I perform the QTL analysis (I used the scanone function with the binary model) and plot the results, it is very noisy and I see peaks all over. This is somewhat surprising to me since the parents are expected to have a very similar genetic background (although I may be wrong here, as this is my first QTL analysis).

When I look at the genetic map after filtering (attached), I see that the marker density is very high, which I suspect might lead to the noise in the QTL.

As a side note, the distribution of the phenotype is highly asymmetrical, in which 116 of the individuals have the dominant phenotype, while only 7 have the other phenotype.

My question is whether a high density of markers typically leads to noisy QTLs and what can I consider doing in order to reduce the marker number? Or could it be that the very skewed representation of the phenotype makes it difficult to actually infer an association between the genotype and phenotype? Or maybe there is something else I can change about my approach in order to improve the analysis.

I would appreciate any input on this.

Best regards,
Helene

genetic_map.png

Karl Broman

unread,

Nov 19, 2024, 10:52:33 AM11/19/24

to R/qtl discussion

Generally, there's no inherent problem, in QTL mapping, with using very dense markers, as the increased density is accompanied by increased association among markers. LOD curves will be a bit more wiggly, but not very much so. As you increase the density of markers from 20 cM to 10 cM to 5 cM, you do increase your power a bit, but beyond that, not so much.

But it is important, in the R/qtl software, to provide cM locations for markers rather than basepairs locations. On your map, the range of marker positions should be on the order of 100s not millions. Having the markers in basepairs may lead to noisy results, but I wouldn't think so, unless maybe your marker genotypes have a high rate of errors.

The skewed distribution of the phenotype *does* make it harder to identify QTL, but I'd expect lower power and low LOD scores, rather than particularly noisy LOD curves with lots of peaks. It will be very hard to map a binary trait in an intercross with 7 1's and 116 0's.