Re: [Rqtl-disc] Coding phase for outcross using 4way cross format?

513 views

Skip to first unread message

Karl Broman

unread,

Feb 28, 2013, 9:57:43 AM2/28/13

to rqtl...@googlegroups.com

> Given this outcross design, there can be up to 4 alleles at a given marker, and segregation types include "intercross", "backcross", 3allele and 4 allele. I would like to include all of the segregation types in one analysis, so I am trying to work out how this can be done within the 4way format. In particular, my question centers on how phase is encoded in the AB x CD form. Perhaps this is described somewhere, but I couldn't find it in the help for read.cross or in the book.

The only documentation is in the help file for read.cross, but it's just one line on the numeric codes for the possible genotypes.

> In the absence of any real knowledge, I've worked out the following scheme-- the next 3 paragraphs will reveal whether I've got it or not. If not, could you steer me to the correct way to encode phase, and let me know if I can code all the segregation types in the 4way format?
>
> Starting with the help file for read.cross:
> "For a 4-way cross, the mother and father are assumed to have genotypes AB and CD, respectively”
> Do I understand correctly that the “mother and father” are the two F1 genotypes, making their progeny the F2? This would mean the original parentals are AC and BD?

It's easiest to think of a two crosses involving four inbred lines, AA x BB and CC x DD.

The F1 would then be AB and CD.

You can use this for a cross between two outbred individuals, provided that you can reconstruct phase in the two F1 individuals (the parents of the four-way cross progeny.

> The genotype data for the progeny is assumed to be phase-known,
> Is phase encoded by the position within the cross AB x CD? If we always keep the F1 genotypes in this order, then alleles A and C are from P1 (genotype AC) while B and D are from P2 (BD)?

No, really I'm thinking of the F1s as two individuals with completely different parents.

> If this is true, then I think that any segregation pattern (backcross, intercross, 3 alleles) could be coded with the appropriate phase (which I already know) by arranging alleles from the two parents in the AB x CD format. In this scheme whatever alleles we put in the A and C position came from P1, while alleles in the B and D positions came from P2. If this is dead wrong, no need to read further! If it has potential, I've worked out several examples.

You need to work out phase, which means determining the relationship between alleles at different markers. For example, if you have two markers that are GT one parent and TT in the other, you need to figure out whether the G's go together or the T's.

karl

Sara Via

unread,

Feb 28, 2013, 8:26:22 PM2/28/13

to rqtl...@googlegroups.com

Hi Karl,

Thank you for your quick reply— I realize that outbred crosses aren’t your core interest, but may I continue the conversation with a few additional questions?

It's easiest to think of two crosses involving four inbred lines, AA x BB and CC x DD.

Am I correct that in this scheme AA and CC have one extreme phenotype (say, large), while BB and CC have the other phenotype (small)? That would lead to the recombination between divergent chromosomes in the F1 that you need to map QTL for the trait.

You can use this for a cross between two outbred individuals, provided that you can reconstruct phase in the two F1 individuals (the parents of the four-way cross progeny.

Right. So imagine going into an outbred population of large individuals and picking up an individual that is AC, then going to a population of small individuals and picking up one that is BD. Then use these as the parents of the mapping cross.

Parentals: AC x BD

F1: AB, AD, BC, CD. Then cross only AB x CD (since they have all 4 parental alleles).

For any progeny of AB x CD, we know that

A or C came from the large parent, and

B or D came from the small parent

This is essentially the same thing you would get from the AA x CC, BB x DD crosses you described, so I think we are OK so far.

You need to work out phase, which means determining the relationship between alleles at different markers.

This is the same thing as determining which alleles in the F2 came from the large parent(s) of the F1 and which from the small parent(s), correct?

I worked this out for each of my markers from the genotypes of the parentals and the F1, so I’m all set there. But how is the information about phase encoded in the 4way format? It’s easy for a marker with 4 alleles in the F1, because they can immediately be coded as AB and CD. But what about a marker with 3 alleles, one with 2 alleles that segregates like a backcross, or one with two alleles that segregates like an intercross?

Again, forgive my ignorance of inbred lines, but couldn’t these segregation types happen in the 4way cross if some markers in one or both sets of parents are monomorphic. i.e. (AA x BB and AA x DD: F1 cross is AB x AD), (AA x AA and CC x DD: F2 look like backcross) , or (AA x AA and DD x DD: F2 look like intercross)? Are these possibilities and others (plus dominance) why you have genotypes 5-14 in the 4way coding? How would you deal with these possibilities in a 4way cross of inbred lines?

Hopefully, my scheme does the same thing, but maybe coming at it from a different angle. Once we know the phase, it seems like we should be able to use this 4way coding for any segregation type in an outcross if, at each marker, we write the genotypes of our 2 F1 in a stereotyped way that encodes the phase—always put the F1 in the same order in the “cross” (i.e., F1(1) x F1(2), and write their genotypes so that the alleles we put in the A or C position came from the large parental genotype, while alleles in the B or D positions came from the small parental genotype. Thus, in the examples I gave in the original post, by writing a cross (say ef x eg) in a different way for each possible phase, the same F2 genotype (i.e., eg) will get a different code than it will for a marker with 3 alleles where the phase is different and the cross is written as fe x eg. In the first case, eg = AD, while in the second case, eg = BD. If you maintain this convention for all markers, then you know which set of alleles came from the large parent and which set came from the small parent.

By the way, I made a typo in one of my examples in the first post, so here is the corrected version:

Example 2: a backcross segregation type: lm x ll (2 alleles, the first F1 is heterozygous, the second is homozygous)

There are two possible phases for this cross, depending on which parental contributed the m to that first F1:

a. lm x ll (m came from the BD parent of the F1)

F2 are ll = 5 (AC or AD)

lm = 6 (BC or BD)

b. ml x ll (m came from the AC parent)

so F2 are

ml = 5 (AC or AD)

ll = 6 (BC or BD)

Anyway—let me know if you think this scheme will work.

Thanks again,

Sara Via

Reply all

Reply to author

Forward

0 new messages