New markers addition

190 views
Skip to first unread message

pmoncada...@gmail.com

unread,
Jul 11, 2013, 11:04:33 AM7/11/13
to rqtl...@googlegroups.com
Hello.
I wonder if there is a way to add new markers to an existing map?  

Thanks.

Pilar

Karl Broman

unread,
Jul 11, 2013, 2:33:15 PM7/11/13
to rqtl...@googlegroups.com
It would be helpful to have some context for your question.

Do you have genotype data on a cross, with a known genetic map for most of the markers, but a few new markers with unknown position?

In that case, I would assign the new markers to some arbitrary linkage group (e.g., "un") in the data file, and then use the function tryallpositions() for each such marker, one at a time, to estimate their position relative to the other markers.

karl
> --
> You received this message because you are subscribed to the Google Groups "R/qtl discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rqtl-disc+...@googlegroups.com.
> To post to this group, send email to rqtl...@googlegroups.com.
> Visit this group at http://groups.google.com/group/rqtl-disc.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

pmoncada...@gmail.com

unread,
Jul 11, 2013, 3:32:42 PM7/11/13
to rqtl...@googlegroups.com
Thanks for your answer.  We have an F2 population  and a map established for it.  Now we have  genotypic data for around 300  new markers to add to the map. Which could be the best way to put those new markers on the map?

Pilar

Karl Broman

unread,
Jul 12, 2013, 12:25:30 AM7/12/13
to rqtl...@googlegroups.com

Here's approximately what I would do:

(1) Use est.rf to calculate pairwise recombination fractions and LOD scores, and for the new markers, look at which chromosome has largest LOD.

(2) Then use tryallpositions() with those markers, one at a time, trying each interval for the inferred chromosome

(3) Use movemarker() to move the marker to that interval

karl

Pilar Moncada

unread,
Jul 12, 2013, 11:02:35 AM7/12/13
to rqtl...@googlegroups.com
Thanks a lot.  We will try that way.

Pilar.


2013/7/11 Karl Broman <kbr...@gmail.com>
You received this message because you are subscribed to a topic in the Google Groups "R/qtl discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rqtl-disc/YmGSh9BOsGM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rqtl-disc+...@googlegroups.com.

Jen

unread,
Jul 22, 2013, 3:08:33 PM7/22/13
to rqtl...@googlegroups.com
Hi,
I have a similar question. I want to add just under 200 markers to a framework genetic map as well, however, I had to randomly assign the parental alleles due to a lack of data, and therefore expect that about half of these 200 markers will need to have their alleles switched. I was wondering if you consider the following variation on the above to be a valid method of identifying which markers should have their alleles switched:

1. Use est.rf to calculate pairwise recombination fractions and LOD scores, and for the new markers, look at which chromosome has the largest LOD. 
2. Determine whether the chromosome identified in step one contains a majority of markers with pairwise recombination fractions >>.5 with marker of interest. If so, then switch alleles for marker of interest
3. use tryallpositions() on marker of interest trying each interval for the inferred chromosome
4. Use move marker() to move the marker to that interval

Thanks very much,
Jen

Karl Broman

unread,
Jul 23, 2013, 11:17:48 AM7/23/13
to rqtl...@googlegroups.com
Another approach would be to create two versions of each of these markers, with the two parental allele assignments. Toss the one from each pair that doesn't map.

karl

Elisa Zhang

unread,
Jul 23, 2013, 3:29:22 PM7/23/13
to rqtl...@googlegroups.com
Hi Karl,

What would you recommend doing if I have a map established for an F2 population (~7000 markers) and wanted to add genotype data for ~4000 markers to this map?

Do you think that the approach you gave Pilar be feasible for this too?

Thanks!

Elisa

Karl Broman

unread,
Jul 23, 2013, 3:33:21 PM7/23/13
to rqtl...@googlegroups.com
If the initial map is of good quality, I think it would be feasible, though tryallpositions() would be quite slow. It might be best to focus on the estimated recombination fractions from est.rf(), to find initial positions for all of those markers.

karl

Robert Mattera

unread,
Feb 8, 2021, 6:21:46 PM2/8/21
to R/qtl discussion
Hey Dr. Broman,

I am also trying to do something similar to everyone in this thread. I have a map with around 8,000 markers and am adding around 6,000 more. I added the 6,000 markers with unknown positions to chr=un and imported my data to r/qtl. I used est.rf to estimate recombination frequency. I went through the steps in your tutorial in order to quality control the data. At this point I have about 3,000 markers remaining in chr=un. I can look at each marker one by one using the following code:

 lod <- pull.rf(CombinedMap_reduced, what="lod")

> mnun <- markernames(CombinedMap_reduced, chr="un")

> plot(lod, mnun[1], bandcol="gray70", alternate.chrid=TRUE)

When I look at a marker, I can clearly see which chromosome has the highest LOD. Is there a way where I can move all the markers based on which chromosome has the highest LOD for that marker? Going one by one seems like it would be time-consuming and is daunting. I have no to minimal experience with making scripts or for loops but I would imagine a script could be made to identify which chromosome has the highest LOD for marker i and then use the command tryallpositions() for that marker and chromosome combination and placing that marker in the right position with movemarker() before going onto the next marker. 

 Thanks for any help or guidance,
Rob

Karl Broman

unread,
Feb 9, 2021, 12:36:37 PM2/9/21
to R/qtl discussion


This is maybe the point where you need to invest some time in learning R.
There are a number of good intro R books, many free online. For example, Hands on Programming with R: https://rstudio-education.github.io/hopr/

For your situation, you might proceed like this:

(a) use that lod matrix, output from pull.rf with what="lod", to identify the location with highest lod for each marker

which_max <- apply(lod, 1, which.max)

(b) turn that into marker names

which_marker <- colnames(lod)[which_max)

(c) figure out the chromosome for each of those

which_chr <- which_marker
for(i in seq_along(which_marker))
    which_chr[i] <- find.markerpos(CombinedMap_reduced, which_marker[i])
}

(d) Use a for() loop to move the markers to those chromosomes.

CombinedMap_rearranged <- CombinedMap_reduced
for(i in seq_along(which_marker)) {
    CombinedMap_rearranged <- movemarker(CombinedMap_rearranged, rownames(lod)[i], which_chr[i])
}

karl

Robert Mattera

unread,
Feb 16, 2021, 10:58:11 PM2/16/21
to R/qtl discussion
Hey Dr. Broman,

Thanks so much for your help! Sorry for the delay. I've been trying to figure out how to move only the markers from chr=un to their respective chromosomes that have the highest LOD while ignoring their LOD to chr=un.


Below is my code 

> lodcolnames <- colnames(lod2)[apply(lod2,1,which.max)] ## this pulled out the names of each col that had the highest value for each row (and only one for each - the first one)

> lodmarkernames <- rownames(lod2)

> markers2move <- cbind(lodmarkernames,lodcolnames) ## created a matrix with the marker names and the respective chromosomes they pulle


> for (i in 1:nrow(markers2move)) {

+     markers2move[i,2] <- substr(markers2move[i,2],(nchar(markers2move[i,2])+1)-2,nchar(markers2move[i,2]))

+ }

## above code got rid of the name of the marker they had the lighest LOD towards and only kept the chromosome decimal that I had coded to the end of the names for each however it left the decimals in for chromosomes that were less than 2 digits


lodmarkernames    lodcolnames

   [1,] "33137289.26"     "26"       

   [2,] "snp12588"        "12"       

   [3,] "snp18311"        “.6”        

   [4,] "snp18465"        “.6”        

   [5,] "snp32804"        “.9”        

   [6,] "snp65640"        “.6”        

   [7,] "snp87688"        “.5”        

   [8,] "snp88606"        “.5”       


> for (i in 1:nrow(markers2move)) {

+     if (markers2move[i,2] < 1) markers2move[i,2] <- substr(markers2move[i,2],(nchar(markers2move[i,2])+1)-1,nchar(markers2move[i,2]))


## this for loop ignored the two digit chromosome numbers and just focused on the ones “less than 1” and kept the single character from the right (so only the number and not the number and decimal point 


lodmarkernames    lodcolnames

   [1,] "33137289.26"     "26"       

   [2,] "snp12588"        "12"       

   [3,] "snp18311"        "6"        

   [4,] "snp18465"        "6"        

   [5,] "snp32804"        "9"        

   [6,] "snp65640"        "6"        

   [7,] "snp87688"        "5"        

   [8,] "snp88606"        "5"     


I am now trying to use the command movemarker to move each of these markers to their new respective chromosome (lodcolnames) using this for loop

> for (i in nrow(markers2move)) {

+     CombinedMap_rearranged <- movemarker(CombinedMap_rearranged, markers2move[i,1], markers2move[i,2])

+ }


However I am getting this showing that chr=un still contains all of the markers


> summaryMap(CombinedMap_rearranged)

        n.mar length ave.spacing max.spacing

1          17    145         9.1          20

2          78   2160        28.1         230

3          89   1815        20.6         155

4          68   1975        29.5         475

5          36   1735        49.6         425

6          42    840        20.5         130

7          69   3645        53.6         255

8          39   1755        46.2         365

9          72   2345        33.0         255

10         32    310        10.0          20

11         66    725        11.2          60

12         64   1195        19.0         115

13         58   2680        47.0         360

14         39    975        25.7         275

15         58   1610        28.2         185

16         82   2695        33.3         220

17         48   2090        44.5         680

18         35   1200        35.3         400

19         48   1510        32.1         235

20         47   1000        21.7         230

21         64   1735        27.5         175

22         54    945        17.8         105

23         92   3285        36.1         210

24         41   1455        36.4         545

25         51   1680        33.6         190

26         51   1680        33.6         605

un       2946  33030        11.2         100

overall  4386  76215        17.5         680


Thanks for your help
Rob

Karl Broman

unread,
Feb 18, 2021, 12:51:39 AM2/18/21
to R/qtl discussion
I can't see what might be going wrong; sorry.
Maybe try calling the movemarker() line for different values of i and then using find.markerpos() to see if the marker moved.

karl

Robert Mattera

unread,
Feb 24, 2021, 2:26:19 AM2/24/21
to R/qtl discussion
Thanks Karl! After many attempts and your help I figured it out. I noticed that the problem wasn't in the command or what was being fed to the command but the construction of my for loop. Below is the code that finally worked:

> for (i in 1:nrow(m2)) {

+      CombinedMap_rearranged <- movemarker(CombinedMap_rearranged,m2[i,1],m2[i,2])

+      }

> summaryMap(CombinedMap_rearranged)

        n.mar length ave.spacing max.spacing

1         151   1485         9.9          20

2         193   3310        17.2         230

3         185   2775        15.1         155

4         185   3145        17.1         475

5         140   2775        20.0         425

6         122   1640        13.6         130

7         246   5415        22.1         255

8         152   2885        19.1         365

9         211   3735        17.8         255

10        133   1330        10.1          20

11        139   1455        10.5          60

12        176   2315        13.2         115

13        140   3500        25.2         360

14        142   2005        14.2         275

15        154   2570        16.8         185

16        212   3995        18.9         220

17        140   3010        21.7         680

18        111   1960        17.8         400

19        141   2440        17.4         235

20        127   1800        14.3         230

21        237   3465        14.7         175

22        189   2295        12.2         105

23        312   5485        17.6         210

24        131   2355        18.1         545

25        179   2960        16.6         190

26        138   2550        18.6         605

overall  4386  72655        16.7         680


My next question regards how I can identify the best location of each newly added marker for each chromosome. I know you go through the use of the command ripple. Ideally, I would like to keep the positions of the markers where I know the physical location of and to insert the markers that are unknown into their appropriate intervals. Is there a better way to do this than tryallpositions() or do you recommend sticking with ripple? 


Thanks again for your advice!


Reply all
Reply to author
Forward
0 new messages