update chr: position to rs ID

1,869 views
Skip to first unread message

Jongyun Jung

unread,
Jul 22, 2019, 4:20:06 PM7/22/19
to plink2-users
Hi,

I'm trying to update my imputed data of chr:position to rs ID number. 

I found there is a similar solution in biostar as shown in the website below. 


For example, my .map file is looks like this:

1       1:748878        0       748878
1       1:751756        0       751756
1       1:752566        0       752566
1       1:752721        0       752721
1       1:752894        0       752894
1       1:753405        0       753405

And I downloaded my reference of rs ID file from (https://genome.ucsc.edu/cgi-bin/hgTables) as the file name is All-SNPs.txt. 

This file is looks like this:

#chrom  chromEnd        name
chr1    100663297       rs1235665665
chr1    1048577 rs1346354302
chr1    62914560        rs538775156
chr1    196083715       rs1467218511
chr1    224395264       rs1488102548
chr1    246415379       rs1177496554
chr1    248512513       rs1485351626
chr1    1966085 rs1434739345
chr1    20316161        rs1364715928

I run this command in my computer:

 plink --bfile HIPFXchr1 --update-map All-SNPs.txt --make-bed --out HIPFXchr1-updated

And then I got this error message of 

Ambiguous sex IDs written to HIPFXchr1-updated.nosex .
Error: Line 195333666 of --update-map file has fewer tokens than expected.

Could you help on this how can I change chr:position to rs ID? 

Thank you for your help!



Christopher Chang

unread,
Jul 23, 2019, 12:21:37 AM7/23/19
to plink2-users
1. You need to specify that variant IDs should be read from column 3, not column 1.  "--update-map All-SNPs.txt 2 3"
2. You may also need to remove the last line(s) of All-SNPs.txt if they don't adhere to the same format as the rest of the lines.  "head -n 195333665 All-SNPs.txt > truncated.txt"

Jongyun Jung

unread,
Jul 23, 2019, 11:34:57 AM7/23/19
to plink2-users
Hi Chris,

Thank you for your help.

After I truncated my All-SNPs.txt to  "head -n 195333665 All-SNPs.txt > All-SNPs-updated.txt", I was able to create new plink binary files with below command. 

plink --bfile HIPFXchr1 --update-map All-SNPs-updated.txt 2 3 --make-bed --out HIPFXchr1-updated

However, when I check the output of .bim file for rs ID number, it is still showing as chr:position number. 

 less -S HIPFXchr1-updated.bim

1       1:748878        0       748878  T       G
1       1:751756        0       751756  T       C
1       1:752566        0       752566  A       G
1       1:752721        0       752721  G       A
1       1:752894        0       752894  C       T
1       1:753405        0       753405  A       C

Could you help on this?

Thank you.

Christopher Chang

unread,
Jul 23, 2019, 12:02:17 PM7/23/19
to plink2-users
Oh, that's because --update-map isn't the command you wanted in the first place.  You want to run --update-name instead, after postprocessing the truncated All-SNPs.txt file to contain both the chr:pos and the new rsIDs.

Jongyun Jung

unread,
Jul 23, 2019, 12:15:55 PM7/23/19
to plink2-users
I tried with --update-name, but it is still showing chr:position in the updated .bim file. 

When I truncate my All-SNPs.txt file, should I have to extract only chr:pos and the new rsIDs?

Is it because my All-SNPs-updated.txt not format as chr:position?

Here is the head of All-SNPs-updated.txt. 


chr1    100663297       rs1235665665
chr1    1048577 rs1346354302
chr1    62914560        rs538775156
chr1    196083715       rs1467218511
chr1    224395264       rs1488102548
chr1    246415379       rs1177496554
chr1    248512513       rs1485351626
chr1    1966085 rs1434739345


Thank you.

Christopher Chang

unread,
Jul 23, 2019, 12:17:12 PM7/23/19
to plink2-users
Yes, it is because you did not reformat your All-SNPs-updated.txt to contain a chr:pos column.  Please reread my previous post.

Jongyun Jung

unread,
Jul 23, 2019, 7:50:11 PM7/23/19
to plink2-users
Hi Christopher,

I converted to the same format of chr:position for the .txt file as shown below. 

less -S All-SNPs-updated3.txt

1:100000002 rs1462733795
1:100000004 rs373074226
1:100000012 rs10875231
1:10000002 rs1323965099
1:100000048 rs898988451
1:100000049 rs1419136389
1:100000050 rs1410081010
1:100000056 rs575744361
1:10000005 rs1478843528
1:100000067 rs772996438

And when I run this command, 

 plink --bfile HIPFXchr1 --update-name All-SNPs-updated3.txt --make-bed --out HIPFXchr1-updated

I have error message of duplicate variant ID. 

PLINK v1.90b4.1 64-bit (30 Mar 2017)           www.cog-genomics.org/plink/1.9/
(C) 2005-2017 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to HIPFXchr1-updated.log.
Options in effect:
  --bfile HIPFXchr1
  --make-bed
  --out HIPFXchr1-updated
  --update-name All-SNPs-updated3.txt

1031771 MB RAM detected; reserving 515885 MB for main workspace.
1874897 variants loaded from .bim file.
847 people (0 males, 0 females, 847 ambiguous) loaded from .fam.
Ambiguous sex IDs written to HIPFXchr1-updated.nosex .
Error: Duplicate variant ID '1:100002443' in --update-name file.

Is there any way that I can remove the duplicate variant ID in plink?

Thank you.

Christopher Chang

unread,
Jul 23, 2019, 7:54:36 PM7/23/19
to plink2-users
If you want to remove all duplicated IDs,

  plink --bfile HIPFXchr1 --write-snplist
  uniq -d plink.snplist duplicated.snplist
  plink --bfile HIPFXchr1 --exclude duplicated.snplist --out ...

should get the job done.

Hannah Mandle

unread,
Apr 24, 2023, 9:58:20 AM4/24/23
to plink2-users
Hi,

I have followed the same instructions in this thread to rename pos to rsID and it has worked for a subset of my files. However, some are stopping due to the error below:

 ./plink --bfile abi2_aac --update-map abi2b.txt  --recode A --out abi2_aa_rs
PLINK v1.90b7 64-bit (16 Jan 2023)             www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to abi2_aa_rs.log.
Options in effect:
  --bfile abi2_aac
  --out abi2_aa_rs
  --recode A
  --update-map abi2b.txt

15883 MB RAM detected; reserving 7941 MB for main workspace.
1727 variants loaded from .bim file.
2257 people (0 males, 0 females, 2257 ambiguous) loaded from .fam.
Ambiguous sex IDs written to abi2_aa_rs.nosex .
Error: Invalid bp coordinate on line 2 of --update-map file.

The .txt file looks like this: 

2:204183120 rs934711772
2:204183122 rs539421053
2:204183124 rs917373398
2:204183125 rs948908696
2:204183128 rs1044995944
2:204183129 rs888678182
2:204183139 rs1005785433
2:204183146 rs1471716359
2:204183154 rs1037297815
2:204183155 rs1261544649
....

If I delete the SNP at line 2, I then get an error at line 12, 21, etc. I have 208 genes I have filtered and need to rename to rsIDs and I am wondering if there is an easier workaround to this problem.

Thanks!
Hannah

Hannah Mandle

unread,
Apr 24, 2023, 2:41:57 PM4/24/23
to plink2-users
Nevermind! Solved using plink2
Reply all
Reply to author
Forward
0 new messages