--update-ids with long IID names

325 views
Skip to first unread message

Thomas Ueland

unread,
Sep 16, 2023, 4:38:03 PM9/16/23
to plink2-users

 Hello!

I have a plink2 pgen/pvar/psam file set where the FID corresponds to a unique user ID and the IID is a long name (typically over 50 characters) that starts with the FID and has other information separated by _. An example FID is R123456789_chipname_plate_platenumber_A01. Each R123456789 identifier is unique.

I want to change the IID so that it is just the R123456789 text (i.e what is currently in the FID column, and also what is at the beginning of the long text string in the IID column). I made a tab separated .txt file without headers where the first column is the .psam IID and the second column is .psam FID (i.e. what I want the new IID to be). Unfortunately when trying the --update-ids flag it does not recognize the ids. The code log below.

If I run the code with a different pgen/pvar/psam file set (1KG) and random numbers for new IDs, I am able to update names just fine.

My guess is that the length of the IID names has something to do with why it won’t recognize the names to update, but I am not sure and would appreciate any direction on next steps of troubleshooting! My ultimate goal is to add phenotypes to this file set, but without an ability to specify IIDs I am unable to do this.

Thanks so much for your time!

 

PLINK v2.00a5LM 64-bit Intel (29 Aug 2023)

Options in effect:

  --make-just-psam

  --out /home/jupyter/my_dataset_names_updated

  --pfile /home/jupyter/my_dataset

  --update-ids /home/jupyter/names_to_update.txt

Start time: Sat Sep 16 13:46:23 2023

Random number seed: 1694871983

60283 MiB RAM detected, ~58152 available; reserving 30141 MiB for main

workspace.

Using up to 16 threads (change this with --threads).

91449 samples (52322 females, 39127 males; 91449 founders) loaded from

/home/jupyter/my_dataset.psam.

Note: No phenotype data present.

--update-ids: 0 samples updated, 91449 IDs not present.

Writing /home/jupyter/my_dataset_names_updated... done.

 

End time: Sat Sep 16 13:46:23 2023

Christopher Chang

unread,
Sep 16, 2023, 6:28:51 PM9/16/23
to plink2-users
I've added a comment to the --update-ids documentation clarifying that when you use the two-column input format, old FIDs are 0.

Thomas Ueland

unread,
Sep 17, 2023, 6:11:22 AM9/17/23
to plink2-users
Thank you for clarifying this! If I use the four-column input format, are the old FIDs allowed to be something other than 0, or does the requirement of old FIDs = 0 still apply? Just trying to figure out the best way to go about changing the IIDs for my file set.

Really appreciate your time!

Reply all
Reply to author
Forward
0 new messages