How do I merge several binary files from separate samples into a single binary file using Plink2?

726 views
Skip to first unread message

sbir...@hotmail.com

unread,
May 27, 2024, 9:54:59 AM5/27/24
to plink2-users

Good day, everyone

I am trying to combine a total of 5 aDNA samples (full genome) into a single file for the purpose of further analysis in ADMIXTOOLS2. At the moment, I have separate .vcf and separate .bed, .bim and .fam files for each DNA sample.

I tried using the following code:

plink2 --pmerge-list ~/plink2/EAS_merge_file.txt --bfile --make-bed --out EAS_Kent_all

Here is the text file of plink1 files to be merged:

~/data/plink2/EAS001uc
~/data/plink2/EAS002uc
~/data/plink2/EAS004uc
~/data/plink2/EAS005uc
~/data/plink2/EAS006uc

This was the output:

"Logging to EAS_Kent_all.log.

Options in effect:
--bfile
--make-bed
--out EAS_Kent_all
--pmerge-list /home/steve/plink2/EAS_merge_file.txt

Start time: Sun May 26 16:04:57 2024
Error: Missing --bfile argument."

Then I tried the same run but without --bfile. Here is the result:

plink2 --pmerge-list ~/plink2/EAS_merge_file.txt --make-bed --out EAS_Kent_all

PLINK v2.00a3 SSE4.2 (18 Feb 2022) www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to EAS_Kent_all.log.
Options in effect:
--make-bed
--out EAS_Kent_all
--pmerge-list /home/steve/plink2/EAS_merge_file.txt

Start time: Sun May 26 16:16:04 2024
9663 MiB RAM detected; reserving 4831 MiB for main workspace.
Using up to 6 compute threads.
--pmerge-list: 5 filesets specified.
Error: Failed to open ~/data/plink2/EAS001uc.psam : No such file or
directory.
End time: Sun May 26 16:16:04 2024

So, I am asking plink2 to generate a single file using the Plink1 format, but plink2 is looking for a .psam file, which is not there. Also, I am not clear on what the file for --bfile should be called since the prefix names for all five aDNA files are different.

How can I get these five sample files into one bed/bim/fam file set?

Thank you for your help!

EAS_Kent_all.log

Christopher Chang

unread,
May 27, 2024, 1:17:33 PM5/27/24
to plink2-users
0. The most recent --pmerge[-list] bugfix was in August 2023, so I recommend that you update your plink2 build.

1. https://www.cog-genomics.org/plink/2.0/data#pmerge says that the correct syntax is "--pmerge-list ~/plink2/EAS_merge_file.txt bfile", not "--pmerge-list ~/plink2/EAS_merge_file.txt --bfile".  You aren't supposed to add dashes in front of flag-modifiers.

sbir...@hotmail.com

unread,
May 27, 2024, 5:41:29 PM5/27/24
to plink2-users
Thank you!  I updated the version as you suggested.  Now I seem to be having a problem with the pmerge-list text file.  I am attaching both the log file and the text file. 

It appears to me that plink2 is finding the txt file, but then when it begins to read it, plink2 can't seem to find the .fam file associated with the first sample "EAS001uc".  I tried to run it using both the full path and just the file name.  Neither worked.  I must be doing something wrong with the txt file format.

I appreciate the help!

sbir...@hotmail.com

unread,
May 27, 2024, 5:42:18 PM5/27/24
to plink2-users
Sorry, forgot the attachments.
EASpop1.log
EAS_merge_file.txt

Christopher Chang

unread,
May 27, 2024, 7:59:22 PM5/27/24
to plink2-users
"~/" in the --pmerge-list file will not be expanded properly by plink2; sorry about the inconvenience.

sbir...@hotmail.com

unread,
May 28, 2024, 1:12:13 PM5/28/24
to plink2-users
Fair enough.  How do I get the --pmerge command to recognize the lines in the text file.  Offhand, it appears that everything else is working.  I tried to expand the path to /home/steve/<prefix name> (see attached text file) but got the same result as before.  The .txt file is residing in the working directory and plink2 seems to recognize it, but then can't find the prefix file in the same directory.

What is the correct format to get plink2 to recognize the entries in the txt file?
EASpop1.log
EAS_merge_file.txt

sbir...@hotmail.com

unread,
May 28, 2024, 1:15:10 PM5/28/24
to plink2-users
P.S.:  By prefix file, I mean the list of files to be merged into one binary plink file.  The .bed, .bim and .fam (e.g., EAS001.bed, EAS002.bim, EAS001.fam)  are all in the working directory.

Christopher Chang

unread,
May 31, 2024, 5:33:22 PM5/31/24
to plink2-users
The .log says your working directory is /home/steve/plink2 , not /home/steve .

sbir...@hotmail.com

unread,
Jun 4, 2024, 8:35:36 PM6/4/24
to plink2-users
I tried moving the --pmerge-list text file to different directories just to see what would happen.  The most encouraging combination seemed to be this:

./plink2 --pmerge-list ~/plink2/EAS_merge_file1.txt bfile --out ~/plink2/EASpop1

This produced a different error message than before:

Start time: Tue Jun  4 19:27:11 2024
9663 MiB RAM detected, ~7392 available; reserving 4831 MiB for main workspace.

Using up to 6 compute threads.
--pmerge-list: 5 filesets specified.
Error: Failed to open EAS002uc.fam : No such file or directory.

Now it appears to be able to locate the .txt file and even the first file in the pmerge-list text file (EAS001uc), but now seems to be getting hung up on finding the second file in the list (EAS002uc).  I am attaching the log file and the txt file for the merge.  There is in fact a EAS002.fam file in the ~/plink2/ directory, so I'm not sure why the pmerge-list command can see the first file in the text list but not the second.  (I'm hoping that makes sense!)

Thank you for your help!
EAS_merge_file1.txt
EASpop1.log

Christopher Chang

unread,
Jun 5, 2024, 12:30:02 AM6/5/24
to plink2-users
You told plink2 to look for EAS002uc.fam, not EAS002.fam .

sbir...@hotmail.com

unread,
Jun 5, 2024, 10:38:07 AM6/5/24
to plink2-users
Pardon me, that was a typo.  The plink2 folder contains the file EAS002uc.fam but the --pmerge-list command can't seem to locate it.

Chris Chang

unread,
Jun 5, 2024, 10:46:45 AM6/5/24
to sbir...@hotmail.com, plink2-users
Well, this isn’t going to end up working anyway because plink2 — pmerge-list currently only concatenates chromosomes and the like, not samples.  You should use plink 1.x for this step.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/e96b4ae4-756f-48f6-8a05-653c6055ed53n%40googlegroups.com.

sbir...@hotmail.com

unread,
Jun 6, 2024, 9:02:25 PM6/6/24
to plink2-users
I tried to run the merge in Plink 1.9; this is what I got:

steve@steve-h8-1534:~$ plink --merge-list EAS_merge_file1.txt --out KENT
PLINK v1.90p 64-bit (13 Feb 2023)            www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3
Error: Failed to open KENT.log.  Try changing the --out parameter.

I tried again but with a somewhat different format to the .txt file:

$ plink --merge-list EAS_merge_file2.txt --out KENT
PLINK v1.90p 64-bit (13 Feb 2023)            www.cog-genomics.org/plink/1.9/
(C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3
Error: Failed to open KENT.log.  Try changing the --out parameter.


I can't attach a .log file because there isn't one.  I am attaching the two text files.  I'm also attaching one of the binary file sets for informational purposes to the next post.

It was suggested on May 27, when I first started trying to do this merge that I use Plink2 instead.  Apparently, that can't be done.  What should I try next?  Why does plink 1.9 fail to create or open the KENT.log file?

Thank you.
EAS_merge_file1.txt
EAS_merge_file2.txt

sbir...@hotmail.com

unread,
Jun 6, 2024, 9:07:30 PM6/6/24
to plink2-users
Sorry, it won't let me upload the binary files; says they're too big.

Reply all
Reply to author
Forward
0 new messages