Thanks! Right now I'm prepping my own YAML/JSON based on that meta.yaml you listed, and providing that with a separate cache-parameters call. That works, but only if my sequence IDs are modified to follow the sort of the formatting that file has. So for example if my FASTA looks like:
>1-igh
...
>1-igk
...
and my metadata file looks like:
{"1-igh": {"paired-uids": ["1-igk"], "locus": "igh"}, "1-igk": {"paired-uids": ["1-igh"], "locus": "igk"},...
Then I can run something like this, and all is well:
partis cache-parameters --paired-loci --infname input.fasta --input-metafnames input.meta.yaml --paramter-dir params --paired-outdir data
partis partition --paired-loci --paired-indir data --parameter-dir params --paired-outdir output
But if my sequence IDs are different (my originals used a naming scheme like "H1" paired with "K1", and then "H2" paired with, say, "L2", etc.) then I get this in my output from cache-paramters, even though the FASTA and metadata match up:
writing to paired subdirs
0/363 igh seqs pair with igk (warning)
0/363 igh seqs pair with igl (warning)
The first pair of entries in my FASTA and YAML here are like:
>H1
...
>K1
...
{"H1": {"paired-uids": ["K1"], "locus": "igh"}, "K1": {"paired-uids": ["H1"], "locus": "igk"}, ...
So it looks as though it's still insisting that my sequence IDs are formatted like the "guess" option for parsing droplet IDs, but wouldn't giving the paired-uids via the meta.yaml skip the guessing? Or does it still need to run the extract pairing info step even with metadata supplied? (Am I just conflating droplet IDs and UIDs incorrectly? I don't expect any parsing should need to happen since I supply the metadata, but I'm not totally clear on the droplet ID versus UID terminology.) It does work when I've reformatted my IDs, so I'm not stuck at the moment, at least. I just want to make sure I'm not screwing up something basic.
Jesse