1) The samples file looks correct. Double check against the example
here.
2) I am not sure I understand what you mean by the second point but here are a few things to note.
- PyClone will report the cellular prevalence of SNVs in each samples
- I assume you mean somatic SNVs not germline SNPs. PyClone doesn't handle the latter.
- Each sample should have the same set of SNVs. So even if a mutation is not predicted in a sample, you will need to include the count data from that sample in the file.
- Make sure the `mutation_id` is consistent for SNVs across sample.
- sample_1:chr2:12345, sample_2:chr2:12345 would be the wrong way to name things
- chr2:12345 is the correct way
- you do not need to specify the chromosome and coordinates as above, just make sure the name is consistent across samples
Cheers,
Andy