Matt
unread,May 11, 2012, 11:04:11 AM5/11/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Isolation with Migration
Hello,
I have two questions regarding formatting the input file for HapSTR
data. Sorry if these were posted previously, I did a search and could
not find anything and the instruction manual doesn't quit detail this
enough (or maybe I am just misreading, I apologize in advance if
that's the case!).
1) From the manual it looks like I cut the STR sequence out of the
full sequence in the input file, but if I have sequence on both sides
of the STR do I just concatenate these two flanking sequences to make
the sequence portion of the HapSTR? For example, if my HapSTR
sequence was AATCCACACACACAGTTC I would code the input file as 5
AATCGTTC?
2) If I have a HapSTR that actually has two linked STR's and 3 sets of
flanking sequence (i.e. upstream of the first STR, between them, and
downstream of the second STR) then I would apply the same format but
concatenating the three sequence sections? For example,
AATCCACACACACAGTTCTTTCTTCTTCTTCTTCTTCCTAG I would code the input file
as 5 5 AATCGTTCCTAG?
While I have your attention, I just thought of another question
regarding input for mtDNA. Since the analysis assumes that all loci
are unlinked, if we have multiple mtDNA loci should they all be
entered as a single locus? The reason I ask is because even if they
are all linked and entering them as seperate loci may violate
something in the model, we don't expect them to have identical
mutation rates and if they have different sample sizes then any
missing individuals in a concatenated dataset would effectively cause
the program to ignore the entire dataset. I think the solution is to
enter them all individually, but just wanted to be sure.
Thanks for your help!
Matt