Hi Paul,
As you can imagine, a short piece of DNA fragment might be able to mapped to more than one CDR3 contigs. Therefore, when we estimate the frequency, we took an EM approach to split the ambiguous short reads into different contigs based on their estimated frequencies (this procedure was repeated until EM was reached). The FPKM notation was just a parable, not accurate. In reality, we used the expected read coverage (from EM) divided by the total number of TCR reads.
However, I need to emphasize, unless you believe your data is deep enough, the frequency estimation is not accurate. This is why we never report any analysis of this value in our paper. Our simulation results suggested that, at low coverage, the estimation has almost no correlation with real values, due to high sampling errors.
Thanks,
Bo