Number of input and output rows does not match

38 views
Skip to first unread message

Gary Chan

unread,
Aug 8, 2022, 8:41:34 PM8/8/22
to HLAthena
Hi all,

I have tried to use HLAthena "predict" function to predict a list of ~24000 pep with different ctex_up and ctex_dn for 5 different HLA alleles. As the website only work with 10,000 peptides, I broke down my file into 9000 rows per file with the below setting.

Assign peptides to alleles by: scores
Threshold: 0.1
Peptide column name: pep
Log-transform expr? no
Context available? yes
Aggregate by peptide? no

However, the output files constantly contain less rows than the input, suggesting some of the rows were discarded.
I have tried to change the "Aggregate by peptide?" option from no to yes, drop the number of row to 5,000 per file and delete all the other columns but "pep", ctex_up" and "ctex_dn", but the problem persists.
Some of the pep sequence are duplicated, but they will have different ctex_up or ctex_dn.

I have gone through the "How to" page and cannot figure out what I did wrong, so I will greatly appreciate if you have any suggestion.

Thanks a lot!
Gary


Gary Chan

unread,
Aug 8, 2022, 10:31:01 PM8/8/22
to HLAthena
oops, nvm, sorry!
I found some of the pep or ctex sequences contained "?", so the HLAthena dropped them. problem is fixed now =)

Gary
Reply all
Reply to author
Forward
0 new messages