unique identifiers for each splicing event

17 views

Skip to first unread message

Miriam Llorian

unread,

Aug 29, 2025, 10:21:36 AMAug 29

to Biociphers

Dear developers,

I would like to know what's the recommended way of obtaining unique identifiers for splicing events? I'm interested in looking at events altered across multiple comparisons. I'm using the modulize output, and in there I've noticed that event_id is not unique, as there are multiple row with the same identifier. I've started to use a name that comes from pasting: event_id, spliced_with and junction_name, for now, but I was wondering if I'm overcomplicating this step and how do you recommend going about it.

Thank you,

Miriam

San Jewell

unread,

Sep 8, 2025, 1:26:02 PMSep 8

to Biociphers

Hi Miriam,

event_id is not unique per row in the output files because each row is a junction and an event will have multiple junctions/introns. However, event_id is unique for an event. (If you find two event_ids that are the same for different events, please report it here as a bug)

However, you also state that you are trying to compare across multiple separate runs. event_id will only be unique to an event in one run in general. I think if you want a truly unique event id across multiple unrelated runs you'd need to consider all of the things that would make that unique to you and perhaps take a hash function of it. For example, if you assume all cross comparisons are human, you could take a hash function of chromosome+junction1start_junction1end+junction2start_junction2end , etc for all junctions (where the plus means just string concatenation). If multiple species are being compared you'd need to add species into the mix too. Perhaps your runs are also grouped based on sets of experiments and then you might want to add something about the samples into the uniqueness hash.

Let me know if it makes sense of if I'm not understanding something properly.

Thanks,

-San

Reply all

Reply to author

Forward

0 new messages