Majiq

majkena Isufaj

unread,

Mar 3, 2026, 5:49:22 AMMar 3

to Biociphers

Hi, I am using Majiq v3. We run our fastq using STAR double pass, without using the flag —outFilterIntronMotifs RemoveNoncanonical. We run the pipeline using the multimapping BAM file, and the corresponding results are reported in the attached presentation.

We then repeated the analysis after removing multimapping reads (unique-only BAM), and these results are also included in the presentation for comparison.

By looking at the documentation we couldn’t find any documentation of the bam, and by looking at the forum we found conflict evidence in particular : https://groups.google.com/u/2/g/majiq_voila/c/tIKiKYpOyjc?pli=1

In this thread you say that Majiq discarded Multimap reads.

While in the other example it said that: https://groups.google.com/u/2/g/majiq_voila/c/d8wzrhN_X_w/m/jzI48ootAAAJ Majiq requires de novo junctions to be represented by multiple reads at multiple positions.

Thanks

Majkena

Majiq bam unique e Multimap .key

San Jewell

unread,

Mar 3, 2026, 10:47:10 AMMar 3

to Biociphers

Hi Majkena,

I thank you for your questions, and I will try to help you with them through this thread.

In general, it sounds like you are trying to determine where a difference in majiq output comes from running STAR with and without —outFilterIntronMotifs RemoveNoncanonical ; I would generally expect there to be a removal of de-novo introns when using that switch, but if it is not the behavior you see, I may need to see the full list of STAR and majiq switches used to determine if there may be other factors influencing the result, or, if I cannot see anything relevant, I may need to take a look at the BAMs files to check deeper into the cause.

Unfortunately, I am not able to make good use of the presentation that you have attached to verify that this is indeed your concern, as 1) It seems to be not written in English, and 2) it requires a software keynote which I do not have access to and so I see a converted version which is difficult to read. When you have a chance, please re-send the attachment in English and as a PDF or other open format.

Thank you,

-San

majkena Isufaj

unread,

Mar 5, 2026, 6:02:53 AMMar 5

to Biociphers

Hi San,

thank you for your prompt reply.

Let me clarify the question we are trying to answer.

Should we restrict the analysis to reads with unique alignement or showld we also include multimap reads for Majiq quantification?

Right now, we are using MAJIQ v3 on BAM produced by STAR in two-pass mode, without using the flag --outFilterIntronMotifs RemoveNoncanonical to increase the sensitivity of the analysis.

We then ran the MAJIQ pipeline in two configurations:
1. Using the full BAM file, thus including multimapping reads.
2. Using a BAM where multimapping reads were removed (NH:i:1 only)

We compared the number of de novo events with coverage > 0 detected by MAJIQ.

Our results are the following:

FULL BAM (multimapping included)
Single sample: 53,966
Cohort build: 141,575

UNIQUE BAM (NH:i:1 only)
Single sample: 27,197
Cohort build: 82,376

This corresponds roughly to:

Single: −49% events when removing multimapping reads
Cohort: −42% events

When comparing the sets of detected events:

Total de novo events (coverage > 0):
Multimap: 141,575 events
Unique: 82,376 events
Overlap: 73,480 events

Events only detected with multimapping reads: 68,095
Events only detected with unique reads: 8,896

This result is puzzling at multiple levels. however oll the questions leads to the same point.
how does Majiq employ reads with multiple alignements? and as a consequence, what do you suggest to do, in order to have reliable LSV?

Thank you very much for your help.

Best regards
Majkena

San Jewell

unread,

Mar 6, 2026, 1:38:50 PMMar 6

to Biociphers

Hi Majkena,

I've just checked the internal code of majiq v3 to verify my answer before I got back to you. Majiq is ignoring multimappers by making sure that the read as defined by samtools library is not flagged as BAM_FUNMAP or BAM_FSECONDARY (Alignments.hpp line 56) ; in other words, it must be a mapped read with only a primary mapping. Therefore, I would surmise that whatever process you are using to change the BAM file is affecting reads beyond simply removing reads that fit this criterion. Can you describe the process or software that you are using to change the files, so that I could check what exactly it does and try to reproduce this change?

Thanks,

-San

majkena Isufaj

unread,

Mar 16, 2026, 5:32:48 AMMar 16

to Biociphers

Hi San,

Thank you again for checking the MAJIQ v3 code.

I went back to my command history and confirmed how the “unique-only” BAM files were generated. I filtered the original STAR BAM files by retaining only alignments with the NH:i:1 tag (reads with a single reported alignment). The command was essentially:

samtools view -h input.bam |

awk ‘BEGIN{OFS=”\t”} /^@/ {print; next} $0 ~ /NH:i:1/ {print}’ |

samtools view -b -o output.unique.bam

The filtered BAMs were then sorted and indexed.

From these BAMs I generated the splice junction files using:

majiq-v3 sj BAM annotation.sg.zarr sample.sj

Then I built a single cohort splicegraph from all SJ files:

majiq-v3 build annotation.sg.zarr sg.zarr –sj *.sj

After that, I ran psi-coverage for each sample using the same splicegraph:

majiq-v3 psi-coverage sg.zarr sample.psicov sample.sj

Finally, I generated the TSV outputs with majiq-v3 quantify.

So the difference between the two analyses is only in the BAM filtering step (original BAMs vs BAMs filtered with NH:i:1), while the rest of the MAJIQ workflow was the same.

From your explanation, I understand that this filtering is stricter than MAJIQ’s internal handling, since MAJIQ excludes unmapped and secondary alignments but may still retain the primary alignment of reads that have multiple reported mappings.

This may explain the differences observed between the analyses.

Reply all

Reply to author

Forward