Error using importRdata

97 views
Skip to first unread message

Laura López Hernández

unread,
Jun 13, 2023, 10:49:51 AM6/13/23
to IsoformSwitchAnalyzeR
Hi,

I'm using the IsoformSwitchAnalyzeR version 2.1.2. I'm trying to use importRdata to load some counts and tpm tables that were generated using Isoquant. The gtf file and fasta are the same ones I've used to run Isoquant. Does anyone has a clue of what can be happening?

Thanks a lot!!

> transcript.SwitchList <- importRdata( + isoformCountMatrix = transcripts$transcript_counts, + isoformRepExpression = transcripts$transcript.tpm, + designMatrix = design.table, + isoformExonAnnoation = "data/9-22/genome_annotation/gencode.v43.annotation.gtf", + isoformNtFasta = "data/9-22/genome_annotation/GRCh38.primary_assembly.genome.fa", + showProgress = FALSE, + ignoreAfterBar = TRUE, + ignoreAfterSpace = TRUE, + removeNonConvensionalChr = TRUE + )
Step 1 of 10: Checking data... Step 2 of 10: Obtaining annotation... importing GTF (this may take a while)... Error in importRdata(isoformCountMatrix = transcripts$transcript_counts, : The annotation and quantification (count/abundance matrix and isoform annotation) seems to be different (Jaccard similarity < 0.925). Either isforoms found in the annotation are not quantifed or vise versa. Specifically: 79812 isoforms were quantified. 159957 isoforms are annotated. Only 79812 overlap. 0 isoforms quantifed had no corresponding annoation This combination cannot be analyzed since it will cause discrepencies between quantification and annotation thereby skewing all analysis. If there is no overlap (as in zero or close) there are two options: 1) The files do not fit together (e.g. different databases, versions, etc) (no fix except using propperly paired files). 2) It is somthing to do with how the isoform ids are stored in the different files. This problem might be solvable using some of the 'ignoreAfterBar', 'ignoreAfterSpace' or 'ignoreAfterPeriod' arguments. Examples from expression matrix are : ENST00000570076.5, ENST00000529196.5, ENST00000569561.5 Examples of annoation are : ENST00000514211.1, ENST00000439236.6, ENST00000515835.2 Examples of isoforms which were only found im the quantification are : If there is a large overlap but still far from complete there are 3 possibilites: 1) The files do not fit together (e.g different databases versions etc.) (no fix except using propperly paired files). 2) If you are using Ensembl data you have supplied the GTF without phaplotyps. You need to supply the <Ensembl_version>.chr_patch_hapl_scaff.gtf file - NOT the <Ensembl_version>.chr.gtf 3) One file could contain non-chanonical chromosomes while the other do not (might be solved using the 'removeNonConvensionalChr' argument.) 4) It is somthing to do with how a subset of the isoform ids are stored in the different files. This problem might be solvable using some of the 'ignoreAfterBar', 'ignoreAfterSpace' or 'ignoreAfterPeriod' arguments. For more info see the FAQ in the vignette.

Eric Katagirya

unread,
Dec 19, 2023, 3:38:19 AM12/19/23
to IsoformSwitchAnalyzeR

Were you able to solve this? I have the same error with my analysis. I even repeated the Salmon analysis but still got the same error.
Thanks.
Reply all
Reply to author
Forward
0 new messages