Hi everyone,
I've been trying to perform hisat2-stringtie analysis for my bacterial genome. I obtained my .gff file from prokka and then converted that to .gtf format using gffread. However, when i try to run isoformswitchanalyzer (v2.0.1) to get gene names for all the unknown gene ids (MSTRG) generated by stringtie, i end of in error and i cant seem to understand where im going wrong as im new to all these. I can see that the errors are being caused by the gtf file format. I generated 2 gff files from the same input data using different parameters from prokka and i face different errors with each file. I did really appreciate it to get some advice on how to address this problem. Thank you
mega_1.gtf gives me this error:
switchAnalyzeRlist <- importRdata(
isoformCountMatrix = stringTieQuant$counts,
isoformRepExpression = stringTieQuant$abundance,
designMatrix = myDesign,
isoformExonAnnoation = "/plant-bacteria/B.mega/mega_1.gtf",
)
Step 1 of 3: Identifying which algorithm was used...
The quantification algorithm used was: StringTie
Found 6 quantification file(s) of interest
Step 2 of 3: Reading data...
reading in files with read_tsv
1 2 3 4 5 6
Step 3 of 3: Normalizing abundance values (not counts) via edgeR...
Done
Step 1 of 10: Checking data...
Step 2 of 10: Obtaining annotation...
importing GTF (this may take a while)...
Error in importGTF(pathToGTF = isoformExonAnnoation, addAnnotatedORFs = addAnnotatedORFs, :
The GTF file must contain the folliwing collumns 'transcript_id' and 'gene_id'. gene_id is missing.
mega_2.gtf gives me this error:
switchAnalyzeRlist <- importRdata(
isoformCountMatrix = stringTieQuant$counts,
isoformRepExpression = stringTieQuant$abundance,
designMatrix = myDesign,
isoformExonAnnoation = "plant-bacteria/mega_2.gtf",
)
Step 1 of 3: Identifying which algorithm was used...
The quantification algorithm used was: StringTie
Found 6 quantification file(s) of interest
Step 2 of 3: Reading data...
reading in files with read_tsv
1 2 3 4 5 6
Step 3 of 3: Normalizing abundance values (not counts) via edgeR...
Done
Step 1 of 10: Checking data...
Step 2 of 10: Obtaining annotation...
importing GTF (this may take a while)...
Error in `[[<-`(`*tmp*`, name, value = c("exon_1", "exon_0")) :
2 elements in value to replace 0 elements