Hi Alex,
Do you know what makes STAR anomaly very slow to run? STAR is still running to align one paired-end sample (30 million per sample, 100bp) to a reference genome of 17Gb size after almost two days. I used --runThreadN 24 and --limitBAMsortRAM 16000000000. These are output files;
-rw-r--r-- 1 0 Feb 5 18:19 Aligned.sortedByCoord.out.bam
-rw-r--r-- 1 4904174 Feb 5 19:06 Aligned.toTranscriptome.out.bam
-rwxr-xr-- 1 255152721 Feb 6 07:59 Log.out
-rw-r--r-- 1 5310 Feb 7 07:31 Log.progress.out
Any suggestion please.
Thanks,
ERA
Hi Alex,
Is it possible to index genome without annotation.gtf file? I lost the chromosome names corresponding to each sequence in the super-contig reference genome. I therefore cannot index it using annotation.gtf file.
Error message:
…
Feb 14 06:57:59 ... finished generating suffix array
Feb 14 06:57:59 ... generating Suffix Array index
Feb 14 07:04:24 ... completed Suffix Array index
Feb 14 07:04:24 ..... processing annotations GTF
Fatal INPUT FILE error, no valid exon lines in the GTF file: /work/annotation.gtf
Solution: check the formatting of the GTF file. Most likely cause is the difference in chromosome naming between GTF and FASTA file.
The super contig is composed of all short sequences (<1kb) being separated each other by N (30 times).
Cheers,
ERA
Hi Alex,
I’d like to use the second option but I am not sure to understand the order of steps to follow for that. So, do you mean that I have to: 1- concatenate short contigs to one super contig; 2- filter out, from GTF file, annotations corresponding to contigs which compose the super-contig; 3- transform the coordinates of the filtered annotations into the super-contig coordinates (I ignore which the coordinates of the contig composing the super-contig are); 4- index the super-contig; and 5- map samples against the STAR indexed super contig? Sorry if I look like a novice because I am not yet very familiar with the high throughput sequencing data. I’d really appreciate if you could also suggest me an r package (or a simple script) allowing this coordinate conversion.
Thanks,
Era