Hello all,
I have done all the troubleshooting that I can think of, but can't seem to get STAR going. Genome indexing with GTF seems to go as expected, and all necessary files are created. The process breaks down when I begin alignment of paired data.
I am running on this on a HPC cluster on my campus, and I have 100 threads available to me, each with 8GB of RAM. It uses OpenMP to parallelize, but I haven't been able to figure that out yet (I'm still learning a lot about this system and how to use HPC, and I would love it if anyone would also explain how to parallelize these commands) and so haven't used it. Anyway, I submit a PBS script asking for 1 compute node and 8 processors - the same parameters that I used to index and that worked perfectly. Here's the index command, for reference, as well as the first few lines of the GTF file:
STAR --runThreadN 8 --runMode genomeGenerate --genomeDir ... --genomeFastaFiles ... --sjdbGTFfile ... --sjdbOverhang 75
[user]$ head -5 smansoni_annotations.gtf
Smp.Chr_1 WormBase_imported exon 11159 11220 . + . transcript_id "transcript:Smp_186980.1"; gene_id "gene:Smp_186980"; gene_name "Smp_186980";
Smp.Chr_1 WormBase_imported exon 12411 12750 . + . transcript_id "transcript:Smp_186980.1"; gene_id "gene:Smp_186980"; gene_name "Smp_186980";
Smp.Chr_1 WormBase_imported CDS 11159 11220 . + 0 transcript_id "transcript:Smp_186980.1"; gene_id "gene:Smp_186980"; gene_name "Smp_186980";
Smp.Chr_1 WormBase_imported CDS 12411 12750 . + 1 transcript_id "transcript:Smp_186980.1"; gene_id "gene:Smp_186980"; gene_name "Smp_186980";
Smp.Chr_1 WormBase_imported exon 16927 17082 . + . transcript_id "transcript:Smp_197050.1"; gene_id
However, when I use the below command to align, it constantly hangs after it creates the second to last thread. For example, it loads the genome correctly and begins creating the threads to be used for aligning, but it never creates the last thread (even if left overnight). My aligning command:
STAR --runThreadN 8 --genomeDir ... --readFilesIn ... ... --outSAMtype BAM SortedByCoordinate
I've attached the Log.out. The Log.progress.out is empty (other than the column headings). It creates the Aligned.sortedByCoord.out.bam file and _STARtemp directory and subdirectories, but everything is empty.
It's possible that this is a question for my system admin, but I haven't gone that route because everything has been performed on a single compute node. Even if I run it on the HOME node and don't submit it to the PBS scheduler (using only 4 threads this time), the same thing happens - it hangs after creating 3 threads, so it doesn't seem to be a problem with the system. I am using the STAR_2.5.0 static executables.
Best,
Nic