STARsolo 2.7.9a segmentation fault (single-end reads, Smartseq)

276 views
Skip to first unread message

Raquel F.

unread,
Dec 18, 2021, 10:28:58 PM12/18/21
to rna-star
Hello,
I'm experiencing a segmentation fault core-dumped issue when running STARsolo using Smartseq single-end reads. This issue only emerges when using a bash for loop that sends the alignments from each FASTQ file to a separate BAM file. I was successful in aligning these FASTQ files using the --readFilesManifest, but splitting the huge output BAM file into sub-BAM files containing the reads belonging to each sample takes too long, and having the reads of each sample in separate BAMs is a must for what I will do next.

This is the script that was successful:

'#!/bin/bash
# qsub star.sh
#$ -cwd
#$ -N STAR
#$ -pe smp 30
module load star/2.7.9a
STAR  --readFilesCommand zcat --genomeDir /storage/mnv/indexes/STAR_genome --soloType SmartSeq --readFilesManifest /storage/mnv/E-MTAB-3929_Petropoulos_2016/manifest.txt --soloUMIdedup Exact --soloStrand Unstranded --limitOutSJcollapsed 1000000000 --runThreadN 20 --outSAMtype BAM Unsorted --outFileNamePrefix Petropoulos'

However, when running this as a for loop, I get the segmentation fault error after creating the threads.

`#!/bin/bash
# qsub star_forloop.sh accessions_split.txt names.txt
#$ -cwd
#$ -N STAR
#$ -pe smp 30
#$ -t 1-2:1

module load star/2.7.9a

ACCESSIONS=$1
echo $ACCESSIONS
acc=$(awk -v i=$SGE_TASK_ID 'BEGIN{FS="\t"}NR==i{print $1}' $ACCESSIONS)

SAMPLES=$2
echo $SAMPLES
sample=$(awk -v i=$SGE_TASK_ID 'BEGIN{FS="\t"}NR==i{print $1}' $SAMPLES)

for line in $acc;
do
STAR --soloType SmartSeq --soloUMIdedup Exact --soloStrand Unstranded --limitOutSJcollapsed 1000000000 --runThreadN 8 --outSAMtype BAM Unsorted --genomeDir /storage/mnv/indexes/STAR_genome --readFilesIn ${line}.fastq --outFileNamePrefix $sample
done`

I attach the accessions_split.txt file, the names.txt file, the logout, and the stdout file from the scheduler.
I tried modifying the number of threads and the assigned cores with diverse values and still, I get a segmentation fault. I have also added a dash as an R2 in the --readFilesIn argument but it did not work. I have used compressed and uncompressed fastq.gz and it throws the same error (I suspect the FASTQ files are OK because using the --readFilesManifest worked well).
The server has 112 cores and 1008 Gb of RAM. I have unlimited access to these resources.
Thanks in advance for your time and help!
Raquel
accessions_split.txt
names.txt
E3.1.443Log.out.txt
STAR.e7201.1_stdout.txt

Alexander Dobin

unread,
Dec 21, 2021, 1:57:47 PM12/21/21
to rna-star
Hi Raquel,

please create 1-line manifest.txt files for each of the files. STAR needs these files to get the "barcode" for each cell (3rd column).

Cheers
Alex

Raquel F.

unread,
Dec 27, 2021, 2:20:39 PM12/27/21
to rna-star
Hi Alex,
I finally circumvented the problem by using the  --outSAMattrRGline argument following your response to this thread: 
https://github.com/alexdobin/STAR/issues/1107#issuecomment-792323526
In any case, thank you so much for your help and for developing this great tool.
Raquel

Alexander Dobin

unread,
Dec 27, 2021, 3:44:49 PM12/27/21
to rna-star
Hi Raquel,

I forgot about this workaround... :)

Thanks!
Alex

Reply all
Reply to author
Forward
0 new messages