Hi Alex,
I am trying to align wheat reads against Triticum aestivum genome (version TGACv1) from Ensemble plant (
http://plants.ensembl.org/Triticum_aestivum/Info/Index).
I tried STAR version 2.4.2a and 2.5.2b, both of them hang at certain stage for a few days and never complete (please see the Log.out:
https://drive.google.com/open?id=0B9WQGMiXLe7GeWh6MlJvSXVaT1k ). There is no error message. BTW, there is no problem at all when generating index.
It was working very well when I used version 2.4.2a to align the reads to a smaller wheat genome reference, it took about 2-3 hours to complete one sample.
In the new study, I used "masked genomic DNA" as reference, in which the total length of the genome is 13427354022, it contains 80% of N's. And there are a lot of scaffold. I am wondering if it is because of the genome size.
The parameters used for alignment are the following:
STAR-2.5.2b/bin/Linux_x86_64_static/STAR STAR
--runMode alignReads
--runThreadN 16
--limitBAMsortRAM 470000000000
--limitIObufferSize 500000000
--limitSjdbInsertNsj 5000000
--outReadsUnmapped Fastx
--outSAMtype BAM SortedByCoordinate
--outSAMmode Full
--outSAMstrandField intronMotif
--outFilterIntronMotifs RemoveNoncanonical
--chimSegmentMin 20
--quantMode TranscriptomeSAM GeneCounts
--outBAMsortingThreadN 0
--outSAMattributes All
--genomeDir ${ref_index}
--readFilesIn reads1.fa reads2.fa
--outFileNamePrefix myResults/
Could you please point out what is the problem for me and give me some suggestions in terms parameters?
Thanks in advance!
Zoe
BTW, the Log.out file is too big and not able to attach