Hi Corset Authors,
I am struggling to run Corset and I would appreciate if you kindly support me.
I am working with a non-model frog using 2 groups with each 2 replications (total 4 RNA samples). These were sequenced with 2 x 150 bp, trimmed, de novo assembled by Trinity with using 4 fastq pair files, and evaluated by BUSCO, resulting in reliable app. 200,000 transcripts. Previously, I used CD-hit-est to reduce redaudancy, but now I would like to use Corset to get better clustering. To run Corset, I firstly run salmon on Mac terminal:
*Indexing the assembled TranscriptomeAssemblyTools
salmon index -p 2 -t /Users/naoki/AF_1_2_liver/trinity_contig_AF_1_2/Trinity_AF_1_2.fasta -i Salmon_ref_index_AF_1_2
*Quantifying the reads of 4 samples (individually run)
salmon quant -i rSalmon_ref_index_AF_1_2 -l ISF --dumpEq -1 pair1.fastq.gz -2 pair2.fastq.gz -p 8 -o Salmon_quant_AF_1_2_output
Four eq_classes.txt.gz files (2.1-3 MB) were generated in aux_info sub-directory in main directories, named "A1, A2, B1 and B2". The I run Corset with using the following command:
corset -g 1,1,2,2 -n A1,A2,B1,B2 -i salmon_eq_classes A1/aux_info/eq_classes.txt.gz A2/aux_info/eq_classes.txt.gz B1/aux_info/eq_classes.txt.gz B2/aux_info/eq_classes.txt.gz
Then I got count.txt and clusters.txt, but these files are blank (counts.txt notes but "A1 A2 B1 B2"). And Mac terminal showed as follows.
Running Corset Version 1.09
Setting sample groups:1,1,2,2, 2 groups in total
Setting sample names to:A1,A2,B1,B2
Reading salmon eq_classes file : A1/aux_info/eq_classes.txt.gz
Reading data on 0 transcripts in 0 equivalence classes
0 reads counted, 0 reads filtered, 0 reads redistributed.
Reading salmon eq_classes file : A2/aux_info/eq_classes.txt.gz
Reading data on 0 transcripts in 0 equivalence classes
0 reads counted, 0 reads filtered, 0 reads redistributed.
Reading salmon eq_classes file : B1/aux_info/eq_classes.txt.gz
Reading data on 0 transcripts in 0 equivalence classes
0 reads counted, 0 reads filtered, 0 reads redistributed.
Reading salmon eq_classes file : B2/aux_info/eq_classes.txt.gz
Reading data on 0 transcripts in 0 equivalence classes
0 reads counted, 0 reads filtered, 0 reads redistributed.
Done reading all files.
Start to cluster the reads
Starting hierarchial clustering...
Finished
I was wondering if eq_classes.txt.gz files were not generated correctly? Or something wrong in Corset command or setting of directory?
Thanks and Best Regards,
Naoki