I'm very excited by the --samples_file option in the newest version of Trinity! I'm running into problems getting Trinity to read/use any entry after the top entry. My samples.txt file (tab-delimited) shows 4 conditions, 3 of which have 3 biological replicates, 1 of which has only 2 biological replicates:
Feed Feed_rep1 CR-ZYW-0_R1.fastq.gz CR-ZYW-0_R2.fastq.gz
Feed Feed_rep2 CR-ZYW-A_S3_L007_R1_001.fastq.gz CR-ZYW-A_S3_L007_R2_001.fastq.gz
Feed Feed_rep3 CR-ZYW-B_S4_L007_R1_001.fastq.gz CR-ZYW-2_S4_L007_R2_001.fastq.gz
Fast Fast_rep1 CR-ZYW-C_S5_L007_R1_001.fastq.gz CR-ZYW-C_S5_L007_R2_001.fastq.gz
Fast Fast_rep2 CWR-ZYW-E_S1_L002_R1_001.fastq.gz CWR-ZYW-E_S1_L002_R2_001.fastq.gz
Fast Fast_rep3 CWR-ZYW-F_S2_L002_R1_001.fastq.gz CWR-ZYW-F_S2_L002_R2_001.fastq.gz
Dec Dec_rep1 CR-ZYW-D_S6_L007_R1_001.fastq.gz CR-ZYW-D_S6_L007_R2_001.fastq.gz
Dec Dec_rep2 CWR-ZYW-G_S3_L002_R1_001.fastq.gz CWR-ZYW-G_S3_L002_R2_001.fastq.gz
Juv Juv_rep1 CWR-ZYW-H_S4_L002_R1_001.fastq.gz CWR-ZYW-H_S4_L002_R2_001.fastq.gz
Juv Juv_rep2 CWR-ZYW-I_S5_L002_R1_001.fastq.gz CWR-ZYW-I_S5_L002_R2_001.fastq.gz
Juv Juv_rep3 CWR-ZYW-J_S6_L002_R1_001.fastq.gz CWR-ZYW-J_S6_L002_R2_001.fastq.gz
I'm running Trinity v2.3.2 with --trimmomatic and using the --samples_file option. The issue first appears with Trimmomatic: Trimmomatic only trims the top entry, then Trinity proceeds with in silico normalization:
----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------
----------------------------------------------------------------------------------
---------------------------------------------------------------
------ Quality Trimming Via Trimmomatic ---------------------
<< ILLUMINACLIP:/apps/software/trinityrnaseq/2.3.2/trinity-plugins/Trimmomatic/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25 >>
---------------------------------------------------------------
Friday, December 2, 2016: 16:36:09 CMD: java -jar /apps/software/trinityrnaseq/2.3.2/trinity-plugins/Trimmomatic/trimmomatic.jar PE -threads 8 -phred33 /group/rags-lab/Mom_Sequencing/CR-ZYW-0_R1.fastq.gz /group/rags-lab/Mom_Sequencing/CR-ZYW-0_R2.fastq.gz CR-ZYW-0_R1.fastq.gz.P.qtrim CR-ZYW-0_R1.fastq.gz.U.qtrim CR-ZYW-0_R2.fastq.gz.P.qtrim CR-ZYW-0_R2.fastq.gz.U.qtrim ILLUMINACLIP:/apps/software/trinityrnaseq/2.3.2/trinity-plugins/Trimmomatic/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25
TrimmomaticPE: Started with arguments: -threads 8 -phred33 /group/rags-lab/Mom_Sequencing/CR-ZYW-0_R1.fastq.gz /group/rags-lab/Mom_Sequencing/CR-ZYW-0_R2.fastq.gz CR-ZYW-0_R1.fastq.gz.P.qtrim CR-ZYW-0_R1.fastq.gz.U.qtrim CR-ZYW-0_R2.fastq.gz.P.qtrim CR-ZYW-0_R2.fastq.gz.U.qtrim ILLUMINACLIP:/apps/software/trinityrnaseq/2.3.2/trinity-plugins/Trimmomatic/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 129462733 Both Surviving: 127080559 (98.16%) Forward Only Surviving: 2381584 (1.84%) Reverse Only Surviving: 0 (0.00%) Dropped: 590 (0.00%)
TrimmomaticPE: Completed successfully
Friday, December 2, 2016: 16:53:14 CMD: cp CR-ZYW-0_R1.fastq.gz.P.qtrim CR-ZYW-0_R1.fastq.gz.PwU.qtrim.fq
Friday, December 2, 2016: 16:57:09 CMD: cp CR-ZYW-0_R2.fastq.gz.P.qtrim CR-ZYW-0_R2.fastq.gz.PwU.qtrim.fq
Friday, December 2, 2016: 17:01:47 CMD: touch trimmomatic.ok
Friday, December 2, 2016: 17:01:47 CMD: gzip CR-ZYW-0_R1.fastq.gz.P.qtrim CR-ZYW-0_R1.fastq.gz.U.qtrim CR-ZYW-0_R2.fastq.gz.P.qtrim CR-ZYW-0_R2.fastq.gz.U.qtrim &
---------------------------------------------------------------
------------ In silico Read Normalization ---------------------
-- (Removing Excess Reads Beyond 50 Coverage --
-- /scratch/zwang3/OG_replicate_trinity/insilico_read_normalization --
---------------------------------------------------------------
Friday, December 2, 2016: 17:01:47
CMD: /apps/software/trinityrnaseq/2.3.2/util/
insilico_read_normalization.pl --seqType fq --JM 15G --max_cov 50 --CPU 8 --output /scratch/zwang3/OG_replicate_trinity/insilico_read_normalization --max_pct_stdev 10000 --SS_lib_type RF --left CR-ZYW-0_R1.fastq.gz.PwU.qtrim.fq --right CR-ZYW-0_R2.fastq.gz.PwU.qtrim.fq --pairs_together --PARALLEL_STATS
Converting input files. (both directions in parallel)CMD: /apps/software/trinityrnaseq/2.3.2/util/..//trinity-plugins/fastool/fastool --rev --illumina-trinity --to-fasta /scratch/zwang3/OG_replicate_trinity/CR-ZYW-0_R1.fastq.gz.PwU.qtrim.fq >> left.fa
CMD: /apps/software/trinityrnaseq/2.3.2/util/..//trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /scratch/zwang3/OG_replicate_trinity/CR-ZYW-0_R2.fastq.gz.PwU.qtrim.fq >> right.fa
Sequences parsed: 127080559
CMD finished (605 seconds)
Sequences parsed: 127080559
CMD finished (898 seconds)
CMD: touch left.fa.ok
CMD finished (0 seconds)
CMD: touch right.fa.ok
CMD finished (1 seconds)
Done converting input files.CMD: cat left.fa right.fa > both.fa
CMD finished (272 seconds)
CMD: touch both.fa.ok
(etc etc...this run of Trinity completed without any errors logged, but the assembly that was created is just an assembly of the top file)
My files are in the working directory and the names of the files are correct. If I change the order that the replicate names are listed, the same thing happens--only the top entry gets trimmed and shuttled along for further analyses. Any insight into this? I'm currently trimming each library individually and will try running Trinity again with the --samples_file option and no --trimmomatic.