Hi Marc,
I'm not sure, but how I understand it, you need to use 'list files' only if you run the normalization separately using the "
insilico_read_normalization.pl" script. If you run the normalization step as part of the pipeline, it is enough if you concatenate the files by comma and give them as an argument:
"If you have multiple sets of fastq files, such as corresponding to
multiple tissue types or conditions, etc., you can indicate them to
Trinity like so:
Trinity --seqType fq --max_memory 50G \
--left condA_1.fq.gz,condB_1.fq.gz,condC_1.fq.gz \
--right condA_2.fq.gz,condB_2.fq.gz,condC_2.fq.gz \
--CPU 6 "
Nevertheless I will try your suggestion. Maybe it helps.
Since Trinity can't run on different nodes, I'm limited to 1 node with 16 cores and 512G memory. I think this will be enough, if I split the job. The trimming (using --
trimmomatic) worked quite nice.
As you suggested, I already tried to run the normalization step for each pair of files individually, followed by a final normalization step for all normalized pairs. I used therefore the options "
--normalize_reads --normalize_max_read_cov 30 --normalize_by_read_set".Best,
Mihaela