Issue with processing multiple paired end samples using master files

John G

unread,

Oct 14, 2016, 2:59:12 PM10/14/16

to CLARK Users

Hi,

I am attempting to start a job that takes advantage of the ability to input multiple paired end files into CLARK using the master files, samples.L.txt and samples.R.txt. I keep receiving the following error:

Loading database [//scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/dbDir/bacteria_viruses_0/db_central_k31_t5721_s1610612741_m0.tsk.*] ...

Loading done (database size: 34497 MB read, with sampling factor 1)

Error: paired-end reads must be FASTQ files!: No such file or directory

Here is the command I am running:

./classify_metagenome.sh -P /scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/samples.L.txt /scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/samples.R.txt -R /scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/resultsAll -m 0 -n 16

The contents of the samples.L.txt and samples.R.txt is:

samples.L.txt

/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/WMGS-001-V1-P_S1_Concat_R1_001.fastq

/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/WMGS-001-V4-P_S9_Concat_R1_001.fastq

/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/WMGS-002-V1-P_S2_Concat_R1_001.fastq

samples.R.txt

/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/WMGS-001-V1-P_S1_Concat_R2_001.fastq

/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/WMGS-001-V4-P_S9_Concat_R2_001.fastq

/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/WMGS-002-V1-P_S2_Concat_R2_001.fastq

I ran it fine on one sample without using the master file with the same parameters. I also tried removing the full path and just leave the file name. Same error.

Thanks for any help you might be able to provide.

John

Rachid

unread,

Oct 14, 2016, 3:35:08 PM10/14/16

to CLARK Users

Hello John!

Thank you for your interest. I see you are passing multiple paired-end files at the same.

Just as reminder: if you decide to pass multiple files at the same time (paired-end reads or single-end reads), you also need to pass multiple result files.

I wonder if the issue you are experiencing comes from your file "/scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/resultsAll"

In your case, since you have 3 paired-end reads, does the file "resultsAll' contain 3 lines as well, each line indicating where to store the results for a each paired-end reads?

How about you run the following command line, and let us know what happens?

./classify_metagenome.sh -P /scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/samples.L.txt /scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/samples.R.txt -R /scratch/jdg/WholeGenomeSequenceAnalysis/CLARK/samples.L.txt -m 0 -n 16

Cheers,

Rachid

On Friday, October 14, 2016 at 11:59:12 AM UTC-7, John G wrote:

Hi,

I am attempting to start a job that takes advantage of the ability to input multiple paired end files into CLARK using the master files, samples.L.txt and samples.R.txt. I keep receiving the following error:

CLARK version 1.2.3 (UCR CS&E. Copyright 2013-2016 Rachid Ounit, rouni001 at cs.ucr.edu)

John G

unread,

Oct 14, 2016, 4:42:29 PM10/14/16

to CLARK Users

Hi Rachid,

I made the changes to the results file to contain a corresponding number of samples and it seems to now be loading the fastqs. I figured the name of the results file was something the program created for output, but upon re-reading the documentation after your response, I just overlooked the fact that it needs to be created upfront. I will come back to the forum and click complete if it finishes successfully.

John

Reply all

Reply to author

Forward