Error Running CellRanger Terra Workflow

155 views
Skip to first unread message

Alexander Young

unread,
Jul 12, 2021, 11:35:37 AM7/12/21
to cumulus...@googlegroups.com
Hello,
I have a fastq that I'm trying to run cumulus/cellranger_workflow's cellranger count on. I've set up the directory structure as follows: gs://fc-e0000000-0000-0000-0000-000000000000/donor_1_fastq has 6 fastqs: donor_1_S1_L001_l1_001, donor_1_S1_L001_R1_001, donor_1_S1_L001_R2_001, donor_1_S1_L002_l1_001, donor_1_S1_L002_R1_001, and donor_1_S1_L002_R2_001. I believe this means that the donor's data is split between two samples(?), [donor_1_S1_L001_l1_001, donor_1_S1_L001_R1_001, donor_1_S1_L001_R2_001] and [donor_1_S1_L002_l1_001, donor_1_S1_L002_R1_001, donor_1_S1_L002_R2_001], and so I have placed each of these sets into subfolders, donor_1_S1_L001 and donor_1_S1_L002, respectively. The sample sheet, located alongside donor_1_fastq, then goes something like:

Sample,Reference,Flowcell,DataType
donor_1_S1_L001,GRCh38-2020-A,gs://fc-e0000000-0000-0000-0000-000000000000/donor_1_fastq,rna
donor_1_S1_L002,GRCh38-2020-A,gs://fc-e0000000-0000-0000-0000-000000000000/donor_1_fastq,rna

I believe I have set up the directory structure, named the files, and written the sample sheet as per the docs, yet I'm encountering the failure "Job cellranger_count.run_cellranger_count:NA:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details." for each of the two samples. (Not sure where to find that attribute.)
I was hoping you could help me figure out what I'm setting up wrong. Maybe it's messing up because it's a single donor that for some reason is split in two, or something? Thanks,
Alexander Young


Yiming Yang

unread,
Jul 12, 2021, 11:53:18 AM7/12/21
to Alexander Young, Cumulus Support
Hello Alexander,

Your case is actually still one sample, but with two lanes in the sequencing step. So no need to make them into two subfolders.

Moreover, since you run cellranger workflow with only count step, you should be careful about the folder name. We follow the 10x naming strategy, so the folder name should be consistent with the prefix of the corresponding fastq file names in that folder. See the following page in our documentation for details:


Otherwise, if you still have errors, please send us your error message (usually in stderr, stdout, or .log files in the execution folder) so that we can better understand your situation.


Sincerely,
Yiming

--
You received this message because you are subscribed to the Google Groups "Cumulus Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cumulus-suppo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cumulus-support/CAPktJYA1RQzZ8zGfVzYN7WH6Bw1FBej5%3DR_ETZ_N8YrkFnZd5A%40mail.gmail.com.

Yiming Yang

unread,
Jul 12, 2021, 11:49:03 PM7/12/21
to Alexander Young, Cumulus Support
If running only with cellranger count, you don't need to include "Lane" or "DataType" columns, as those are for mkfastq step.

The documentation states 2 requirements: (1) the subfolder name must be the same as the sample name in your sample sheet, and (2) the FASTQ file names must follow bc2fastq file naming convention (see https://kb.10xgenomics.com/hc/en-us/articles/115003802691-How-do-I-prepare-Sequence-Read-Archive-SRA-data-from-NCBI-for-Cell-Ranger-).

Due to the file naming convention, you cannot state "donor_1_S1_L001" as the sample name, but "donor_1" only. Otherwise, you'll need to rename files to "donor_1_S1_L001_S1_L001_R1_001.fastq", etc. in order to let cellranger count recognize "donor_1_S1_L001" as your sample name.


Sincerely,
Yiming

On Mon, Jul 12, 2021 at 12:24 PM Alexander Young <aly...@broadinstitute.org> wrote:
Hi Yiming,
Thanks for your help. Quick question: so when we have two lanes, would the sample sheet then be

Sample,Reference,Flowcell,Lane,DataType
donor_1,GRCh38-2020-A,gs://fc-e0000000-0000-0000-0000-000000000000/donor_1_fastq,1-2,rna

? And re: folder name, the doc says "the subfolder name and the sample name in your sample sheet MUST be the same", I don't see anything about the folder name and fastq prefixes? Thanks,
Alexander
Reply all
Reply to author
Forward
0 new messages