CLI Upload: FASTQ naming restrictions don't allow dual index reads (10x data)

136 views

Skip to first unread message

Gordon Beattie

unread,

Mar 8, 2022, 9:42:01 AM3/8/22

to basespace-developers

Hey,

I'm hoping to upload data from the 10x single-cell pipeline, as a result I have paired reads (R1 and R3) and dual index reads (R2 and I1). Unsurprisingly with the strict naming restrictions, I have no way of uploading I1 and R3 files. I've put my command and the first two errors to demonstrate this below (the command successfully uploads the R1 and R2 files). Any suggestions would be greatly appreciated!

All the best,
Gordon

bs dataset upload -p 340302969 -c europe --skip-invalid-filenames --allow-invalid-readnames --recursive .

WARNING: Invalid filename: 'Control-ATAC_S1_L001_I1_001.fastq.gz', rename to match the expected format (SampleName_S1_L001_R1_001.fastq.gz) or use '--skip-invalid-filenames' to ignore this file !

WARNING: Invalid read number: 3 in input Control_ATAC/Control-ATAC_S1_L001_R3_001.fastq.gz !

Gordon Beattie

unread,

Mar 9, 2022, 8:48:23 AM3/9/22

to basespace-developers

Sorry, just realised I forgot to mention, this is single-cell ATAC data from the 10x pipeline, hence why there's an "R3" read. I realise for gene expression-only runs you don't need the extra FASTQ files (I1 and I2) that mkfastq generate for running through cellranger, but for the ATAC pipeline R1, R2 and R3 are needed.

But as a general point, useful to have flexibility with the read name for this reason (among many others), just maybe allow for opt-out of Basespace analysis pipelines in these cases.

Reply all

Reply to author

Forward

0 new messages