Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

CLI Upload: FASTQ naming restrictions don't allow dual index reads (10x data)

104 views
Skip to first unread message

Gordon Beattie

unread,
Mar 8, 2022, 9:42:01 AM3/8/22
to basespace-developers
Hey,

I'm hoping to upload data from the 10x single-cell pipeline, as a result I have paired reads (R1 and R3) and dual index reads (R2 and I1). Unsurprisingly with the strict naming restrictions, I have no way of uploading I1 and R3 files. I've put my command and the first two errors to demonstrate this below (the command successfully uploads the R1 and R2 files). Any suggestions would be greatly appreciated!

All the best,
Gordon

bs dataset upload -p 340302969 -c europe --skip-invalid-filenames --allow-invalid-readnames --recursive .

WARNING: Invalid filename: 'Control-ATAC_S1_L001_I1_001.fastq.gz', rename to match the expected format (SampleName_S1_L001_R1_001.fastq.gz) or use '--skip-invalid-filenames' to ignore this file !

WARNING: Invalid read number: 3 in input Control_ATAC/Control-ATAC_S1_L001_R3_001.fastq.gz !

Gordon Beattie

unread,
Mar 9, 2022, 8:48:23 AM3/9/22
to basespace-developers
Sorry, just realised I forgot to mention, this is single-cell ATAC data from the 10x pipeline, hence why there's an "R3" read. I realise for gene expression-only runs you don't need the extra FASTQ files (I1 and I2) that mkfastq generate for running through cellranger, but for the ATAC pipeline R1, R2 and R3 are needed.

But as a general point, useful to have flexibility with the read name for this reason (among many others), just maybe allow for opt-out of Basespace analysis pipelines in these cases. 

Reply all
Reply to author
Forward
0 new messages