I have a biosample with three paired-end FASTQ datasets, six files in total. These FASTQ datasets were all generated from the same library.
When I try to upload these FASTQ datasets into one single biosample using the BaseSpaceCLI 1.4.0, I get an error about incorrect ReadNums. However, if I try to upload these same datasets via the BaseSpace Sequence Hub web UI, the upload completes perfectly (100.00% reads passing filter, too).
Is this a known bug? If so, where can I subscribe for updates to its fix? If not, is the BaseSpaceCLI team able to reproduce it? I can share more information via private email.
---
The BaseSpaceCLI commands I tried:
- bs upload dataset --project <project id> --biosample-name SAMPLE \
SAMPLE_S1_L002_R1_003.fastq.gz \
SAMPLE_S1_L002_R2_003.fastq.gz \
SAMPLE_S1_L003_R1_002.fastq.gz \
SAMPLE_S1_L003_R2_002.fastq.gz \
SAMPLE_S1_L004_R1_001.fastq.gz \
SAMPLE_S1_L004_R2_001.fastq.gz
- bs upload dataset --project <project id> --biosample-name SAMPLE \
--recursive . - for r1 in $(ls *R1*.fastq.gz); do
r2=$(echo $r1 | sed "s/R1/R2/g")
bs upload dataset \
--project <project id> \
--biosample-name=SAMPLE $r1 $r2
done
The error message is: ERROR: *** error in validation: Read(R1): Incorrect ReadNum ***, followed by panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc8pc=0x72b926].
The last command, the one with the bash for loop, does not raise the segmentation error, and fails for two out of three FASTQ datasets (the other commands fail for all FASTQ datasets).