Hi everyone,
I’m having trouble matching my STARSolo results to CellRanger-arc-2.0.0 on the 3’ Gene Expression portion from samples run on 10X Single Cell Multiome ATAC + Gene Expression v1 with introns included. The command I am running is:
STAR --soloType CB_UMI_Simple --soloCBwhitelist /home/m7/software/737K-arc-v1.txt --readFilesCommand zcat --genomeDir /home/m7/software/STAR_arcgenome --readFilesIn /expanse/projects/qstore/csd686/raw_data/HuBMAP_single_cell_nucleus/SC2100044_SC2100048_Fetal-B-3076-pB-Snf-F1/fastq/SC2100044_GT21-13145_TGATGATTCA-CGACTCCTAC_S1_L001_R2_001.fastq.gz /expanse/projects/qstore/csd686/raw_data/HuBMAP_single_cell_nucleus/SC2100044_SC2100048_Fetal-B-3076-pB-Snf-F1/fastq/SC2100044_GT21-13145_TGATGATTCA-CGACTCCTAC_S1_L001_R1_001.fastq.gz --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 12 --clipAdapterType CellRanger4 --outFilterScoreMin 30 --soloCBmatchWLtype 1MM_multi_Nbase_pseudocounts --soloUMIfiltering MultiGeneUMI_CR --soloUMIdedup 1MM_CR --outFileNamePrefix /expanse/lustre/projects/csd686/m7/SC2100044_GT21/ --soloFeatures GeneFull --outSAMattributes NH HI nM AS CR UR CB UB GX GN sS sQ sM --outSAMtype BAM SortedByCoordinate --limitOutSJcollapsed 2000000 --soloCellFilter EmptyDrops_CR
While the sequencing estimates are comparable, I am getting very different numbers of counts from the cells section from CellRanger to STARSolo. In CellRanger, I see:
Estimated number of cells: 5132, Mean raw reads per cell: 69,208.51, Fraction of transcriptomic reads in cells: 89.4%, Median UMI counts per cell: 1078, Median genes per cell: 686, total genes detected: 28,550.
However, in STARSolo, I see:
Estimated Number of Cells: 2352, Unique Reads in Cells Mapped to GeneFull: 147773780, Fraction of Unique Reads in Cells: 0.71642, Mean Reads per Cell: 62828, Median Reads per Cell: 42592, UMIs in Cells: 6764737, Mean UMI per Cell: 2876, Median UMI per Cell: 1973, Mean GeneFull per Cell: 1410, Median GeneFull per Cell: 1146, Total GeneFull Detected: 27409.
Thank you in advance!