error on check_markers for 'SYMBOL'

39 views
Skip to first unread message

Jinhong Kim

unread,
May 9, 2023, 11:40:17 AM5/9/23
to garnett-users
Hello,

I've been getting an error as described below when using 'check_markers' function at the setting of 'SYMBOL'. Something strange is that it's been working fine when I set 'ENSEMBL' for cds and marker_file_path as described under the error message. So, could you please let me know what I need to correct?

Best,

Jinhong

*error message

> marker_file_path_symbol <- system.file("extdata", "celltype_symbol", package = "garnett") > marker_File_path_symbol [1] "~/R/x86_64-pc-linux-gnu-library/4.3/garnett/extdata/celltype_symbol" > cds <- load_cellranger_data("~/01_cellranger_7.1_output/", umi_cutoff=0) > cds class: cell_data_set dim: 32285 2242 metadata(1): cds_version assays(1): counts rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ... ENSMUSG00000095019 ENSMUSG00000095041 rowData names(2): id gene_short_name colnames(2242): AAACCCAAGAACGTGC-1 AAACCCAAGGCCCAAA-1 ... TTTGTTGCACTCTAGA-1 TTTGTTGGTATCGTAC-1 colData names(2): barcode Size_Factor reducedDimNames(0): mainExpName: NULL altExpNames(0): > marker_check_symbol <- check_markers(cds, marker_file_path_symbol, db=org.Mm.eg.db, cds_gene_id_type = "SYMBOL", marker_file_gene_id_type = "SYMBOL") Error in value[[3L]](cond) : Garnett cannot convert the gene IDs using the db and types provided. Please check that your db, cds_gene_id_type and marker_file_gene_id_type parameters are correct. Please note that the cds_gene_id_type refers to the type of the row.names of the feature (gene) table in your cds. Conversion error: Error in .testForValidKeys(x, keys, keytype, fks): None of the keys entered are valid keys for 'SYMBOL'. Please use the keys method to see a listing of valid arguments.

* check_markers with 'ENSEMBL'

> marker_file_path_gene_id <- system.file("extdata", "celltype_geneid", package = "garnett") > marker_file_path_gene_id [1] "~/R/x86_64-pc-linux-gnu-library/4.3/garnett/extdata/celltype_geneid" > cds <- load_cellranger_data("~/01_cellranger_7.1_output/", umi_cutoff=0) > cds class: cell_data_set dim: 32285 2242 metadata(1): cds_version assays(1): counts rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ... ENSMUSG00000095019 ENSMUSG00000095041 rowData names(2): id gene_short_name colnames(2242): AAACCCAAGAACGTGC-1 AAACCCAAGGCCCAAA-1 ... TTTGTTGCACTCTAGA-1 TTTGTTGGTATCGTAC-1 colData names(2): barcode Size_Factor reducedDimNames(0): mainExpName: NULL altExpNames(0): > marker_check_gene_id <- check_markers(cds, marker_file_path_gene_id, db=org.Mm.eg.db, cds_gene_id_type = "ENSEMBL", marker_file_gene_id_type = "ENSEMBL") There are 7 cell type definitions


hpl...@gmail.com

unread,
Aug 3, 2023, 1:56:01 PM8/3/23
to garnett-users
Hi Jinhong, sorry for the delay. The cell_id_type refers to the gene type of the cds function, which looks like it is ensembl in your CDS object. The marker_file_gene_id_type refers to the gene type in your marker file. So I'm guessing what you want is to set the cds_gene_id_type to "ENSEMBL" and the marker_file_gene_id_type to "SYMBOL" (assuming you're using gene symbols in your marker file). 

The reason the second version runs without error is that since both types are ENSEMBL it can skip the conversion step.

Best,
Hannah

Reply all
Reply to author
Forward
0 new messages