build_gene_activity_matrix error, does chromosome name matter?

23 views
Skip to first unread message

Isaac Diaz

unread,
Aug 30, 2023, 11:42:43 PM8/30/23
to cicero-users
Hello, I am running into an issue running the following command

unnorm_ga <- build_gene_activity_matrix(s.cds, a.sub, dist_thresh = 500000, coaccess_cutoff = 0)

I receive this error

Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : 

  no 'dimnames[[.]]': cannot use character indexing

Calls: build_gene_activity_matrix ... build_composite_gene_activity_matrix -> [ -> [ -> subCsp_rows -> intI



My CDS looks just like the one in the tutorial


               site_name         chr  bp1  bp2 num_cells_expressed overlap gene

Scaffold-1A_1_1000       Scaffold-1A_1_1000 Scaffold-1A    1 1000                 211      NA <NA>

Scaffold-1A_1001_2000 Scaffold-1A_1001_2000 Scaffold-1A 1001 2000                 225      NA <NA>

Scaffold-1A_2001_3000 Scaffold-1A_2001_3000 Scaffold-1A 2001 3000                 303      NA <NA>

Scaffold-1A_3001_4000 Scaffold-1A_3001_4000 Scaffold-1A 3001 4000                  68      NA <NA>

Scaffold-1A_4001_5000 Scaffold-1A_4001_5000 Scaffold-1A 4001 5000                  66      NA <NA>

Scaffold-1A_5001_6000 Scaffold-1A_5001_6000 Scaffold-1A 5001 6000                  51      NA <NA>


My conns object looks like this


 Peak1                       Peak2     coaccess

147000 Scaffold-1A_10000001_10001000 Scaffold-1A_9796001_9797000 -0.106314969

24000  Scaffold-1A_10000001_10001000 Scaffold-1A_9797001_9798000 -0.070820862

324770 Scaffold-1A_10000001_10001000 Scaffold-1A_9804001_9805000  0.003835977

47400  Scaffold-1A_10000001_10001000 Scaffold-1A_9809001_9810000 -0.012461730

54079  Scaffold-1A_10000001_10001000 Scaffold-1A_9810001_9811000  0.001366555

69300  Scaffold-1A_10000001_10001000 Scaffold-1A_9812001_9813000 -0.018624440




The only major difference I see is the chromosome naming convention, could this be an issue? I appreciate any help

Isaac Diaz

unread,
Aug 30, 2023, 11:44:58 PM8/30/23
to cicero-users
Also here is 
print(summary(fData(s.cds)$overlap))

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 

      0     269     526     517     770    1000  245901 



print(summary(fData(s.cds)$num_cells_expressed))

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 

    1.0   140.0   286.0   341.8   482.0  2221.0 

hpl...@gmail.com

unread,
Oct 26, 2023, 12:11:46 PM10/26/23
to cicero-users
Hello,

Apologies for the delay. Yes, I think the issue here is going to be the chromosome name. To get positions, the function splits the position name by any combination of colon, dash and underscore looking for three parts. In your case, the function would find 4 parts because of the scaffold-1A issue. I suspect that if you remove the dash using something like gsub that it will solve the issue.

Best,
Hannah
Reply all
Reply to author
Forward
0 new messages