Cicero on Very Large Dataset

57 views
Skip to first unread message

Shahroze Abbas

unread,
Apr 1, 2024, 8:53:04 AM4/1/24
to cicero-users
Hello!

I would like to use Cicero as a prerequisite for CellOracle. My problem is that the multiome dataset I am working with is ~1.6 million cells and I am not entirely sure how to run the Cicero portion for CellOracle because I don't think R will be able to handle the size. 

I have a merged peakset, and so would I be able to run Cicero on a per sample basis and then merge downstream in Python for CellOracle inputs? Is there a specific way or certain considerations to take when merging Cicero outputs? Is it even possible to do?

Thanks in advance for your advice.

Shahroze

hpl...@gmail.com

unread,
Sep 2, 2024, 8:26:54 PM9/2/24
to cicero-users
Apologies for the very late reply. I haven't used CellOracle, but from a quick scan, it seems like they're using Cicero only to identify enhancer-promoter pairs - if that's the case, you could probably simply concatenate the cicero connection matrices from different samples and combine/summarize the duplicate pairs (probably max or mean score) to get a consensus set. Another way to divide up the cicero run is by chromosome - each chromosome is considered separately, so you can divide by chromosome and concatenate the results - as another way to make the problem smaller. Hope this helps...

Best,
Hannah

Reply all
Reply to author
Forward
0 new messages