Do overlaping groups of cells violate statistical assumptions?

37 views
Skip to first unread message

Matt R

unread,
Nov 3, 2022, 3:00:04 PM11/3/22
to cicero-users
Hi Hannah!

I am reaching out to improve my understanding of how Cicero works.

For the graphical LASSO, the observations are formed by groups of similar cells (metacells). There may be some overlap between these metacells based on the overlap filtering parameter (90% from the manuscript methods). Therefore, there are likely some metacells that still share cells (in other words, one scATAC-seq cell could be present in multiple metacells), meaning that the observations are not technically independent of one another.

To my understanding, one of the assumptions of regression-based methods is independence of observations, meaning that observations can only be counted once. 

Since we have overlapping metacells, would this technically violate one of the assumptions of the graphical LASSO? Or does the 90% overlap filtering step address/dampen this concern?

Please let me know if I misunderstand something and/or feel free to share your "rebuttal statement" for a hypothetical reviewer. 

I ask so that I can be more prepared for the scientific review process and for my thesis committee meetings. Thank you!

hpl...@gmail.com

unread,
Dec 3, 2022, 10:50:47 AM12/3/22
to cicero-users
Hi Matt,

Yes, the metacells likely share some cells, though in practice the 90% exclusion usually means that the number of overlaps is usually pretty low. Yes, typically you want observations to be independent, but in the case of ATAC data we found that a bagging method was a decent compromise between non-independent observations and not having enough metacells of a sufficient density to leave the binary regime.

Hope this helps,
Hannah
Reply all
Reply to author
Forward
0 new messages