cicero inverse correlations

Daniel Gingerich

unread,

Feb 17, 2021, 3:58:34 PM2/17/21

to cicero-users

I really appreciate cicero, the great resource you and your lab has provided to the world of bioinformatics. I am wondering if there is a way to tell if the cicero peak-peak correlations are negative from the run_cicero() output?

We have a set of differentially accessible peaks from control and disease that we are analyzing. My PI instructed me to input only upregulated (more accessible) differentially accessible peaks into cicero, as we are interested in upregulation. I would like to ensure that the cicero connections agree with the differential accessibility calculations. How do I know that all correlations are positive?

Best,

Dan Gingerich

hpl...@gmail.com

unread,

Mar 2, 2021, 2:38:21 PM3/2/21

to cicero-users

Hi Dan,

Sorry for the delay. I'm not entirely sure if I understand your question, but the coaccessibility values in the "coaccess" column of the run_cicero output will be negative in the case of an inverse correlation. Does that answer the question?

Best,

Hannah

Nick

unread,

Jul 9, 2021, 12:50:46 PM7/9/21

to cicero-users

Hi,

Jumping in on this, since I'm interested in the negative (anti) correlations as well. It seems that positive correlations far outnumber negative correlations (~100000:1). My expectation is that there would be a roughly 50:50 split. Is this to be expected, i.e. does the cicero model place greater emphasis on discovering positive correlations?

Thanks,

Nick

Daniel Gingerich

unread,

Jul 25, 2021, 6:32:22 PM7/25/21

to cicero-users

Nick,

My new understanding is that any cis interaction will be positively correlated. Peaks are accessible when bound by proteins (other than histones). I think this is why it is not explained very much in the paper. The reason I am interested in negative correlations has to do with negative correlations between differentially accessible peaks and differentially expressed genes. This would suggest a gene silencer interaction. If a silencer sequence is bound to a transcription regulatory factor, this peak will be open because it is bound by a protein. If the TRF/TRFcomplex is then bound to the silencer target, that peak will also be accessible. The end result will show positive correlation between the two interacting peaks, even though the correlation with gene expression is negative.

Best,

Dan

Nick

unread,

Sep 1, 2021, 2:26:02 PM9/1/21

to cicero-users

Hi Dan,

Thanks for the input. I'm interested in these negative correlations for the same reason - accessible peaks correlated with inaccessible promoter regions could indicate potential silencer elements. We can't say this definitely without corresponding RNA-seq data, since there are genes with accessible promoters that show minimal expression.

However, I'm still confused as to why the positive correlations far outnumber the negative correlations. I don't have a great understanding of the cicero model - is it just more difficult to define inverse-correlations as true negative correlations due the sparse nature of ATAC-seq?

hpl...@gmail.com

unread,

Sep 2, 2021, 1:23:01 PM9/2/21

to cicero-users

Hi Nick,

In my experience the negative correlations don't outnumber the positive by much. Looking at the HSMM dataset from the original Cicero paper, approx 34% of the coaccess scores are positive, 31% are negative, 33% are zero and the rest are NA. Maybe this is something particular to your dataset?

Hannah

hpl...@gmail.com

unread,

Sep 2, 2021, 1:32:28 PM9/2/21

to cicero-users

*positive don't outnumber negative*

Reply all

Reply to author

Forward