mal-formed RSAT matrix clusters for JASPAR 2022 Invertebrates !?! (NOT)

19 views
Skip to first unread message

Malcolm Cook

unread,
Aug 31, 2022, 12:29:00 PM8/31/22
to JASPAR Q&A Forum
Hello,

Now that I have your attention ;)... 

Looking at JASPAR_2022_matrix_clustering_vertebrates_CORE_tables  I was initially surprised to see things like:

 (a)  BACH2 is a member of both cluster_2 and cluster_18.
 (b) Atoh1 appears twice as a member of cluster_6

This can easily be confirmed looking at that file in the browser and searching for the motif.

Upon consideration, this apparent violation of well-formed cluster must be due to the use non-unique TF names in that file.  TFs sometime have multiple matrices that may cluster together (as in (b)) or apart (as in (a)).

Can you provide a version of the cluster membership table which identifies them by their long-form id JASPAR_2022_vertebrates_CORE_m1_MA0004_1  (or even better, their motif  name (incorporating version) e.g. "MA0004.1"

Or even better, if you provide a sqlite version of all this?

Or perhaps I am missing a file you already provide?

Thanks, Cheers (and expect a citation!),

Malcolm (aka mec!at!stowers.org)


Malcolm Cook

unread,
Sep 1, 2022, 4:28:51 PM9/1/22
to JASPAR Q&A Forum
re: my request "Can you provide a version of the cluster membership table which identifies them by their long-form id ..."

I now see by browsing this directory that there is already a file in this format - namely the cluster.tab file.

So, good, my needs are already met in this regard.  Hooray.

I now suggest that you advertise the availability of this file under Additional Files or it might will go missed by other hopeful users of this fine resource.

Thanks & Cheers

Anthony Mathelier

unread,
Sep 5, 2022, 2:49:38 PM9/5/22
to JASPAR Q&A Forum
Thanks very much for pointing that you. We'll address this as you suggest to make it clearer.
Reply all
Reply to author
Forward
0 new messages