Profile Interference Tool

20 views
Skip to first unread message

Jessica McAnulty

unread,
Sep 12, 2022, 1:27:07 PM9/12/22
to JASPAR Q&A Forum
I'm looking to find what other transcription factors share the same motif as c-Myc. I stuggled to find a database that allows searching by motif (please share if you know!), but I came across JASPAR's profile interference. From my understanding, this may still help me accomplish my goal. 

Does profile interference allow you to find other transcription factors that bind to a similar motif as the protein sequence that you put in? Please clarify.

Oriol Fornés

unread,
Sep 12, 2022, 1:55:16 PM9/12/22
to Jessica McAnulty, JASPAR Q&A Forum
Hi Jessica,

Not exactly, the profile inference tool, for a given TF, finds TFs from JASPAR predicted to share the same DNA-binding motif(s).
And you would like to do the opposite, for a given JASPAR TF, all TFs from a given database/organism (UniProt, human.fa) that share the same DNA-binding motif(s).
I think I should be able to create a script that does exactly what you want.

I will add it to my TODO list for this week.

--
Oriol Fornés Crespo, PhD
Research Associate / Wasserman Lab
or...@cmmt.ubc.ca

Centre for Molecular Medicine and Therapeutics (CMMT)
Dpt. of Medical Genetics / University of British Columbia
BC Children's Hospital Research Institute
950 W 28th Ave, Room 3109, Vancouver, BC V5Z 4H4, Canada


On Mon, Sep 12, 2022 at 10:27 AM Jessica McAnulty <jessica....@gmail.com> wrote:
I'm looking to find what other transcription factors share the same motif as c-Myc. I stuggled to find a database that allows searching by motif (please share if you know!), but I came across JASPAR's profile interference. From my understanding, this may still help me accomplish my goal. 

Does profile interference allow you to find other transcription factors that bind to a similar motif as the protein sequence that you put in? Please clarify.

--
You received this message because you are subscribed to the Google Groups "JASPAR Q&A Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jaspar+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/jaspar/5d00a4f0-7f3f-4adb-94f8-20a76717024an%40googlegroups.com.

Jessica

unread,
Sep 13, 2022, 10:08:14 AM9/13/22
to Oriol Fornés, JASPAR Q&A Forum
Wonderful, thank you, Oriol! Regarding the profile interference tool as it is now, if I receive no results when putting in my protein of interest, does that suggest there are no other proteins that share that binding motif?
--
--
Jessica McAnulty

Oriol Fornés

unread,
Sep 13, 2022, 10:09:35 AM9/13/22
to Jessica, JASPAR Q&A Forum
Correct. No ouput = no hits.

Sent from my iPhone

On Sep 13, 2022, at 7:08 AM, Jessica <jessica....@gmail.com> wrote:



Oriol Fornés

unread,
Sep 22, 2022, 12:06:09 PM9/22/22
to Jessica, JASPAR Q&A Forum
Hi Jessica,

I have to update the documentation on GitHub, but I have implemented what you requested.

Inside the profile inference tool repository, now there is the script infer_homolog.py. This script, given one or more transcription factor sequences in FASTA format (e.g. JUN_HUMAN.fa), it searches a database of sequences in FASTA format (e.g. the full human proteome) for homologs that are predicted to share the same DNA-binding specificities.
As in the original inference, first, it searches for homologs of the transcription factor(s) using BLAST+, and then it compares the predicted DNA-binding domains (DBDs) of the transcription factor(s) and the homologs. Finally, the script returns transcription factor-homolog pairs whose pairwise DBD percentage of sequence identity (i.e., DBD %ID) is above a certain threshold (from this manuscript). To skip the BLAST+ search for homologs, thereby performing an inference as in the previous manuscript, use the option "--no-blast".

(JASPAR-profile-inference) oriol@gpurtx-2:~/JASPAR-inference-tool$ ./infer_homolog.py --threads 32 ./examples/human/JUN_HUMAN.fa ./examples/human/human.fa
100%|████████████████████| 1/1 [00:00<00:00, 32.28it/s]
100%|████████████████████| 20601/20601 [00:19<00:00, 1080.40it/s]
100%|████████████████████| 1/1 [00:00<00:00,  1.53it/s]
Query   Target  E-value Query Start-End Target Start-End        DBD %ID
sp|P05412|JUN_HUMAN     sp|P05412|JUN_HUMAN     0.0     1-331   1-331   1.0
sp|P05412|JUN_HUMAN     sp|P17535|JUND_HUMAN    2.37e-81        60-331  87-347  0.891
sp|P05412|JUN_HUMAN     sp|P17275|JUNB_HUMAN    6.47e-70        1-331   1-347   0.828

Let me know if you have any questions.

--
Oriol Fornés Crespo, PhD
Research Associate / Wasserman Lab
or...@cmmt.ubc.ca

Centre for Molecular Medicine and Therapeutics (CMMT)
Dpt. of Medical Genetics / University of British Columbia
BC Children's Hospital Research Institute
950 W 28th Ave, Room 3109, Vancouver, BC V5Z 4H4, Canada

Reply all
Reply to author
Forward
0 new messages