Issue when modeling proteins

64 views
Skip to first unread message

Alexandre Pereiras

unread,
Dec 16, 2024, 8:36:59 AM12/16/24
to Phyre
Hello all,

I´m having an interesting behavior... I´m trying model few proteins with no results... proteins are 



And tried also with the known hemoglobin


But I'm getting no results and no report... I have been running in Normal mode. 

Any ideas what may be going on? 

Thanks

Powell, Harold

unread,
Dec 16, 2024, 1:06:51 PM12/16/24
to Phyre
Hi

It looks like we're having some hard disk issues. We're doing a reboot of the webserver over the next few hours, and this might resolve the situation.

I recommend that you don't try to submit any jobs until 1200 UTC tomorrow Tuesday 17th December so that we can be sure that any further issues are not caused by the current issues with the webserver.

Harry

Kamogelo Mokgwela

unread,
Dec 16, 2024, 1:11:59 PM12/16/24
to harold...@imperial.ac.uk, Phyre

The issue you're facing might be due to a combination of input parameters, tool settings, or the protein sequences/models you are working with. Here's a list of possible reasons and troubleshooting steps to help resolve the problem:

1. Input Sequence Errors

Check Protein Sequences: Verify that the sequences from UniProtKB are correct, complete, and properly formatted for the modeling tool you're using.

FASTA Format: Ensure that the input sequence is in the correct FASTA format.

2. Modeling Tool Settings

Normal Mode Limitations: If you're running the tool in "Normal mode," there may be constraints on sequence size, complexity, or alignment sensitivity. Check if there's a "High Sensitivity" or "Expert Mode" for better handling of complex or less-characterized sequences.

Error Logs: Some tools may suppress detailed error reports in normal mode. Check if there's an option to enable detailed logs or verbose mode.

3. Protein Characteristics

Unstructured or Uncharacterized Proteins: Proteins like TupB may lack enough structural or sequence homology data for accurate modeling. Consider checking for similar proteins in structural databases (like PDB) to provide templates.

Multimeric or Complex Nature: Transporter proteins often function as part of complexes, which might make them harder to model individually.

4. Database and Templates

Template Availability: Ensure that homologous structures are available in the structural database (e.g., PDB). If templates are missing or insufficient, tools like homology modeling might fail.

Custom Templates: If possible, provide a known template structure close to your sequence as a starting point.

5. Tool-Specific Issues

Service Overload or Bugs: Modeling servers can experience issues like high traffic or internal errors. Try running the same sequences on another modeling platform (e.g., AlphaFold, I-TASSER, SWISS-MODEL) to rule out server-specific problems.

Version or Compatibility: Ensure that you're using the most recent version of the tool.

6. Hemoglobin (HBB) Exception

HBB is well-characterized, so issues here are likely not related to sequence complexity. If it also fails, it might indicate a tool-specific or systemic issue (e.g., server error, incompatible input format).

Steps to Troubleshoot

1. Recheck Input Files: Verify and simplify your input.

2. Run a Test Sequence: Use a known sequence like HBB to see if the tool functions properly.

3. Try Alternative Tools: Use other modeling platforms like:

AlphaFold Protein Structure Database

I-TASSER

SWISS-MODEL

4. Check Tool Documentation: Look for known limitations or troubleshooting sections in the tool's help documentation.

5. Contact Support: If all else fails, consider contacting the tool's support team with your issue and input details.




--
You received this message because you are subscribed to the Google Groups "Phyre" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phyre+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/phyre/0c61c567-a801-4e8d-b7ce-96f3834b52fen%40googlegroups.com.

Powell, Harold

unread,
Dec 17, 2024, 5:13:47 AM12/17/24
to Phyre
Just for the record -

Since I run Phyre, I know a little about this.

We are having some issues with the HDDs - I hope that these will be resolved today. It's most likely that the OP's problems are linked with this (and probably due to an issue on a single node in our cluster), because no-one else has reported problems - if you have had problems like this, then PLEASE get in touch!

Once our local hardware problems are sorted out, I will be able to investigate further.

It is possible that the OP's problems are associated with the change to Phyre2.2 - I will be looking into this later today to see where things have gone wrong with their submission.

Regarding Kamogelo's points:

(1) Phyre checks the sequence to make sure it's valid before job submission - if it isn't made up of the 20 canonical amino acids, or is made up of more than 90% just of Alanine, Threonine, Cysteine or Glycine (guess why...), you shouldn't be able to finish job submission. If you can, then PLEASE let us know so that we can investigate.

If you use the Phyre tool to use a sequence directly from UniProt, then that is checked by the people at UniProt when they put it in the DB, by us before the Phyre "sequence box" is filled in, and then before Phyre actually runs the job. It's difficult to be too careful!

If you really need to model a protein sequence that is nearly all ATCG (you would need a fairly compelling justification), then let me know by e-mail and I can submit it for you.

(2) The sequence is limited to lengths between 30 and 6000 amino acids - I don't think that any of the OP's sequences come close (because our servers are down at the moment I can't check until later today). We don't check for things like complexity (not sure what is meant by this, but we don't look specifically for things like disulfide bridges or covalently modified AAs), but if the HMM matching finds that there are no MSAs with e-values < 1 e-3, it's possible that you'd get no answers - but that would have different symptoms. However, the OP's post (and since I run the service, I can check this by examining their results locally) indicates that this is not the case. 

Using a haemoglobin is a good test - it should always work because there are many HBs with similar sequences in the PDB.

(3) Yes, using alternative tools is a good idea.

(4) Troubleshooting mostly consists of asking us at Imperial when you have problems.

(5) Actually, the first thing (and not if all else fails...) that I would do is to check with us here at Imperial if you have problems - we are always happy to help. The contact e-mail is clearly marked in many places on the Phyre2 website.

The best single piece of information that you can give us when reporting a problem is the 16-character unique code that is assigned to each Phyre job - it saves me no end of time if this is included in an e-mail.

Best wishes

Harry

Alexandre Pereiras

unread,
Dec 17, 2024, 12:31:39 PM12/17/24
to Phyre
Everything seems to be working now, thanks a lot!

Kamogelo Mokgwela

unread,
Dec 17, 2024, 12:35:14 PM12/17/24
to alexandre...@gmail.com, Phyre

Alexandre Pereiras

unread,
Dec 17, 2024, 12:37:02 PM12/17/24
to Phyre
I would rather give the credit to Harry Powell, if you don't mind ;-)

Kamogelo Mokgwela

unread,
Dec 17, 2024, 12:43:44 PM12/17/24
to alexandre...@gmail.com, Phyre

geetika khosla

unread,
Dec 18, 2024, 4:18:12 AM12/18/24
to alexandre...@gmail.com, Phyre

Hey!
Could you please elaborate on your problem?


--
You received this message because you are subscribed to the Google Groups "Phyre" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phyre+un...@googlegroups.com.

Alexandre Pereiras

unread,
Dec 18, 2024, 4:18:13 AM12/18/24
to geetika khosla, Phyre
Sure. i am just getting no results. Email states no % and no report is presented. Like there is no model being generated. 

geetika khosla

unread,
Dec 18, 2024, 4:18:19 AM12/18/24
to Alexandre Pereiras, Phyre

Ok!
You should download protein model through PDB.

Alexandre Pereiras

unread,
Dec 18, 2024, 4:18:23 AM12/18/24
to Phyre
Fordwarding my email, as it looks I only replied to Geetika... 

All the info is there. 

---------- Forwarded message ---------
From: Alexandre Pereiras <alexandre...@gmail.com>
Date: Mon, 16 Dec 2024 at 16:08
Subject: Re: [Phyre:243] Issue when modeling proteins
To: geetika khosla <geetikak...@gmail.com>


Ok, let's start from scratch :)

I'm using Phyre as part of our studies to predict a model based on a sequence of aminoacids. In order to learn the tool, we've chosen the sequence of A0A5C2HCR3, for which there is no equivalent in PDB. When using that sequence in Phyre, it produces no results (example of job ids: 1002c434db66f325,  b938511404a2dcfa, c7bf0262fea39639, de5dc315b1f722c1, c89f08271ef3d7c8...)

I thought that I may have been using the tool incorrectly, so I started to use different sequences to see if other results are presented. For example: A0A6G5QL18 (with job id 4c7f491a02db0623) and P68871  (with job id a32ef8c9e3f511d3). All of them produce same results. Example extracted from email (see no value for %). 

NORMAL mode.
Confidence in the model: %

Example from Phyre UI (no report generated):

image.png

Hence, my ask to understand what's going on here, as I would have expected results from the tool in all of the cases above.

Thanks 

Powell, Harold

unread,
Dec 18, 2024, 4:42:35 AM12/18/24
to Phyre
Just a quick note on Geetika's post -

(First a note on my nomenclature - when I write "structure", I mean something for which experimental data has been obtained; when I write "model", I'm talking about things for which there is no direct experimental evidence - so no X-ray, NMR, Cryo-EM, etc).

 If the protein structure is in the PDB, this is a good idea. If it isn't in the PDB, then the AlphaFold DB is a good place to look (but it hasn't been updated as far as I know for over 2 years - so it doesn't have models of newer depositions to UniProt, for example). However, for novel sequences this is not an option, and some kind of modelling software is appropriate.

Mostly, people don't use modelling tools like Phyre (or any of the others that Kamogelo mentioned in this thread) if there is a structure of their protein in the PDB - why would they bother with a model based only on the sequence when they can get a structure that has been built from experimental data (X-Ray, NMR, Cryo-EM, etc) that will in almost every case represent something much closer to reality? Even AlphaFold3 models (good as they are) are probably not as good as a moderately well solved and refined experimental structure. I don't have the figures to hand, but from memory, AlphaFold 2 models are generally considered to be about as accurate as a 3Å X-ray structure - so better than Max Perutz's original haemoglobin (~5.5Å) but considerably worse than John Kendrew's myoglobin (originally ~2Å), which readers may remember were the first two X-ray structures of proteins that were published.

Many of the modelling tools available will allow the user to base their model on an existing structure in the PDB or on a model from another source - e.g. Phyre2.2 has what we call "AlphaThread" which will use the model in the AlphaFold DB with the closest sequence to the user's as the basis for the homology model.
Reply all
Reply to author
Forward
0 new messages