Improving how "No OpenID endpoint found" errors are reported

397 views
Skip to first unread message

Werner Strydom

unread,
May 13, 2012, 7:51:00 PM5/13/12
to dotnet...@googlegroups.com
This message aims to get some consensus on error handling when an OpenID rely party tries to authenticate with an OpenID provider and the best means modify DotNetOpenAuth to implement it. As for the latter, I'd appreciate some guidance.

Assume that a OpenID relay party (https://rp.example.com) is trying to authenticate with an OpenID provider (https://op.example.com). Research at this point has highlighted the following potential issues when the RP wants to authenticate using the OP.  If you know of additional ones, please let me know.

  • The host of the OP is incorrect, or it doesn't exist
  • The host of the OP is correct and the connection is dropped (as IIS/WCF does when the request is too large)
  • The host of the OP exists and never responds within a given time period
  • The host of the OP exists and returns HTTP Status 30x
  • The host of the OP exists and returns HTTP Status 400
  • The host of the OP exists and returns HTTP Status 401
  • The host of the OP exists and returns HTTP Status 403
  • The host of the OP exists and returns HTTP Status 404
  • The host of the OP exists and returns HTTP Status 500
  • The host of the OP exists and returns HTTP Status 503
  • The host of the OP exists and returns HTTP Status 504
  • The host of the OP exists, returns HTTP Status 200 but an invalid XRDS document
  • The host of the OP exists, returns HTTP Status 200, but no XRDS document
  • The host of the OP exists, returns HTTP Status 200, but the HTML returned does satisfy HTML discovery

Some of these conditions may temporary (such as HTTP status 503), other permanent (i.e. host doesn't exist). In any case, an appropriate exception should be thrown by DotNetOpenAuth so that the OpenID rely party can handle it accordingly.  

Here are some requirements for that exception:
  • The exception message should support localization
  • The OpenID rely party should be able to determine the underlying error without having to analyze the message.  
  • The exception should be as comprehensive as possible, as it may be logged in isolation and may be the only source to diagnose an issue.  This assumes that there are multiple OpenID rely parties and providers running on the same physical host. 
  • Messages should not contain abbreviations as this poses problems for support personnel. 
The exception may encapsulate:
  • source of the request (https://rp.example.com/signin), to differentiate one OpenID rely party from another.
  • destination of the request (https://op.example.com), to differentiate one OpenID provider from another.
  • an correlation identifier to differentiate one flow from another
  • a reason to diagnose the issue
  • an corrective action, if possible
Here are examples of messages that may be more appropriate:
  • When HTML discovery fails: "Failed to discover the OpenID provider endpoint using the HTML Discovery method at https://op.example.com/. The realm used trying to perform the discovery was https://rp.example.com".
  • When XRDS fails because the XRDS document is malformed: "Failed to discover the OpenID provider endpoint using XRDS discovery method at https://op.example.com/. The OpenID provider returned a XRDS document, but the document was malformed. The realm used trying to perform the discovery was https://rp.example.com.".
  • When the OP doesn't respond in a reasonable time: "Failed to discover the OpenID provider endpoint using XRDS and HTML discovery methods at https://op.example.com/. The OpenID provider did not respond within a reasonable time (00:00:10). The realm used trying to perform the discovery was https://rp.example.com."  
Does anyone have anything to add? If not, any suggestions how to best approach implementing it?

Werner

PS.  Much of the same requirements may apply to other parts of the DotNetOpenAuth, such as OpenID provider and OAuth components.

Andrew Arnott

unread,
May 15, 2012, 10:27:52 AM5/15/12
to dotnet...@googlegroups.com
Inline...
On Sun, May 13, 2012 at 4:51 PM, Werner Strydom <blou...@gmail.com> wrote:
This message aims to get some consensus on error handling when an OpenID rely party tries to authenticate with an OpenID provider and the best means modify DotNetOpenAuth to implement it. As for the latter, I'd appreciate some guidance.

Assume that a OpenID relying party (https://rp.example.com) is trying to authenticate with an OpenID provider (https://op.example.com). Research at this point has highlighted the following potential issues when the RP wants to authenticate using the OP.  If you know of additional ones, please let me know.

  • The host of the OP is incorrect, or it doesn't exist
  • The host of the OP is correct and the connection is dropped (as IIS/WCF does when the request is too large)
  • The host of the OP exists and never responds within a given time period
  • The host of the OP exists and returns HTTP Status 30x
  • The host of the OP exists and returns HTTP Status 400
  • The host of the OP exists and returns HTTP Status 401
  • The host of the OP exists and returns HTTP Status 403
  • The host of the OP exists and returns HTTP Status 404
  • The host of the OP exists and returns HTTP Status 500
  • The host of the OP exists and returns HTTP Status 503
  • The host of the OP exists and returns HTTP Status 504
  • The host of the OP exists, returns HTTP Status 200 but an invalid XRDS document
  • The host of the OP exists, returns HTTP Status 200, but no XRDS document
  • The host of the OP exists, returns HTTP Status 200, but the HTML returned does satisfy HTML discovery
We should not assume that any list we compile is comprehensive.  Nor can we take hard dependencies on how we interpret specific error codes.  For example, retrying on a specific error code that should mean 'retry later' may cause the user to just hang forever waiting for a server that just isn't online.  But I imagine you realize that. 

Some of these conditions may temporary (such as HTTP status 503), other permanent (i.e. host doesn't exist). In any case, an appropriate exception should be thrown by DotNetOpenAuth so that the OpenID rely party can handle it accordingly.  

Here are some requirements for that exception:
  • The exception message should support localization
Already done. 
  • The OpenID relying party should be able to determine the underlying error without having to analyze the message.  
Agreed, but just so you're aware, errors can be complex.  For example, a single login attempt my result in multiple failures with different reasons.  For example, a specific OpenID Identifier may yield several OPs that are authorized to log the user in.  Before ultimately failing, DNOA will try to reach each and every Provider.  Each one may fail for different reasons, and ultimately DNOA (currently) returns that the OpenID cannot be resolved.  Or it returns just that subset of Providers that did respond, in which case no exception is thrown at all, although the user may log in with a different Provider than they usually do.  However, this multiple case is rare. 
  • The exception should be as comprehensive as possible, as it may be logged in isolation and may be the only source to diagnose an issue.  This assumes that there are multiple OpenID relying parties and providers running on the same physical host. 
  • Messages should not contain abbreviations as this poses problems for support personnel. 
    I'll throw in there that exception messages are targeted at the end user -- not the programmer.  This is simply because most of the web sites out there simply print the exception message to the user so giving them a very technical reason is unhelpful and looks like this web site failed rather than the other one did.  Also, it's critical that the exception message not disclose any confidential information.  
     
    The exception may encapsulate:
    • source of the request (https://rp.example.com/signin), to differentiate one OpenID rely party from another.
    • destination of the request (https://op.example.com), to differentiate one OpenID provider from another.
    • an correlation identifier to differentiate one flow from another
    • a reason to diagnose the issue
    • an corrective action, if possible
    Here are examples of messages that may be more appropriate:
    • When HTML discovery fails: "Failed to discover the OpenID provider endpoint using the HTML Discovery method at https://op.example.com/. The realm used trying to perform the discovery was https://rp.example.com".
    • When XRDS fails because the XRDS document is malformed: "Failed to discover the OpenID provider endpoint using XRDS discovery method at https://op.example.com/. The OpenID provider returned a XRDS document, but the document was malformed. The realm used trying to perform the discovery was https://rp.example.com.".
    • When the OP doesn't respond in a reasonable time: "Failed to discover the OpenID provider endpoint using XRDS and HTML discovery methods at https://op.example.com/. The OpenID provider did not respond within a reasonable time (00:00:10). The realm used trying to perform the discovery was https://rp.example.com."  
    These are examples of excellent log/trace messages -- but not good exception messages (IMO) for the aforementioned reason.

     
    Does anyone have anything to add? If not, any suggestions how to best approach implementing it?

    Werner

    PS.  Much of the same requirements may apply to other parts of the DotNetOpenAuth, such as OpenID provider and OAuth components.

    --
    You received this message because you are subscribed to the Google Groups "DotNetOpenAuth" group.
    To view this discussion on the web visit https://groups.google.com/d/msg/dotnetopenid/-/p7Zwg_pmQ_oJ.
    To post to this group, send email to dotnet...@googlegroups.com.
    To unsubscribe from this group, send email to dotnetopenid...@googlegroups.com.
    For more options, visit this group at http://groups.google.com/group/dotnetopenid?hl=en.

    Richard Collette

    unread,
    May 15, 2012, 11:14:14 AM5/15/12
    to dotnet...@googlegroups.com
    Respectfully, I disagree that exceptions should be formatted for the user (unless I misunderstand your statement).  We know that good site design should never present an exception message to the user.  That being the case, exceptions should not be designed to accommodate bad programming practices.  Exceptions, when logged, should provide the developer/maintainer enough information to hopefully identify and resolve the issue.  Dumbing it down for end user presentation typically removes the information needed to resolve the issue.

    In your review of possible complex scenarios, you have provided information that might be helpful in an exception message.  WCF has started to improve their error messages to present "check this, and this, and this..." type information that I find very helpful when debugging.  Some ASP.NET exceptions tend to do this as well.
    Inline...
    To unsubscribe from this group, send email to dotnetopenid+unsubscribe@googlegroups.com.

    Andrew Arnott

    unread,
    May 15, 2012, 11:49:44 AM5/15/12
    to dotnet...@googlegroups.com
    I think we simply disagree here (which is fine).  The reasons for my position on this point are:
    1. These exceptions can be thrown even when the RP has been perfectly written.  So the exceptions will be thrown when the developer is not actively debugging the app, and thus the exception message won't help the developer anyway unless the developer is actively logging those exceptions. But in that event, the far more useful logging is to log all ERROR messages from dotnetopenauth, which will include errors that never result in thrown exceptions, errors leading up to the exception, etc.  So developers already have the help they need.
    2. In spite of ideals, the fact that most web sites display exception messages suggest that there is value to generating exceptions with user-friendly error messages.
    I understand your points and I think they are valid.  However I feel that the above reasons are sufficient to keep the current practices for exception messages.

    That said, I am still in favor of making exceptions more programmatically actionable by the web application.  So creating new derived types for exceptions, and/or adding an ErrorCode property to the ProtocolException that allows the application to know whether this is a fatal error or one that may warrant a retry still sounds like a good idea.

    --
    Andrew Arnott
    "I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre


    To view this discussion on the web visit https://groups.google.com/d/msg/dotnetopenid/-/uLkVfrO-UNgJ.

    To post to this group, send email to dotnet...@googlegroups.com.
    To unsubscribe from this group, send email to dotnetopenid...@googlegroups.com.

    Werner Strydom

    unread,
    May 15, 2012, 1:35:58 PM5/15/12
    to dotnet...@googlegroups.com
    Howzit,

    I respectfully disagree that exceptions are targeted at the end-user of my application.  Its aimed at me.  Its then my responsibility whether that message is logged, displayed to the user of the application or just ignored.  DotNetOpenAuth may never have the necessary information to cater for the end user of my application.  

    At least one of the inner exceptions should contain a detailed plain english message that describes the problem with some corrective action.  We can use  AggregateException (in v4.0) to encapsulate multiple exceptions in one.  For older frameworks, we can implement an equivalent. 

    The scenarios exist for testing purposes.  It it aimed to ensure that under all those circumstances, the developer can take corrective action without the framework (DotNetOpenAuth) preventing the RP from doing so.  It may never be a definitive list, but keeping a list ensures that we know the requirement is met. 

    We can put whatever we want to the trace logs, but it is no substitute for proper error handling.  As per previous discussion, I differentiate between tracing (which is aimed at developers and switched off in production) and logging (which has very specific requirements for operations and support personell). DotNetOpenAuth does tracing. If errors at runtime are reported properly, there is no need to enable tracing. Exceptions messages are often a source for logging and only the application will know whether a specific event is an error, warning or information based on the context of that event and the requirements how to handle it. 

    Its generally not a good idea to display any exception message to the end user.  The reason is that the application may be localized for Chinese and the underlying framework only supports English.  And how do we know whether the message is even appropriate for that end-user?  Asking around, pretty much everyone, other than some developers and my product manager, is clueless what "No OpenID endpoint found" really means. The first question is "what is OpenID?" and "What do I need to do to fix it?".   

    We might have been better off showing a whale in the clouds.

    Werner

    Richard Collette

    unread,
    May 15, 2012, 2:18:25 PM5/15/12
    to dotnet...@googlegroups.com
    Clearly you are passionate about the topic :)

    But I agree.  Global error handling is so easy, I am not sure why a library should try to defend improper site design and by doing so one could make the argument that it only encourages a bad practice.  I could be wrong but the number of touch points to DotNetOpenAuth within an application are probably going to be minimal compared to all the business logic.  Catching errors and displaying something user friendly like "Could not authenticate your id with Yahoo." or "Could not connect to the xyz server" or otherwise should not be a large effort.

    Richard Collette

    unread,
    May 15, 2012, 3:55:44 PM5/15/12
    to dotnet...@googlegroups.com

    Here's an example of a message that I wouldn't even consider user friendly.  What would be nice is if the exception included additional contextual information (url, id, etc.)

    Werner Strydom

    unread,
    May 15, 2012, 4:11:48 PM5/15/12
    to dotnet...@googlegroups.com
    Please accept my apologies if the "passion" seemed inconsiderate, disrespectful or argumentative.  How to handle errors is always a contentious, whether you develop an application in COBOL on a mainframe or the latest and greatest .NET framework.  I have spent countless evenings diagnosing issues when I'd rather be at the pub watching the game.

    I think I understand where Andrew comes from. Sensitive information in the exception may result in the system being compromised.  We seldom consider the security implications of the exceptions we catch and the appropriate manner to handle it.  Developers are often lazy.  Working on error handling has never been  "cool".   That is perhaps why you often see exception messages being displayed (along with stack traces) to end users.  And given that its a DotNetOpenAuth message being displayed to the end user, it makes sense to consider it a requirement of the framework to display a message in a way that the end user may understand.  Developers (incl. myself) can do much better than that.  

    I'm just disputing whether that requirement is appropriate for a framework. 

    Richard, the challenge is really understand the error in the context of whole system, whether it is recoverable and if not, who should be the target audience to address it. Much of the errors we deal with today originates from frameworks we did not write or external systems (database, other web services).  We don't always have access to the logs in production and thus rely on error messages returned to diagnose issues.  So handling errors appropriately globally is rather challenging and complex based on my experience.

    Werner

    Andrew Arnott

    unread,
    May 28, 2012, 12:28:33 PM5/28/12
    to dotnet...@googlegroups.com
    Richard,

    I replied to this email the day you sent it, but I can see that my email was lost somewhere...

    This error message comes from ASP.NET itself (I think when web.config has customErrors="off").  For some reason, ASP.NET shows the most InnerException rather than the outermost exception.  DNOA saves the more user-friendly error messages for the outermost exceptions and includes the developer technical details for the InnerExceptions. This will often include more of those details you were asking for as well.  I agree this error page is nasty, and web site should go to lengths to avoid this page by either catch exceptions and dealing with them better, or using custom error pages.  

    --
    Andrew Arnott
    "I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre


    --
    You received this message because you are subscribed to the Google Groups "DotNetOpenAuth" group.
    To view this discussion on the web visit https://groups.google.com/d/msg/dotnetopenid/-/F370CeXy_1MJ.

    Andrew Arnott

    unread,
    May 28, 2012, 12:32:28 PM5/28/12
    to dotnet...@googlegroups.com
    I'm in favor of adding ProtocolException-derived exception types and throwing those, and/or adding an ErrorCode property to these exceptions where the hosting web site might take different actions based on the values.  Werner's earlier list (that included a variety of 40x error message codes) seems a bit too precise to me as it's hard for me to imagine that a web site taking a different action for each of those scenarios, even if you could depend on servers returning the correct error codes.

    --
    Andrew Arnott
    "I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre


    --
    You received this message because you are subscribed to the Google Groups "DotNetOpenAuth" group.
    To view this discussion on the web visit https://groups.google.com/d/msg/dotnetopenid/-/rwccB050RK8J.

    Richard Collette

    unread,
    May 28, 2012, 9:49:04 PM5/28/12
    to dotnet...@googlegroups.com
    I'm on board with the idea that the technical details be wrapped by a more user oriented message.   This way the logging should contain the necessary details.   And I agree, at best a category of error is enough.  The level of detail only needs to support the ability to make a decision of "is this recoverable and how so?".   5 different errors that either are not recoverable or would be handled in the same manner are not justified.

    Andrew Arnott

    unread,
    May 29, 2012, 12:56:31 AM5/29/12
    to dotnet...@googlegroups.com
    Great.  If you (Richard) or Werner are interested in taking a stab at preparing a change that accommodates your requirements, it sounds like goodness to include in the project.  If you do, I'd like to look it over when you've put in enough work to show off what you have in mind, but before you apply it "everywhere" (or as broadly as you care to) so we can adjust the pattern slightly if needed.

    --
    Andrew Arnott
    "I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre


    Werner Strydom

    unread,
    May 29, 2012, 1:08:39 AM5/29/12
    to dotnet...@googlegroups.com
    Howzit,

    I'll look into that, it may just take a while before I can submit the changes due to other priorities.

    Werner
    Reply all
    Reply to author
    Forward
    0 new messages