[Django] #25452: Email validation for domain `gmail.-com` is considered valid

26 views
Skip to first unread message

Django

unread,
Sep 23, 2015, 6:36:53 AM9/23/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
----------------------------+--------------------
Reporter: phalt | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 1 | UI/UX: 0
----------------------------+--------------------
When entering an email like "test@gmail.-com" the email validator returns
True.

Particularly, the `validate_domain_part` allows domains with a hyphen
character in the TLD:

{{{
from django.core.validators import validate_email
validate_email.validate_domain_part('gmail.-com')
True
}}}

Nearly all other special characters return correctly:

{{{
from django.core.validators import validate_email
validate_email.validate_domain_part('gmail._com')
False
}}}

Unless my knowledge of valid TLDs is wrong, I don't think this is correct
:(

--
Ticket URL: <https://code.djangoproject.com/ticket/25452>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 23, 2015, 9:22:03 AM9/23/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------

Reporter: phalt | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------
Changes (by collinanderson):

* cc: cmawebsite@… (added)
* needs_better_patch: => 0
* needs_tests: => 0
* easy: 1 => 0
* needs_docs: => 0
* stage: Unreviewed => Accepted


Comment:

Confirmed on 1.8 and 1.9. Chrome's email validation rejects this, so I
assume this is unintentional.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:1>

Django

unread,
Sep 23, 2015, 11:40:52 AM9/23/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------

Reporter: phalt | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by phalt):

We've been investigating this more and it appears that hyphens can be in
TLDs, just not at the start of the beginning:

> Domain names may be formed from the set of alphanumeric ASCII characters
(a-z, A-Z, 0-9), but characters are case-insensitive. In addition the
hyphen is permitted if it is surrounded by characters, digits or hyphens,
although it is not to start or end a label.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:2>

Django

unread,
Sep 23, 2015, 1:18:51 PM9/23/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------

Reporter: phalt | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by timgraham):

Please check `URLValidator` to see if it handles this (if so, maybe you
could borrow from it) or if it requires a similar fix.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:3>

Django

unread,
Sep 23, 2015, 4:46:15 PM9/23/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------
Changes (by bak1an):

* status: new => assigned
* owner: nobody => bak1an


Comment:

According to the https://tools.ietf.org/html/rfc1035 domain labels can
contain hyphens but not as their first character.

[https://github.com/django/django/blob/71ebcb85b931f43865df5b322b2cf06d3da23f69/django/core/validators.py#L160
EmailValidator.domain_regex] checks this for all labels but the last one
(the TLD) and it looks like a bug to me (not something that was done on
intention).

[https://github.com/django/django/blob/71ebcb85b931f43865df5b322b2cf06d3da23f69/django/core/validators.py#L89
URLValidator] uses more complex domain validation regex set (including
unicode, etc).

I will double check if borrowing those checks into {{{EmailValidator}}}
won't violate any standards and come back with a patch in case it's ok.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:4>

Django

unread,
Oct 29, 2015, 8:32:28 AM10/29/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------
Changes (by DheerendraRathor):

* cc: dheeru.rathor14@… (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:5>

Django

unread,
Nov 8, 2015, 4:11:52 PM11/8/15
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------
Changes (by bak1an):

* has_patch: 0 => 1


Comment:

[https://github.com/django/django/pull/5612 The pull request]

Unfortunately I have not found a good way to merge URLValidator and
EmailValidator since there are tons of small differences.

So I decided to fix EmailValidator instead. I tried to be as accurate as
possible (actual regex changes are just few characters long).

During reading various RFCs and articles I've found some other easy-
fixable issues (like allowing quoted '@', space or backslash for dot atom
local part, allowing spaces inside quoted local part, etc) those are
included in above PR as well.


Proper RFC-based email validation is a
[http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-
address-until-i.aspx/ surprisingly hard task] and definitely no one wants
to have [http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html few pages
long regex] in Django.

So updated validator is still not fully RFC compliant but few issues are
fixed now (or covered with test cases).

Commit messages includes detailed description of all changes that were
made.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:6>

Django

unread,
Mar 11, 2016, 10:58:57 AM3/11/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by nedbatchelder):

Can I respectfully suggest that continuing to tweak this complex regex to
get asymptotically closer to perfection is not worth it? Especially to
fix false positives. What real-world problem is happening because
"gmail.-com" is accepted? "gmail.ccomm" is also accepted, but is just as
useless as an email address.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:7>

Django

unread,
Mar 11, 2016, 11:11:11 AM3/11/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by timgraham):

I am open to that if you can get consensus on the DevelopersMailingList on
a set of limitations that we can document so that we have something to
point to when we get requests for enhancements. I imagine this policy
would also include `URLValidator`.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:8>

Django

unread,
Mar 11, 2016, 11:18:29 AM3/11/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by collinanderson):

I think we should try to match the standard html <input type="email">
validation. I'd imagine that most uses cases would want to match that. We
use the regex verbatim from the standard itself:

https://html.spec.whatwg.org/multipage/forms.html#e-mail-
state-(type=email)

If people want to allow things outside of that they could use a custom
regex.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:9>

Django

unread,
Mar 11, 2016, 12:14:58 PM3/11/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by collinanderson):

Though it gets more complicated when considering Unicode. Unicode needs to
get normalized to ascii before running through the official regex.

Here's how chrome does it:
https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/html/forms/EmailInputType.cpp

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:10>

Django

unread,
Mar 13, 2016, 2:01:38 PM3/13/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by bak1an):

How about resolving this issue with the
[https://github.com/django/django/compare/master...bak1an:ticket_25452_minimal
smallest possible change ] and moving future validation regex improvement
discussion into separate ticket? Would this be legit?

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:11>

Django

unread,
Mar 13, 2016, 4:41:09 PM3/13/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by nedbatchelder):

@bak1an: Can you explain why it's important to reject "gmail.-com"? Why
add *any* more complexity just to reject more bogus email addresses?

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:12>

Django

unread,
Mar 14, 2016, 2:10:20 PM3/14/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: assigned
Component: Forms | Version: 1.8

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1


Comment:

I started a [https://groups.google.com/d/topic/django-
developers/ASBJ0ge2KYo/discussion thread on django-developers] to find a
way forward.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:13>

Django

unread,
Mar 17, 2016, 12:15:39 PM3/17/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: closed
Component: Forms | Version: 1.8
Severity: Normal | Resolution: wontfix

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------
Changes (by timgraham):

* status: assigned => closed
* resolution: => wontfix


Comment:

The consensus on the mailing list seems to be to simplify the validation,
not make it more comprehensive.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:14>

Django

unread,
Mar 17, 2016, 3:51:01 PM3/17/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: closed
Component: Forms | Version: 1.8

Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by claudep):

Tim, is there a ticket about the simplification, or should we create one?

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:15>

Django

unread,
Mar 30, 2016, 9:01:14 AM3/30/16
to django-...@googlegroups.com
#25452: Email validation for domain `gmail.-com` is considered valid
------------------------+------------------------------------
Reporter: phalt | Owner: bak1an
Type: Bug | Status: closed
Component: Forms | Version: 1.8

Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
------------------------+------------------------------------

Comment (by timgraham):

Ticket for simplifying the validation is #26423.

--
Ticket URL: <https://code.djangoproject.com/ticket/25452#comment:16>

Reply all
Reply to author
Forward
0 new messages