New URL underscore validation issue

17 views
Skip to first unread message

Nick Ramsay

unread,
Jan 29, 2010, 10:31:43 PM1/29/10
to inspekt
Hi Ed,

Following on from this thread (which is now closed to new replies):
http://groups.google.com/group/inspekt/browse_thread/thread/fdb284bc36577778

I have another url that won't validate, most likely because of the
underscore in the sub-domain. Try this one:
http://blue_moon.typepad.com/blue_lotus/2010/01/ogasawara.html

I'm running JapanSoc.com live with Hotaru CMS now, and since it's a
social bookmarking site, parsing urls is one of Inspekt's main duties.

Many thanks,
Nick.

Ed Finkler

unread,
Jan 30, 2010, 10:52:22 AM1/30/10
to ins...@googlegroups.com
I didn't even think it was valid to use underscores in domain names.
From what I can tell, the specs for domain names do *not* allow
underscores, even in the subdomain level.

I'm reluctant to alter the method to allow underscores. Even if they
resolve (this URL did for me), I think sticking to the spec is safer.
I worry about the possibility of introducing, say, additional phishing
possibilities, or other social engineering attacks by not following
the spec.

I'm willing to listen to arguments about this, though.

See:
<http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names>
<http://tools.ietf.org/html/rfc952>

> --
> You received this message because you are subscribed to the Google Groups "inspekt" group.
> To post to this group, send email to ins...@googlegroups.com.
> To unsubscribe from this group, send email to inspekt+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/inspekt?hl=en.
>
>

Nick Ramsay

unread,
Jan 30, 2010, 11:10:52 AM1/30/10
to inspekt
I understand Ed, no worries. You wouldn't happen to know how to adapt
the line so I can just put it in myself without you needing to add it
to the original script? Thanks again, Nick.

On Jan 31, 12:52 am, Ed Finkler <funkat...@gmail.com> wrote:
> I didn't even think it was valid to use underscores in domain names.
> From what I can tell, the specs for domain names do *not* allow
> underscores, even in the subdomain level.
>
> I'm reluctant to alter the method to allow underscores. Even if they
> resolve (this URL did for me), I think sticking to the spec is safer.
> I worry about the possibility of introducing, say, additional phishing
> possibilities, or other social engineering attacks by not following
> the spec.
>
> I'm willing to listen to arguments about this, though.
>
> See:
> <http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names>
> <http://tools.ietf.org/html/rfc952>
>

> On Fri, Jan 29, 2010 at 10:31 PM, Nick Ramsay <n...@longcountdown.com> wrote:
> > Hi Ed,
>
> > Following on from this thread (which is now closed to new replies):

> >http://groups.google.com/group/inspekt/browse_thread/thread/fdb284bc3...

Ed Finkler

unread,
Jan 30, 2010, 11:16:58 AM1/30/10
to ins...@googlegroups.com
You could create your own accessor method by extending
AccessorAbstract. That would be better than hacking core files. See
<http://github.com/funkatron/inspekt/blob/master/Examples/extending.php>

--
Ed Finkler
http://funkatron.com
Twitter:@funkatron
AIM: funka7ron
ICQ: 3922133
XMPP:funk...@gmail.com

Oliver Humpage

unread,
Jan 30, 2010, 12:44:22 PM1/30/10
to inspekt
I believe that *host* names can't have underscores. However, *domain*
names can. Domain names include things like Domain Keys, or standard
Active Directory domains like _tcp.domain.tld.

So if you're parsing URLs, then underscores are invalid, because URLs
should point to hosts (or at least, I can't think of any counter
examples). But if you're doing general domain name stuff, you should
allow underscores.

Oliver.

Reply all
Reply to author
Forward
0 new messages