Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Should we change how we decide a node is a link to not take into account the url's validity?

40 views
Skip to first unread message

Boris Zbarsky

unread,
Jul 13, 2011, 11:17:34 AM7/13/11
to
Right now, a <html:a> is considered a link if it has an "href" attribute
and we can create an nsIURI from that attribute. That means there are
some edge cases in which it's not considered a link even though the href
attribute is nonempty (e.g. <a href="http://with space/">).

In some ways this behavior is desirable: with this behavior, if the <a>
is a link then the user can actually click on it and chances are
something will happen (because we have a URI object and all).

In some ways this behavior is undesirable: computing whether something
is a link involves expensive nsIURI creation (as opposed to a HasAttr
check), and I don't think this behavior is compatible with other
browsers, though that's not a big deal. We mitigate the expensive bit
by caching the nsIURI, but this means that checking whether something is
a link is not const-safe, which is somewhat unfortunate. Furthermore,
we end up having to jump through hoops to avoid doing the first such
check for as long as we can, because otherwise we get a pageload and
benchmark time hit.

With all that in mind, would people object to having "is a link" mean
"has a href attribute" in this situation, and having the behavior for
clicks on a link where we can't create an nsIURI just silently do
nothing (though probably still preventDefault the event, etc)?

-Boris

Jonas Sicking

unread,
Jul 13, 2011, 11:32:11 AM7/13/11
to Boris Zbarsky, dev-pl...@lists.mozilla.org

This sounds like a good idea to me. Simpler == better (most of the time).

/ Jonas

Neil

unread,
Jul 13, 2011, 4:32:37 PM7/13/11
to
Boris Zbarsky wrote:

> Furthermore, we end up having to jump through hoops to avoid doing the
> first such check for as long as we can, because otherwise we get a
> pageload and benchmark time hit.

I take it the history check doesn't involve creating nsIURI objects then?

--
Warning: May contain traces of nuts.

Boris Zbarsky

unread,
Jul 13, 2011, 10:56:12 PM7/13/11
to
On 7/13/11 4:32 PM, Neil wrote:
> Boris Zbarsky wrote:
>
>> Furthermore, we end up having to jump through hoops to avoid doing the
>> first such check for as long as we can, because otherwise we get a
>> pageload and benchmark time hit.
>
> I take it the history check doesn't involve creating nsIURI objects then?

It does, actually.

But pages do stuff like add/remove anchors to the DOM a bunch before any
style is resolved on them (e.g. using += a bunch of times on innerHTML),
and we don't need to do a history check for anchors that are just
transiently in the DOM.

-Boris

Justin Dolske

unread,
Jul 13, 2011, 11:25:21 PM7/13/11
to
On 7/13/11 8:17 AM, Boris Zbarsky wrote:

> With all that in mind, would people object to having "is a link" mean
> "has a href attribute" in this situation, and having the behavior for
> clicks on a link where we can't create an nsIURI just silently do
> nothing (though probably still preventDefault the event, etc)?

Seems reasonable to me. At least, I can't think of a reasonable argument
for a site wanting to have <a href="xxx"> but not have it be treated as
a link. (Similarly, we don't check to make sure the link is reachable,
not a 404, etc.)

Justin

Jeff Hammel

unread,
Jul 14, 2011, 1:43:46 AM7/14/11
to dev-pl...@lists.mozilla.org
...although, to mention obliquely, if there was a way of fetching e.g.
link titles, HTTP codes, last changed, graphs of user navigation, etc.,
in the background, there is a lot we could do with this...

Neil

unread,
Jul 14, 2011, 5:31:16 AM7/14/11
to
Boris Zbarsky wrote:

So why do we need to do a link check for anchors that are just
transiently in the DOM?

Boris Zbarsky

unread,
Jul 14, 2011, 10:39:43 AM7/14/11
to
On 7/14/11 5:31 AM, Neil wrote:
> So why do we need to do a link check for anchors that are just
> transiently in the DOM?

We don't, per se. But since link state is stored as part of our general
state infrastructure (something we may want to change, but it's a good
bit of work), we have to jump through hoops to avoid computing it.

And sometimes we _do_ have to check for being a link on elements that
are only transiently in the DOM, or not in the DOM at all. Thing
querSelector(":link,:visited"). Right now we just get that wrong in
disconnected subtrees as a result of all the attempts to figure out
whether we're a link as lazily as possible.

-Boris

Neil

unread,
Jul 14, 2011, 11:49:19 AM7/14/11
to
Boris Zbarsky wrote:

> On 7/14/11 5:31 AM, Neil wrote:
>
>> So why do we need to do a link check for anchors that are just
>> transiently in the DOM?
>
> We don't, per se. But since link state is stored as part of our
> general state infrastructure (something we may want to change, but
> it's a good bit of work), we have to jump through hoops to avoid
> computing it.

Ah, so you're saying "we don't have a good story for computing
linkfulness lazily, so we'd like to make linkfulness easier to compute".
I guess that makes sense, although since so far I only know two reasons
to compute linkfulness, I'm still not sure why it's so hard to compute
lazily. Maybe I should read the relevant code rather than getting you to
drip-feed it to me second-hand...

Boris Zbarsky

unread,
Jul 14, 2011, 12:02:41 PM7/14/11
to
On 7/14/11 11:49 AM, Neil wrote:
> Ah, so you're saying "we don't have a good story for computing
> linkfulness lazily, so we'd like to make linkfulness easier to compute".
> I guess that makes sense, although since so far I only know two reasons
> to compute linkfulness, I'm still not sure why it's so hard to compute
> lazily.

Right now we compute it lazily in one spot: during selector matching
(ignoring for the moment all the places that compute it by accident).

This makes selector matching non-const.

We would like to be able to do selector matching in parallel.

Therefore it would be good to have it not mutate the state of the DOM.

This means either precomputing linkfulness before starting selector
matching (at all possible entrypoints into selector matching; that's the
approach being taken for now) or making it simpler to compute
linkfulness so we don't have to cache whether something is a link.

Does that make more sense?

-Boris

Neil

unread,
Jul 14, 2011, 12:41:24 PM7/14/11
to
Boris Zbarsky wrote:

> Right now we compute it lazily in one spot: during selector matching
> (ignoring for the moment all the places that compute it by accident).
>
> This makes selector matching non-const.
>
> We would like to be able to do selector matching in parallel.
>
> Therefore it would be good to have it not mutate the state of the DOM.
>
> This means either precomputing linkfulness before starting selector
> matching (at all possible entrypoints into selector matching; that's
> the approach being taken for now) or making it simpler to compute
> linkfulness so we don't have to cache whether something is a link.
>
> Does that make more sense?

I think so; lazy linkfulness doesn't scale to parallel threads because
of the locking involved. Unless somehow you can arrange for all linkful
selectors to be matched on the same thread ;-)

Boris Zbarsky

unread,
Oct 2, 2011, 4:52:31 PM10/2/11
to
On 7/13/11 11:17 AM, Boris Zbarsky wrote:
> With all that in mind, would people object to having "is a link" mean
> "has a href attribute" in this situation, and having the behavior for
> clicks on a link where we can't create an nsIURI just silently do
> nothing (though probably still preventDefault the event, etc)?

I filed https://bugzilla.mozilla.org/show_bug.cgi?id=691195 on this.

-Boris
0 new messages