On Wed, Dec 11, 2013 at 7:16 AM, Tom Feist <
sha...@gmail.com> wrote:
>
> On 10 Dec 2013, at 22:21, George Nachman <
geo...@google.com> wrote:
>
>> The problem is that too many things are potentially URLs, and that would make the cmd-click behavior unpredictable. The regex you link to fails to catch a class of URLs that matter (at least to me and my colleagues), of the form hostname/argument, which is commonly used in the intranets of at least a couple big companies.
>>
>> Stripping trailing punctuation should be safe to do. What is the set of trailing characters that should be removed? I'm thinking
>>
>> .!)
>>
>
> Note that stripping ')' can be problematic for things like wikipedia links, where you commonly encounter things like:
> '
https://en.wikipedia.org/wiki/Culture_(disambiguation)' (although they should correctly be encoded as %28/%29, but
> in practice have been a common irritation in things like markdown parsers when they're not.
In the same vein of what you just typed, I guess quotes should be
ignored (gmail ignored the "'", but not the ")").
>
> I'm not sure if there's a good solution that isn't massively overkill like trying to match balanced delimiters or
> attempting to special-case (ugh) the obvious suspects.
>
> One UX workaround might be to have the terminal highlight the match-under-cursor when cmd is held, so the user can
> confirm the right bits are being {in,ex}cluded, and have the option to drag-select/refine otherwise.
That's too complicated. Just open what you think the url is. If it's
wrong, you'll get a 404, and it will be clear (well, hopefully) that
the url is too much or not enough.
>
> I remember that auto-url highlighting is an old and rather tricky feature request, but iirc that was mostly performance
> based, which requiring the keydown before scanning might solve.
>
> Priority/Precision based list of regex rules similar to the smart selection might also help, or at least punt the problem
> off to the user, if they often want to operate on weirdly-formatted urls :)
Frankly, I would try to find some well-used library that has already
solved this, to at least borrow its regular expressions. Presumably
some common markdown renderers do the job well. Things like that that
match too much or too little tend to get bugged about it and it gets
fixed.
Aaron Meurer
>
> --Tom