In message <
l09qngtdcloijqebh...@4ax.com>, at 12:43:55 on Sat, 30 Oct 2021, Mark Goodge <use...@listmail.good-
stuff.co.uk> remarked:
>On Sat, 30 Oct 2021 07:31:45 +0100, Roland Perry <
rol...@perry.co.uk>
>wrote:
>
>>I understand the original url I was asking about isn't available to Martin any
>>more, so I'll repost it; asking Martin how his reader renders one of mine that's
>>more than 72 chars long, but also has hyphens in:
>>
>> "Does this one get split (and if so, at 45 or at 72 chars) or are all
>> 75 in line:"
>>
>><
https://www.realtimetrains.co.uk/service/gb-nr:L22178/2021-10-23/detailed>
>>
>>Given the number of complaints about split urls (hidden words... "being
>>truncated when extracted"), I think it's important to understand which get split
>>and why, and where. Only then can I try to prevent it, where possible.
>
>Your news client hard-wraps lines even if they contain no spaces, which
>means it will break a long URL. It will try to break a long line on a
>hyphen if possible, but if not it will just hard-wrap at precisely your
>line length. So your news client will transmit this:
>
>foobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbar
>
>as this
>
>foobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfo
>obazbarfoobazbarfoobazbarfoobazbar
Can we just check that? I'll leave off the <> wrappers, although in practice
I would always add them.
foobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbarfoobazbar
>but will transmit this
>
>foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar
>
>as this
>
>foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-
>foobazbar-foobazbar-foobazbar-foobazbar-foobazbar
Let's check that too (as above, willing to be proven wrong - this is a learning experience!):
foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar-foobazbar
>The preferred solutions to that, these days, are either not to hard-wrap
>long lines, or to soft-wrap instead so that they can easily be
>reassembled by a recipient news client which understands format=flowed.
>Agent uses the former, Thunderbird the latter, I'm not sure offhand what
>other news clients do without checking them.
>
>Back in the days when Usenet was mostly read on fixed-width terminal
>windows, though, neither of those options was practical. If you didn't
>wrap a line at all, then part of it would be invisible on a fixed-width
>terminal. And soft-wrapping wasn't formalised until 2004, so any
>software older than that won't support it. So a workaround, back then,
>specifically for URLs, was to enclose them within < and > delimiters.
It's still in the rfc.
>This doesn't prevent them being split in transmission, but it does give
>a visual signal of the extent of the URL - in particular, it indicates
>that the URL extends over multiple lines and therefore needs to be
>reassembled by the reader. Some news software will attempt to be clever
I think you mean "standards compliant".
>and automatically reassemble a split line within those delimiters,
>although that's not necessarily reliable.
>
>However, that's pretty much outdated now, given that almost everyone
>reads news on a GUI screen rather than a terminal window, and the
>software can either cope with long lines (Agent, for example, will
>visually wrap anything wider than the viewing window without actually
>breaking the underlying text) or understands format=flowed and therefore
>isn't bound by the line length settings on the sending client.
>
>One of the main reasons why the < > convention lingers, though, is the
>malign influence of Turnpike, which stubbornly persisted with pre-GUI
>conventions long after other news software had moved on and, despite
>being obsolete and no longer maintained, still has a dedicated, if
>dwindling, cohort of fans.
Perhaps you'd like to propose a new rfc?
--
Roland Perry