Usenet Markup

Jason Evans

unread,

Nov 12, 2021, 2:30:49 AM11/12/21

to

Some of you may have seen the HTML spammer in news.admin.misc and a few
other places. Someone else recently asked on Reddit why HTML isn't used
more on Usenet. I think the basic answer is because so many people still
use terminal-based newsreaders like SLRN and NN and HTML just makes
everything harder to read.

I think there could be a middle ground. Take something like gemtext which
is a very limited subset of the Markdown markup language. Gemtext has
only 5 different operations that are used for formatting:

Links:
=> https://blog.theuse.net My Blog

Headings:
# Heading
## Sub-heading
### Sub-sub-heading

Lists:
* Item 1
* Item 2
* Item 3

Blockquotes:
> That's what she said

Preformatted text/Source code:
```
#!/bin/bash

echo "this is a bash script"
```

Right now there are no newsreaders that handle this kind of markup but if
there were, users who do not use that newsreader would not be distracted
by it being in an article that they were reading because they don't
interfere with normal reading the way HTML interferes.

Just throwing this out there for discussion.

Grant Taylor

unread,

Nov 12, 2021, 1:40:34 PM11/12/21

to

On 11/12/21 12:30 AM, Jason Evans wrote:
> Some of you may have seen the HTML spammer in news.admin.misc and a few
> other places. Someone else recently asked on Reddit why HTML isn't used
> more on Usenet.

In a word, "convention".

> I think the basic answer is because so many people still use
> terminal-based newsreaders like SLRN and NN and HTML just makes
> everything harder to read.

Maybe, maybe not.

> I think there could be a middle ground. Take something like gemtext which
> is a very limited subset of the Markdown markup language.

gemtext seems like it might be fairly innocuous, much like
format=flowed. Though I suspect that gemtext's update would be lower
than format=flowed's uptake.

> Right now there are no newsreaders that handle this kind of markup but if
> there were, users who do not use that newsreader would not be distracted
> by it being in an article that they were reading because they don't
> interfere with normal reading the way HTML interferes.

Valid point.

> Just throwing this out there for discussion.

I think introducing another form of markup seems like it's only going to
muddy the water even more.

The wonderful thing about standards is that we have so many to pick
from. A la. xkce 927 -- Standards

https://xkcd.com/927/

--
Grant. . . .
unix || die

Adam H. Kerman

unread,

Nov 12, 2021, 3:35:12 PM11/12/21

to

Grant Taylor <gta...@tnetconsulting.net> wrote:
>On 11/12/21 12:30 AM, Jason Evans wrote:

>>Some of you may have seen the HTML spammer in news.admin.misc and a few
>>other places. Someone else recently asked on Reddit why HTML isn't used
>>more on Usenet.

>In a word, "convention".

>>I think the basic answer is because so many people still use
>>terminal-based newsreaders like SLRN and NN and HTML just makes
>>everything harder to read.

>Maybe, maybe not.

>>I think there could be a middle ground. Take something like gemtext which
>>is a very limited subset of the Markdown markup language.

>gemtext seems like it might be fairly innocuous, much like
>format=flowed. Though I suspect that gemtext's update would be lower
>than format=flowed's uptake.

And yet format=flowed is badly implemented on any number of Mail and
News clients.

Whether gemtext is innocuous depends on successful implementation in
clients.

I like plain text ASCII. It's universally readable.

Grant Taylor

unread,

Nov 12, 2021, 7:32:31 PM11/12/21

to

On 11/12/21 1:35 PM, Adam H. Kerman wrote:
> And yet format=flowed is badly implemented on any number of Mail and
> News clients.

I've been using format=flowed for close to 20 years without any problems.

Bugs exist in almost all computer programs. Implementations of
format=flowed, or default configurations therefor, are subject to
similar bugs.

> Whether gemtext is innocuous depends on successful implementation
> in clients.
>
> I like plain text ASCII. It's universally readable.

format=flowed *is* /plain/ /text/.

Adam H. Kerman

unread,

Nov 12, 2021, 8:32:18 PM11/12/21

to

Grant Taylor <gta...@tnetconsulting.net> wrote:
>On 11/12/21 1:35 PM, Adam H. Kerman wrote:

>>And yet format=flowed is badly implemented on any number of Mail and
>>News clients.

>I've been using format=flowed for close to 20 years without any problems.

I think that's wonderful. I've attempted to use it in different clients.
Almost none can handle it in followup. The quotes ended up nonstandard.

>Bugs exist in almost all computer programs. Implementations of
>format=flowed, or default configurations therefor, are subject to
>similar bugs.

Ok. That's a tautology.

>>Whether gemtext is innocuous depends on successful implementation
>>in clients.

>>I like plain text ASCII. It's universally readable.

>format=flowed *is* /plain/ /text/.

What does that have to do with my comment about ASCII? It's not
character set dependent.

I've seen clients produce long lines in format=flowed, which makes it
NOT universally readable if it won't output lines of 78 characters or
less. There's no reason to output text not intended to be displayed on
a screen width of 80 characters as it's designed to treat each paragraph
as one long line and reformat on the fly based on screen width.

Grant Taylor

unread,

Nov 12, 2021, 11:13:09 PM11/12/21

to

On 11/12/21 6:32 PM, Adam H. Kerman wrote:
> I think that's wonderful. I've attempted to use it in different clients.
> Almost none can handle it in followup. The quotes ended up nonstandard.

I've not noticed a problem in follow up replies vs new messages per se.
I think the problem has to do with the source material, be it newly
typed text or copy that's being replied to.

For instance, your comment above is not formatted per format=flowed
standards. I've slightly altered a copy below such that it is formatted
per format=flowed. I've also increased the quote depth.

>> I think that's wonderful. I've attempted to use it in different clients.
>> Almost none can handle it in followup. The quotes ended up nonstandard.

Here's another copy of it that has been completely reformatted the way
that I usually do things. Plus yet another increase in quote depth.

>>> I think that's wonderful. I've attempted to use it in different
>>> clients. Almost none can handle it in followup. The quotes ended
>>> up nonstandard.

:-)

> What does that have to do with my comment about ASCII? It's not
> character set dependent.

Sorry, I think of plain text as being ASCII. Or more precisely plain
text is a subset of ASCII.

> I've seen clients produce long lines in format=flowed, which makes it
> NOT universally readable if it won't output lines of 78 characters or
> less.

That statement tells me without a doubt that those long lines (as viewed
in the message source) are NOT format=flowed.

It sounds like you are instead talking about the simply really long /
unwrapped lines of text. Which is something that I consider to be an
abomination.

> There's no reason to output text not intended to be displayed on a
> screen width of 80 characters as it's designed to treat each paragraph
> as one long line and reformat on the fly based on screen width.

Why is there any reason to artificially limit /display/ of text to 80
-- or pick the number you prefer -- characters?

I believe that format=flowed is a wonderful happy medium. It allows
format=flowed enabled readers to re-wrap the text to the window width
for /display/ while the underlying source is fixed width. Thus
format=flowed will respect those that want the text to be re-wrapped
/and/ those that want text to be < 80 characters per line.

Format=flowed works by doing two things:

1) Adding a header indicating that format=flowed is being used.
2) Output lines of text up to but not exceeding a fixed width. That
fixed width is usually set to 72 through 76 characters.

The output lines that are supposed to be continued end with a space.

The existence of the format=flowed header means that any line that ends
with a space is supposed to have the next line appended to it.

Julien ÉLIE

unread,

Nov 13, 2021, 3:47:57 AM11/13/21

to

Hi Adam,

> I like plain text ASCII. It's universally readable.

Just responding to say I like UTF-8 better :-)

P.-S.: I think you now decode well my messages :-)

--
Julien ÉLIE

« Dès que le silence se fait, les gens le meublent. » (Raymond Devos)

Adam H. Kerman

unread,

Nov 13, 2021, 7:42:43 AM11/13/21

to

Grant Taylor <gta...@tnetconsulting.net> wrote:
>On 11/12/21 6:32 PM, Adam H. Kerman wrote:

>>I think that's wonderful. I've attempted to use it in different clients.
>>Almost none can handle it in followup. The quotes ended up nonstandard.

>I've not noticed a problem in follow up replies vs new messages per se.
>I think the problem has to do with the source material, be it newly
>typed text or copy that's being replied to.

The problem in my experience has not been the source material. With a
client that poorly implements it, I've observed that the quote of
material that began as flowed text is no longer flowed text.

>>. . .

>>What does that have to do with my comment about ASCII? It's not
>>character set dependent.

>Sorry, I think of plain text as being ASCII. Or more precisely plain
>text is a subset of ASCII.

I'm not going to dispute that plain text using the Latin alphabet in a
language other that English with accented characters is plain text.
However, I do not recognize as plain text substituting open and close
single and double quotes in punctuation or em and en dashes for which
there are perfectly good ASCII punctuation marks. I have yet to see one
of these "smart quote" implementations that distinguish between single
close quote, apostrophe, and acute accent. All three may be represented
by the same glyph but they have separate character codes in UTF.

>>I've seen clients produce long lines in format=flowed, which makes it
>>NOT universally readable if it won't output lines of 78 characters or
>>less.

>That statement tells me without a doubt that those long lines (as viewed
>in the message source) are NOT format=flowed.

format=flowed is not line length dependent, for the entire paragraph may
be one long line or a series of lines of widely varying length, and
still display as intended within the viewport.

I looked at RFC 3676. The recommendation not to exceed 78 characters is
a SHOULD, not a MUST.

>It sounds like you are instead talking about the simply really long /
>unwrapped lines of text. Which is something that I consider to be an
>abomination.

No, I'm not. I'm talking about those who use a variable-width
character set instead of a fixed-width character set when composing News
and Mail. The line length is then set by their viewport, ignoring the
needs of those of us who continue to expect output for an 80 character
width terminal or emulation.

>>. . .

Adam H. Kerman

unread,

Nov 13, 2021, 8:46:12 AM11/13/21

to

Julien Ã LIE <iul...@nom-de-mon-site.com.invalid> wrote:

I am aware that you spell your last name with Latin Capital E With Acute
Accent. Your encoded word is =c3=80. I'm using the vim text editor to edit
this followup. I see Latin Capital A with Tilde.

>Hi Adam,

>> I like plain text ASCII. It's universally readable.

>Just responding to say I like UTF-8 better :-)

>P.-S.: I think you now decode well my messages :-)

No. It's never displayed on my UTF-8 terminal emulations as you intended.
On the Linux Mint laptop I'm using Xfce Terminal Emulator.

>--
>Julien ÉLIE

I see Latin Capital A with Tilde, then <89> which is an undecoded display.

>« Dès que le silence se fait, les gens le meublent. » (Raymond Devos)

Here I see several non-printing characters and various accented letters
not displaying as you intended.

Grant Taylor

unread,

Nov 13, 2021, 12:10:08 PM11/13/21

to

On 11/13/21 5:42 AM, Adam H. Kerman wrote:
> The problem in my experience has not been the source material. With
> a client that poorly implements it, I've observed that the quote of
> material that began as flowed text is no longer flowed text.

Ah.

I too have seen many email / news clients not properly handle quoting of
previously format=flowed text.

To me, that quote text is /new/ text that happens to resemble the
original format=flowed text. Somewhat like a photo copy of a photo copy.

You are referring to an area where there are far more bugs. An area
where I have taken to manually reformatting text that go into articles
that I reply to. I have a script that I use to reformat quoted text.

Use of format=flowed is important enough to me that I take time to
reformat=flowed (reflow?) text in messages that I reply to.

> I'm not going to dispute that plain text using the Latin alphabet in
> a language other that English with accented characters is plain text.
> However, I do not recognize as plain text substituting open and close
> single and double quotes in punctuation or em and en dashes for which
> there are perfectly good ASCII punctuation marks. I have yet to see one
> of these "smart quote" implementations that distinguish between single
> close quote, apostrophe, and acute accent. All three may be represented
> by the same glyph but they have separate character codes in UTF.

That statement didn't go where I thought it was going to go. I will say
that I may have simplified "plain text" to some extent. But I didn't
think this thread was going far enough into the weeds about that. I'll
mostly bow out of that discussion.

Mostly as in there is a difference in meaning of a dash (a.k.a. hyphen),
an en-dash, and an em-dash. Admittedly they are lost on many people.

I think a comparison can be made to an "o", "O", and "0". As there was
a time when people had to be taught to use the proper letter in the
proper context, particularly when interacting with computers.

1-5 (dash / hyphen) vs 1–5 (en-dash) Am I subtracting 5 from 1 or am I
saying 1 through 5? The dash vs en-dash makes a difference.

<Title> — <subtitle> An em-dash is a form of separator / pause.

There are differences between the "-" (dash / hyphen), "–" (en-dash),
and "—" (em-dash). Some may think that the differences are an
unnecessary nuance.

> format=flowed is not line length dependent, for the entire paragraph
> may be one long line or a series of lines of widely varying length,
> and still display as intended within the viewport.

Which is one of the reasons that I like format=flowed as much as I do.

> I looked at RFC 3676. The recommendation not to exceed 78 characters
> is a SHOULD, not a MUST.

Yep.

> No, I'm not. I'm talking about those who use a variable-width character
> set instead of a fixed-width character set when composing News and
> Mail. The line length is then set by their viewport, ignoring the
> needs of those of us who continue to expect output for an 80 character
> width terminal or emulation.

The font face should not make a difference. Variable vs fixed width
text should support format=flowed perfectly fine.

The fact that you are running into a hard line length — other than
someone using the wrong value close to 78 — tells me that you are
dealing with something that's not implementing format=flowed properly.

Grant Taylor

unread,

Nov 13, 2021, 12:14:54 PM11/13/21

to

On 11/13/21 6:46 AM, Adam H. Kerman wrote:
> No. It's never displayed on my UTF-8 terminal emulations as you
> intended. On the Linux Mint laptop I'm using Xfce Terminal Emulator.

Sounds to me like you need to get a better terminal emulator.

XTerm displays it just fine and looks identical to what Thunderbird
displays.

> I see Latin Capital A with Tilde, then <89> which is an undecoded
> display.

Or perhaps your MUA / editor needs some tweaking.

I'm able to see Julien's text just fine inside of vim in XTerm.

> Here I see several non-printing characters and various accented
> letters not displaying as you intended.

Sounds to me like you're seeing the message source, not a rendered
message. Hence the MUA comment.

Grant Taylor

unread,

Nov 13, 2021, 12:17:09 PM11/13/21

to

On 11/13/21 10:14 AM, Grant Taylor wrote:
> Sounds to me like you need to get a better terminal emulator.

> ...

> Or perhaps your MUA / editor needs some tweaking.

> ...

> Sounds to me like you're seeing the message source, not a rendered
> message. Hence the MUA comment.

You are obviously free to use whatever software / hardware that you want
to use.

But I implore you to understand where the limitation is. Maybe it's a
configuration issue of what you're using. Maybe it's a lack of capability.

Use what you want to, but understand what and how your choice impacts
what you do.

Adam H. Kerman

unread,

Nov 13, 2021, 1:25:32 PM11/13/21

to

Of course not, but I didn't write the client to flout conventional line
length.

>Variable vs fixed width text should support format=flowed perfectly fine.

>The fact that you are running into a hard line length — other than
>someone using the wrong value close to 78 — tells me that you are
>dealing with something that's not implementing format=flowed properly.

I'm disagreeing with you on that. Outputting conventional line length is
a separate issue from outputting standard flowed text.

Adam H. Kerman

unread,

Nov 13, 2021, 1:32:24 PM11/13/21

to

Grant Taylor <gta...@tnetconsulting.net> wrote:
>On 11/13/21 6:46 AM, Adam H. Kerman wrote:

>>No. It's never displayed on my UTF-8 terminal emulations as you
>>intended. On the Linux Mint laptop I'm using Xfce Terminal Emulator.

>Sounds to me like you need to get a better terminal emulator.

It may not be the terminal emulator. I have a similar problem with
Julien's articles on PuTTY on a Windows 8.1 desktop.

My guess is it's something in the LOCALE setting of the remote terminal
but I have no idea what it could be since both the terminal and the two
emulations are set to UTF-8.

>Or perhaps your MUA / editor needs some tweaking.

Then you'll have to clue me in. As far as I'm aware, vim doesn't touch
this stuff and can't override the terminal emulation.

>>Here I see several non-printing characters and various accented
>>letters not displaying as you intended.

>Sounds to me like you're seeing the message source, not a rendered
>message. Hence the MUA comment.

When there's an incompatibility, non-printing characters become visible.
I sometimes do that deliberately to make sure I've eliminated them when
I intend to send plain text.

Grant Taylor

unread,

Nov 13, 2021, 7:41:42 PM11/13/21

to

On 11/13/21 11:25 AM, Adam H. Kerman wrote:
> I'm disagreeing with you on that. Outputting conventional line length
> is a separate issue from outputting standard flowed text.

Please provide an example of where you are seeing a problem.

I'm still not tracking where the problem would relate to format=flowed.

Grant Taylor

unread,

Nov 13, 2021, 7:43:37 PM11/13/21

to

On 11/13/21 11:32 AM, Adam H. Kerman wrote:
> My guess is it's something in the LOCALE setting of the remote terminal
> but I have no idea what it could be since both the terminal and the
> two emulations are set to UTF-8.

Maybe.

> Then you'll have to clue me in. As far as I'm aware, vim doesn't
> touch this stuff and can't override the terminal emulation.

I was thinking that the MUA might not be processing / decoding things
correctly and thus displaying them incorrectly.

Adam H. Kerman

unread,

Nov 13, 2021, 8:35:32 PM11/13/21

to

That would be my point. It's an unrelated issue.

Julien ÉLIE

unread,

Nov 19, 2021, 1:42:16 PM11/19/21

to

Hi Adam,

> It may not be the terminal emulator. I have a similar problem with
> Julien's articles on PuTTY on a Windows 8.1 desktop.
>
> My guess is it's something in the LOCALE setting of the remote terminal
> but I have no idea what it could be since both the terminal and the two
> emulations are set to UTF-8.

Do you have "UTF-8" as remote character set in the Window > Translation
parameter of PuTTY?

Which locale do you use on the remote terminal?
export LC_ALL=en_US.utf8
export LANG=en_US.utf8
don't do the trick?

--
Julien ÉLIE

« – Depuis quelque temps, Zérozérosix n'attire plus les mouches !
– Espérons qu'il ne nous attirera pas d'ennuis ! » (Astérix)

Donkey Button

unread,

Nov 20, 2021, 12:32:31 PM11/20/21

to

On 11/12/21 2:35 PM, Adam H. Kerman wrote:

[...]

> I like plain text ASCII. It's universally readable.

I agree. I love the old vanilla. That said, English is not the only
language in use. Also: math.

We are at a point where all terminals and graphical clients should be
UTF-8 compliant, and attaching postscript-like font blobs and simple
formatting instructions could be part of the standard. With HTML we
attach CSS/JS. A simpler scheme than in HTML allows for multiple
languages. It would also allow for intricate mathematical notation.

Of course I believe default should be ASCII and nothing should be
attached that doesn't need to be. I also think that any kind of attached
resource should always be at the very bottom of the text file, never at
the top. Inline markup should be numerical indexes instead of style
tags, to prevent a lot of visual clutter. The human eye gets accustomed
to numbers in parenthesis quickly and soon just tunes them out.

For instance one might have some UTF-8 math operators or Greek glyphs
with a enclosed font in postscript/base64 format. The postscript code
for the glyphs should be at the end of the file with a numerical index
like this:

(31) RG9ua2V5IEJ1dHRvbiBydWxlcyB0aGUgSW50ZXJuZXQK
(32) cG9zdHNjcmlwdCBnbHlwaHMgYXQgZW5kIG9mIGZpbGUK
(33) ZmFrZSBwb3N0c2NyaXB0IGdseXBoIGluZGV4IGNvZGUK

And in the text body it would be inserted like this:

(32:) unicode symbols or entity codes go here (:32)
(33:) Spo9AmpYwPrv220TwtNDQm61lv81m/zJ (:33)

Index 33 would theoretically be UTF-8, but I substituted base64 just for
the example.

Readers should hide the index tags by default. Readers that can't render
them should allow display of the tags and code or hiding all tagged code.

Index tags can be used for any kind of formatting. However the format
instructions should not be part of the index tags. Rather the
instructions should reside at the footer of the document and be
referenced by a numerical tag. This also has the side effect of clear
and unambiguous semantic meaning without boilerplate.

In this way plain text would just render as plain text without any need
for highly-distracting markup or instruction, and formatted text and
font resources would not obstruct any plain text in the document.

Command-line browsers and readers like lynx and slrn would only need a
few lines of code to detect and adapt the scheme for modern terminals
that are UTF-8 compliant.

--
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Donkey Button

(29) Spo9AmpYwPrv220TwtNDQm61lv81m/zJ
(30) RG9ua2V5IEJ1dHRvbiBpcyBqdXN0IEFTQ0lJIHRleHQK
-----BEGIN PGP SIGNATURE-----

iIoEARYIADIWIQTVsPVXpYZunw9X5fAPDrQ1oja5vAUCYZkwjhQcZG9ua2V5QGJ1
dHRvbi5hc2NpaQAKCRAPDrQ1oja5vME7AP4svyYiU8C0ppKJSUYGjSomOqBV3hMZ
UzfhIn6di1Y9bAEAlhQ9yX/cnPxYbAT7s6Obon5TFOYZc5T10hTzIieRowg=
=P7T5
-----END PGP SIGNATURE-----

Adam H. Kerman

unread,

Nov 20, 2021, 4:32:50 PM11/20/21

to

Donkey Button <don...@button.ascii> wrote:
>On 11/12/21 2:35 PM, Adam H. Kerman wrote:

>[...]

>>I like plain text ASCII. It's universally readable.

>I agree. I love the old vanilla. That said, English is not the only

>language in use. Also: math. . . .

What idiot would post a noncontroversial opinion like this through
mixmin?

hi seamus

fuck off seamus

bye seamus

Donkey Button

unread,

Nov 21, 2021, 12:37:54 AM11/21/21

to

On 11/20/21 3:32 PM, Adam H. Kerman wrote:

> What idiot would post a noncontroversial opinion like this through
> mixmin?

*plonk*

The name-calling and [snipped] profanity is unhinged and pubescent
behavior. It seems like this person was asking for his address to
be killfiled. I have granted his subliminal request.

--
Donkey Button

Adam H. Kerman

unread,

Nov 21, 2021, 1:26:10 PM11/21/21

to

hi seamus

fuck off seamus

bye bye seamus

Kefra Gotex

unread,

Nov 7, 2022, 6:48:18 AM11/7/22

to

I think markdown would be slightly better, since it can syntactically be
rendered as HTML, with all styling controlled by the reader rather than
the author.

For those addicted to extreme simplicity, a half-dozen markdown rules is
all they would need to know, which would put it on par with Gemini text.

A custom NNTP header in each message could indicate the format for
readers aware of formats. It could be like a !DOCTYPE declartion. For
example:

Doctype: html5
Doctype: xml
Markup: gemini 1.0
Markup: markdown
Markup: commonmark
Markup: GFM 3.2
Markup: RST
Markup: asciidoc
Markup: bbcode
Markup: wikimedia

Readers without markup awareness could just display as is.

Mime dividers could allow multiple formats to be attached. This would
allow the client software to choose which format to render. Using a
proper compression algorithm like 7z or xz would deflate multiple
markups of the same text very well since they would share most words in
common. Therefore bandwidth inflation would not really be an issue.

The big tech shills would want the client to access a remote URI to get
or validate the doctype or markup declaration, so they can get IPs of
Usenetizens. The moment anyone would try to put this poison into the
protocol, it would be necessary to expose it for its true motivation.
There is no reason whatsoever for any NNTP client to access any URI to
validate any formatting declaration. A RFC would need to run ahead of
this making clear that URI access is prohibited for rendering of any
format. Look at google API scripts and fonts for an example of how that
surveillance operation works in web pages.