The documentation generator: Modernizing the DOCTYPE to avoid quirks mode

142 views
Skip to first unread message

Frank Dana (FeRD)

unread,
Jun 21, 2025, 2:27:58 AMJun 21
to lua-l
Here's a patch suggestion to adjust the manual/2html generator script so that it uses proper HTML5 DOCTYPE tags when generating the manual HTML. In HTML5, the doctype should just be <!DOCTYPE html> and nothing else, all the other cruft is deprecated.

Using a "DOCTYPE legacy string" (as the current HTML standard calls it) with the additional attributes can cause browsers to activate quirks mode (as Chrome reports it has, when viewing the Lua docs) and that can affect the layout of pages.

I'm not sure, because I can't reproduce it with offline copies, but I think quirks mode processing might be the cause of a page-reflow issue I'm seeing with the online manual, in Chrome on Linux:
  1. If I go to https://www.lua.org/manual/5.4/contents.html#index
  2. And select one of the library functions, os.getenv for example (but it could be any one)
  3. The browser will navigate to https://www.lua.org/manual/5.4/manual.html#pdf-os.getenv
  4. The page will initially load with the os.getenv (varname) heading at the top of the window
  5. Then a fraction of a second later, as it finishes loading, it will reflow the document and the targeted heading will jump offscreen
  6. I have to scroll upwards about half a page-length to get back to it
Can't reproduce it in incognito mode, can't reproduce it with the offline copy installed in /usr/share/doc/lua/, can't reproduce it in Firefox. But it's been happening to me for a while now, in Chrome, and I'm hoping this might help. Even if it doesn't, using the correct, modern HTML5 doctype tag is a good idea anyway.

0001-2html-Use-modern-HTML5-doctype.patch.txt

Roberto Ierusalimschy

unread,
Jun 26, 2025, 2:36:45 PMJun 26
to lu...@googlegroups.com
Does anyone else have any opinion about that?

(Changing <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
in the manual to <!DOCTYPE html>.)

I would assume that modern browsers should be able to read HTML 4,
but maybe that is not the case. Can't this change cause issues in older
browsers?

> Here's a patch suggestion to adjust the manual/2html generator script so
> that it uses proper HTML5 DOCTYPE tags when generating the manual HTML. In
> HTML5, the doctype should just be <!DOCTYPE html> and nothing else, all the
> other cruft is deprecated.
>
> Using a "DOCTYPE legacy string" (as the current HTML standard calls it
> <https://html.spec.whatwg.org/multipage/syntax.html#the-doctype>) with the
> additional attributes can cause browsers to activate quirks mode
> <https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode> (as
> Chrome reports it has, when viewing the Lua docs) and that can affect the
> layout of pages.
>
> I'm not *sure*, because I can't reproduce it with offline copies, but I
> think quirks mode processing *might* be the cause of a page-reflow issue
> I'm seeing with the online manual, in Chrome on Linux:
>
> 1. If I go to https://www.lua.org/manual/5.4/contents.html#index
> 2. And select one of the library functions, os.getenv
> <https://www.lua.org/manual/5.4/manual.html#pdf-os.getenv> for example
> (but it could be any one)
> 3. The browser will navigate to
> https://www.lua.org/manual/5.4/manual.html#pdf-os.getenv
> 4. The page will initially load with the *os.getenv (varname)* heading
> at the top of the window
> 5. Then a fraction of a second later, as it finishes loading, it will
> reflow the document and the targeted heading will jump offscreen
> 6. I have to scroll upwards about half a page-length to get back to it
>
> Can't reproduce it in incognito mode, can't reproduce it with the offline
> copy installed in /usr/share/doc/lua/, can't reproduce it in Firefox. But
> it's been happening to me for a while now, in Chrome, and I'm hoping this
> might help. Even if it doesn't, using the correct, modern HTML5 doctype tag
> is a good idea anyway.
>
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/aed0f721-e35f-4f57-aeb3-a3d95e1288b2n%40googlegroups.com.

> >From 68456f5b928826eccf652c07d64669bb8c5aafeb Mon Sep 17 00:00:00 2001
> From: "FeRD (Frank Dana)" <fer...@gmail.com>
> Date: Sat, 21 Jun 2025 02:00:49 -0400
> Subject: [PATCH] 2html: Use modern HTML5 doctype
>
> Using the full HTML3/HTML4-format `<!DOCTYPE html PUBLIC "...">`
> file header causes the document to be interpreted in quirks mode[1].
> The correct[2] HTML5 doctype tag is merely `<!DOCTYPE html>`, full
> stop.
>
> [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode
> [2]: https://html.spec.whatwg.org/multipage/syntax.html#the-doctype
> ---
> manual/2html | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/manual/2html b/manual/2html
> index ac5ea043..860f8bf1 100755
> --- a/manual/2html
> +++ b/manual/2html
> @@ -8,7 +8,7 @@
>
> ---------------------------------------------------------------
> header = [[
> -<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
> +<!DOCTYPE html>
> <html>
>
> <head>
> --
> 2.49.0
>



-- Roberto

Sainan

unread,
Jun 26, 2025, 2:55:27 PMJun 26
to lu...@googlegroups.com
I think <!DOCTYPE html> has been sufficient for at least the last decade. And omitting it typically also works (tho also enables quirks mode).

-- Sainan

Luiz Henrique de Figueiredo

unread,
Jun 26, 2025, 3:17:39 PMJun 26
to lu...@googlegroups.com
> I would assume that modern browsers should be able to read HTML 4,

The Lua website uses this for a long time:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

bil til

unread,
Jun 30, 2025, 8:52:41 AMJun 30
to lu...@googlegroups.com
Am Do., 26. Juni 2025 um 20:36 Uhr schrieb Roberto Ierusalimschy
<rob...@inf.puc-rio.br>:
>
> Does anyone else have any opinion about that?

Lua documentation is perfectly fine for me.

Also thanks for great PiL "Lua applications cookbook".

All very compact and complete - and keeping the format typcially has
the advantage for better compatibility for "backwards version
comparisons / search applications".

Matthew Wild

unread,
Jun 30, 2025, 9:42:31 AMJun 30
to lu...@googlegroups.com
On Thu, 26 Jun 2025 at 19:36, Roberto Ierusalimschy
<rob...@inf.puc-rio.br> wrote:
>
> Does anyone else have any opinion about that?
>
> (Changing <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
> in the manual to <!DOCTYPE html>.)
>
> I would assume that modern browsers should be able to read HTML 4,
> but maybe that is not the case. Can't this change cause issues in older
> browsers?

As with many web things, it's unfortunately not as simple as "HTML 4"
vs "HTML 5" - different browsers historically have different parsers,
renderers, quirks modes, etc.[^1]. For the widest compatibility, the
HTML5 doctype is definitely the best choice, because the HTML5
specification finally clarified lots of undefined cases in older HTML
versions which previously varied between browsers. The HTML5 doctype
is also backwards-compatible with older browsers which do not
implement HTML5[^2], this is one reason it was chosen (compared to,
for example, just deprecating the DOCTYPE).

Note that HTML5 browsers (over 96% according to
https://caniuse.com/?search=html5 ) encountering the HTML5 doctype
*will* change behaviour compared to if they see a non-HTML5 doctype,
so you may want to adjust some minor things in the document to align
with HTML5. For example, generally avoid use of tags deprecated by
HTML5[^3].

Regards,
Matthew

[^1]: Attempt to document some of the pre-HTML5 modes used by
browsers: https://hsivonen.fi/doctype/
[^2]: HTML5 doctype backwards compatibility:
https://johnresig.com/blog/html5-doctype/
[^3]: This document defines differences from HTML4:
https://www.w3.org/TR/html5-diff/ (see 'Obsolete Elements' section for
example)

Frank Dana

unread,
Jun 30, 2025, 10:08:44 AMJun 30
to lu...@googlegroups.com
On Mon, Jun 30, 2025 at 9:42 AM Matthew Wild <mwi...@gmail.com> wrote:
For the widest compatibility, the
HTML5 doctype is definitely the best choice, because the HTML5
specification finally clarified lots of undefined cases in older HTML
versions which previously varied between browsers. The HTML5 doctype
is also backwards-compatible with older browsers which do not
implement HTML5[^2], this is one reason it was chosen (compared to,
for example, just deprecating the DOCTYPE).

What Matthew said. The [^2] link provided is the kicker; quoting from that article written in 2008(!):

 What’s nice about this new DOCTYPE, especially, is that all current browsers (IE, FF, Opera, Safari) will look at it and switch the content into standards mode – even though they don’t implement HTML5.

No browser will break on any DOCTYPE, but the HTML5 DOCTYPE in particular is maximally safe because (unlike all predecessors) it doesn't specify any particular HTML version or level of support. It just says, "Hey, this is HTML." Browsers will interpret the content according to the HTML standards level they support, which (as Matthew also said) for 96% of current web sessions means HTML5. It was safe to use 17 years ago, it's even safer now.

An audit of the manual's markup to assess its compliance with HTML5 couldn't hurt, just as a general housekeeping thing. (I'm sure there's some kind of linter we can point at the HTML.)

But that's like saying "backing up your files couldn't hurt". When is that ever not true? Unless the docs are using <BLINK> or <MARQUEE> and expecting them to work (they haven't for something like 30 years, now), they're probably not going to encounter any major issues in HTML5 standards mode.
Reply all
Reply to author
Forward
0 new messages