HTMLSTREAM: an imagestream that writes HTML files

5 views
Skip to first unread message

Herb Jellinek

unread,
Jun 17, 2025, 6:39:29 PMJun 17
to Interlisp core
Hi everyone,

I've written a new package, HTMLSTREAM, an implementation of the Medley
Interlisp device-independent graphics API that writes HTML file output.

To get started, please visit https://github.com/hjellinek/htmlstream and
follow the instructions.

            Herb

Paolo Amoroso

unread,
Jun 20, 2025, 11:05:26 AMJun 20
to Herb Jellinek, Interlisp core
I used HTMLSTREAM and the Hardcopy command of TEdit to generate the attached HTML file of the TEdit documentation file of the Calendar LispUsers module. Very nice.


--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lispcore/bf83768c-3524-485b-98aa-c2d2022431d5%40newscenter.com.


--
CALENDAR.html

Ron Kaplan

unread,
Jun 20, 2025, 11:46:08 AMJun 20
to Paolo Amoroso, Herb Jellinek, Interlisp core
This looks different than the pdf:  the paragraphs are justified in the pdf but not in the html, and the line breaks are in different places.

Is that to be expected?

Herb Jellinek

unread,
Jun 20, 2025, 2:11:18 PMJun 20
to Ron Kaplan, Paolo Amoroso, Interlisp core
I wonder if the two imagestreams have implemented IMSPACEFACTOR inconsistently.  I'll have to look into that.

            Herb

Ron Kaplan

unread,
Jun 20, 2025, 2:27:53 PMJun 20
to Herb Jellinek, Paolo Amoroso, Interlisp core
I supposed we wouldn't expect the line breaks to be in the same places, because the font widths are probably different.  But if html can support the space factor, the justification should work.

The other thing I noticed is that the page numbers in the pdf are centered on the page, in html they're before the left margin.

--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.

Herb Jellinek

unread,
Jun 20, 2025, 2:50:42 PMJun 20
to Ron Kaplan, Paolo Amoroso, Interlisp core
You are correct about where line breaks might fall.  Page breaks, too.

I created GitHub issues for the ragged-right and centering problems you and Paolo reported.

            Herb

Paolo Amoroso

unread,
Jun 20, 2025, 3:11:50 PMJun 20
to Herb Jellinek, Interlisp core
NoteCards meets the web, take 2 😀 (courtesy of HTMLSTREAM and Hardcopy)

ncdemo1.html
ncdemo2.html

Herb Jellinek

unread,
Jun 20, 2025, 3:24:42 PMJun 20
to Paolo Amoroso, Interlisp core
I've been thinking again about how we might enable links like HTML's <a> tag in TEdit.  We discussed this a few months ago.

It would be easiest to create an imageobj that serves the purpose like this 🔗, I suppose.  But would it be straightforward to implement a sort of "character looks" to TEdit, so you could add a link to a section of a document as in HTML?

            Herb

Larry Masinter

unread,
Jun 22, 2025, 2:37:49 PMJun 22
to Herb Jellinek, Paolo Amoroso, Interlisp core
Seems to me that you'll need to add a 'link from imageobj' anyway to handle <A> around <img> and the like.

So while some HTML hyperlinks from runs of text could be modelled as extended "looks", it's a bigger change to the underlying model for not much gain.
 


--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.

Nick Briggs

unread,
Jun 25, 2025, 1:09:29 PMJun 25
to Paolo Amoroso, Herb Jellinek, Lisp Core
In the calendar.html you included, when displayed in Safari, the header renders as (viewing page source, and screenshot)

        <text class='nsm sz10' fill='#000000' x='0' y='75600'>27</text>
        <text class='nsf sz24 bold' fill='#000000' x='8400' y='4800'>en·v</text>
        <text class='nsf sz18 bold' fill='#000000' x='15332' y='4800'>Ì„</text>
        <text class='nsf sz18 bold' fill='#000000' x='15649' y='4800'>o</text>
        <text class='nsf sz24 bold' fill='#000000' x='17108' y='4800'>s</text>
        <text class='ns sz10' fill='#000000' x='47202' y='4800'>CALENDAR</text>

Screenshot 2025-06-25 at 10.04.21 AM.png

Is that how it rendered on-screen for you?  The rest of the text looks reasonable.


On Jun 20, 2025, at 8:05 AM, Paolo Amoroso <paolo....@gmail.com> wrote:

Herb Jellinek

unread,
Jun 25, 2025, 1:31:13 PMJun 25
to Nick Briggs, Paolo Amoroso, Lisp Core
It looked right in Chrome and, I think, Firefox, the other day.

Now, however, it renders in both with a Japanese (虅) and Simplified Chinese (路) character:



<text class='nsf sz24 bold' fill='#000000' x='8400' y='4800'>en路v</text>
<text class='nsf sz18 bold' fill='#000000' x='15332' y='4800'>虅</text>

<text class='nsf sz18 bold' fill='#000000' x='15649' y='4800'>o</text>
<text class='nsf sz24 bold' fill='#000000' x='17108' y='4800'>s</text>


(I can't account for the difference except to blame my own memory.)

That "·" stuff looks like the sort of garble you see when a client misinterprets UTF-8.  Maybe Safari wants the document to declare its encoding?  I'll investigate that and what the XCCS characters were in the TEdit document.

            Herb

Larry Masinter

unread,
Jun 25, 2025, 2:48:38 PMJun 25
to Herb Jellinek, Nick Briggs, Paolo Amoroso, Lisp Core

Always declare the encoding of your document using a meta element with a charset attribute, or using the http-equiv and content attributes (called a pragma directive). The declaration should fit completely within the first 1024 bytes at the start of the file, so it's best to put it immediately after the opening head tag.






--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.

Paolo Amoroso

unread,
Jun 25, 2025, 2:51:27 PMJun 25
to Nick Briggs, Herb Jellinek, Lisp Core
Firefox on Linux renders it like this:

envos.png

Herb Jellinek

unread,
Jun 25, 2025, 3:02:58 PMJun 25
to Paolo Amoroso, Nick Briggs, Lisp Core
And on my Mac running macOS 13.7.6, Safari 18.5 renders it identically to Firefox and Chrome.



Inserting the pragma <meta charset="utf-8"> in the head doesn't change anything.

Very curious!

Nick, can you add that to CALENDAR.html and see if it fixes the garbled characters you're seeing?

<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<title>....


                Herb

Ron Kaplan

unread,
Jun 25, 2025, 3:48:32 PMJun 25
to Herb Jellinek, Paolo Amoroso, Nick Briggs, Lisp Core
FWIW, when I click on the link on my iphone, the html source shows up as text.

On Jun 25, 2025, at 12:02 PM, Herb Jellinek <jell...@newscenter.com> wrote:

And on my Mac running macOS 13.7.6, Safari 18.5 renders it identically to Firefox and Chrome.

<screenshot_905.png>


Inserting the pragma <meta charset="utf-8"> in the head doesn't change anything.

Very curious!

Nick, can you add that to CALENDAR.html and see if it fixes the garbled characters you're seeing?

<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<title>....


                Herb

On 6/25/25 11:51 AM, Paolo Amoroso wrote:
Firefox on Linux renders it like this:

<envos.png>


On Wed, Jun 25, 2025 at 7:09 PM Nick Briggs <nicholas...@gmail.com> wrote:
In the calendar.html you included, when displayed in Safari, the header renders as (viewing page source, and screenshot)

        <text class='nsm sz10' fill='#000000' x='0' y='75600'>27</text>
        <text class='nsf sz24 bold' fill='#000000' x='8400' y='4800'>en·v</text>
        <text class='nsf sz18 bold' fill='#000000' x='15332' y='4800'>Ì„</text>
        <text class='nsf sz18 bold' fill='#000000' x='15649' y='4800'>o</text>
        <text class='nsf sz24 bold' fill='#000000' x='17108' y='4800'>s</text>
        <text class='ns sz10' fill='#000000' x='47202' y='4800'>CALENDAR</text>

Nick Briggs

unread,
Jun 25, 2025, 4:33:04 PMJun 25
to Herb Jellinek, Paolo Amoroso, Lisp Core
Yes, adding the <meta charset="utf-8"> makes it render the correct characters.  However, then another problem is revealed -- the macron that should be over the "o" appears offset to the left --

Screenshot 2025-06-25 at 1.20.47 PM.png
(as rendered by Safari) or

Screenshot 2025-06-25 at 1.22.32 PM.png

(as rendered by Google Chrome, which seems to assume utf-8 even without the charset declaration)

This probably isn't relevant, but If you take just the characters, without the HTML surrounding them, and have them rendered by, say, emacs or the Terminal, you get (view scaled up in emacs...)

Screenshot 2025-06-25 at 1.26.35 PM.png

which is I presume because that's a non-spacing diacritic, though in the HTML it's being explicitly placed.


On Jun 25, 2025, at 12:02 PM, Herb Jellinek <jell...@newscenter.com> wrote:

And on my Mac running macOS 13.7.6, Safari 18.5 renders it identically to Firefox and Chrome.

<screenshot_905.png>


Inserting the pragma <meta charset="utf-8"> in the head doesn't change anything.

Very curious!

Nick, can you add that to CALENDAR.html and see if it fixes the garbled characters you're seeing?

<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<title>....


                Herb

On 6/25/25 11:51 AM, Paolo Amoroso wrote:
Firefox on Linux renders it like this:

<envos.png>


On Wed, Jun 25, 2025 at 7:09 PM Nick Briggs <nicholas...@gmail.com> wrote:
In the calendar.html you included, when displayed in Safari, the header renders as (viewing page source, and screenshot)

        <text class='nsm sz10' fill='#000000' x='0' y='75600'>27</text>
        <text class='nsf sz24 bold' fill='#000000' x='8400' y='4800'>en·v</text>
        <text class='nsf sz18 bold' fill='#000000' x='15332' y='4800'>Ì„</text>
        <text class='nsf sz18 bold' fill='#000000' x='15649' y='4800'>o</text>
        <text class='nsf sz24 bold' fill='#000000' x='17108' y='4800'>s</text>
        <text class='ns sz10' fill='#000000' x='47202' y='4800'>CALENDAR</text>

<Screenshot 2025-06-25 at 10.04.21 AM.png>

Is that how it rendered on-screen for you?  The rest of the text looks reasonable.


On Jun 20, 2025, at 8:05 AM, Paolo Amoroso <paolo....@gmail.com> wrote:

I used HTMLSTREAM and the Hardcopy command of TEdit to generate the attached HTML file of the TEdit documentation file of the Calendar LispUsers module. Very nice.


On Wed, Jun 18, 2025 at 12:39 AM Herb Jellinek <jell...@newscenter.com> wrote:
Hi everyone,

I've written a new package, HTMLSTREAM, an implementation of the Medley
Interlisp device-independent graphics API that writes HTML file output.

To get started, please visit https://github.com/hjellinek/htmlstream and
follow the instructions.

             Herb

--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lispcore/bf83768c-3524-485b-98aa-c2d2022431d5%40newscenter.com.


--

--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lispcore/CAGi1hztWK23hL%3DJUif_shCsTDgmMBmYDbz0bkY7SACwwiWKcOw%40mail.gmail.com.
<CALENDAR.html>

Ron Kaplan

unread,
Jun 25, 2025, 4:42:24 PMJun 25
to Nick Briggs, Herb Jellinek, Paolo Amoroso, Lisp Core
That may be a consequence of the fact that MCCS has the diacritics in front of the base character, Unicode puts them afterwards.

The Tedit line formatter should know how to do diacritics for the different output devices, for now it just tries to center the diacritic at the middle of the next character, which it assumes is the base.

And in some cases it should do substitutions of rendering characters for the base+diacritic sequence.

But right now, it doesn't really have reliable information about what is and what is not a diacritic, it just as a dumb predicate based on the particular character codes in charset 0.

I thought of making the distinction based on the fact that the advance-width of diacritics, but I'm not sure that is actually reflected in our various font metrics.  Perhaps that should be a character feature that we put into the Unicode tables so that we can back-fit it into MCCS.

On Jun 25, 2025, at 1:32 PM, Nick Briggs <nicholas...@gmail.com> wrote:

Yes, adding the <meta charset="utf-8"> makes it render the correct characters.  However, then another problem is revealed -- the macron that should be over the "o" appears offset to the left --

<Screenshot 2025-06-25 at 1.20.47 PM.png>
(as rendered by Safari) or

<Screenshot 2025-06-25 at 1.22.32 PM.png>

(as rendered by Google Chrome, which seems to assume utf-8 even without the charset declaration)

This probably isn't relevant, but If you take just the characters, without the HTML surrounding them, and have them rendered by, say, emacs or the Terminal, you get (view scaled up in emacs...)

Nick Briggs

unread,
Jun 25, 2025, 4:55:59 PMJun 25
to Ron Kaplan, Herb Jellinek, Paolo Amoroso, Lisp Core

On Jun 25, 2025, at 1:42 PM, Ron Kaplan <ron.k...@post.harvard.edu> wrote:

That may be a consequence of the fact that MCCS has the diacritics in front of the base character, Unicode puts them afterwards.

Yes.

The Tedit line formatter should know how to do diacritics for the different output devices, for now it just tries to center the diacritic at the middle of the next character, which it assumes is the base.

Hmmm... but TEdit can never know all possible output devices -- I think TEdit should know what the input it's processing intended, and then it's the output device's job to position things (or encode them) accordingly.


And in some cases it should do substitutions of rendering characters for the base+diacritic sequence.

Yes, and one may also find that where the author of the input document directly used a rendering code

Herb Jellinek

unread,
Jun 25, 2025, 5:12:50 PMJun 25
to Nick Briggs, Ron Kaplan, Paolo Amoroso, Lisp Core
I just pushed a new HTMLSTREAM version that writes the meta charset pragma.

https://github.com/hjellinek/HtmlStream/issues/10

            Herb

Matt Heffron

unread,
Jun 25, 2025, 11:24:52 PMJun 25
to Ron Kaplan, Nick Briggs, Herb Jellinek, Paolo Amoroso, Lisp Core
The Unicode Character Database https://www.unicode.org/ucd/ should have all the information about the characters. E.g., Alphabetic, numeric, punctuation, diacritic, UPPER/lower case, etc.

Sent via the Samsung Galaxy S22+ 5G, an AT&T 5G smartphone
Get Outlook for Android

From: lisp...@googlegroups.com <lisp...@googlegroups.com> on behalf of Ron Kaplan <ron.k...@post.harvard.edu>
Sent: Wednesday, June 25, 2025 1:42:09 PM
To: Nick Briggs <nicholas...@gmail.com>
Cc: Herb Jellinek <jell...@newscenter.com>; Paolo Amoroso <paolo....@gmail.com>; Lisp Core <lisp...@googlegroups.com>
Subject: Re: HTMLSTREAM: an imagestream that writes HTML files
 
Reply all
Reply to author
Forward
0 new messages