Character codes for different spaces?

Ron Kaplan

unread,

Mar 12, 2025, 5:54:59 PMMar 12

to Interlisp

In working through the various places that keys are bound to Tedit actions, I came across a list of space mappings:

(MSPACE 153) =231Q emquad?
(NSPACE 152) =230Q enquad?
(THINSPACE 159) =237Q
(FIGSPACE 154) =232Q

These are not legal XCCS codes, and they don't appear to be Alto-font codes. Any idea of where they might be defined, and what kinds of documents they might appear in?

Tedit seems just to mark these spaces as TEXT characters so that they are passed over when looking for word boundaries to operate on.

(Also, 159 is one of the interrupt codes that i use in wheelscroll--a separate set of constraints.)

Matt Heffron

unread,

Mar 12, 2025, 6:43:21 PMMar 12

to Medley Interlisp core

They appear to be Press character codes.

In the file PRESS, in the function \PRESS.CONVERT.NSCHARACTER, three of these four spaces are generated from the corresponding NS character codes (along with some others).

The FIGSPACE is not handled here.

Ron Kaplan

unread,

Mar 12, 2025, 8:09:21 PMMar 12

to Matt Heffron, Medley Interlisp core

Interesting. If that's where they originated, there doesn't seem to be any reason at all for Tedit to know about them.

It may be reasonable to suppress word-breaking around those spaces, but then they should be assigned their XCCS codes (or eventually Unicode).

--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lispcore/055c5a18-675c-4803-a0b7-22551a37d6b5n%40googlegroups.com.

Ron Kaplan

unread,

Mar 12, 2025, 8:10:19 PMMar 12

to John Cowan, Interlisp

Yes, but the question, is why are those conceptual characters associated with those particular numbers.

On Mar 12, 2025, at 4:49 PM, John Cowan <co...@ccil.org> wrote:

They have fixed widths relative to the font size: one em, one en (usually half an em), the space between adjacent quotation marks (usually 1/5 or 1/6 of an em), and the width of a European digit. They correspond to U+2003, U+2002, U+2009, and U+2007 respectively. They also have XCCS codes. See WP articles for details.

--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/lispcore/18247FC7-FED1-421F-8C3A-C9EBC6A23ADE%40post.harvard.edu.

Nick Briggs

unread,

Mar 12, 2025, 9:27:29 PMMar 12

to Ron Kaplan, Matt Heffron, Lisp Core

If you do a FONTSAMPLER and look at, say, TimesRoman 12, with

(FontSample (FONTCREATE 'TIMESROMAN 12) 0 NIL 'DISPLAY)

you'll see (actually, you wont, since they're spaces...) those positions filled with the characters described.

As Matt points out --

(\PRESS.CONVERT.NSCHARACTER

[LAMBDA (CHARCODE) (* jds " 4-Nov-85 08:02")

(* Provide backward compatibility for extended-language characters in the PRESS

printing environment. Converts certain of the NS characters into their

equivalent PARC-internal charcodes)

(SELCHARQ CHARCODE

(357,55 (* em quad)

153)

(357,54 (* en quad)

152)

(357,57 (* Thin space)

159)

(357,44 (* en dash / figure dash)

155)

(357,45 (* em dash)

156)

(357,146 (* bullet)

183)

(0,251 (* left single quote)

96)

(0,271 (* right single quote)

39)

(\CHAR8CODE CHARCODE])

)

If TEDIT is working with an Alto/Press font on the display, does it need to know about those particular characters?

To view this discussion visit https://groups.google.com/d/msgid/lispcore/722F1842-E698-49FB-85C1-F8B13F183EFB%40post.harvard.edu.

Ron Kaplan

unread,

Mar 12, 2025, 11:01:25 PMMar 12

to Nick Briggs, Matt Heffron, Lisp Core

Those characters were assigned different mappings in the font description in the old Bravo documentation. I use the \ASCII2XCCSMAP in INTERPRESS to fix when Tedit coerces Ascii-font characters to NS encodings, and that maps (most of) these the way that Press does it. So there is still a little confusion.

I think I will fix up the Interpress table (maybe with multiple mappings in one direction) to bridge, and then use the XCCS names and codes for these characters, and base Tedit word-breaking on the XCCS codes. An unconverted Timesroman Tedit file just won't select, move over, or delete the same "words".

To view this discussion visit https://groups.google.com/d/msgid/lispcore/5616F573-DF69-408B-83B1-85CA31ED99B3%40gmail.com.

John Cowan

unread,

Mar 14, 2025, 7:13:40 PMMar 14

to Ron Kaplan, Interlisp

They have fixed widths relative to the font size: one em, one en (usually half an em), the space between adjacent quotation marks (usually 1/5 or 1/6 of an em), and the width of a European digit. They correspond to U+2003, U+2002, U+2009, and U+2007 respectively. They also have XCCS codes. See WP articles for details.

On Wed, Mar 12, 2025, 5:54 PM Ron Kaplan <ron.k...@post.harvard.edu> wrote:

--
You received this message because you are subscribed to the Google Groups "Medley Interlisp core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lispcore+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/lispcore/18247FC7-FED1-421F-8C3A-C9EBC6A23ADE%40post.harvard.edu.

Reply all

Reply to author

Forward