Re: Fwd: Re: Lojban word processor for Windows?

Robert J. Chassell

unread,

Sep 19, 1999, 12:27:20 PM9/19/99

to

>2. The typography Mark and I have been discussing is (to me, at least,
>and I imagine to Mark as well) a wholly separate issue from the text
>editor.

Yes, but since you can already type Chinese, Cyrillic, Ethiopic, and
Latin all in the same Emacs buffer, surely you can add the Tolkein
characters? The Emacs' multilingual extension provides all sorts of
tools (and the next version will remove the need for fixed width
fonts.)

--
Robert J. Chassell b...@rattlesnake.com
Rattlesnake Enterprises http://www.rattlesnake.com

Bob LeChevalier (lojbab)

unread,

Sep 18, 1999, 11:46:33 AM9/18/99

to

At 11:02 AM 9/18/99 +0100, Piermaria Maraziti wrote:
>From: Piermaria Maraziti <pier...@maraziti.it>
>
>At 11.36 18/09/1999 +0200, you wrote:
>
> >I don't see the point in having an extra word processor just for one
> >language. Also, I think there are far more important items on the
> >programming agenda: a good, portable glosser / automatic translator
> >of syntax-parsed lojban text.
>
>The idea is to do a WP with an integrated "good, portable glosser /
>automatic translator
>of syntax-parsed lojban text" - perhaps at least as for vwrsion "1.0",
>without the "automatic translator" part, excluding the interlinearizing
>capabilities that will help so much in translating (and learning)!

Well, even that is probably version 2.0. Robin Turner, who originally
mentioned it on conlang, suggested that being able to click on a word and
call up its place structure (which implies calling up its tanru breakdown
if it is not in the dictionary) is the thing most needed for Lojban
writing. Having a dictionary lookup for English words would seem an
obvious thing to include at the same level. Such lookups are presumably
trivial utility routines by now, so if you started from that open software
base that several have mentioned, we have something that is small enough
for someone to tackle in their spare time in a few days.

The fancy version written from scratch that uses Java, and has the
functionality of the parser/glosser and the lujvo maker and typesetting
including Tolkienian stuff is a much bigger project - probably worth doing,
but not likely to have someone writing it very soon. You either have to
write it from scratch or translate the existing code. Just making existing
Turbo Pascal code portable has never been accomplished for any of our core
software (at least a dozen people started to translate LogFlash into C,
including using an auto-translator to help, and no one in 10 years ever got
a working program, though Eric Raymond got close. I suspect that the
glosser is bigger than LogFlash, though I haven't looked. Both are very
time consuming to debug and test).
----
lojbab ***NOTE NEW ADDRESS*** loj...@lojban.org
Bob LeChevalier, President, The Logical Language Group, Inc.
2904 Beau Lane, Fairfax VA 22031-1303 USA 703-385-0273
Artificial language Loglan/Lojban:
see Lojban WWW Server: href=" http://xiron.pc.helsinki.fi/lojban/ "
Order _The Complete Lojban Language_ - see our Web pages or ask me.

Robert J. Chassell

unread,

Sep 20, 1999, 8:50:13 PM9/20/99

to

You don't use the structured markup to denote what *font* you are
using but you might very well use it to denote what *language* you are
using.

Yes, you are right. That would be a good way to do it.

David Brookshire Conner

unread,

Sep 20, 1999, 3:18:52 PM9/20/99

to

Robert J. Chassell writes:
> From: "Robert J. Chassell" <b...@rattlesnake.com>
>
> David Brookshire Conner <nell...@concentric.net> wrote:
[....]
> And typography, well, I'm a structured markup fiend. Mixing
> typography with word processing seems misguided and encourages lots
> of visually ugly documents...
>
> I am confused here. Suppose you are writing on cyrillic, Tibetan,
> and latin: do you use structured markup for the different fonts? I
> don't think so. I suspect you use markup for whether your Tibetan or
> Korean `Watch Out!' should be emphasized or not.

You don't use the structured markup to denote what *font* you are
using but you might very well use it to denote what *language* you are

using. Whether or not you do this depends on exactly what kind of
doucment you are writing.

Suppose you are writing a novel that includes fluently multilingual
characters. The *structure* may have more to do with who says what
than it does with what language someone is speaking.

Now suppose you are writing a textbook for learning a foreign
language. Here, clearly, marking the language can be quite important,
whether or not the languages in the book use the same glyphs or
not. For example, writing a textbook on Russian, sections will
describe Cyrillic, including the characters, but the language will be
English (well, it will if *I* write it :-)

> Surely, structured markup is orthogonal to what glyphs used for
> straight text?

Yes of course - I wasn't suggesting that.

Glyphs are not fonts. Unicode does not describe a font. Unicode
describes characters which have stereotypical appearances (the
glyph). You need a font to render something - that's typography, not
word-processing.

Hmmm. I suppose I'm getting definitional here, so here's how I'm using
things:

character - value in some sort of script

glyph - the archetypical appearance of a character; alt. the
particular appearance of a character represented by a particular font.

Word processing - rearranging characters (usually in groups, i.e.,
words :-)

(Structured) markup - notating the logical structure of a string of
characters.

Formatting - Mapping markup to particular renderings.

Typography - The subset of formatting concerned with fonts and
placement of glyphs (in sense 2 above).

Font - a set of graphical symbols with a mapping from symbol to
character. The map need not be complete, but is usually a function
(i.e., one graphical symbol is associated with one character. One
character may have many representations in the font).

> ... from taking a book from outline to camera-ready form...
>
> Gosh, a voice from out of history. :-) `Camera-ready' is only one kind
> of output format.

Of course.

> For the past couple of decades people I know have read
> manuals both online and printed: books go from outline to *two* forms,
> one them `camera-ready', the other `display' ready.

Right - of course. This wasn't a manual. This was a textbook. Six
years ago (when I wrote it), Addison Wesley wasn't about to consider
distributing a textbook on CD ROM. Three years ago (when I was working
on the revision), they were, and had that project continued, I would
have produced both camera ready and display ready copy, most likely
from one SGML source.

> Oh, I know that. The problem is and has been for some years strictly
> legal: the Lucid/X Emacs people are unable to obtain the kinds of
> disclaimers/assignments that the lawyers I deal with require for wide
> spread, safe distribution.

Ah, of course. Lawyers, gotta love em.

[...]

> Most programmers I know rightfully hate these sorts of legal concern;
> or else they pay little attention on account they lack experience and
> street smarts.

I hope I'm the former, not the latter :-)

Brook

---------
A computer's attention span is as long as it's power cord.

---------
Fancy. Myth. Magic.
http://www.concentric.net/~nellardo/

Robert J. Chassell

unread,

Sep 20, 1999, 7:05:18 AM9/20/99

to

David Brookshire Conner <nell...@concentric.net> wrote:

Sure, but the point that I was trying to make was that typography and
orthography was orthogonal to the question of a word processor.

Yes, they certainly are, or should be.

And typography, well, I'm a structured markup fiend. Mixing
typography with word processing seems misguided and encourages lots
of visually ugly documents...

I am confused here. Suppose you are writing on cyrillic, Tibetan,
and latin: do you use structured markup for the different fonts? I
don't think so. I suspect you use markup for whether your Tibetan or
Korean `Watch Out!' should be emphasized or not.

Surely, structured markup is orthogonal to what glyphs used for
straight text?

... from taking a book from outline to camera-ready form...

Gosh, a voice from out of history. :-) `Camera-ready' is only one kind
of output format.

For the past couple of decades people I know have read

manuals both online and printed: books go from outline to *two* forms,

one them `camera-ready', the other `display' ready. The two forms
provide different resolutions, different methods of search, different
portabilities, and so on. To some extent, each is truly different;
but in other ways, the differences are sufficiently regular that a
single manuscript can be the source for both kinds of output format.

Um, clearly you prefer GNU Emacs - the Lucid/X Emacs branch has
supported non-fixed width fonts for years :-)

Sorry, just had to take the cheap shot in the editor religious wars
:-)

Oh, I know that. The problem is and has been for some years strictly
legal: the Lucid/X Emacs people are unable to obtain the kinds of
disclaimers/assignments that the lawyers I deal with require for wide
spread, safe distribution.

Without such disclaimers/assignments, it is easy for someone to repeat
what has happened in the past, namely to stick some code into a
program that gets used by major companies, then threaten the major
companies. Some people that tried this in the past got money from
several companies (out of court settlements) until faced in court by
DEC, at which point they lost. Sure, someone trying this with
Lucid/Xemacs would, we hope, lose because the Xemacs people have got
legally smarter over the years and they have a track record, but the
inclusion decision is still is a question of how much time and money
you want to put into legal questions rather than programming
questions.

Most programmers I know rightfully hate these sorts of legal concern;
or else they pay little attention on account they lack experience and
street smarts.

--

Piermaria Maraziti

unread,

Sep 18, 1999, 4:37:02 AM9/18/99

to

At 09.53 17/09/1999 -0500, you wrote:

>The idea of using Java sounds appealing for the same reasons given by the
author below. There are some reservations, however. The Java wars are
still raging and Java tends to be just a tad slow for the average
end-user's expectations. I always know that if an app is dragging, then
it's bound to be written in Java - but then, that might be Microsoft's way
of "helping" java die out :)

Well... cooling all flames on Java and Microsoft I have a simple solution
to that.

Using Java 1.2 and only JFC (Swing) classes, you are compatible with almost
every compiler, among which Visual Cafe' who can compile in native code on
Windows 95/98/NT machines. You have quality of software (due to a beautiful
and good language), speed on the vast majority of platforms and
compatibility with Unix (and Macs!) that are burdened a little less the
M$'s from Java Virtual Machines.

Ciao!

PS: I'm project leader of a project budgeted around $ 400,000 with 5, soon
7, men at staff, almost entirely to be written in Java (using also Oracle 8
and Weblogic)

-------------------------------------------------------------------------
Piermaria Maraziti - pier...@maraziti.it - http://piermaria.maraziti.it
ait anuas [Ex Arcano] - ainulindale: - Discordia l'Eterno - +3934735GILDA
http://gilda.it http://www.pathos.it http://discussioni.org ICQ:744473
Gran Siniscalco del Leale Ordine della Cavalleria et Stregoneria Italica

Piermaria Maraziti

unread,

Sep 18, 1999, 6:02:16 AM9/18/99

to

At 11.36 18/09/1999 +0200, you wrote:

>I don't see the point in having an extra word processor just for one
>language. Also, I think there are far more important items on the
>programming agenda: a good, portable glosser / automatic translator
>of syntax-parsed lojban text.

The idea is to do a WP with an integrated "good, portable glosser /
automatic translator
of syntax-parsed lojban text" - perhaps at least as for vwrsion "1.0",
without the "automatic translator" part, excluding the interlinearizing
capabilities that will help so much in translating (and learning)!

Ciao!

David Brookshire Conner

unread,

Sep 18, 1999, 7:28:36 PM9/18/99

to

Bob LeChevalier (lojbab) writes:
> From: "Bob LeChevalier (lojbab)" <loj...@lojban.org>

>
> At 11:02 AM 9/18/99 +0100, Piermaria Maraziti wrote:
> >From: Piermaria Maraziti <pier...@maraziti.it>
> >

> >At 11.36 18/09/1999 +0200, you wrote:
> > >I don't see the point in having an extra word processor just for one
> > >language.

Agreed - that's why I suggested an extension to an existing
editor. Since I know Framemaker and Emacs best (and neither are
provided by the Evil Empire :-), I suggested extensions to those.

> > > Also, I think there are far more important items on the
> > >programming agenda: a good, portable glosser / automatic translator
> > >of syntax-parsed lojban text.
> >
> >The idea is to do a WP with an integrated "good, portable glosser /
> >automatic translator
> >of syntax-parsed lojban text" - perhaps at least as for vwrsion "1.0",
> >without the "automatic translator" part, excluding the interlinearizing
> >capabilities that will help so much in translating (and learning)!

Erm, I'd be careful about what "the" idea is. See below.

> Well, even that is probably version 2.0. Robin Turner, who originally
> mentioned it on conlang, suggested that being able to click on a word and
> call up its place structure (which implies calling up its tanru breakdown
> if it is not in the dictionary) is the thing most needed for Lojban
> writing.

Yeah, this would seem to be the case - place structure is probably the
most opaque part of the language. Sure, natural languages have similar
problems (transitive vs intransitive verbs come to mind), but I'll bet
people learning English (or anyone trying to write in a non-native
tongue) would appreciate usage hints comparable to lojban place structure.

> Having a dictionary lookup for English words would seem an
> obvious thing to include at the same level.

Yes, especially with the part of speech identified and allowable
adjacent parts of speech and their meaning.

> Such lookups are presumably
> trivial utility routines by now, so if you started from that open software
> base that several have mentioned, we have something that is small enough
> for someone to tackle in their spare time in a few days.

Yep - the "hard" part is having the database of words to look up. But
le gi'uste looks regular enough in format that it could probably be
adapted pretty much as is.

So, presuming Emacs as a base, here's the stepping stones:

1. an interactive function that looks up the word under point (the
cursor) in various word databases - gismu, cmavo, etc.

2. An extension that provides tab-completion of partial words

3. A major mode that provides some basic functionality:
regexps to use with outline-minor-mode (ni'oni'o and the like)
notations for use with font-lock (what's a quoted piece of text
(string to font-lock), "paren" matching, etc)
bindings to 1. and 2.
binding to pipe buffer or region to parser

4. A major mode with more functionality
interactive syntax checking
auto-suggestion and elision of cmavo
spell-checking
dictation (!)
lujvo tools

> The fancy version written from scratch that uses Java, and has the
> functionality of the parser/glosser and the lujvo maker and typesetting
> including Tolkienian stuff is a much bigger project - probably
> worth doing,
> but not likely to have someone writing it very soon.

I just want to be clear on a few things:

1. Writing a text editor from scratch is an interesting student
exercise, but I don't see that you'd gain much for lojban
functionality over existing open source (or closed but extensible)
text editors.

2. The typography Mark and I have been discussing is (to me, at least,
and I imagine to Mark as well) a wholly separate issue from the text

editor. The issue got raised in the context of the text editor, but
type-setting and text editing are not the same thing (I write in
Emacs, but use TeX to type-set - and I'm damn glad I don't typeset in
Emacs).

3. Very specifically, the use of Tengwar for lojban is strictly an
esthetic exercise for me. I make no claims that it has any real
practical application beyond beauty (for a very conlang-ish kind of
beauty at that).

> You either have to
> write it from scratch or translate the existing code. Just making existing
> Turbo Pascal code portable has never been accomplished for any of our core
> software (at least a dozen people started to translate LogFlash into C,
> including using an auto-translator to help, and no one in 10 years ever got
> a working program, though Eric Raymond got close. I suspect that the
> glosser is bigger than LogFlash, though I haven't looked. Both are very
> time consuming to debug and test).

This doesn't surprise me - especially rewriting Pascal in C (something
that strikes me as a largely misguided effort at best, and positively
sadistic (or masochistic, depending on whether the task was assigned
or voluntarily undertaken)).

For various kinds of parsing and transformational applications, I'd
suggest the programming language Haskell. It isn't wide-spread, but it
runs everywhere, and is quite well-suited to these kinds of things.

Brook

---------
All wiyht. Rho sritched mg kegtops awound?

ma...@kli.org

unread,

Sep 19, 1999, 12:06:14 PM9/19/99

to

>From: David Brookshire Conner <nell...@concentric.net>
>Date: Sat, 18 Sep 1999 19:28:36 -0400 (EDT)
>Cc: loj...@onelist.com

>
>From: David Brookshire Conner <nell...@concentric.net>
>
>3. A major mode that provides some basic functionality:
> regexps to use with outline-minor-mode (ni'oni'o and the like)
> notations for use with font-lock (what's a quoted piece of text
> (string to font-lock), "paren" matching, etc)
> bindings to 1. and 2.
> binding to pipe buffer or region to parser

Paren matching will be nice; those terminators can get confusing in long
sentences, even when you're trying to be simple.

>4. A major mode with more functionality
> interactive syntax checking
> auto-suggestion and elision of cmavo
> spell-checking
> dictation (!)
> lujvo tools

Don't forget cmene-checking. Or maybe I deleted that.

>1. Writing a text editor from scratch is an interesting student
>exercise, but I don't see that you'd gain much for lojban
>functionality over existing open source (or closed but extensible)
>text editors.

It *could* be nice... But definitely AFTER we've learned lessons from
something simpler.

>2. The typography Mark and I have been discussing is (to me, at least,
>and I imagine to Mark as well) a wholly separate issue from the text
>editor. The issue got raised in the context of the text editor, but
>type-setting and text editing are not the same thing (I write in
>Emacs, but use TeX to type-set - and I'm damn glad I don't typeset in
>Emacs).

Agreed on all points, and I also use TeX (though I may have to break down
and find som M$-Word for some things; I can't seem to find anything on
Linux that can do TrueType kerning properly.

~mark

PILCH Hartmut

unread,

Sep 18, 1999, 5:36:32 AM9/18/99

to

On 17 Sep 1999 ma...@kli.org wrote:

> >The idea of using Java sounds appealing for the same reasons given by the =
> >author below. There are some reservations, however. The Java wars are =
> >still raging and Java tends to be just a tad slow for the average =
> >end-user's expectations. I always know that if an app is dragging, then =
> >it's bound to be written in Java - but then, that might be Microsoft's =

> >way of "helping" java die out :)

Byte-compiled java programs are just as fast as any other.

> Eh, the fine-points of distinction can often be avoided in a simple
> enough application, and word-processing hardly needs blinding speed,
> so the slowness shouldn't matter much in that application. Java
> sounds like a Good Plan to me.

What about just writing

- an ispell table for Lojban
- an Emacs mode for Lojban

?

I don't see the point in having an extra word processor just for one

language. Also, I think there are far more important items on the

programming agenda: a good, portable glosser / automatic translator
of syntax-parsed lojban text.

-phm

David Brookshire Conner

unread,

Sep 19, 1999, 8:33:04 PM9/19/99

to

Robert J. Chassell writes:
> From: "Robert J. Chassell" <b...@rattlesnake.com>
>

> >2. The typography Mark and I have been discussing is (to me, at least,
> >and I imagine to Mark as well) a wholly separate issue from the text
> >editor.
>

> Yes, but since you can already type Chinese, Cyrillic, Ethiopic, and
> Latin all in the same Emacs buffer, surely you can add the Tolkein
> characters?

Sure, but the point that I was trying to make was that typography and
orthography was orthogonal to the question of a word processor. The
functionality of the word processor should be comparable no matter
what orthography it is using (as Mule/Emacs demonstrates in
spades). And typography, well, I'm a structured markup fiend. Mixing

typography with word processing seems misguided and encourages lots of

visually ugly documents - Microsoft products seem to be especially bad
in this regard, as their tools for supporting styles as semantic
mark-up are fairly crippled.

Just my biases. I guess I got anal from taking a book from outline to
camera-ready form. Made me really sensitive to the different tasks at
each stage.

It would certainly be nice to have lojban as a full-scale language
supported from top to bottom, with multiple possible entry forms
(type, lookup, dictate) and renderings (latin, tengwar, spoken,
others).

> The Emacs' multilingual extension provides all sorts of
> tools (and the next version will remove the need for fixed width
> fonts.)

Um, clearly you prefer GNU Emacs - the Lucid/X Emacs branch has

supported non-fixed width fonts for years :-)

Sorry, just had to take the cheap shot in the editor religious wars
:-)

Brook

---------
Go ahead, make my data!

Reply all

Reply to author

Forward