One possible design feature that I don't think has been used before is a gematria-based checksum for all words or morphemes. If the speaker had to calculate the checksum on inflected words and add the appropriate suffix, the language probably wouldn't be speakable by humans in real time. But suppose each phoneme/letter has a given gematriatic value, and any validly formed morpheme has to have phonemes whose gematria values add up to an even number, or a number divisible by three, or something -- that would allow a certain amount of error detection and even correction without the speaker having to do any calculation on the fly. The language designer (or any speaker who wants to coin a new morpheme) just has to do the arithmetic when creating a new word. This would probably take the place of the kind of phonological redundancy I outlined on the CONLANG list a while ago, since it would be pretty hard to manage both kinds at once.
It also seems apt that the engelang could use a relatively large phoneme inventory, and an orthography that fits into ASCII -- so different phonetic values for capital and lowercase letters, and maybe some punctuation pressed into service as letters as well. (In fact the gematria values of the letters might be based on or the same as their ASCII values? But more likely we would assign gematria values in such a way that two valid morphemes differing by only one phoneme will differ by at least two distinctive features in that letter, as far as possible.)
Actually, I've done this with a conlang before. It was only a toy branch of one of my conlangs, and I never released it or anything. It was certainly interesting, but I think it's a little silly since it involves mapping phonemes to numbers--something I always try to avoid as a matter of personal aesthetics. But that's just me.
> I'd like to see something that's still within the domain of languages > humans can use realistically.
> Gematria-based language might still be doable of course... but I'm not sure.
As long as the gematria applies to the forms of individual morphemes, it won't make the language any harder to learn and use than any other a priori language. It means a little extra work for the language designer(s), but the payoff is that we ensure no two morphemes are excessively similar.
Alternatively, we could use another method of ensuring phonological redundancy, like those I outlined in my article a while ago.
You said in your prospectus that we want the engelang to "display major innovation in language tech" -- I'm not sure I have any other innovative engelang ideas.
I have some ideas about syntax, but they're not terribly innovative. Probably we want to use a different word order than the artlang (apparently VSO) and auxlang (probably SVO), so maybe SOV -- but probably split-S active, more or less, with marking (either by case or postpositions) for specific theta roles. Also, probably some rules for omitting the postpositions in some circumstances for brevity's sake.
> You said in your prospectus that we want the engelang > to "display major innovation in language tech" -- I'm not > sure I have any other innovative engelang ideas.
Nor I, aside from NLF2DWS, but I don't think that it would be apporpriate for CL101. I think it would be wise to brainstorm for a little while about this, so we can have a few different ideas to toss around and pick from.
And indeed, a gematric engelang would certainly fit the bill for major innovation. I have no objection to it on any principle grounds, only on whether it can be a) used plausibly be humans realtime b) developed (in major part at least) over the course of one textbook c) understood by intro textbook students
(a) is mainly a question of good design - either very clever and suffusive, or realistically humble in its goals. (b) is likewise - good design tends to be a matter of simple core elements giving rise to complex results. Note that (a) does not mean "naturalistic"; that is IMO a very different constraint, and actually I would prefer that the engelang be relatively far from naturalistic while fitting (a), so as to provide a better spread of examples for the book. Naturalism is the artlang's job, and a posteriori is the auxlang's.
(c) would mean that it can't have gematric concepts more advanced than can be introduced in a page or two - we're teaching linguistics not numerology after all. This probably is possible, but I honestly don't know enough about gematria to be sure.
I have no comments on the details of the syntax etc; my only involvement in the design of the languages is to ensure they fit the needs of the book.
P.S. A redundant language could also be a viable candidate IMO.
It need (and should) not be restricted to a matter of phonology and phonotactics of course, but be throughout. It could be tailored for use in high noise, high risk environments that nevertheless need fast communication; militaries for example, or aircraft control. In real world it would likely need to be a posteriori, but we don't need to worry about that and instead aim for the "ideal" version.
That reminds me: there should be an essay in ASP about getting conlangs popularized and spread. Is having an army really the only way? :-P Perhaps it could be done by meme or subversion, or by "conlang modules" that could be fit into any existing language?
> That reminds me: there should be an essay in ASP about getting > conlangs popularized and spread. Is having an army really the only > way? :-P Perhaps it could be done by meme or subversion, or by > "conlang modules" that could be fit into any existing language?
Gary Shannon might be a person to ask for an essay on this, given his experiences with Larry Sulky's Ilomi and his own Kalusa. Or Sonja Kisa, re: Toki Pona.
I've observed that it's a lot easier to popularize someone else's conlang than your own. When someone is beating the drum for a conlang they invented themselves, people tune it out. If you're promoting a conlang someone else invented, some people at least will sit up and take notice; a disinterested recommendation seems more reliable. My impression is that Gary Shannon was more influential in popularizing Ilomi (during its few months of popularity) than Larry Sulky, and that jan Pije and others have been more influential in popularizing Toki Pona than Sonja Kisa.
Also, a language with two speakers is probably more than twice as attractive to learn as one with a single speaker.
> P.S. A redundant language could also be a viable candidate IMO.
> It need (and should) not be restricted to a matter of phonology and > phonotactics of course, but be throughout. It could be tailored for > use in high noise, high risk environments that nevertheless need fast > communication; militaries for example, or aircraft control. In real > world it would likely need to be a posteriori, but we don't need to > worry about that and instead aim for the "ideal" version.
Yes, that's good; it gives the language a unifying theme and a design principle that applies in some way at the phonological, orthographic, grammatical and semantic levels.
> And indeed, a gematric engelang would certainly fit the bill for major > innovation. I have no objection to it on any principle grounds, only > on whether it can be > a) used plausibly be humans realtime
As long as it's a constraint on forms of valid root morphemes, rather than something speakers have to calculate while speaking, it should not prevent the language from being speakable in realtime.
> b) developed (in major part at least) over the course of one textbook > c) understood by intro textbook students ... > (c) would mean that it can't have gematric concepts more advanced than > can be introduced in a page or two - we're teaching linguistics not > numerology after all. This probably is possible, but I honestly don't > know enough about gematria to be sure.
Sort the phonemes by manner of articulation and then by point of articulation. Then assign the counting numbers 1, 2, 3, and so on to all the phonemes in the inventory/letters in the alphabet. Say that a valid root word has to have letters whose gematria values add up to an even number (or a number divisible by three, for higher redundancy). I think we can explain that in "a page or two" -- probably in less space than it took me to explain my method for generating a set of morphemes which all differ by at least two phonemes.
E.g. (with a simpler phonology/alphabet than we will probably use):
i 1 e 2 u 3 o 4 a 5 k 6 t 7 p 8 x 9 s 10 f 11
So if evenness is the constraint, then "kit", "kix", "kif" are valid roots but "kik", "kip", "kis" are not valid, etc.
Within a specific category of words you might have a tighter constraint. E.g., if all color-adjectives have to have the same gematria sum, you might have colors "kos", "tifi", "xup", "asa", and so forth, all adding up to 20.
(This is without considering self-segregating morphology, which we probably want as well.)
Other thoughts on redundancy -- "belt and suspenders" grammar:
- use both case marking AND adpositions AND word order - repeat conjunctions when used with a list (e.g. "eat and apples and oranges and mangoes" rather than "eat apples, oranges and mangoes") - use both a question particle and an inversion of word order to mark yes/no questions - use a sentence-level question particle even when a more specific interrogative word is present, as in Japanese and unlike in Esperanto
How might it work on the semantic level? Probably it would make lexical distinctions between concepts that natural languages typically conflate because context would distinguish the sense meant.
On this subject I've been collection some notes and ideas concerning what might make a conlang "popular". So far it's just several pages of unorganized notes. I could post an outline under the ASP banner if anyone is interested.
I have a few, weirder, ideas for syntax. The stuff you're proposing, while quite interesting, don't quite feel abstract and confusing enough for an engelang. Though of course this is for an introduction, so we can't get too weird. If I were doing this entirely for myself, I'd base it on some abstract mathematical concept. I'm not sure if that's the best idea though.
> I have some ideas about syntax, but they're not > terribly innovative. Probably we want to use > a different word order than the artlang (apparently VSO) > and auxlang (probably SVO), so maybe SOV -- but > probably split-S active, more or less, with marking > (either by case or postpositions) for specific theta > roles. Also, probably some rules for omitting the > postpositions in some circumstances for brevity's > sake.
More accurately it would be put as maximizing the Principle of Noise Resistance (PNR).
The prinicples listed in the essay are indeed conflicting and meant to be so; they're tradeoffs that need to be made. It's certainly not the full scope of what an engelang could be (viz. the gematria idea) but perhaps it could suggest some other viable ones, or multiple goals for one language?
On 10/12/06, spaced...@gmail.com <spaced...@gmail.com> wrote:
> I have a few, weirder, ideas for syntax. The stuff you're proposing, > while quite interesting, don't quite feel abstract and confusing enough > for an engelang. Though of course this is for an introduction, so we
So, go ahead. Tell us what your few weirder ideas for syntax are.
I don't know if this is along the lines of what you're looking for, but a propaganda engelang of mine modifies its sytax based on the "respect" of the subject. Depending on the subject's social stature, the sytanx of a sentence can be either SVO or OVS. For example, "Le gat kuolwi" means "the dog bit the man" and "Vraskwi gatstaal kuol" means "the king bit the dog."
BTW, "wi" is a tense marker, not a subject marker.
On Oct 11, 10:39 pm, "Sai Emrys" <sai...@gmail.com> wrote:
I really hope my previous reply got through, because I don't want to double-reply but I'm willing to repost the same information on my ideas for syntax if needed. In any case, my most thoroughly worked-out proposal in the post is to force the language to conform to group theory.
On Oct 13, 9:35 am, "Jim Henry" <jimhenry1...@gmail.com> wrote:
> On 10/12/06, spaced...@gmail.com <spaced...@gmail.com> wrote:
> > I have a few, weirder, ideas for syntax. The stuff you're proposing, > > while quite interesting, don't quite feel abstract and confusing enough > > for an engelang. Though of course this is for an introduction, so weSo, go ahead. Tell us what your few weirder ideas for syntax are.
Well, I guess it didn't get through. In any event, here are some thoughts on group theoretic syntax:
Group theory, for those who may not know, is a branch of abstract algebra that deals with a certain class of entities characterized by the following:
1) Groups are sets of objects with a binary operation that is applied to them, usually written as *. It follows the following properties:
1a) Associativity: (a*b)*c = a*(b*c) 1b) Neutral element: a*e = e*a = a 1c) Inverses: a*b = b*a = e
The mathematics behind this is quite rich and well worked-out. The problem with cramming language into it is this:
If you make the set of objects used by the group the set of words or some other syntactic or semantic unit and you make the operation equal to syntactic or semantic composition, then you face a problem presented by inverses. If you have it as a group and want all the power that comes with that, you need inverses.
For us that means that every unit we're working with has to have an inverse somewhere that undoes the operation, so for "dog" we have an anti-dog that, when the two are combined, results in a null return. The full ramifications of this I haven't worked out yet, but it'd be interesting.
Of course this is kind of an extreme example. There are a lot of other formalisms that provide wacky and engineery possibilities. What would be interesting--and again, possibly not appropriate for a beginner's book--would be considering different semantic theories, and then tying them into some sort of novel syntax.
> I really hope my previous reply got through, because I don't want to > double-reply but I'm willing to repost the same information on my ideas > for syntax if needed. In any case, my most thoroughly worked-out > proposal in the post is to force the language to conform to group > theory.
> On Oct 13, 9:35 am, "Jim Henry" <jimhenry1...@gmail.com> wrote:
> > On 10/12/06, spaced...@gmail.com <spaced...@gmail.com> wrote:
> > > I have a few, weirder, ideas for syntax. The stuff you're proposing, > > > while quite interesting, don't quite feel abstract and confusing enough > > > for an engelang. Though of course this is for an introduction, so weSo, go ahead. Tell us what your few weirder ideas for syntax are.
I think that this is the most viable suggestion so far for an engelang basis: noise resistance, essentially.
So, how's this for a language spec: * maximally tolerant of noise in all modes, e.g. critical messages in a combat situation not tolerant of mistakes * maximally brief / terse, within that limitation (can't waste time) * maximally simple - don't communicate info that's not relevant to the situation, but communicate everything that is * possibly runtime-encrypted in some manner, to minimize the ability of people other than those in the discussion to understand what is being talked about (e.g. by requiring multimodality, or some sort of shared key, or ...?)
I think this poses a few interesting problems (primarily the first and last points), with good constraints, that should be relevant at all levels of the language.
> So, how's this for a language spec: > * maximally tolerant of noise in all modes, e.g. critical messages in a > combat situation not tolerant of mistakes
In addition to the phonological redundancy technique I outlined in the essay linked above, there is another technique that could be used as well or instead: no two phonemes differ by less than two distinctive features. E.g. you might have /p/, /z/, /x/, but having those would rule out /b/, /d/, /k/, /G/. If two phonemes are in the same or nearly the same point of articulation they must have different manner of articulation and voicing.
> * maximally brief / terse, within that limitation (can't waste time)
This is pretty hard to reconcile with the high noise resistance criterion, which makes an interesting challenge.
> * maximally simple - don't communicate info that's not relevant to the > situation, but communicate everything that is
That's good, probably.
> * possibly runtime-encrypted in some manner, to minimize the ability of > people other than those in the discussion to understand what is being > talked about (e.g. by requiring multimodality, or some sort of shared > key, or ...?)
I have some doubts whether this would render the language unspeakable by normal humans.
Maybe if it is based on a systematic polysemy in the roots, with some shared key indicating which sense of all roots is intended?
E.g., if all roots have five senses, (and any core concept can be expressed by any of five roots) a simple key might be a number from one to five that tells you in which sense each root in the following utterance is meant. Or a "polyalphabetic" cypher might have a series of several numbers, e.g. 2-1-4, meaning that the 1st, 4th, 7th etc morphemes are meant in sense #2, the 2nd, 5th, 8th etc. morphemes are meant in sense #1, and the 3d, 6th, 9th etc. morphemes are meant in sense #4 .... almost certainly not speakable or understandable in realtime.
On 10/28/06, Jim Henry <jimhenry1...@gmail.com> wrote:
> > * maximally brief / terse, within that limitation (can't waste time)
> This is pretty hard to reconcile with the high noise resistance > criterion, which makes an interesting challenge.
*grin* Exactly.
Redundancy alone without a balancing factor is too easy, because you could just take the tack the Army currently uses: repetition. Boooring.
Whereas your suggestions - e.g. minimum two distinctive features - is more compact IMO.
We'd have to further elaborate what "noise" means. Speaking under stress? Radio static? Bombs? Distinguishing people in the squad from each other (what if each person has a unique 'tag' of talking so that you can easily tell who said what without looking or hearing well enough to make out their voice signature)?
> > * possibly runtime-encrypted in some manner, to minimize the ability of > > people other than those in the discussion to understand what is being > > talked about (e.g. by requiring multimodality, or some sort of shared > > key, or ...?)
> I have some doubts whether this would render the language > unspeakable by normal humans.
> Maybe if it is based on a systematic polysemy in the roots, > with some shared key indicating which sense of all roots > is intended?
> E.g., if all roots have five senses, (and any core concept > can be expressed by any of five roots) a simple key might be > a number from one to five that tells you in which sense each > root in the following utterance is meant. Or a "polyalphabetic" > cypher might have a series of several numbers, e.g. 2-1-4, > meaning that the 1st, 4th, 7th etc morphemes are meant in > sense #2, the 2nd, 5th, 8th etc. morphemes are meant in > sense #1, and the 3d, 6th, 9th etc. morphemes are meant > in sense #4 .... almost certainly not speakable or > understandable in realtime.
Shared root key is certainly possible, but it's eminently guessable contextually unless the roots have significant overlap & dissimilarity. That would be cognitively difficult; a shared root system would probably be easier to implement as something like metaphorical equivalencies across domains, e.g. cooking-knife vs combat-knife (bayonet) vs verbal-knife (good argument) vs etc-knife. More abstractly for most things of course.
I don't understand what you mean by the polyalph cypher. Elaborate / example?
I think this is potentially a very interesting problem. It would need to be attacked, IMO, in a different way than standard encryption. Make it somehow dependent on context, pragmatics. Possibly some way of varying jargon by person? (Would be possible for long-term committed squads, but not between random elements of an army meeting for the first time...)
Another element is the multimode thing - e.g. if it's simultaneously spoken + signed it could be designed in such a way that losing either makes the message unintelligble. Obviously only works if you *can* use both modes, but I think that's an acceptable restriction.
> On 10/28/06, Jim Henry <jimhenry1...@gmail.com> wrote: > We'd have to further elaborate what "noise" means. Speaking under > stress? Radio static? Bombs? Distinguishing people in the squad from > each other (what if each person has a unique 'tag' of talking so that > you can easily tell who said what without looking or hearing well > enough to make out their voice signature)?
> > > * possibly runtime-encrypted in some manner, to minimize the ability of ..... > > Maybe if it is based on a systematic polysemy in the roots, > > with some shared key indicating which sense of all roots > > is intended? > > E.g., if all roots have five senses, (and any core concept > > can be expressed by any of five roots) a simple key might be > > a number from one to five that tells you in which sense each > > root in the following utterance is meant. Or a "polyalphabetic" > > cypher might have a series of several numbers, e.g. 2-1-4, > > meaning that the 1st, 4th, 7th etc morphemes are meant in > > sense #2, the 2nd, 5th, 8th etc. morphemes are meant in > > sense #1, and the 3d, 6th, 9th etc. morphemes are meant > > in sense #4 .... almost certainly not speakable or > > understandable in realtime.
> Shared root key is certainly possible, but it's eminently guessable > contextually unless the roots have significant overlap & > dissimilarity. That would be cognitively difficult; a shared root > system would probably be easier to implement as something like > metaphorical equivalencies across domains, e.g. cooking-knife vs > combat-knife (bayonet) vs verbal-knife (good argument) vs etc-knife. > More abstractly for most things of course.
That makes some amount of sense.
> I don't understand what you mean by the polyalph cypher. Elaborate / > example?
So ... in a bit simpler system, where every root has three senses, you might have words like
tef - 1. man 2. dog 3. tree kaS - 1. sing 2. bite 3. cut
then tef kaS tef
would mean something like "a man cuts (down) a tree" if the polysemic key in use is (or starts out with) 1-3-3, or "man bites dog" if the key is 1-2-2, etc. The main problem (as with any cryptographi system) is communicating the key. Or maybe the main problem is figuring out whether people can speak this kind of language in realtime: doubtful.