Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Long words

94 views
Skip to first unread message

John Smith

unread,
Oct 10, 2022, 2:10:48 PM10/10/22
to
What are the linguistic criteria to determine when a string of
characters is a word? The reason I am asking is because it is often said
that German has very long words, when in fact what seems to be the case
that it is several separate words run together, each word having a
meaning in isolation. This is not the same thing as in Finnish, where one
can also have very long words by adding suffixes to a root, the suffixes
having no meaning in isolation.

Are the German and Finnish instances described above in the same
class when it comes to the linguistic concept of a word?

Peter T. Daniels

unread,
Oct 10, 2022, 4:02:10 PM10/10/22
to
"Word" is not a particularly useful concept in linguistics. For some
languages, you can draw up criteria, which will be agreed upon. For
some, it's not so easy.

The linguist Jerry Sadock, who studied the Greenlandic language, put it
informally as "A word is what when you make a mistake in the middle of
it, you have to go back to the start to say the corrected version.
There's a more formal version in an article on the language in the journal
*Language*.

John Smith

unread,
Oct 10, 2022, 6:26:03 PM10/10/22
to
OK, thanks for the explanation. It does make sense that a "word"
is a concept that will be strongly language-dependent - I imagine it
might even be difficult to define at all for some languages. Which I
guess renders the notion of "long word" valid within a specific language
alone, not across languages.

Ruud Harmsen

unread,
Oct 11, 2022, 3:52:25 AM10/11/22
to
Mon, 10 Oct 2022 18:10:41 -0000 (UTC): John Smith
<12...@whatismyemailaddress.xyz> scribeva:

> What are the linguistic criteria to determine when a string of
>characters is a word?

None. It is a spelling convention, different in every language.

>The reason I am asking is because it is often said
>that German has very long words, when in fact what seems to be the case
>that it is several separate words run together, each word having a
>meaning in isolation.

Yes, writing composite words as one word. Dutch and German do, English
usually doesn't.

>This is not the same thing as in Finnish, where one
>can also have very long words by adding suffixes to a root, the suffixes
>having no meaning in isolation.
>
> Are the German and Finnish instances described above in the same
>class when it comes to the linguistic concept of a word?

German etc. also have suffixes, like in Machbarkeiten,
Mach-bar-keit-en, feasabilities.
--
Ruud Harmsen, http://rudhar.com

wugi

unread,
Oct 11, 2022, 5:18:07 AM10/11/22
to
Op 10/10/2022 om 20:10 schreef John Smith:
Dutch and occasionally German do complicate those compounds by inserting
adverbial or case particles/suffixes:

kind, kinderen + school: kinderschool
rund, runderen (cow bull ox) + vlees (meat): rundervlees/rundsvlees
hond,-en + dag, dagen: hondsdagen (dog days)
vrouw,-en (women) + persoon: vrouwspersoon
etc.

--
guido wugi

wugi

unread,
Oct 11, 2022, 5:28:32 AM10/11/22
to
Op 11/10/2022 om 11:18 schreef wugi:
I forgot the (in)famous "tussen-n"
rug, ruggen (back) + graat ([fish]bone) : ruggegraat, now ruggengraat
(backbone)
hart,-en + beest,en: hartebeest, now presumably hartenbeest
I'm even not sure with harte()leed, harte()wens etc.

Personally I distinguish between
hartedief: sweetheart; and
hartendief: womaniser, ladykiller

--
guido wugi

Peter T. Daniels

unread,
Oct 11, 2022, 9:55:31 AM10/11/22
to
On Tuesday, October 11, 2022 at 3:52:25 AM UTC-4, Ruud Harmsen wrote:
> Mon, 10 Oct 2022 18:10:41 -0000 (UTC): John Smith
> <12...@whatismyemailaddress.xyz> scribeva:

> > What are the linguistic criteria to determine when a string of
> >characters is a word?
>
> None. It is a spelling convention, different in every language.

Well, that's not correct, either.

In Hungarian (I believe you've been discussing that language recently)
each stressed syllable begins a new word; in Polish, a new word begins
with the second syllable after a stress, etc. In French, word spaces
clearly do not correspond with the spoken language, with both prefixed
and suffixed clitics and pronouns and such.

In Latin and Turkish, almost every word has desinences added to its
end. In Greek, only a limited number of phonemes could end a word.

It's when you get to the polysynthetic languages that finding criteria
other than Sadock's for demarcating words is difficult.

Daud Deden

unread,
Oct 12, 2022, 3:34:03 AM10/12/22
to
On Monday, October 10, 2022 at 2:10:48 PM UTC-4, John Smith wrote:
A word is merely an utterance with meaning, in Paleo-etymology, from before the era of written character symbols.
Although today suffixes might have no meaning, they probably did have in the past.

Peter T. Daniels

unread,
Oct 12, 2022, 11:00:56 AM10/12/22
to
What is a suffix that "has no meaning"?

Do you know *any* language other than English?

NB for John Smith: "Paleo-etymology" is DD's own personal hobbyhorse,
which has nothing to do with linguistics or with facts about language.

Daud Deden

unread,
Oct 12, 2022, 1:01:38 PM10/12/22
to
On Wednesday, October 12, 2022 at 11:00:56 AM UTC-4, Peter T. Daniels wrote:
> On Wednesday, October 12, 2022 at 3:34:03 AM UTC-4, daud....@gmail.com wrote:
> > On Monday, October 10, 2022 at 2:10:48 PM UTC-4, John Smith wrote:
>
> > > What are the linguistic criteria to determine when a string of
> > > characters is a word? The reason I am asking is because it is often said
> > > that German has very long words, when in fact what seems to be the case
> > > that it is several separate words run together, each word having a
> > > meaning in isolation. This is not the same thing as in Finnish, where one
> > > can also have very long words by adding suffixes to a root, the suffixes
> > > having no meaning in isolation.
> > >
> > > Are the German and Finnish instances described above in the same
> > > class when it comes to the linguistic concept of a word?
> >
> > A word is merely an utterance with meaning, in Paleo-etymology, from before the era of written character symbols.
> > Although today suffixes might have no meaning, they probably did have in the past.
> What is a suffix that "has no meaning"?

"... the suffixes having no meaning in isolation".

> Do you know *any* language other than English?

What 'language' is *"English"*? A tempero-geopolitical dialect of the human language, no?

> NB for John Smith: "Paleo-etymology" is DD's own personal hobbyhorse,
> which has nothing to do with linguistics or with facts about language.

Rather, it is the study of word evolution, from the perspective of primate communication.
Whereas (many) "linguists" appear to prefer to remain focused upon armies & navies and other neo-political shenanigans.

Ruud Harmsen

unread,
Oct 12, 2022, 3:14:11 PM10/12/22
to
>On Wednesday, October 12, 2022 at 3:34:03 AM UTC-4, daud....@gmail.com wrote:
>> A word is merely an utterance with meaning, in Paleo-etymology, from before the era of written character symbols.
>> Although today suffixes might have no meaning, they probably did have in the past.

Wed, 12 Oct 2022 08:00:55 -0700 (PDT): "Peter T. Daniels"
<gram...@verizon.net> scribeva:
>What is a suffix that "has no meaning"?
>
>Do you know *any* language other than English?

Malay.

https://rudhar.com/lingtics/intrlnga/toentcqu.htm

Ruud Harmsen

unread,
Oct 12, 2022, 3:17:36 PM10/12/22
to
>On Wednesday, October 12, 2022 at 11:00:56 AM UTC-4, Peter T. Daniels wrote:
>> Do you know *any* language other than English?

Wed, 12 Oct 2022 10:01:36 -0700 (PDT): Daud Deden
<daud....@gmail.com> scribeva:
>What 'language' is *"English"*?

Quote of the year.

>A tempero-geopolitical dialect of the human language, no?

No.

>> NB for John Smith: "Paleo-etymology" is DD's own personal hobbyhorse,
>> which has nothing to do with linguistics or with facts about language.
>
>Rather, it is the study of word evolution, from the perspective of primate communication.

No.

>Whereas (many) "linguists" appear to prefer to remain focused upon armies & navies and other neo-political shenanigans.

No.

Daud Deden

unread,
Oct 12, 2022, 3:47:20 PM10/12/22
to
Ruud, you no everything and know nothing about Paleo-etymology, since it is not mentioned in your bible, wikipedia.
Now just try to answer this question: "What are the linguistic criteria to determine when a string of
characters is a word?"

DKleinecke

unread,
Oct 12, 2022, 6:43:11 PM10/12/22
to
Bad question. You need to first announce your definition of "character". That is,
your assumed universal phonology. Then you can define "word" within that
phonology. I see no reason for assuming a one-to-one correspondence between
phonological words and semantic objects.

Daud Deden

unread,
Oct 12, 2022, 10:38:56 PM10/12/22
to
The question was first posted/posed by John Smith, I pasted it for Ruud's benefit, to prevent him from going too far off track.

My earlier response seems adequate:

DKleinecke

unread,
Oct 12, 2022, 11:36:50 PM10/12/22
to
I am beginning to suspect you don't know what phonology means. The
fact that you refer to characters rather than phonemes makes that
more likely. You can't discuss words without discussing phonemes and
the rest of the machinery of phonology. PTD suggests you don't know
any language except English - somebody adds Malay. You need to learn
some linguistics to avoid speaking silly things. Isn't all like English.

Daud Deden

unread,
Oct 13, 2022, 2:37:47 AM10/13/22
to
https://biologydictionary.net/ 'phoneme' term not found.
https://www.biologyonline.com/search/phoneme term not found.
Oxford Medical Dictionary phoneme term not found.

I have no problem discussing words (in a bio- evolutionary context) without discussing phonemes.
Once again, my interest is in primate communication, which includes the human language, which includes "long words".
PTD is not a biologist, nor are you & Ruud.
I speak a local variant of the human language. Most everybody does.

So, then, back to 'long words'.

Ruud Harmsen

unread,
Oct 13, 2022, 3:57:12 AM10/13/22
to
Wed, 12 Oct 2022 12:47:18 -0700 (PDT): Daud Deden
<daud....@gmail.com> scribeva:

>On Wednesday, October 12, 2022 at 3:17:36 PM UTC-4, Ruud Harmsen wrote:
>> >On Wednesday, October 12, 2022 at 11:00:56 AM UTC-4, Peter T. Daniels wrote:
>> >> Do you know *any* language other than English?
>> Wed, 12 Oct 2022 10:01:36 -0700 (PDT): Daud Deden
>> <daud....@gmail.com> scribeva:
>> >What 'language' is *"English"*?
>> Quote of the year.
>> >A tempero-geopolitical dialect of the human language, no?
>> No.
>> >> NB for John Smith: "Paleo-etymology" is DD's own personal hobbyhorse,
>> >> which has nothing to do with linguistics or with facts about language.
>> >
>> >Rather, it is the study of word evolution, from the perspective of primate communication.
>> No.
>> >Whereas (many) "linguists" appear to prefer to remain focused upon armies & navies and other neo-political shenanigans.
>> No.
>> --
>> Ruud Harmsen, http://rudhar.com
>
>Ruud, you no everything and know nothing about Paleo-etymology, since it is not mentioned in your bible, wikipedia.
>Now just try to answer this question: "What are the linguistic criteria to determine when a string of
>characters is a word?"

See previous contributions by Peter T. Daniels, who pointed out some
possible approaches and some difficulties.

My own practical definition is here:
https://rudhar.com/sfreview/siworin/siworin05.htm
Quote:
"A word is a sequence of alphabetic UTF-8 characters. Dashes (-) and
apostrophes (' or ’) may occur, except at the start and the end, for
English words like ‘don’t’ and ‘isn’t’. "

wugi

unread,
Oct 13, 2022, 5:36:38 AM10/13/22
to
Op 13/10/2022 om 9:57 schreef Ruud Harmsen:
Funny you now define a word in writing terms, I thought it ought to be a
semantic thing in the first place?

Anyway, written and spoken words are not per se congruent.

Are methinks, meseems single words?
Are E. lady killer two words and Nl. ladykiller one?
Is F. j' "sh" in j'sais pas a word?
And j't'ai dit: one or three words in one syllable?
And the (in)famous F. liaisons, after a pause or hesitation, what words
have we here:
Le- z'autres. Dan- z'un instant. Tou- t'à fait.
Are Sp. dale, dalo, dáselo single words, and E. give him, give it, give
it to him not? And F. donne-lui, donne-le, donne-le-lui?
Is Nl. inderdaad a single word and D. In der Tat not?
....

--
guido wugi

wugi

unread,
Oct 13, 2022, 5:38:43 AM10/13/22
to
Op 13/10/2022 om 11:36 schreef wugi:

> And j't'ai dit: one or three words in one syllable?

J't'l'avais bien dit :)

--
guido wugi

Ruud Harmsen

unread,
Oct 13, 2022, 9:17:36 AM10/13/22
to
>Op 13/10/2022 om 9:57 schreef Ruud Harmsen:
>> My own practical definition is here:
>> https://rudhar.com/sfreview/siworin/siworin05.htm
>> Quote:
>> "A word is a sequence of alphabetic UTF-8 characters. Dashes (-) and
>> apostrophes (' or ’) may occur, except at the start and the end, for
>> English words like ‘don’t’ and ‘isn’t’. "

Thu, 13 Oct 2022 11:36:34 +0200: wugi <wu...@scrlt.com> scribeva:
>Funny you now define a word in writing terms, I thought it ought to be a
>semantic thing in the first place?

Not for my practical purpose of extracting words from the HTML of a
website.

>Anyway, written and spoken words are not per se congruent.
>
>Are methinks, meseems single words?

I'd say yes.

>Are E. lady killer two words and Nl. ladykiller one?

Yes.

>Is F. j' "sh" in j'sais pas a word?

According to my algorithm, yes.

>And j't'ai dit: one or three words in one syllable?
>And the (in)famous F. liaisons, after a pause or hesitation, what words
>have we here:
>Le- z'autres. Dan- z'un instant. Tou- t'à fait.
>Are Sp. dale, dalo, dáselo single words, and E. give him, give it, give
>it to him not? And F. donne-lui, donne-le, donne-le-lui?
>Is Nl. inderdaad a single word and D. In der Tat not?
>....

Quite.

My definition, by the way, requires spaces or other punctuation
between words, so it doesn't work for Japanese or Chinese. Before
writing my own search engine, I used Hyper Estraier, written by a
Japanese, and supporting Asian languages by using n-grams. (Whatever
that is, I still don't quite understand.) But it wasn't faultless, and
moreover the software was uncompilable and and unrunnable, and the
developer had been silent for many tears.

Mine does work, without special effort, for languages like Greek,
Arabic, Yiddish, Georgian and Javanese.

Peter T. Daniels

unread,
Oct 13, 2022, 9:17:48 AM10/13/22
to
In 1948, Robert A. Hall Jr. (who was a teacher of mine later on)
caused a firestorm. among Gallicists at least, when in his *French:
A Structural Sketch* (a supplement to the journal *Language*) he
did away with orthography entirely and with the notion of "word"
and treated the sort of thing wugi mentions as "breath groups" --
rather like Sadock's definition of "word" that I gave earlier.

He didn't do a similar sketch of Spanish, but I think you'll find
that sort of thing about English in C. C. Fries's two books from
the 1930s. At that time there were no oral databases, of course,
so he used as his data the next best thing: letters written to
President FDR from the marginally literate, whose expressions
could be assumed to have barely been tainted by prescriptivism.

Peter T. Daniels

unread,
Oct 13, 2022, 9:23:01 AM10/13/22
to
On Thursday, October 13, 2022 at 9:17:36 AM UTC-4, Ruud Harmsen wrote:
> >Op 13/10/2022 om 9:57 schreef Ruud Harmsen:

> My definition, by the way, requires spaces or other punctuation
> between words, so it doesn't work for Japanese or Chinese. Before

Virtually every word in Japanese is written with one or two kanji
followed by several hiragana for the inflections (or else entirely
in katakana), so it should be childsplay [the squggler doesn't like
that -- isn't it the name of the first Chucky movie?] to write a
program to chop up a Japanese text into words. Because Japanese
is rigorously suffixing, it's a language where defining "word" is easy.
(Deciding whether some affixes are "suffixes" or "clitics" is a different
question.)

Christian Weisgerber

unread,
Oct 13, 2022, 10:30:06 AM10/13/22
to
On 2022-10-11, Peter T. Daniels <gram...@verizon.net> wrote:

> In French, word spaces clearly do not correspond with the spoken
> language, with both prefixed and suffixed clitics and pronouns and
> such.

It's even worse: syllabification will happily cross "word" boundaries
so that the final consonant of one word and the initial vowel of
the next word form a single syllable ("enchaînement").

--
Christian "naddy" Weisgerber na...@mips.inka.de

Christian Weisgerber

unread,
Oct 13, 2022, 10:30:06 AM10/13/22
to
On 2022-10-10, John Smith <12...@whatismyemailaddress.xyz> wrote:

> What are the linguistic criteria to determine when a string of
> characters is a word?

As others have already mentioned, there is no good linguistic
definition for what constitutes a word. Problems arise when trying
to apply criteria across languages, but even within one language,
criteria for syntactical words, morphological words, phonological
words, etc. may be incompatible.

In fact, the notion of a "phonological word" frequently comes up
in opposition to words defined otherwise, e.g. by orthography.
Take a simple English sentence:

I gave her the book.

For a language like English, it makes sense to require that a
phonological word incorporates a stressed syllable. But in normal,
casual pronunciation, people don't pronounce "gave her" as two
syllables; instead the "her" is unstressed and attached to "gave",
"gave'er" /ˈgeɪvər/. So is "gave her" one word?

> The reason I am asking is because it is often said
> that German has very long words, when in fact what seems to be the case
> that it is several separate words run together, each word having a
> meaning in isolation.

Stop.

Indeed, those long German words are typically noun-based compounds,
frequently (noun-...-)noun-noun compounds. All Germanic languages
form such compounds, including English. The salient difference is
purely orthographic: English separates the components with spaces,
German doesn't. So when discussing this phenomenon, you don't need
to exoticize it by invoking German. English itself shows the same.

My favorite example, because it actually occurs in the wild, is

Abu Dhabi Combat Club Submission Wrestling World Championships
gold medalist

It's so long I had to break the line. Syntactically, that monster
behaves like a single noun. And it's just a fluke of orthography
that English doesn't spell it as

abudhabicombatclubsubmissionwrestlingworldchampionshipsgoldmedalist

Christian Weisgerber

unread,
Oct 13, 2022, 10:30:06 AM10/13/22
to
On 2022-10-11, wugi <wu...@scrlt.com> wrote:

> Dutch and occasionally German do complicate those compounds by inserting
> adverbial or case particles/suffixes:

The infamous linking elements (Fugenlaute) of German.

IIRC it is Swedish that has simplified the pattern such that there
is no linking element for noun+noun, but a fixed -s- is inserted
in noun+noun+s+noun.

English doesn't have productive linking elements in compounds, but
there may be a very few isolated instances, e.g. "beeswax".

wugi

unread,
Oct 13, 2022, 10:58:48 AM10/13/22
to
Op 13/10/2022 om 15:55 schreef Christian Weisgerber:
Anyway, it poses the question: who'll consider this a single word?*

Where to draw a line between single words? Can a whole phrase be a
single word, as it alledgedly does in various languages?
And as you said: there is no allround definition of the, erm, word.

*Ditto for that (..)famous Welsh station name...

--
guido wugi

DKleinecke

unread,
Oct 13, 2022, 6:19:41 PM10/13/22
to
Assuming there is a "human language" how is your concept different
than Chomsky's universal grammar?

Daud Deden

unread,
Oct 14, 2022, 12:32:19 AM10/14/22
to
I spend so little time thinking about grammar, I have no idea. If UG is real, does that mean it has remained unchanged for 5 million years, or has it evolved from a simpler origin, possibly akin to macaque, vervet or gibbon call grammar?
Ask me the question when I've got a paleo dictionary finished. For now, I'm immersed in Paleo-etymology thinking, grammar is of no significance at all.

Ruud Harmsen

unread,
Oct 14, 2022, 1:07:05 AM10/14/22
to
Thu, 13 Oct 2022 06:23:00 -0700 (PDT): "Peter T. Daniels"
<gram...@verizon.net> scribeva:
In the Japanese Wikipedia, e.g. here,
https://ja.wikipedia.org/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%89%88
already in the title, I see what seem to be sequences of 3 or more
kanji. So how do you chop those up into words of 1 or 2?

(My working definition of kanji here is: the ones that look
complicated. I'll what it really is using my
https://rudhar.com/sfreview/utf8cntx/index-en.htm . Result:

Ruud Harmsen via Google Groups <google@rudhar.com>

unread,
Oct 14, 2022, 1:17:34 AM10/14/22
to
On Friday, October 14, 2022 at 7:07:05 AM UTC+2, Ruud Harmsen wrote:
> In the Japanese Wikipedia, e.g. here,
> https://ja.wikipedia.org/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%89%88
> already in the title, I see what seem to be sequences of 3 or more
> kanji. So how do you chop those up into words of 1 or 2?
>
> (My working definition of kanji here is: the ones that look
> complicated. I'll check what it really is using my
> https://rudhar.com/sfreview/utf8cntx/index-en.htm . Result:

ウィキペディア日本語版
000000 0xe3-82-a6-e3 0x0030a6: ............
000003 0xe3-82-a3-e3 0x0030a3: ...........
000006 0xe3-82-ad-e3 0x0030ad: ..........
000009 0xe3-83-9a-e3 0x0030da: .........
000012 0xe3-83-87-e3 0x0030c7: ........
000015 0xe3-82-a3-e3 0x0030a3: .......
000018 0xe3-82-a2-e6 0x0030a2: ......
000021 0xe6-97-a5-e6 0x0065e5: .....
000024 0xe6-9c-ac-e8 0x00672c: ....
000027 0xe8-aa-9e-e7 0x008a9e: ...
000030 0xe7-89-88-0a 0x007248: ..
000033 0x0a-00-88-0a 0x00000a: .

Knowing from https://rudhar.com/lingtics/uniclnkl.htm that 3040-309F is
for Hiragana and 30A0-30FF is for Katakana, here we see 7 Katakana
followed by 4 Kanji (in the range 4E00-9FFF CJK Unified Ideographs).

Ruud Harmsen

unread,
Oct 14, 2022, 1:37:47 AM10/14/22
to
Tue, 11 Oct 2022 06:55:30 -0700 (PDT): "Peter T. Daniels"
<gram...@verizon.net> scribeva:
>In Hungarian (I believe you've been discussing that language recently)
>each stressed syllable begins a new word;

True.

Yet, when I try to denote Hungarian fragments by ear, and later with
Google’s help find the real thing, there are often mistakes in my word
boundaries.

Ruud Harmsen

unread,
Oct 14, 2022, 5:03:26 AM10/14/22
to
Thu, 13 Oct 2022 13:55:13 -0000 (UTC): Christian Weisgerber
<na...@mips.inka.de> scribeva:

>On 2022-10-10, John Smith <12...@whatismyemailaddress.xyz> wrote:
>
>> What are the linguistic criteria to determine when a string of
>> characters is a word?
>
>As others have already mentioned, there is no good linguistic
>definition for what constitutes a word. Problems arise when trying
>to apply criteria across languages, but even within one language,
>criteria for syntactical words, morphological words, phonological
>words, etc. may be incompatible.
>
>In fact, the notion of a "phonological word" frequently comes up
>in opposition to words defined otherwise, e.g. by orthography.
>Take a simple English sentence:
>
> I gave her the book.
>
>For a language like English, it makes sense to require that a
>phonological word incorporates a stressed syllable. But in normal,
>casual pronunciation, people don't pronounce "gave her" as two
>syllables; instead the "her" is unstressed and attached to "gave",
>"gave'er" /?ge?v?r/. So is "gave her" one word?
>
>> The reason I am asking is because it is often said
>> that German has very long words, when in fact what seems to be the case
>> that it is several separate words run together, each word having a
>> meaning in isolation.
>
>Stop.
>
>Indeed, those long German words are typically noun-based compounds,
>frequently (noun-...-)noun-noun compounds. All Germanic languages
>form such compounds, including English. The salient difference is
>purely orthographic: English separates the components with spaces,
>German doesn't. So when discussing this phenomenon, you don't need
>to exoticize it by invoking German. English itself shows the same.

And because the English spelling obscures the sometimes extreme length
of such compounds, they are often much longer than people would
reasonably tolerate them to be in languages like Dutch or German, even
though they _can_ grammatically be made that long in those languages
too. But they can also be broken up, in Romance language style, with
prepositional constructions. (Example: in Romaansetalenstijl, hmm, no,
I prefer: op de manier van Romaanse talen.)

This in combination with the fact that many English words can be a
noun or a verb without any formal difference, results in strings (not
always complete sentences) in software manuals and technical manuals
for machines etc., which can be annoyingly ambiguous. Been there, done
that. Questions to clarify things are sometimes not taken seriously,
because many monolingual English speakers, especially if they have
knowledge of the machine or software in question, don't see the
problem and think the translator is incompetent. But the real
incompetent translators are those who do not ask, and instead make
unfounded assumptions.

>My favorite example, because it actually occurs in the wild, is
>
> Abu Dhabi Combat Club Submission Wrestling World Championships
> gold medalist

Nice one.

>It's so long I had to break the line. Syntactically, that monster
>behaves like a single noun. And it's just a fluke of orthography
>that English doesn't spell it as
>
> abudhabicombatclubsubmissionwrestlingworldchampionshipsgoldmedalist

Yes. The direct Dutch or German translation should be written like
that. Therefore of course it isn't, and a different, more manageable
translation is chosen.

I could not give actual translations right away, because I am not sure
what the role of Submission is in this "word". Well, this claryfies
the matter: https://de.wikipedia.org/wiki/Submission_Wrestling .
So:
De winnaar van de gouden medaille bij het wereldkampioenschap in
submission wrestling, georganiseerd door de vechtsportclub van Abu
Dhabi.
In reorganised English:
The winner of the gold medal in the World Chamionship in Submission
Wrestling, organised by the Combat Club of Abu Dhabi.

Hope my interpretation is correct. Perhaps it is not, and a different
translation would be needed.

Ruud Harmsen

unread,
Oct 14, 2022, 5:07:08 AM10/14/22
to
Fri, 14 Oct 2022 11:03:24 +0200: Ruud Harmsen <r...@rudhar.com>
scribeva:

>Thu, 13 Oct 2022 13:55:13 -0000 (UTC): Christian Weisgerber
><na...@mips.inka.de> scribeva:
>>My favorite example, because it actually occurs in the wild, is
>>
>> Abu Dhabi Combat Club Submission Wrestling World Championships
>> gold medalist

Google Translate fails on this. So does DeepL.

Helmut Richter

unread,
Oct 14, 2022, 5:17:29 AM10/14/22
to
On Thu, 13 Oct 2022, Christian Weisgerber wrote:

> On 2022-10-11, wugi <wu...@scrlt.com> wrote:
>
> > Dutch and occasionally German do complicate those compounds by inserting
> > adverbial or case particles/suffixes:
>
> The infamous linking elements (Fugenlaute) of German.
>
> IIRC it is Swedish that has simplified the pattern such that there
> is no linking element for noun+noun, but a fixed -s- is inserted
> in noun+noun+s+noun.

IYRC: Is this feature of Swedish dependent on whether the construct is
built as (word+word)+word or as word+(word+word)?

In English the first of these has the spelling "word-word word", the
second has "word word word", and the intonation is different. I remember
the somewhat weird example

purple elephant gun: a purple gun for shooting elephants
purple-elephant gun: a gun for shooting purple elephants

In German, there is no difference between the two.
A "Mädchenhandelsschule" could be a business school for girls, or else a
school teaching Mädchenhandel (white slave trade).

--
Helmut Richter

Peter T. Daniels

unread,
Oct 14, 2022, 9:16:22 AM10/14/22
to
Chomsky apparently still claims that language was the result
of a single *genetic* mutation in a human ancestor that made
it possible for humans to perform the cognitive operation that
he most recently calls "Merge."

Some years ago, he collaborated with two linguists who were
already known for origin-of-language research, Hauser and Fitch.

Peter T. Daniels

unread,
Oct 14, 2022, 9:18:42 AM10/14/22
to
On Friday, October 14, 2022 at 1:07:05 AM UTC-4, Ruud Harmsen wrote:
> Thu, 13 Oct 2022 06:23:00 -0700 (PDT): "Peter T. Daniels"
> <gram...@verizon.net> scribeva:
> >On Thursday, October 13, 2022 at 9:17:36 AM UTC-4, Ruud Harmsen wrote:
> >> >Op 13/10/2022 om 9:57 schreef Ruud Harmsen:
> >
> >> My definition, by the way, requires spaces or other punctuation
> >> between words, so it doesn't work for Japanese or Chinese. Before
> >
> >Virtually every word in Japanese is written with one or two kanji
> >followed by several hiragana for the inflections (or else entirely
> >in katakana), so it should be childsplay [the squggler doesn't like
> >that -- isn't it the name of the first Chucky movie?] to write a
> >program to chop up a Japanese text into words. Because Japanese
> >is rigorously suffixing, it's a language where defining "word" is easy.
> >(Deciding whether some affixes are "suffixes" or "clitics" is a different
> >question.)
> In the Japanese Wikipedia, e.g. here,
> https://ja.wikipedia.org/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%89%88
> already in the title, I see what seem to be sequences of 3 or more
> kanji. So how do you chop those up into words of 1 or 2?

Are you unfamiliar with the expression "virtually every"?
(Maybe it's an idiom.)
Have you never understood that the grammar of headlines/titles
offen differs from the grammar of texts?

> (My working definition of kanji here is: the ones that look
> complicated. I'll what it really is using my
> https://rudhar.com/sfreview/utf8cntx/index-en.htm . Result:

Nothing at all. Hmm.

Peter T. Daniels

unread,
Oct 14, 2022, 9:23:53 AM10/14/22
to
On Friday, October 14, 2022 at 5:03:26 AM UTC-4, Ruud Harmsen wrote:
> Thu, 13 Oct 2022 13:55:13 -0000 (UTC): Christian Weisgerber
> <na...@mips.inka.de> scribeva:

> >My favorite example, because it actually occurs in the wild, is
> >
> > Abu Dhabi Combat Club Submission Wrestling World Championships
> > gold medalist
> Nice one.
> >It's so long I had to break the line. Syntactically, that monster
> >behaves like a single noun. And it's just a fluke of orthography
> >that English doesn't spell it as
> >
> > abudhabicombatclubsubmissionwrestlingworldchampionshipsgoldmedalist
>
> Yes. The direct Dutch or German translation should be written like
> that. Therefore of course it isn't, and a different, more manageable
> translation is chosen.
>
> I could not give actual translations right away, because I am not sure
> what the role of Submission is in this "word".

No idea what it is (nor do I care), but it's obvious that "submission
wrestling" is a variety of wrestling.

> Well, this claryfies
> the matter: https://de.wikipedia.org/wiki/Submission_Wrestling .

Yet somehow you knew how to parse it so you could look it up.

> So:
> De winnaar van de gouden medaille bij het wereldkampioenschap in
> submission wrestling, georganiseerd door de vechtsportclub van Abu
> Dhabi.
> In reorganised English:
> The winner of the gold medal in the World Chamionship in Submission
> Wrestling, organised by the Combat Club of Abu Dhabi.

Or, the gold medalist who belongs to that club.

Christian Weisgerber

unread,
Oct 14, 2022, 11:30:07 AM10/14/22
to
On 2022-10-11, Peter T. Daniels <gram...@verizon.net> wrote:

> In Hungarian (I believe you've been discussing that language recently)
> each stressed syllable begins a new word;

Does the definite article "a(z)" get its own stress?

wugi

unread,
Oct 14, 2022, 12:25:54 PM10/14/22
to
Op 14/10/2022 om 11:17 schreef Helmut Richter:
In a slightly different vein, I remember a funny cartoon, was it in Lui
?;), showing two schools at both sides of the street, at registration time:
one displayed
Ecole Publique de Filles,
the other
Ecole de Filles Publiques.
Guess where the cues were.

--
guido wugi

Daud Deden

unread,
Oct 14, 2022, 2:20:40 PM10/14/22
to
tepozpocatetlahuilānalōni, means "train", as in the kind that goes on a railroad. [Wiktionary]

Daud Deden

unread,
Oct 14, 2022, 2:32:10 PM10/14/22
to
Yes, I recall that.
I do wonder if the mutation which reduced the chromosome count from 48 (standard great apes) to 46 in Homo might be significant in inverting the arboreal bowl nest into the terrestrial dome hut (domeshield) and flipped a switch from loud ape calls to conversational chattering more like monkeys . Probably not.

Peter T. Daniels

unread,
Oct 14, 2022, 2:34:00 PM10/14/22
to
On Friday, October 14, 2022 at 11:30:07 AM UTC-4, Christian Weisgerber wrote:
> On 2022-10-11, Peter T. Daniels <gram...@verizon.net> wrote:

> > In Hungarian (I believe you've been discussing that language recently)
> > each stressed syllable begins a new word;
>
> Does the definite article "a(z)" get its own stress?

No idea. Would you call it a word?

Ruud Harmsen

unread,
Oct 15, 2022, 6:24:00 AM10/15/22
to
Fri, 14 Oct 2022 11:17:26 +0200: Helmut Richter <hr.u...@email.de>
scribeva:
The first has secondary stress on Mäd and Schu, primary stress on Han.
The second has primary stress on Mäd, secondary on Han and Schu.

I would think. Am I right?

Ruud Harmsen

unread,
Oct 15, 2022, 6:27:21 AM10/15/22
to
Fri, 14 Oct 2022 06:18:40 -0700 (PDT): "Peter T. Daniels"
<gram...@verizon.net> scribeva:

>On Friday, October 14, 2022 at 1:07:05 AM UTC-4, Ruud Harmsen wrote:
>> Thu, 13 Oct 2022 06:23:00 -0700 (PDT): "Peter T. Daniels"
>> <gram...@verizon.net> scribeva:
>> >On Thursday, October 13, 2022 at 9:17:36 AM UTC-4, Ruud Harmsen wrote:
>> >> >Op 13/10/2022 om 9:57 schreef Ruud Harmsen:
>> >
>> >> My definition, by the way, requires spaces or other punctuation
>> >> between words, so it doesn't work for Japanese or Chinese. Before
>> >
>> >Virtually every word in Japanese is written with one or two kanji
>> >followed by several hiragana for the inflections (or else entirely
>> >in katakana), so it should be childsplay [the squggler doesn't like
>> >that -- isn't it the name of the first Chucky movie?] to write a
>> >program to chop up a Japanese text into words. Because Japanese
>> >is rigorously suffixing, it's a language where defining "word" is easy.
>> >(Deciding whether some affixes are "suffixes" or "clitics" is a different
>> >question.)
>> In the Japanese Wikipedia, e.g. here,
>> https://ja.wikipedia.org/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%89%88
>> already in the title, I see what seem to be sequences of 3 or more
>> kanji. So how do you chop those up into words of 1 or 2?
>
>Are you unfamiliar with the expression "virtually every"?
>(Maybe it's an idiom.)

There isn't just one, there are many counter examples.

>Have you never understood that the grammar of headlines/titles
>offen differs from the grammar of texts?

There are lots of counter example in the body text too.

>> (My working definition of kanji here is: the ones that look
>> complicated. I'll what it really is using my
>> https://rudhar.com/sfreview/utf8cntx/index-en.htm . Result:
>
>Nothing at all. Hmm.

Saw next posting.

Ruud Harmsen

unread,
Oct 15, 2022, 6:28:59 AM10/15/22
to
Fri, 14 Oct 2022 06:23:52 -0700 (PDT): "Peter T. Daniels"
<gram...@verizon.net> scribeva:

>On Friday, October 14, 2022 at 5:03:26 AM UTC-4, Ruud Harmsen wrote:
>> Thu, 13 Oct 2022 13:55:13 -0000 (UTC): Christian Weisgerber
>> <na...@mips.inka.de> scribeva:
>
>> >My favorite example, because it actually occurs in the wild, is
>> >
>> > Abu Dhabi Combat Club Submission Wrestling World Championships
>> > gold medalist
>> Nice one.
>> >It's so long I had to break the line. Syntactically, that monster
>> >behaves like a single noun. And it's just a fluke of orthography
>> >that English doesn't spell it as
>> >
>> > abudhabicombatclubsubmissionwrestlingworldchampionshipsgoldmedalist
>>
>> Yes. The direct Dutch or German translation should be written like
>> that. Therefore of course it isn't, and a different, more manageable
>> translation is chosen.
>>
>> I could not give actual translations right away, because I am not sure
>> what the role of Submission is in this "word".
>
>No idea what it is (nor do I care), but it's obvious that "submission
>wrestling" is a variety of wrestling.
>
>> Well, this claryfies
>> the matter: https://de.wikipedia.org/wiki/Submission_Wrestling .
>
>Yet somehow you knew how to parse it so you could look it up.

Trying one possibility is not the same as knowing already.

>> So:
>> De winnaar van de gouden medaille bij het wereldkampioenschap in
>> submission wrestling, georganiseerd door de vechtsportclub van Abu
>> Dhabi.
>> In reorganised English:
>> The winner of the gold medal in the World Chamionship in Submission
>> Wrestling, organised by the Combat Club of Abu Dhabi.
>
>Or, the gold medalist who belongs to that club.

Yes, also possible.

Peter T. Daniels

unread,
Oct 15, 2022, 9:43:39 AM10/15/22
to
On Saturday, October 15, 2022 at 6:27:21 AM UTC-4, Ruud Harmsen wrote:
> Fri, 14 Oct 2022 06:18:40 -0700 (PDT): "Peter T. Daniels"
> <gram...@verizon.net> scribeva:
> >On Friday, October 14, 2022 at 1:07:05 AM UTC-4, Ruud Harmsen wrote:
> >> Thu, 13 Oct 2022 06:23:00 -0700 (PDT): "Peter T. Daniels"
> >> <gram...@verizon.net> scribeva:
> >> >On Thursday, October 13, 2022 at 9:17:36 AM UTC-4, Ruud Harmsen wrote:
> >> >> >Op 13/10/2022 om 9:57 schreef Ruud Harmsen:

> >> >> My definition, by the way, requires spaces or other punctuation
> >> >> between words, so it doesn't work for Japanese or Chinese. Before
> >> >Virtually every word in Japanese is written with one or two kanji
> >> >followed by several hiragana for the inflections (or else entirely
> >> >in katakana), so it should be childsplay [the squggler doesn't like
> >> >that -- isn't it the name of the first Chucky movie?] to write a
> >> >program to chop up a Japanese text into words. Because Japanese
> >> >is rigorously suffixing, it's a language where defining "word" is easy.
> >> >(Deciding whether some affixes are "suffixes" or "clitics" is a different
> >> >question.)
> >> In the Japanese Wikipedia, e.g. here, https://ja.wikipedia.org/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%89%88
> >> already in the title, I see what seem to be sequences of 3 or more
> >> kanji. So how do you chop those up into words of 1 or 2?
> >Are you unfamiliar with the expression "virtually every"?
> >(Maybe it's an idiom.)
>
> There isn't just one, there are many counter examples.

Are you unfamiliar with the phrase "virtually every"?

Open a Japanese book and look at the damn text.

> >Have you never understood that the grammar of headlines/titles
> >offen differs from the grammar of texts?
>
> There are lots of counter example in the body text too.

Are you unfamiliar with the phrase "virtually every"?

(No, there aren't. I suspect you're not recognizing a lot of
hiragana as such.)

Ruud Harmsen

unread,
Oct 15, 2022, 11:11:35 AM10/15/22
to
Sat, 15 Oct 2022 06:43:38 -0700 (PDT): "Peter T. Daniels"
I thought it was implicitly obvious from the previous that I know that
expression and know exactly what it mean. Repeating the question is
useless and serves no purpose.

>Open a Japanese book and look at the damn text.

I have no Japanese book, and won't buy, because I cannot read the
language and have no intention of learning it. That's why I used the
readily available Wikipedia.

>> >Have you never understood that the grammar of headlines/titles
>> >offen differs from the grammar of texts?
>>
>> There are lots of counter example in the body text too.
>
>Are you unfamiliar with the phrase "virtually every"?

See above.

To explain the obvious: if in a random text I already find multiple
counter examples, that means "virtually every" is incorrect.
"Virtually every", as you might or might not know, means "almost all,
nearly every, with very few exceptions, with hardly any exception".
Lots is not the same as hardly any.

>(No, there aren't. I suspect you're not recognizing a lot of
>hiragana as such.)

I can see EXACTLY what is kanji, katakana and hiragana, just by
looking at the Unicode scalars, as I explained in my next post.

Ruud Harmsen via Google Groups <google@rudhar.com>

unread,
Oct 15, 2022, 11:26:11 AM10/15/22
to
RH:
>> There are lots of counter example in the body text too.
> Are you unfamiliar with the phrase "virtually every"?
> (No, there aren't. I suspect you're not recognizing a lot of
> hiragana as such.)

So you are saying that Unicode 65e5, 672c, 8a9e, 7248, 本語版, is not
a sequence of four kanji?

Ruud Harmsen via Google Groups <google@rudhar.com>

unread,
Oct 15, 2022, 11:37:19 AM10/15/22
to
Sorry, I meant 日本語版 of course. Checked them in the block
https://unicode.org/charts/PDF/U4E00.pdf . They are all Kanji.
Of course, because Hiragana and Katakana each have their own
block, in the 3000s.

Peter T. Daniels

unread,
Oct 15, 2022, 3:10:18 PM10/15/22
to
I wonder how you changed four code points into three characters.

Peter T. Daniels

unread,
Oct 15, 2022, 3:12:10 PM10/15/22
to
Maybe someone who can read Japanese will tell you what they say.
The most likely guess is a two-morpheme phrase with no inflections
or clitics on the first one.

Ross Clark

unread,
Oct 15, 2022, 4:20:34 PM10/15/22
to
Google Translate will tell you.
The meaning is 'Japanese language version'. Taken together with the
preceding sequence of katakana, we have a phrase with the structure:

[Wikipediya [[[Nihon]go]han]] "Japanese language version of Wikipedia"

followed by the topic marker wa (hiragana ha).

Elsewhere on that page I see

百科事典 hyakkajiten 'encyclopedia'
事典 by itself is 'dictionary'
百科 is 'hundred branches', perhaps not used elsewhere
So a four-kanji lexical item, albeit with binary (2+2) structure.

管理者 'administrator'
GT won't give me the pronunciation, but I recognize the last
character as 'person', so it's probably 2+1

編集者 henshuusha 'editor' likewise

日本語表記 nihongo hyooki 'Japanese [language] notation'

認知度 ninchi-do '[degree of] recognition'

閲覧数 etsuran-suu 'number of views'

純記事数 jun kiji-suu 'net article count'

OK, enough. There are plenty of 3+ kanji sequences, but they are
generally analyzable into 1- and 2-kanji constituents. GT's
romanizations give an idea (or somebody's idea) of how these are grouped
into "words".

Ruud Harmsen via Google Groups <google@rudhar.com>

unread,
Oct 15, 2022, 10:29:59 PM10/15/22
to
RH copy-pasted from the Japanese Wikipedia:
> >>>> ウィキペディア日本語版

> On 16/10/2022 8:12 a.m., Peter T. Daniels wrote:
> > Maybe someone who can read Japanese will tell you what they say.
> > The most likely guess is a two-morpheme phrase with no inflections
> > or clitics on the first one.

On Saturday, October 15, 2022 at 10:20:34 PM UTC+2, benl...@ihug.co.nz wrote:
> Google Translate will tell you.

Right. That was what I was going to try, but luckily I noticed your post
in time. Thanks.

On Saturday, October 15, 2022 at 10:20:34 PM UTC+2, benl...@ihug.co.nz wrote:
> OK, enough. There are plenty of 3+ kanji sequences, but they are
> generally analyzable into 1- and 2-kanji constituents. GT's
> romanizations give an idea (or somebody's idea) of how these are grouped
> into "words".

So that is probably why Mikio Hirabayashi, the creator of the search engine
Hyper Estraier (http://fallabs.com/hyperestraier/), used n-grams to do it
properly. He is (was?) Japanese, so he probably knows.

As said, the latest version is from 2007, I can't get it to run anymore, and
the author hasn't shown any activity in years. That's why I wrote my own,
which explicitly does not support languages that do not use spaces (or
other non-alphabetic characters, marked as such in Unicode) to delimit
words. Any sequence of alphabetic characters (and dash and single quote
somewhere in the middle) for me is a word. Plain and simple.

Ruud Harmsen

unread,
Oct 15, 2022, 10:31:48 PM10/15/22
to
Sat, 15 Oct 2022 12:10:17 -0700 (PDT): "Peter T. Daniels"
<gram...@verizon.net> scribeva:

>On Saturday, October 15, 2022 at 11:26:11 AM UTC-4, Ruud Harmsen via Google Groups <goo...@rudhar.com> wrote:
>> On Friday, October 14, 2022 at 7:17:34 AM UTC+2, Ruud Harmsen via Google Groups <goo...@rudhar.com> wrote:
>> > On Friday, October 14, 2022 at 7:07:05 AM UTC+2, Ruud Harmsen wrote:
>> > > In the Japanese Wikipedia, e.g. here,
>> > > https://ja.wikipedia.org/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%89%88
>> > > already in the title, I see what seem to be sequences of 3 or more
>> > > kanji. So how do you chop those up into words of 1 or 2?
>> > >
>> > > (My working definition of kanji here is: the ones that look
>> > > complicated. I'll check what it really is using my
>> > > https://rudhar.com/sfreview/utf8cntx/index-en.htm . Result:
>> >
>> > ???????????
>> > 000000 0xe3-82-a6-e3 0x0030a6: ............
>> > 000003 0xe3-82-a3-e3 0x0030a3: ...........
>> > 000006 0xe3-82-ad-e3 0x0030ad: ..........
>> > 000009 0xe3-83-9a-e3 0x0030da: .........
>> > 000012 0xe3-83-87-e3 0x0030c7: ........
>> > 000015 0xe3-82-a3-e3 0x0030a3: .......
>> > 000018 0xe3-82-a2-e6 0x0030a2: ......
>> > 000021 0xe6-97-a5-e6 0x0065e5: .....
>> > 000024 0xe6-9c-ac-e8 0x00672c: ....
>> > 000027 0xe8-aa-9e-e7 0x008a9e: ...
>> > 000030 0xe7-89-88-0a 0x007248: ..
>> > 000033 0x0a-00-88-0a 0x00000a: .
>> >
>> > Knowing from https://rudhar.com/lingtics/uniclnkl.htm that 3040-309F is
>> > for Hiragana and 30A0-30FF is for Katakana, here we see 7 Katakana
>> > followed by 4 Kanji (in the range 4E00-9FFF CJK Unified Ideographs).
>> RH:
>> >> There are lots of counter example in the body text too.
>> > Are you unfamiliar with the phrase "virtually every"?
>> > (No, there aren't. I suspect you're not recognizing a lot of
>> > hiragana as such.)
>>
>> So you are saying that Unicode 65e5, 672c, 8a9e, 7248, ???, is not
>> a sequence of four kanji?
>
>I wonder how you changed four code points into three characters.

That was a mistake in my copy-paste, later amended.

Peter T. Daniels

unread,
Oct 16, 2022, 9:09:00 AM10/16/22
to
Thanks!

Christian Weisgerber

unread,
Oct 17, 2022, 10:30:07 AM10/17/22
to
On 2022-10-14, Helmut Richter <hr.u...@email.de> wrote:

>> IIRC it is Swedish that has simplified the pattern such that there
>> is no linking element for noun+noun, but a fixed -s- is inserted
>> in noun+noun+s+noun.
>
> IYRC: Is this feature of Swedish dependent on whether the construct is
> built as (word+word)+word or as word+(word+word)?

Sorry, I seem to remember that I read it in one of Damaris Nübling's
publications on the topic of linking elements in German, but I can't
find it.

The rather lengthy Swedish Wikipedia page...
https://sv.wikipedia.org/wiki/Sammans%C3%A4ttning_(lingvistik)
... does not appear to discuss this.

> In English the first of these has the spelling "word-word word", the
> second has "word word word", and the intonation is different. I remember
> the somewhat weird example
>
> purple elephant gun: a purple gun for shooting elephants
> purple-elephant gun: a gun for shooting purple elephants

I think you'll find that spelling rule widely ignored.

Helmut Richter

unread,
Oct 21, 2022, 7:10:44 AM10/21/22
to
On Sat, 15 Oct 2022, Ruud Harmsen wrote:

> Fri, 14 Oct 2022 11:17:26 +0200: Helmut Richter <hr.u...@email.de>
> scribeva:
> > [...]

> >A "Mädchenhandelsschule" could be a business school for girls, or else a
> >school teaching Mädchenhandel (white slave trade).
>
> The first has secondary stress on Mäd and Schu, primary stress on Han.

It depends on the alternative: is it in contrast to another girls’ school
or to another business school that is not exclusively for girls?

If there is no alternative, e.g. just explaining on a city tour what the
building is about, I would perhaps stress Schu not so much as the other
two.

> The second has primary stress on Mäd, secondary on Han and Schu.

Yes.

--
Helmut Richter
0 new messages