* Thomas Bushnell, BSG | Does a toneme in Norwegian extend past a single syllable, however? I | don't know the answer to that question.
Basically, the whole word is either rising, falling, or rising-falling, and in combining words, the intonation of both words change. For instance, English as a Second Language means something different from English as the Second Language. In Norwegian we have "Norsk som andre sprog" og "Norsk som andresprog", where the former means either "like other langauges" or "as the second language" depending on tone, and is also distinguished from the latter by tone, not by stress. This is particulary funny when those furriners try to find the section in the bookstore that would help them get just this point and doubly funny when the bookstore cannot even spell it correctly, which their all too young information desk attendant could not pick up from the tone difference even though several bystanders could, and laughed, when I tried in vain to point out the fuuny mistake to her.
| The tones actually extend beyond just the vowel, and affect timing and | intonation of the whole word, however. But they are assigned to the | stressed vowel only, and are counted as various phonemic variants of that | vowel. | | The situation might work out similarly in Norwegian, dunno.
I do not know Classical Attic Greek so I cannot say for certain, but your brief description makes me believe there is a good chance of a similarity.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Kenny Tilton <ktil...@nyc.rr.com> writes: > Ed L Cashin wrote:
> > I'd be happy to hear > > a good case for case-insensitive identifiers.
> I've done a ton of case-sensitive C and I've done a ton of code in > case-insensitive languages. I like case-insensitive much, much more. > Does that count?
Does for me. Also, I think Erik Naggum provided a good argument undermining my original assumption that 'a' and 'A' are different characters in reality (even though they have distinct encodings in ASCII, as opposed to his ideal), and Kent Pitman pointed out that psychologically, we think of strings of letters first, remembering case less frequently.
> A deeper reason is that it seems weird to use case to differentiate > two things. If I looked down and saw an app with two functions, say, > ABLE-P and able-p, meaning different things which the case was meant > to convey, I would have regretably ungenerous thoughts regarding the > author.
Yes. The responses I've read beg the question, "Isn't it the code author's fault if case-sensitivity is abused?" I mean, if the language is case sensitive and people write poor-quality code, is that the language designer's fault?
As for case-folding issues in the face of different languages with different notions of upper and lower case, it seems like many hairy issues are associated with it, and I'm looking forward to the day when I can appreciate them all!
> Yes. The responses I've read beg the question, "Isn't it the code > author's fault if case-sensitivity is abused?" I mean, if the > language is case sensitive and people write poor-quality code, is that > the language designer's fault?
Yes, it is the language designer's fault. At least to some degree.
Languages make some errors more easy to make, some others harder to make, some concepts easier to express, some harder. And that is in the language designer's power to shape.
> As for case-folding issues in the face of different languages with > different notions of upper and lower case, it seems like many hairy > issues are associated with it, and I'm looking forward to the day when > I can appreciate them all!
Just curious -- which languages have different notions of upper and lower case characters (if you are talking about programming languages, of course)?
> > If I looked down and saw an app with two functions, say, > > ABLE-P and able-p, meaning different things which the case was meant > > to convey,...
> Yes. The responses I've read beg the question, "Isn't it the code > author's fault if case-sensitivity is abused?"
Oh, I jumped in on the middle of this (just can't keep up with c.l.l. anymore!) and maybe I missed something. Are you saying that ABLE-P vs able-p is poor quality code, but SomeThingElse vs somethingelse is not? If so, what would SomeThingElse be? If not...never mind. :)
--
kenny tilton clinisys, inc --------------------------------------------------------------- "Harvey has overcome not only time and space but any objections." Elwood P. Dowd
>> > If I looked down and saw an app with two functions, say, >> > ABLE-P and able-p, meaning different things which the case was meant >> > to convey,...
>> Yes. The responses I've read beg the question, "Isn't it the code >> author's fault if case-sensitivity is abused?"
> Oh, I jumped in on the middle of this (just can't keep up with > c.l.l. anymore!) and maybe I missed something. Are you saying that ABLE-P vs > able-p is poor quality code, but SomeThingElse vs somethingelse is not? > If so, what would SomeThingElse be? If not...never mind. :)
I believe what he's saying is that whichever of those are poor quality code - I would suggest that they all are - it isn't the fault of the language designer(s) but of the programmer.
Which brings up the question:
Is it at all possible to use case-sensitivity in an appropriate manner?
> ... It does an excellent job of explaining the > distinction between glyph and character. I think you need it much more > than trying to defend yourself by insulting me with your ignorance.
imagine how much time you would have saved yourself and everyone else had you just posted a useful part of the actual unicode standard, for example pp. 13, "Characters, Not Glyphs" [1]
The Unicode standard draws a distinction betweeb /characters/ which are the smallest components of written language that have semantic value, and /glyphs/, which represent the shapes that characters can have when they are rendered or displayed. Various relationships may exist between character and glyph; a single glyph may correspond to a single character, or to a number of characters, or multiple glyphs may result from a single character.
[etc]
but it is more fun to lecture, and madly scribble on the board, isn't it? :-]
oz --- [1] The Unicode Standard Version 3.0, Addison-Wesley, 2000.
* o...@cs.yorku.ca (ozan s. yigit) | imagine how much time you would have saved yourself and everyone else | had you just posted a useful part of the actual unicode standard, for | example pp. 13, "Characters, Not Glyphs" [1]
Imagine how much time people would have saved _everybody_ if they cared to study something before they thought they had the right to produce "opinions". "When did ignorance become a point of view?" Then imagine how much time it would take to find out what some ignorant fuck needs to hear in order to become unconfused. It is not my task to educate people who voice opinions on what they do not have the intellectual honesty and wherewithal to realize that they do not know sufficiently well. People who cannot keep track of what they know and what they do not know, should shut the fuck up, but they never will, precisely because they are unaware of what they know and do not know. Wade Humeniuk gave us a good analogy to his yoga classes and the mat-abusers. Non-thinking cretins who post ignorant opinions to newsgroup are just the same kind of inconsiderate bastards. But you choose to _defend_ them. What does that make you? Those who have the intellectual honesty to separate what they know from what they just assume, also know where they heard something and can rate its probability and credibility. Those are worth helping, because they are likely to learn from it. Those who are unlikely to learn from what you tell them, are a waste of time.
| but it is more fun to lecture, and madly scribble on the board, isn't it? :-]
Your life experiences apparently differ quite significantly from mine, but if you feel happy about exposing yourself like this, please do. More idiotic drivel that lets the world know how you think is probably going to be the result of your obvious desire to inflame rather than inform, so go ahead, make a spectacle of yourself. This newsgroup is quite used to your kind by now.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
IPmonger <ipmon...@delamancha.org> writes: > Which brings up the question:
> Is it at all possible to use case-sensitivity in an appropriate manner?
Sure, at least in non-lispy languages. A lot of C code uses the convention that identifiers in all caps are macros. Prolog (Edinburgh syntax) requires leading capitalization to distinguish variables from constant symbols (eliminating the need to quote constants in Prolog). In Haskell, capitalized symbols indicate data types and constructors. This makes the Haskell pattern matching syntax nicer. Some languages guarantee that all built-in identifiers will be of one case so that the user can use identifiers with the other case without fear of colliding with a current or future language-defined id.
As was noted a long time ago in this thread, in all these cases it's harder to read the code aloud since capitalization doesn't change the pronunciation. In practice, I think users of these languages have found that case distinctions work well all the same.
Torsten <vi...@fraqz.archeron.dk> writes: > Nils Goesche <n...@cartan.de> skrev:
> > But I have /read/ texts that didn't use capitalization in > > German, and it was very annoying in that it just makes harder > > to guess how a sentence is likely to end, something that is > > very important in German (the verb at the end...). [...] This > > has been measured, BTW.
> I hope you can see the obvious flaw in such measurements. There > is no large German speaking group not trained to capitalize > nouns.
Well, duh. Giving up the extra effort of capitalization isn't exactly something you have to train anybody for. I have read several *books* in German that didn't use capitalization at all. It was horrible. I don't have much time in the morning, but still manage to read large parts of the Frankfurter Allgemeine every morning, in very little time. I sometimes ``observe'' myself how I read that fast, and I found out that when looking at a whole block of text at a time, the visual structure of sentences indicated by capitalized words is a very useful help for the reader.
That's the whole point of capitalization. It makes reading easier, in German, anyway. The /only/ argument I /ever/ heard against capitalization was that it supposedly makes /writing/ easier for some retarded children. Well, even if that were true, which it isn't, who would want to read anything written by retarded children, anyway? Why do you write something down in the first place, if it wasn't for somebody to read? It's the reader that counts, not the writer.
I don't know Danish. Maybe it doesn't make a difference there. Maybe Danes only used to capitalize nouns because the Germans did, and as the Germans weren't exactly very popular in 1948, that might have been a good opportunity to give up on it.
Or maybe it wasn't. Who is supposed to know anymore? You said there was a controversy about it; maybe the people who were against it were right? Who would remember? How could you tell? When you lose a piece of culture like that, later generations don't remember or miss any of it anymore, but that doesn't mean it was right to drop it.
Suppose all the governments in the world suddenly decide to put an end to this babylonian mess of programming languages and make it a law that from now on, you are only allowed to program in, say, Java. We'd hate that and complain, but would be arrested and put into concentration camps until we either learn and publicly announce how great Java actually is, and how sorry we are for not recognizing that earlier, or are given the coup de grace if too stubborn.
What would happen after a few decades or so? I tell you what: People would be happy. They'd laugh about us crazy freaks who were too stupid to recognize the merits of their progressive, modern ideas. Nobody would remember any of the old languages, and how would young people know that they were worth anything, if every history book tells them that they were stupid and anti-modern? Everybody can see that they can write everything in Java, so why should they miss anything?
Regards, -- Nils Goesche Ask not for whom the <CONTROL-G> tolls.
> So, getting back to my original question about charset implementations > in Lisp/Scheme (though actually Smalltalk or any such > [much snippage] > So that means it's pretty easy to make sure the whole space of > UCS values fits in an immediate representation. That's fine for > working with actively used data.
Even for actively used data, compactness of representation pays off in better cache efficency. In fact, particularly for actively used data should we be mindful of this. Since you seem to be thinking of a 32-bit immediate representation, an improvement to 16-bit strings or even 8-bit strings is nothing to be sneezed at.
> However, strings that are going to be kept around a long time should, > it seems to me, be stored more compactly. Essentially all strings > will be in the Basic Multilingual Plane, so they can fit in 16 bits. > That means there would be two underlying string datatypes. I don't > think this is a serious problem.
As an implementor, I can tell you that actually the step from one string type to two is the hardest bit. Once you've figured out how you want to implement that, having more is not such a big deal. From a programmer's point of view, the efficiency gains from more string types outweigh the costs (unless you think you could do without the larger ones), even if you have to deal with it explicitly.
> Is it worth having a third (for 8-bit characters) so that Latin-1 > files don't have to be inflated by a factor of two? It seems to me > that this would be important too.
Files and strings don't really have much to do with each other. Files are an externalization issue. Of course you can store files in UCS, and sometimes that's the right thing to do, but in the real world, you have to deal with all kinds of encodings, so you need the machinery, anyway, to read and write Shift-JIS, Big5, Latin-1, UTF-8, etc.
Like I said above, it _is_ important to have an 8-bit string type. People in the West, who rarely even realize they could easily support 16-bit users, will get great benefits. And between files and "actively used data", there are those people who want to load their entire database in main memory and compute with that; they'll get their size limit extended as well.
> Basically then we would have strings which are UCS-4, UCS-2 and > Latin-1 restricted (internally, not visibly to users). [...] > Procedures like string-set! therefore might have to inflate (and > thus copy) the entire string if a value outside the range is stored. > But that's ok with me; I don't think it's a serious lose.
I suppose that is a viable implementation strategy, but I don't think it's the right option. The language should expose the range of string data types to the programmer, and let them choose, because the range of memory usage is just too great to sweep under the mat. Also, having strings automatically reallocated means an extra indirection for access which cannot always be optimized away.
I note that offering multiple string types is exactly what all the CL implementations seem to have done. This doesn't preclude having features that automatically select the smallest feasible type, e.g., for "" read syntax or a STRING-APPEND function. -- Pekka P. Pirinen The gap between theory and practice is bigger in practice than in theory.
Pekka.P.Piri...@globalgraphics.com (Pekka P. Pirinen) writes:
> > Is it worth having a third (for 8-bit characters) so that Latin-1 > > files don't have to be inflated by a factor of two? It seems to me > > that this would be important too.
> Files and strings don't really have much to do with each other. Files > are an externalization issue. Of course you can store files in UCS, > and sometimes that's the right thing to do, but in the real world, you > have to deal with all kinds of encodings, so you need the machinery, > anyway, to read and write Shift-JIS, Big5, Latin-1, UTF-8, etc.
In the system I'm contemplating, there are no files in the normal sense of the term; all user data lives as strings, more or less (there might be something more clever, but whateve). Whatever strategies are done for strings (and similar structures) will be important for all files.
So such data has to be efficiently stored...
> I note that offering multiple string types is exactly what all the CL > implementations seem to have done. This doesn't preclude having > features that automatically select the smallest feasible type, e.g., > for "" read syntax or a STRING-APPEND function.
But this is, it seems to me, unclean.
I think of it as being similar to the way numbers work. Yes, I can find out whether a given number is a fixnum or a bignum, and I might well care in some special case. But normally I just use numbers and expect the system to automagically do the right thing.
Similarly, I want the string type to simply encode Unicode strings, and the user should not be forced to deal with more. The user should not need to guess at the time the string is created whether or not it will later need to hold a bigger character code, for example.
Nils Goesche <n...@cartan.de> writes: > Torsten <vi...@fraqz.archeron.dk> writes:
> > Nils Goesche <n...@cartan.de> skrev: > > > But I have /read/ texts that didn't use capitalization in > > > German, and it was very annoying in that it just makes harder > > > to guess how a sentence is likely to end, something that is > > > very important in German (the verb at the end...). [...] This > > > has been measured, BTW.
[...]
hmm... I have been speaking and writing in German quite a while now and my knowledge of grammar says that the predicate is at the second place in the main clause. Only its infinite part goes to the end.
> What would happen after a few decades or so? I tell you what: > People would be happy. They'd laugh about us crazy freaks who > were too stupid to recognize the merits of their progressive, > modern ideas. Nobody would remember any of the old languages, > and how would young people know that they were worth anything, if > every history book tells them that they were stupid and > anti-modern? Everybody can see that they can write everything in > Java, so why should they miss anything?
That reminds me of Paul Graham saying that when he was using BASIC which at his time did not support recursion, he never needed recursion, as he did not know that it existed. And writing a long sentence in English reminds me how I love commas in German to make reading easier. ;)
> > > Nils Goesche <n...@cartan.de> skrev: > > > > But I have /read/ texts that didn't use capitalization in > > > > German, and it was very annoying in that it just makes harder > > > > to guess how a sentence is likely to end, something that is > > > > very important in German (the verb at the end...). [...] This > > > > has been measured, BTW.
> hmm... I have been speaking and writing in German quite a while now > and my knowledge of grammar says that the predicate is at the second > place in the main clause. Only its infinite part goes to the > end.
Anscheinend will er mich einfach nicht verstehen. Hat er wirklich kein Beispiel, kein einziges Beispiel, nicht einmal nach langem Sinnen ueber mein wundervolles Posting, dafuer gefunden? (Yes, that's what I meant)
> > What would happen after a few decades or so? I tell you what: > > People would be happy. They'd laugh about us crazy freaks who > > were too stupid to recognize the merits of their progressive, > > modern ideas. Nobody would remember any of the old languages, > > and how would young people know that they were worth anything, if > > every history book tells them that they were stupid and > > anti-modern? Everybody can see that they can write everything in > > Java, so why should they miss anything?
> That reminds me of Paul Graham saying that when he was using BASIC > which at his time did not support recursion, he never needed > recursion, as he did not know that it existed.
Like that.
> And writing a long sentence in English reminds me how I love commas in > German to make reading easier. ;)
Me too, hehe :-)
Regards, -- Nils Goesche Ask not for whom the <CONTROL-G> tolls.
> I wouldn't want to muck about internally with a format that had > characters of various different widths: too much pain to implement, > too many chances to introduce bugs, not enough space savings. > Besides, when people read whole files as strings, do you really > want to run through the whole string counting multi-byte characters > and single-byte characters to find the value of an expression like
> (string-ref FOO charcount) ;; lookups in a 32 million character string!
> where charcount is large? I don't. Constant width means O(1) lookup > time.
Well, there are several mitigating factors and some issues with CL which cause difficulties here.
If you consider your string as a sequence, then you can see that the issues with variable width encodings produce a data-type which has the access characteristics of a list.
The arguments for and against lists apply directly to variable-width strings.
If we look at the use of strings it falls into two fairly distinct categories;
Infact almost everything we do with strings is iterative (which makes sense when you remember why strings are called strings).
The problem is that Cl has rather poor support for iterating sequences.
If we considered a sequence to be addressed though two spaces, one being Index-Space, and the other Point-Space we could avoid a lot of these issues, and make lists more efficiently usable as sequences.
(elt seq index) would access the sequence though index space (which might involve walking down a list N steps). (elt-p seq point) would access the sequence though a point (which would involve no traversal).
The trick to efficiently exploiting this then would be to get a point from an index.
(dosequence (element point sequence) (when (char= element #\!) (setf (elt-p sequence point) #\$)))
for a fairly lame example.
with things like (subseq sequence :start-point a :end-point b) it starts to become more flexible.
Or the ability to say (dosequence (element point sequence :start-point point) ...) to allow the continuation of an iteration.
I'm not suggesting that this is an ideal solution, but it should at least point out some inadequacies in the current model.
With appropriate primitives the wide-spread use of list-like strings should not even be considered problematic, imho.
And in answer to the example above, I don't think that anyone would suggest forcing someone to use a variable-width string representation at all times. If random access to a particular string is important to you, then a vector-like string is obviously the way to go.
tb+use...@becket.net (Thomas Bushnell, BSG) wrote in message
> Similarly, I want the string type to simply encode Unicode strings, > and the user should not be forced to deal with more. The user should > not need to guess at the time the string is created whether or not it > will later need to hold a bigger character code, for example.
I think you need to differentiate between mutable and immutable strings.
A mutable string which is not explicitly restricted (such as simple-base-string) needs to be able to hold any character, so it needs to be conservative.
An immutable string cannot be modified, so you are free to encode it however you like, as long as you can represent whatever you have it in.
The remainder of the problem is the idea of strings as vectors rather than sequences, as sequences the O(1) access is no-longer an issue (although you'd want better iteration support than CL currently provides).
Beyond this it should be trivial to have an immutable string type which knows what encoding it is using, and can tell the system what accessor to use.
As a side-note, string literals and the names of symbols are immutable in CL.
In addition you would need an operator to encode a mutable string as an immutable string (using a given encoding), options for immutable construction for subseq, concatenate, string-output-stream, etc would also be useful.
> Well, duh. Giving up the extra effort of capitalization isn't > exactly something you have to train anybody for. I have read > several *books* in German that didn't use capitalization at all. > It was horrible.
I am talking about the measurement results. You claimed that the capitalization makes reading easier. But what was measured? The ability to read something that differ from the conventional way of writing German in comparison to the way eveybody and his dog learned in school. Somehow the result isn't all that surprising. Where was the control group? The people who had grown up using only non-capitalized German. How do you think they would fare in such a test if they existed? They would have had no preconceived ideas about non-capitalization looking weird.
> I don't have much time in the morning, but still manage to read > large parts of the Frankfurter Allgemeine every morning, in > very little time. I sometimes ``observe'' myself how I read > that fast, and I found out that when looking at a whole block > of text at a time, the visual structure of sentences indicated > by capitalized words is a very useful help for the reader.
Most likely because that's what you are used to.
> I don't know Danish. [...] You said there was a controversy > about it; maybe the people who were against it were right?
The most vocal opponents were the kind of people who always think the world is coming to an end if anything changes. They are now spending their energy on how to, or not to, place commas. In general, they are surprisingly clueless about the subjects they make sarcastic remarks about.
* Brian Spilsbury | I think you need to differentiate between mutable and immutable | strings.
I have suggested that strings need to be separated into two mor basic types: a stream which you read one element at a time, and a vector which provides random access. The former maps directly to files and is suitable for parsing and formatting, while a vector of characters is more useful for repeated access to the same characters.
We have the system class string-stream today, which offers stream access to a string, but I think we need a subclass of string like stream-string, which may contain such things as the octets from another stream such as directly from an input file, and be processed sequentially, and therefore should also be able to use stateful encodings such that reading through them with the string-stream functions would maintain that state.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Torsten <vi...@fraqz.archeron.dk> writes: > Nils Goesche <n...@cartan.de> skrev:
> > Well, duh. Giving up the extra effort of capitalization isn't > > exactly something you have to train anybody for. I have read > > several *books* in German that didn't use capitalization at all. > > It was horrible.
> I am talking about the measurement results. You claimed that the > capitalization makes reading easier. But what was measured? The > ability to read something that differ from the conventional way > of writing German in comparison to the way eveybody and his dog > learned in school. Somehow the result isn't all that surprising. > Where was the control group? The people who had grown up using > only non-capitalized German. How do you think they would fare in > such a test if they existed? They would have had no preconceived > ideas about non-capitalization looking weird.
I can give you an example from a learner of Japanese: The use of Kanji (Chinese characters) in place of their hiragana (phonetic) counterparts is just as redundant as capitalization is in German. A literate speaker of Japanese can easily read text that is written entirely in hiragana (although it might bother him or her a bit). In fact, historically, there was a time when women were not allowed to write Kanji, so this had to be true more or less by definition.
There are on the order of 50 hiragana, but there are several thousands of Kanji -- which means that learing just hiragana is immensely easier than learning both. According to the above, one would expect that someone without prior exposure to either system would have an easier time reading pure hiragana text.
I, having not been raised in Japan, fall into this category of having no prior exposure. But what can I tell you? The moment I managed to memorize even just a tiny number of Kanji, sentences that actually used them (in place of their hiragana spellings) became *vastly* easier to read for me. I am not a psychologist or linguist, so I won't speculate on why that is.
So if it were true that either way would be equally easy to read for someone without prior training, why would an utterly untrained person such as I (and pretty much all of my fellow students as well, BTW) see this effect? In other words, there is certainly more going on than just a "trained dog effect".
> The most vocal opponents were the kind of people who always think > the world is coming to an end if anything changes. They are now > spending their energy on how to, or not to, place commas. In > general, they are surprisingly clueless about the subjects they > make sarcastic remarks about.
[ ... which brings us back to the topic of "ad hominems". Just because idiots or bigots defend something, that something does not have to be wrong. (It, of course, does not mean that it is right either.) ]
>There are on the order of 50 hiragana, but there are several thousands >of Kanji -- which means that learing just hiragana is immensely easier >than learning both. According to the above, one would expect that >someone without prior exposure to either system would have an easier >time reading pure hiragana text.
>I, having not been raised in Japan, fall into this category of having >no prior exposure. But what can I tell you? The moment I managed to >memorize even just a tiny number of Kanji, sentences that actually >used them (in place of their hiragana spellings) became *vastly* >easier to read for me. I am not a psychologist or linguist, so I >won't speculate on why that is.
>So if it were true that either way would be equally easy to read for >someone without prior training, why would an utterly untrained person >such as I (and pretty much all of my fellow students as well, BTW) see >this effect? In other words, there is certainly more going on than >just a "trained dog effect".
Do kanji perhaps serve as some sort of abbreviation, or, I should rather say, syntactic abstraction? If so, their appeal may have the same reason as why no-longer-newbie users of a programming language prefer to extend the language with (their own or others') procedural and textual abstractions rather than sticking to core procedures and core syntax. I'm speculating only.
(BTW, abbreviations I should note are not just for saving space or "typing". They actually aid comprehension by reducing the time taken for cliches that don't deserve that time, and correspondingly letting the non-cliche part of the communication be highlighted more. Even electronic communcation, where space is not expensive the same way as on paper, and where "completion" aids abound, profits from abbreviations.)
> >There are on the order of 50 hiragana, but there are several thousands > >of Kanji -- which means that learing just hiragana is immensely easier > >than learning both. According to the above, one would expect that > >someone without prior exposure to either system would have an easier > >time reading pure hiragana text.
> >I, having not been raised in Japan, fall into this category of having > >no prior exposure. But what can I tell you? The moment I managed to > >memorize even just a tiny number of Kanji, sentences that actually > >used them (in place of their hiragana spellings) became *vastly* > >easier to read for me. I am not a psychologist or linguist, so I > >won't speculate on why that is.
> Do kanji perhaps serve as some sort of abbreviation, > or, I should rather say, syntactic abstraction? If so, > their appeal may have the same reason as why > no-longer-newbie users of a programming language prefer > to extend the language with (their own or > others') procedural and textual abstractions rather > than sticking to core procedures and core syntax. I'm > speculating only.
Kanji's often directly denote a certain meaning of a word, they're like images. When I ask my wife about the meaning of a Japanese word I'd heard or read (in Latin characters) somewhere, she is usually helpless: She can't tell until she sees the kanji sign. Hiragana only describes the sound of a word, like our Latin characters. She told me that sometimes Japanese would actually draw a kanji sign in the air when talking, to indicate what meaning of a word they're saying is intended. For instance, there is a kanji sign that has one and only one meaning: Kant's notion of ``category'' :-)
Regards, -- Nils Goesche Ask not for whom the <CONTROL-G> tolls.
In article <fo3cyjfm6j....@trex10.cs.bell-labs.com>, Matthias Blume <matth...@shimizu-blume.com> wrote:
> [...] The moment I managed to > memorize even just a tiny number of Kanji, sentences that actually > used them (in place of their hiragana spellings) became *vastly* > easier to read for me. I am not a psychologist or linguist, so I > won't speculate on why that is.
_One_ reason is that Japanese does not use white spaces to delimit the words. So, all hiragana text will be felt like reading
MakeLoadFormSavingSlots
instead of
make-load-form-saving-slots
-- <keke at mac com> Are you sure that sound might want to have an idiot?
> > [...] The moment I managed to > > memorize even just a tiny number of Kanji, sentences that actually > > used them (in place of their hiragana spellings) became *vastly* > > easier to read for me. I am not a psychologist or linguist, so I > > won't speculate on why that is.
> _One_ reason is that Japanese does not use white spaces to delimit > the words.