>> Capitalization _is_ incidental. It is ceremonially marked in written >> text, but my impression based on a basic knowledge of linguistics and >> a casual outside view of German [I don't purport to speak the >> langauge] is that German people may claim that "weg" and "Weg" are >> different words, but the capitalization is not pronounced audibly, so >> there is generally enough contextual information to disambiguate in >> speech.
>Well, in fact 'Weg' and 'weg' *are* pronounced differently, one with a >long 'e' and the other with a short one - that is because they are >different words. Should you incidentally start a sentence with 'weg', >thus writing it with capital 'W' it would still be pronounced like >'weg'. This might be difficult to understand, but that is how natural >languages are, I guess.
To me, that case is indeed ornamental is supported by the fact that it appears to be permissible to upper-case a German sentence in its entirety without construing it as a loss of information.
BITTE EIN BIT ICH BIN EIN BERLINER DIE MAUER MUSS WEG!
usw.
Ie, things like titles, slogans, and billboards, but also consider the GPL or other license text in the German, where large globs of the prose are in all caps. Legal prose, it seems to me, would especially not court information loss in this manner if it was felt there really was a risk.
I'm curious: Is there an example, however frivolous, where WEG in an all-caps sentence could be ambiguous?
BTW, the {Weg, weg} pair seems very like the {produce (noun), produce (verb)} pair in English. Like Weg/weg, produce/produce are pronounced differently. However, they don't rely on capitalization, even though the grammatical context used to disambiguate between them has fewer cues than the German.
Matthias Blume <matth...@shimizu-blume.com> writes: > By the way, here is an example in a case-sensitive natural language > where the distinction between uppercase and lowercase gets > *pronounced*: "mit" vs. "MIT" in German. The first means "with" and is > pronounced like "mitt", the second is the Massachussetts Institute of > Technology and is pronounced like speakers of English would pronounce > it: em-ay-tee. I think that there are enough examples of this around
This is "supremely silly", if there is such a thing, even ignoring for the time that MIT is neither a german word, nor a german abbreviation, and that probably a large number of german speakers will not recognize MIT as standing for "the" MIT, nor pronounce it as speakers of English would. The different pronounciation of mit vs. MIT doesn't result from the difference in case, at all. If you receive a telex that informs you of an invitation to "the mit", you will pronounce "mit" just as you would "MIT". qed.
Of course that doesn't mean that case should be completely ignored, it just means that case is just another attribute of text, like fonts, and that there is little reason to encode it in the character.
It also means that you want to distinguish between mit (with) and MIT (the institute) not based on case, but based on packages, i.e.
(and (not (eq 'german-words:mit 'universities:mit)) ;; And now an example where case will not help in disambiguation ;; namely the sequence "tub", standing for both the english word ;; tub and the common abbreviation for the Technische Universität ;; Berlin (not (eq 'english-words:tub 'universities:tub)))
Regs, Pierre.
-- Pierre R. Mai <p...@acm.org> http://www.pmsf.de/pmai/ The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents. -- Nathaniel Borenstein
d...@goldshoe.gte.com (Dorai Sitaram) writes: > I'm curious: Is there an example, however > frivolous, where WEG in an all-caps sentence > could be ambiguous?
Yes, there is a joke about a stupid person who tries to figure out which street he is in and comes up with
"We are on the trail with the nukes."
because he misread the slogan
"WEG MIT DEN ATOMWAFFEN" (meaning "GET RID OF THE NUKES")
as a streetsign.
> BTW, the {Weg, weg} pair seems very like the {produce > (noun), produce (verb)} pair in English. Like Weg/weg, > produce/produce are pronounced differently.
In this case, there is at best a very remote semantic relationship (if any). It is definitely nowhere near a noun/verb sort of thing.
> > I'm curious: Is there an example, however > > frivolous, where WEG in an all-caps sentence > > could be ambiguous?
> Yes, there is a joke about a stupid person who tries to figure out > which street he is in and comes up with
> "We are on the trail with the nukes."
> because he misread the slogan
> "WEG MIT DEN ATOMWAFFEN" (meaning "GET RID OF THE NUKES")
> as a streetsign.
Yes, but this kind of confusion can happen whether case is involved or not, and I think it's not fair to ascribe it to case as the principal cause. We have signs on our highways that say "FINE FOR LITTERING". Writing them in lowercase won't help. ;-)
> > BTW, the {Weg, weg} pair seems very like the {produce > > (noun), produce (verb)} pair in English. Like Weg/weg, > > produce/produce are pronounced differently.
> In this case, there is at best a very remote semantic relationship (if > any). It is definitely nowhere near a noun/verb sort of thing.
There is a phenomenon in English speech wherein stress matters, too, and we sometimes italicize not just to control emphasis but to actively disambiguate. A prime example of this is an effect called anaphoric de-stressing (that is, lessening stress in order to turn a reference into an anaphoric reference--that is, a reference to a previously noun entity--instead of a non-anaphoric referenc--, that is, a reference to a newly introduced entity). The example I've seen is a story of a newsreader misreading an account of how a man, upon hearing his wife had had an affair with another man, had said he wanted to shoot the bastard. (Note how the sentence changes meaning, depending on whether if put stress on _shoot_ or on _bastard_.) Written Englsh doesn't mark this distinction in writing, even though it's present and my some stretch important in spoken English. People figure it out.
* Matthias Blume | I was under the impression that you thought you already did. :-)
Wipe that moronic grin off your face, dimwit. What your retarded impression of other people might be should not concern anybody else. Such despicably stupid behavior should have been punished by people who cared about you. Why have they not?
| To be frank, I do not care *one bit* about what this discussion was | originally about.
Of course not. Moronic grins are a pretty strong indicator of impaired mental capacity, starting with the sheer inability to take other people seriously.
| I was merely commenting on your claim about capitalization being | "incidental". The debate of whether or not case-sensitive identifiers in | programming languages are Good or Evil, or which character set design use | up more bits than others, etc., bore me.
I tried to suggest _strongly_ that you should go back to daytime TV, but did you get it? No. How amazingly dense you must be.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
* Thomas Bushnell, BSG | The GNU/Linux world is rapidly converging on using UTF-8 to hold 31-bit | Unicode values. Part of the reason it does this is so that existing byte | streams of Latin-1 characters can (pretty much) be used without | modification, and it allows "soft conversion" of existing code, which is | quite easy and thus helps everybody switch.
UTF-8 is in fact extreemly hostile to applications that would otherwise have dealt with ISO 8859-1. The addition of a prefix byte has some very serious implications. UTF-8 is an inefficient and stupid format that should never have been proposed. However, it has computational elegance in that it is a stateless encoding. I maintain that encoding is stateful regardless of whether it is made explicit or not. I therefore strongly suggest that serious users of Unicode employ the compression scheme that has been described in Unicode Technical Report #6. I recommend reading this technical report.
Incidentally, if I could design things all over again, I would most probably have used a pure 16-bit character set from the get-go. None of this annoying 7- or 8-bit stuff. Well, actually, I would have opted for more than 16-bit units -- it is way too small. I think I would have wanted the smallest storage unit of a computer to be 20 bits wide. That would have allowed addressing of 4G of today's bytes with only 20 bits. But I digress...
| So even if strings are "compressed" this way, they are not UTF-8. That's | Right Out. They are just direct UCS values. Procedures like string-set! | therefore might have to inflate (and thus copy) the entire string if a | value outside the range is stored. But that's ok with me; I don't think | it's a serious lose.
There is some value to the C/Unix concept of a string as a small stream. Most parsing of strings needs to parse so from start to end, so there is no point in optimizing them for direct access. However, a string would then be different from a vector of characters. It would, conceptually, be more like a list of characters, but with a more compact encoding, of course. Emacs MULE, with all its horrible faults, has taken a stream approach to character sequences and then added direct access into it, which has become amazingly expensive.
I believe that trying to make "string" both a stream and a vector at the same time is futile and only leads to very serious problems. The default representation of a string should be stream, not a vector, and accessors should use the stream, such as with make-string-{input,output}-stream, with new operators like dostring, instead of trying to use the string as a vector when it clearly is not. The character concept needs to be able to accomodate this, too. Such pervasive changes are of course not free.
| Ok, then the second question is about combining characters. Level 1 | support is really not appropriate here. It would be nice to support | Level 3. But perhaps Level 2 with Hangul Jamo characters [are those | required for Level 2?] would be good enough.
Level 2 requires every other combining character except Hangul Jamo.
| It seems to me that it's most appropriate to use Normalization Form D.
I agree for the streams approach. I think it is important to make sure that there is a single code for all character sequences in the stream when it is converted to a vector. The private use space should be used for these things, and a mapping to and from character sequences should be maintained such that if a private use character is queried for its properties, those of the character sequence would be returned.
| Or is that crazy? It has the advantage of holding all the Level 3 values | in a consistent way. (Since precombined characters do not exist for all | possibilities, Normalization Form C results in some characters | precombined and some not, right?)
Correct.
| And finally, should the Lisp/Scheme "character" data type refer to a | single UCS code point, or should it refer to a base character together | with all the combining characters that are attached to it?
Primarily the code point, but both, effectively, by using the private use space as outlined above.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
* Michael Parker | OTOH, if terminals had gotten color and typefaces earlier, maybe | programming languages would have evolved to use them.
Only if we had also had a stateless coding for them, statefulness being so frigthening to the kinds of programmers who are likely to invent new syntaxes.
| Maybe give each namespace its own color, so you would specify the value | of a name by putting it in blue, the function by using red, keywords in | italics, macros in green. The mind boggles at the possibilities.
Especially if they also used XML to write it all, and then we can use cascading style sheets to control both background and foreground color. And programmers would have be selected from those who are not color blind. This is unlikely to succeed, since the current selection from those who can spell has not been successful, either, and that is at least something you can learn.
Thanks for the URL, though. My mind boggles at statements like these: "With the huge RAM of modern computers, an operating system is no longer necessary."
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
* Matthias Blume | The two words are pronounced very differently.
But so is house and house, distinguished by a voiced and unvoiced s. Some languages also have tonemes, not just phonemes. Norwegian is among them. The phonemes of the Noreegian words for "farmers", "prayers" and "beans" are the same, but the tonemes differ. Immigrants often have farmers for dinner and purchase produce directly from beans as a result. The word for "farmers" is spelled "břnder" but "beans" and "prayers" are both spelled "břnner". Note that this is not a question of stress. All three stress the first syllable exactly the same, and do not stress the final syllable.
| Anyway, this whole debate is supremely silly, IMHO.
Then you are supremely silly who continue to post your drivel to it.
| Fortunately neither you nor Erik get to dictate the rules, at least not | for those languages that I speak or program in...
OF course, you are a Scheme freak and a tourist in comp.lang.lisp, the very canonicalization of the irresponsible trouble-maker who thinks he is an outsider to the community he torments with "you are silly who do it differently from me" attitudes. Thank you for contributing to the _impression_ that Scheme is the language of choice of deranged lunatics.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Erik Naggum <e...@naggum.net> writes: > Some languages also have tonemes, not just phonemes. Norwegian is among > them. The phonemes of the Noreegian words for "farmers", "prayers" and > "beans" are the same, but the tonemes differ. Immigrants often have > farmers for dinner and purchase produce directly from beans as a result. > The word for "farmers" is spelled "břnder" but "beans" and "prayers" are > both spelled "břnner". Note that this is not a question of stress. All > three stress the first syllable exactly the same, and do not stress the > final syllable.
Huh? If they are different words, then *by the definition of a phoneme* the sound which distinguishes them is a phoneme. What is a "toneme"?
* Matthias Blume | Sorry, I was unreasonably hash on you, Kent.
You are a clever little asshole, aren't you?
| By the way, here is an example in a case-sensitive natural language where | the distinction between uppercase and lowercase gets *pronounced*: "mit" | vs. "MIT" in German. The first means "with" and is pronounced like | "mitt", the second is the Massachussetts Institute of Technology and is | pronounced like speakers of English would pronounce it: em-ay-tee.
Geez, dude, you are _so_ full of yourself. No wonder you think this is supremely silly -- your own contributions are ludicrous and stupid.
Whether the M, I, and T of the words that make up "MIT" are capitalized or not is incidental. That one chooses to uppercase initials of words is precisely what I am talking about. Sheesh, some people.
| I think that there are enough examples of this around so that making a | distinction between uppercase and lowercase is warranted in the natural | language case.
Hello? Of course these is a _distinction_ you incredibly retarded jerk! Have you been arguing for a _distinction_? Man, how can you survive being so goddamn _stupid_? Nobody has argued against a distinction, you insufferably arrogant moron. The point is how it should be REPRESENTED! (Incidental capitalization added purely for effect.) Is it even possible to be so unintelligent that this is not something you could have avoided by _thinking_ a little? Of course, you are in this "you guys are silly" mode, so thinking on your own is out of the question, but the whole point is that you are so unconscious and so unwilling to engage your brain to understand what somebody else argues that you effectively reduce the discussion to your pathetically ignorant level. Of _course_ there is a distinction! Geez, you are such an idiot. The question is: should that visible distinction have been coded to represent the incidental quality apart from the intrinsic quality, and the answer is so "advanced" that your puny little brain will in all likelihood not grasp its simplicity.
Let me give your sevrely reduced mental capacity a simple enough example that you might actually be inspired to think about the ramifications. The symbol for Ĺngstrřm in Unicode is exactly the same as the glyph for the letter A with ring above, because the guy's name was spelled with that letter, just like Celsius and Fahrenheit, but all these three letters should never be lowercased even though they are upper-case letters. This is an intrinsic quality. For this reason, Unicode has chosen to represent them as _symbols_, not letters. The same applies to Greek omega, pi, rho, and sigma, which are different symbols in each case. Can you wrap your exceptionally pitiful brain around these few and simple examples to perhaps grasp that incidental qualities and intrinsic qualities are important? Or are you so unphilosophical and such a leering idiot with a moronic grin permanently attached to his skull that being able to grasp what other people have thought about before you has become impossible for you?
On wonder you think those who think are _gods_ in their own mind: If you had been able to think at all, you would probably experience _several_ revelations of such magnitude that one "god" would not be enough.
| Again, I do not think that this needs to be in any way correlated with | the PL case.
Is the stuff you are smoking legal? Go back to your Scheme community, where being supremely silly is not considered rude to your compatriots.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Erik Naggum <e...@naggum.net> writes: > Some languages also have tonemes, not just phonemes. Norwegian is among > them. The phonemes of the Noreegian words for "farmers", "prayers" and > "beans" are the same, but the tonemes differ. Immigrants often have > farmers for dinner and purchase produce directly from beans as a result. > The word for "farmers" is spelled "břnder" but "beans" and "prayers" are > both spelled "břnner". Note that this is not a question of stress. All > three stress the first syllable exactly the same, and do not stress the > final syllable.
So what? What does this have to do with anything? I have already pointed out examples (albeit not from Norwegian, which I don't know at all) for this phenomenon. Pronunciation and spelling are often at odds. Therefore, one cannot argue on the basis of phonetics which visual distinctions in the written language matter and which ones don't. As far as I am concerned, uppercase and lowercase are not the same. In German, this is simply a fact of how the written language is defined. Getting the capitalization wrong is a spelling error just like using the wrong vowel, missing an 'h' somewhere, using 'ss' where 'ß' should be used, joining words where they ought to be separated and vice versa, and so and and so forth. Of course, many of these distinctions are redundant to some degree. Case distinctions are not the only redundancies. Should we abolish all whitespace just because with some practice one can infer where word boundaries are? I haven't seen anyone suggesting this. (And again, there are precedents for such a things, for example in some far eastern languages where words are not visibly separated in the written language.)
> OF course, you are a Scheme freak and a tourist in comp.lang.lisp, the > very canonicalization of the irresponsible trouble-maker who thinks he is > an outsider to the community he torments with "you are silly who do it > differently from me" attitudes. Thank you for contributing to the > _impression_ that Scheme is the language of choice of deranged lunatics.
Quite funny that you think I am a Scheme person... (Especially considering that Scheme, like CL, uses case-insensitive identifiers.)
> * Thomas Bushnell, BSG > | The GNU/Linux world is rapidly converging on using UTF-8 to hold 31-bit > | Unicode values. Part of the reason it does this is so that existing byte > | streams of Latin-1 characters can (pretty much) be used without > | modification, and it allows "soft conversion" of existing code, which is > | quite easy and thus helps everybody switch.
> UTF-8 is in fact extreemly hostile to applications that would otherwise > have dealt with ISO 8859-1. The addition of a prefix byte has some very > serious implications. UTF-8 is an inefficient and stupid format that > should never have been proposed. However, it has computational elegance > in that it is a stateless encoding. I maintain that encoding is stateful > regardless of whether it is made explicit or not. I therefore strongly > suggest that serious users of Unicode employ the compression scheme that > has been described in Unicode Technical Report #6. I recommend reading > this technical report.
> Incidentally, if I could design things all over again, I would most > probably have used a pure 16-bit character set from the get-go. None of > this annoying 7- or 8-bit stuff. Well, actually, I would have opted for > more than 16-bit units -- it is way too small. I think I would have > wanted the smallest storage unit of a computer to be 20 bits wide. That > would have allowed addressing of 4G of today's bytes with only 20 bits. > But I digress...
You should have a chat with Charles Moore, of Forth fame. He designed, using a CAD system he wrote in Forth, called OK, a 20 bit microprocessor that (surprise, surprise... NOT!) has an instruction set designed specifically for Forth.
Something that is unfortunate is that the 36 bit processors basically died off in favor of 32 bit ones. Which means we have great gobs of algorithms that assume 32 bit word sizes, with the only leap anyone can conceive of being to 64 bits, and meaning that if you need a tag bit or two for this or that, 32 bit operations wind up Sucking Bad.
> * Michael Parker > | OTOH, if terminals had gotten color and typefaces earlier, maybe > | programming languages would have evolved to use them.
> Only if we had also had a stateless coding for them, statefulness being > so frigthening to the kinds of programmers who are likely to invent new > syntaxes.
> | Maybe give each namespace its own color, so you would specify the value > | of a name by putting it in blue, the function by using red, keywords in > | italics, macros in green. The mind boggles at the possibilities.
> Especially if they also used XML to write it all, and then we can use > cascading style sheets to control both background and foreground color. > And programmers would have be selected from those who are not color > blind. This is unlikely to succeed, since the current selection from > those who can spell has not been successful, either, and that is at least > something you can learn.
> Thanks for the URL, though. My mind boggles at statements like these: > "With the huge RAM of modern computers, an operating system is no longer > necessary."
Yes, that seems rather a strange comment.
Note that one of Moore's more-publicized quasi-recent projects involved building a CAD system for designing microprocessors.
His approach was to basically write the application-cum-operating system based on a tiny kernel of Forth instructions which basically meant he started with 80486 assembler, and built on top of that.
Apparently it offered vast opportunities to avoid all kinds of cruft that tends to get built into CAD systems, but what it really amounted to was that he built his system as an embedded system on top of bare Intel metal.
I think a lot of his argument is that people keep building cruft on top of cruft, when they might be better off with a _good_ embedded system.
Consider the horrors of MS Office: We might be better off if, instead of continually being mandated by the latest bloatware upgrade to upgrade their system to the latest "Pentium IV with more memory than anyone could _conceive_ of ten years ago," people bought cheap electronic typewriters with bare bits of computing power.
If people spent their time _typing_, instead of trying to figure out which menu allows them to change some bit of formatting, they might get more work done. Consider that back in the old days, Unix used to run in 128K words of memory, and CP/M machines could handle word processing, spreadsheets, and databases in 56K of RAM. The notion that you need 256MB of RAM to realistically Windows XP should be offensive.
In any case, Moore is a fascinating character. He is perhaps not always to be taken seriously, but he's had more inspired ideas than most people ever learn about... -- (reverse (concatenate 'string "gro.mca@" "enworbbc")) http://www.ntlug.org/~cbbrowne/wp.html "Cars move huge weights at high speeds by controlling violent explosions many times a second. ...car analogies are always fatal..." -- <westp...@my-dejanews.com>
* Thomas Bushnell, BSG | Huh? If they are different words, then *by the definition of a phoneme* | the sound which distinguishes them is a phoneme. What is a "toneme"?
Stress is generally not considered to be a difference in phoneme.
The sound is exactly the same, but whether you have entering, departing, rising, falling, high, low, up-down, down-up, or level tone can and does change the meaning of the word. Thai, for instance, has explicit tone markers. Chinese has different ideographs for words that are pronounced with the same phonemes and different tonemes.
Consider the phonemes of the word "really". The toneme is the difference in pronunciation between "Really?" and "Really." and "Really!".
French, for instance, has no stress, but tends to use maringally shorter and longer vowels. They also have no tonemes, so they French have very _serious_ problems dealing with other languages and sound ridiculous in almost every other language than their own.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
* Matthias Blume | So what? What does this have to do with anything?
Why are you still talking? This is "supremely silly" and you keep blabbering? What for?
| As far as I am concerned, uppercase and lowercase are not the same.
Nobody has said they are. Please just grasp this, OK? That some distinction is incidental does mean that it is not there. I wonder what your limited brainpower has concluded that this discussion is all about when you are so devoid of understanding. Geez, you are _so_ stupid.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Erik Naggum <e...@naggum.net> writes: > * Thomas Bushnell, BSG > | Huh? If they are different words, then *by the definition of a phoneme* > | the sound which distinguishes them is a phoneme. What is a "toneme"?
> Stress is generally not considered to be a difference in phoneme.
Oh, ok. That's a good point; the term "phoneme" is ambiguous I think. Tonal differences are sometimes phonemic and sometimes not, but I now understand what you mean. Whether a tonal or length difference should be officially phonemic is a matter style and not any real linguistics, as far as I can tell.
> Consider the phonemes of the word "really". The toneme is the difference > in pronunciation between "Really?" and "Really." and "Really!".
Yeah, but there it's a matter of marking, which is different than tone. A better example in English is between homographs like "conduct" (a noun, stress on the first syllable) and "conduct" (a verb, stress on the second syllable).
Because stress is contextual, it's not normally counted as a phoneme. Tone and length are not contextual, so I think those are usually counted as phonemes. But (as I said above) I think this is a pretty gray area.
> French, for instance, has no stress, but tends to use maringally shorter > and longer vowels. They also have no tonemes, so they French have very > _serious_ problems dealing with other languages and sound ridiculous in > almost every other language than their own.
Actually French does have stress as a word marker; the last syllable of each word gets a stress. (Obviously, stress is therefore not phonemic in French.)
> Something that is unfortunate is that the 36 bit processors basically > died off in favor of 32 bit ones. Which means we have great gobs of > algorithms that assume 32 bit word sizes, with the only leap anyone > can conceive of being to 64 bits, and meaning that if you need a tag > bit or two for this or that, 32 bit operations wind up Sucking Bad.
hello, personally I don't really know what the big difference is... I would have imagined that in any case a slightly larger word size would have been useful, but it is not... sometimes for some of my code I use 48 bit ints (when 32 bits is too small and 64 is overkill). I would think that with 36 bits the next size up would be 72, and 36 is not evenly divisible by 8 so you would need a different byte size as well (ie: 9 or 12). sorry, I don't really know of byte sizes other than 8... am I missing something?
(little has changed in my life since before, except that I am working on an os now... again...).
* cr88192 <cr88...@hotmail.com> | sorry, I don't really know of byte sizes other than 8... | am I missing something?
Yes. A "byte" is only a contiguous sequence of bits in a machine word, and has been used that way by most vendors, for us notably DEC, which contributed the machine instructions we know as LDB and DPB and the notion of a byte specifier, which has bit position in word and length in bits. Failure to support LDB and DPB in hardware is very costly for a large number of useful operations, but on an a byte-addressable world with 8-bit bytes, using anything smaller than bytes that might cross byte boundaries has serious penalties. In a word-addressable world, this saves a lot of memory, even relative to the byte-adressable machines. C has bit fields because it was intended to run on Honewyell 6000, which had 36-bit words, so its "char" was 9 bits wide. (See page 34 of Kernighan & Ritchie, 1st ed.)
IBM chose a more specific terminology: 4-bit nybbles (the same spelling deviation as "byte" from "bite"), 8-bit bytes, 16-bit half-words, 32-bit words, and 64-bit double-words. On the PDP-10, we had 36-bit words, 18-bit half-words (and halfword instructions), but bytes were all over the place. I knwo several people who think this is a much better design than the stupid 8-bit design we have today. Sadly, only several, not millions and millions who think Intel's designs are better just because they can buy them.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
* Thomas Bushnell, BSG | Oh, ok. That's a good point; the term "phoneme" is ambiguous I think. | Tonal differences are sometimes phonemic and sometimes not, but I now | understand what you mean. Whether a tonal or length difference should be | officially phonemic is a matter style and not any real linguistics, as | far as I can tell.
*sigh* My native language has tonemes. Yours does not. Trust me on this, OK? Go look it up if you doubt me.
Tone is the musical tone with which you pronounce a phoneme, or more precisely, with the relative direction of the change of the tone throughout the word.
> Consider the phonemes of the word "really". The toneme is the difference > in pronunciation between "Really?" and "Really." and "Really!".
| Yeah, but there it's a matter of marking, which is different than tone.
*sigh No, this is a tone difference. The rising tone at the end of a question is precisely this -- tone. One does not usually talk about tonemes when dealing with the changing meaning of a sentence, but it is the same idea.
| A better example in English is between homographs like "conduct" (a noun, | stress on the first syllable) and "conduct" (a verb, stress on the second | syllable).
No, that would be stress, not tone. I was trying to give you an example of what tone is, not how the same sequence of phonemes can have different meaning in differing ways.
| Because stress is contextual, it's not normally counted as a phoneme. | Tone and length are not contextual, so I think those are usually counted | as phonemes. But (as I said above) I think this is a pretty gray area.
No, it is not a grey area. It just does not apply to English. Study Norwegian or Thai.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Erik Naggum schrieb im Artikel <3226112482576...@naggum.net>:
> * Thomas Bushnell, BSG >| Tonal differences are sometimes phonemic and sometimes not
> *sigh* My native language has tonemes. Yours does not. Trust me on > this, OK? Go look it up if you doubt me.
Some data points on "toneme" from the web: The American Heritage® Dictionary: A type of phoneme The Concise Oxford Dictionary of Linguistics: A unit of pitch, especially in tone languages, treated as or analogously to a phoneme. http://www.factmonster.com: a phoneme consisting of a contrastive feature of tone in a tone language
Erik Naggum <e...@naggum.net> writes: > * Thomas Bushnell, BSG > | Oh, ok. That's a good point; the term "phoneme" is ambiguous I think. > | Tonal differences are sometimes phonemic and sometimes not, but I now > | understand what you mean. Whether a tonal or length difference should be > | officially phonemic is a matter style and not any real linguistics, as > | far as I can tell.
> *sigh* My native language has tonemes. Yours does not. Trust me on > this, OK? Go look it up if you doubt me.
I'm trusting you about the way Norwegian works, and I'm trying to understand it in the terminology used in English to speak about linguistics.
I do understand perfectly well what tone is.
> | Because stress is contextual, it's not normally counted as a phoneme. > | Tone and length are not contextual, so I think those are usually counted > | as phonemes. But (as I said above) I think this is a pretty gray area.
> No, it is not a grey area. It just does not apply to English. Study > Norwegian or Thai.
I know perfectly well what tone is.
The question is whether tonal difference is a phonemic difference.
Since a phoneme is a minimal unit distinguishing two words, if there are two words that differ only in tone, the difference must therefore be phonemic.
I mentioned stress (in English, with the "conduct" example), because stress is also sometimes thought not to distinguish phonemes, but really it does.
What is a gray area is whether how rigid one wants to be about the definition of "phoneme".
> French, for instance, has no stress, but tends to use maringally shorter > and longer vowels. They also have no tonemes, so they French have very > _serious_ problems dealing with other languages and sound ridiculous in > almost every other language than their own.
What makes you think they don't sound equally ridiculous in French? ;-)
In high school, I never did understand what the English teacher was going on about, with his "iambic pentameter" stuff. If you come from a monotonic language, the whole thing doesn't make a lot of sense. Oh well, _our_ rhymes are a lot more exact.
*Years* later, having married an anglophone and lived in english society for a few years, it was finally explained to me that english has this "stress" thing... my accent improved markedly after that.
-- It would be difficult to construe Larry Wall, in article this as a feature. <1995May29.062427.3...@netlabs.com>
* Thomas Bushnell, BSG | Since a phoneme is a minimal unit distinguishing two words, if there are | two words that differ only in tone, the difference must therefore be | phonemic.
Apparently, this is how some people see it -- I have not seen a difference in tone referred to as "phonemic". However, phonemes are supposed to be discrete elments of speech. A toneme is not -- the change in tone usually spans several phonemes. Therefore, it is either a phoneme of its own, which seems odd, or an additional speech element. If a "phoneme" is the _only_ smallest unit of sound it appears not to be possible to enumerate the phonemes of a language, any longer.
| I mentioned stress (in English, with the "conduct" example), because | stress is also sometimes thought not to distinguish phonemes, but | really it does.
So when something, anything distinguishes phonemes, they become two? That does not appear to be useful. It seems rather to mulitply them without bounds.
| What is a gray area is whether how rigid one wants to be about the | definition of "phoneme".
Seems if you can put whatever you want into to, it is rendered useless.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.