Path: g2news2.google.com!news2.google.com!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Tim Bradshaw Newsgroups: comp.lang.lisp Subject: Re: characters in CL Date: Fri, 18 Mar 2011 09:55:07 +0000 Organization: A noiseless patient Spider Lines: 40 Message-ID: References: <0c4f51a1-0060-43e0-b72c-cfec86e0edec@n10g2000yqf.googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: mx02.eternal-september.org; posting-host="K1AaJcAGfPeNf/kPjshkSA"; logging-data="30792"; mail-complaints-to="ab...@eternal-september.org"; posting-account="U2FsdGVkX19K3v2rr0410Gggns3V75Pb" User-Agent: Unison/2.1.4 Cancel-Lock: sha1:8LitF0Ae9dLf1VI7gYYcqi4Tgr8= On 2011-03-18 08:19:32 +0000, Mark Tarver said: > > You can think of them as having strings composed of substrings, with > a > fixed composition mechanism. I think that this kind of thing is basically a disaster for a language which wants to have a coherent type system. You really, I think, have two options: strings are a special magic type orthogonal to any other type (in particular they are not arrays or sequences); strings are some kind of sequence/array type. For a language which is not entirely about string bashing the latter option is the obvious one. Then you have to ask the question: what type are strings sequencs of? The answer can't be "strings" because then the type system falls to bits in some horrible way: you need another type (or the option of another type: strings could be sequences whose elements are (strings OR ).) That other type, in CL, is characters. The other option, having strings *not* be sequences but some special magic type is possible, and I suspect it's basically what Perl does, for instance (in so far as Perl has a coherent type system at all). I would not want CL to do this. I don't know how this maps on to unicode: I do know that unicode has lots of cases where complicated things happen, and I also know that I don't understand it. Unfortunately the only person I knew who I was sure really *did* understand it is dead, so we can't get his opinion. I don't think that the sharp-s / ss thing in German has much to do with this, because there are lots of complicated rules around that (which I can no longer remember). It may be that assuming things like string-upcase / char-upcase &c are simple is just a huge mistake. --tim