> Lars Marius Garshol <lar...@ifi.uio.no> writes:
>
> > * Lars Marius Garshol
> > |
> > | I've stuffed the chars into a vector (using vector-push-extend) and
> > | then used coerce to make a string of it.
> >
> > * Kent M. Pitman
> > |
> > | Coerce can make a string from a list. You don't have to first convert
> > | it to a vector. (coerce '(#\a #\b #\c) 'string)
> >
> > I assumed the vector approach to be more effective (and made sure to
> > allocate a reasonable initial-size vector to begin with). That is a
> > reasonable assumption, no?
> >
>
> No, it's a bad assumption. Your approach allocates a vector (why not
> a string, in the first place?) and then makes a string with the same
> contents (remember that COERCE does not modify its argument, it makes
> a new object `corresponding to' its argument but with the right type).
> Using COERCE directly will traverse the list (I guess) twice, but only
> allocate a single string of the right length.
Well, making a string using the vector-push-extend approach is not all that
difficult. In LW4.1,
EVENT-SEARCH 20 > (setq x (make-array 20 :element-type 'base-char
:adjustable t :fill-pointer 0))
""
EVENT-SEARCH 21 > (vector-push-extend #\a x)
0
EVENT-SEARCH 22 > (vector-push-extend #\b x)
1
EVENT-SEARCH 23 > (vector-push-extend #\c x)
2
EVENT-SEARCH 24 > x
"abc"
EVENT-SEARCH 25 > (stringp x)
T
The only difference when using another version of lisp is the type of
character you use to make up the elements of the array. I believe in MCL,
for instance, it's base-character.
Sunil
* Tim Bradshaw
|
| No, it's a bad assumption. Your approach allocates a vector (why
| not a string, in the first place?) and then makes a string with the
| same contents (remember that COERCE does not modify its argument, it
| makes a new object `corresponding to' its argument but with the
| right type).
You seem to be correct about this. I didn't know this about COERCE,
and I also wasn't aware of the possibility of directly modifiying
strings like this. (The explanation of this perhaps surprising
ignorance is that I'm a Lisp newbie. :)
(format t "Trying vector approach.~%")
(time
(dotimes (ix 10000 'ok)
(let ((str (make-array 30 :fill-pointer 0 :adjustable t)))
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(vector-push-extend #\a str)
(setq str (coerce str 'string)))))
(format t "~%~%Trying string approach.~%")
(time
(dotimes (ix 10000 'ok)
(let ((str (make-string 30)))
(dotimes (fp 10)
(setf (char str fp) #\a)))))
Running this gives me:
[larsga@pc-larsga meta]$ clisp timer.lsp
Trying vector approach.
Real time: 1.119389 sec.
Run time: 0.98 sec.
Space: 1800572 Bytes
GC: 3, GC time: 0.04 sec.
Trying string approach.
Real time: 5.550576 sec.
Run time: 5.55 sec.
Space: 17562752 Bytes
GC: 33, GC time: 0.55 sec.
[larsga@pc-larsga meta]$ clisp -q -c timer.lsp
Compiling file /home/tosca/c1/larsga/privat/prog/clisp/meta/timer.lsp ...
Compilation of file /home/tosca/c1/larsga/privat/prog/clisp/meta/timer.lsp is finished.
0 errors, 0 warnings
[larsga@pc-larsga meta]$ clisp timer
Trying vector approach.
Real time: 0.470378 sec.
Run time: 0.47 sec.
Space: 1800072 Bytes
GC: 3, GC time: 0.04 sec.
Trying string approach.
Real time: 0.179193 sec.
Run time: 0.18 sec.
Space: 400072 Bytes
GC: 1, GC time: 0.01 sec.
[larsga@pc-larsga meta]$ lisp timer.lsp
;;; *** Don't forget to edit /var/lib/cmucl/site-init.lisp! ***
CMU Common Lisp 18a+ release x86-linux 2.4.7 6 November 1998 cvs, running on pc-larsga
Send bug reports and questions to your local CMU CL maintainer,
or to pvan...@debian.org
or to cmucl...@cons.org. (prefered)
type (help) for help, (quit) to exit, and (demo) to see the demos
Loaded subsystems:
Python 1.0, target Intel x86
CLOS based on PCL version: September 16 92 PCL (f)
* (load "timer")
; Loading #p"/home/tosca/c1/larsga/privat/prog/clisp/meta/timer.lsp".
Trying vector approach.
Compiling LAMBDA NIL:
Compiling Top-Level Form:
[GC threshold exceeded with 2,003,432 bytes in use. Commencing GC.]
[GC completed with 106,136 bytes retained and 1,897,296 bytes freed.]
[GC will next occur when at least 2,106,136 bytes are in use.]
Evaluation took:
1.9 seconds of real time
1.15 seconds of user run time
0.04 seconds of system run time
[Run times include 0.09 seconds GC run time]
837 page faults and
2232128 bytes consed.
Trying string approach.
Compiling LAMBDA NIL:
Compiling Top-Level Form:
Evaluation took:
0.02 seconds of real time
0.02 seconds of user run time
0.0 seconds of system run time
0 page faults and
399304 bytes consed.
*
| Using COERCE directly will traverse the list (I guess) twice, but
| only allocate a single string of the right length.
As I've pointed out before this is all done under the assumption that
the initial character source is not a list.
--Lars M.
Is there a standardized way to do this? It really would be nice to
have extensible strings for this, since in my case I'm doing this in
an OMG IDL parser (which preferably shouldn't barf on long names).
--Lars M.
> You now assume the characters initially came from a list. Mine come
> from a character stream and the original poster wrote "assemble a
> string from various chars...or from maybe a list of chars".
>
> In the case you assume I'd use coerce.
Right -- I assumed they came from a list. At the point I joined
the thread KMP had said "COERCE can make a string from a list"
and you'd said "I thought going via vectors was more efficient";
it was clear that KMP thought you were taking your characters
from a list, and I sort of unconsciously assumed you were since
you didn't object. :-)
--
Gareth McCaughan Dept. of Pure Mathematics & Mathematical Statistics,
gj...@dpmms.cam.ac.uk Cambridge University, England.
This *is* the standard way. You can get the element type of a string by
typing:
(array-element-type "a")
in any common lisp. Again, in LW4.1,
GRADER 90 > (array-element-type "a")
BASE-CHAR
Sunil
> Is there a standardized way to do this? It really would be nice to
> have extensible strings for this, since in my case I'm doing this in
> an OMG IDL parser (which preferably shouldn't barf on long names).
Yes, this should be easy. You can use your approach of an adjustable
array with a fill pointer, but make the element-type be whatever is
right (I forget what is now for strings, but the hyperspec will say).
--tim
Groetjes, Peter
--
It's logic Jim, but not as we know it. | pvan...@debian.org for pleasure,
"God, root, what is difference?",Pitr | pvan...@inthan.be for more pleasure!
* Peter Van Eynde
|
| Not wanting to pick on you, but why haven't you done this?
Simply because this is an installation I did on my computer at work
just to check out CMU-CL. I did that a couple of days ago and so far
haven't run CMU-CL more than 3-4 times, so I haven't bothered yet.
| Should I automaticly fill in the "right" values at installation
| time?
Sorry, I don't understand this question.
--Lars M.
You're supposed to fill in a bit of trivial data like the site-name.
I was wondering if people would prefer the installation routine to
ask for this when installing, rather then bugging people to edit
this file.
This person wouldn't. I convert your packages to rpms with "alien";
rpm installation is traditionally a non-interactive thing.
You could make cmuclconfig sort it out, though. That's interactive
anyway.
-dan
> On 18 Mar 1999 13:19:36 +0100, Lars Marius Garshol wrote:
> >| Should I automaticly fill in the "right" values at installation
> >| time?
> >
> >Sorry, I don't understand this question.
>
> You're supposed to fill in a bit of trivial data like the site-name.
> I was wondering if people would prefer the installation routine to
> ask for this when installing, rather then bugging people to edit
> this file.
IMHO, asking people to edit files at installation time is not a good thing.
Cheers
--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
(defun change-first-letter (word letter)
(concatenate 'string letter (subseq word 1 (length word))))
(change-first-letter "mood" "f")
"food"
Peat wrote:
>
> another thing...can anyone help me with:
>
> say i had a word "mood"...how can i replace the first char on mood
> with an f making it "food"?? please help again....
>
> Thanks!!!
>
> -Pete
> pvan...@mail.inthan.be (Peter Van Eynde) writes:
>
> > You're supposed to fill in a bit of trivial data like the site-name.
> > I was wondering if people would prefer the installation routine to
> > ask for this when installing, rather then bugging people to edit
> > this file.
>
> IMHO, asking people to edit files at installation time is not a good thing.
I agree with Marco.
Absolutely under no circumstances should one edit a file upon
installation. The list of ways this can go wrong is large and
varied. It's not to say I've never done it. But neither I nor anyone
should ever feel they have followed normal, healthy engineering practice
when they do. It's an ugly kludge that one should feel embarrassed
about and should strive to eliminate.
* It is equivalent to editing source code for software reuse.
It is the antithesis of modular programming. It is the worst
of what C++ has to offer in the way of so-called oo design.
* This completely thwarts the ability to use md5 and
other checksum tools to verify the integrity of an installed piece of
foreign software.
* It risks that people who don't competently edit it
will create problems they are not aware of and nightmares for people
having to support them. It makes it impossible to say "I have foo 1.4"
becuase you don't--you have 1.4 as modified by me. That's a new
version. It needs a new name. And everyone will have different names.
* It risks the insertion of bogus or confidential information that will
be retransmitted to other sites if someone retransmits the software
installation to someone else instead of grabbing it fresh from the source.
* It isn't perspicuous. It's easy to overlook an edit you have to make.
* It forces people to be programmers continuing the worst trend of
computer science which is to make everyone have to be computer-savvy
just to do ordinary tasks rather than making computers be people-savvy.
All installation guides should ever say is just say "Press install."
That's not to say that they shouldn't ask you questions, log what they
did for inspection by those who want it, make the installation undoable,
etc. But anything more than those two words is a bug in the doc and
the system design.
I'm in a very grumpy mood today so unlike usual where I say there is
too sides to everything and everything is just a trade-off, I'm going to
just assert I'm right on this one.
Part of my bad mood may be related to having wasted two weeks learning
about how to compile the linux kernel, poking around in network card
device drivers, running all manner of configuration and diagnostic
tools, etc. just trying to get linux to boot with two network cards.
There isn't just a tool that does everything. One always just has to
edit a single line of a file. It's just that there are 50,000 linux
programmers and 50,000 single lines one might have to edit. There is
simply no way that is the right model of anything.
I'd much prefer to be asked.
--Lars M.
Agreed, the new version will ask for these details.
> [... arguments against editing files ...]
I had written a long explanation of why debian makes certain
choices, but this is distinctly off-charter. Let me just say that
I/we try to solve this problem the best way we can, and that at
least I know of no better system to date (I would be happy to be
corrected) and that nearly all of your comments are either already
solved or solutions are being discussed[1].
Groetjes, Peter
[1] It's a difficult road to walk _between_ point-and click dumming
down and sendmail (old style, no m4) configuration...
from the description of the system class STRING in ANSI X3.226:
A string is a specialized vector whose elements are of type CHARACTER or a
subtype of type CHARACTER. When used as a type specifier for object
creation, STRING means (VECTOR CHARACTER).
so CHARACTER is already standard. there is no need to use BASE-CHAR, and
no need to worry about portability problems.
in my view, however, the actual task at hand is to extract a subsequence
of a stream's input buffer. I do this with a mark in the stream and
avoid copying until absolutely necessary. this, however, requires access
to and meddling with stream internals.
a less "internal" solution is to use WITH-OUTPUT-TO-STRING and simply
write characters to it until the terminating condition is met, to wit:
(with-output-to-string (name)
(stream-copy <input-stream> name <condition>))
=> <string>
the function STREAM-COPY could be defined like this:
(defun stream-copy (input output &key (count -1) end-test filter transform)
"Copy characters from INPUT to OUTPUT.
COUNT is the maximum number of characters to copy.
END-TEST if specified, causes termination when true for a character.
FILTER if specified, causes only characters for which it is true to be copied.
TRANSFORM if specified, causes its value to be copied instead of character."
(loop
(when (zerop count)
(return))
(let ((character (read-char input nil :eof)))
(when (eq :eof character)
(return))
(when (and end-test (funcall end-test character))
(unread-char character input)
(return))
(when (or (null filter) (funcall filter character))
(write-char (if transform
(funcall transform character)
character)
output)))
(decf count)))
#:Erik
try it like this, instead:
(make-array 30 :fill-pointer 0 :adjustable t :element-type 'character)
BTW, if you have to time things, you might as well do it with proper
declarations and optimizations. otherwise, you don't know what you're
timing.
#:Erik
> * Lars Marius Garshol <lar...@ifi.uio.no>
> | Is there a standardized way to do this? It really would be nice to have
> | extensible strings for this, since in my case I'm doing this in an OMG
> | IDL parser (which preferably shouldn't barf on long names).
>
> from the description of the system class STRING in ANSI X3.226:
>
> A string is a specialized vector whose elements are of type CHARACTER or a
> subtype of type CHARACTER. When used as a type specifier for object
> creation, STRING means (VECTOR CHARACTER).
>
> so CHARACTER is already standard. there is no need to use BASE-CHAR, and
> no need to worry about portability problems.
The first of the two uses of the word "need" here is odd. Strictly,
there is no "need" to use Lisp, nor even to use computers. There is a
"need" for food, clothing, and shelter. But if we extend "need" to
sometimes mean "want" (which I assume you mean here), then there is
sometimes a "need" to use BASE-CHAR because in some implementations
CHARACTER may be inefficient (e.g., it might reserve a much more
heavy-duty space capable of holding multi-byte characters), and it may
sometimes be "necessary" to avoid this. The problem Lars might be
perceiving, and I think it's a legit concern in certain limited
contexts, is that you can't know when one "needs" to use BASE-CHAR to
avoid overallocating space.
I think Erik is saying that one shouldn't pre-optimize something without
first knowing the general case will be a problem. And what Lars is
saying is that he perceives it will be a problem. This is something of
a clash of absolutes and both have some merit. Mostly I think one should
probably define Erik's approach to be the most conservative, even if not
the one most people do. I, too, prefer to write general code first and
get the shape and functionality right, and then to tune as needed where
a problem is discovered. Since there is no a priori functional problem
with CHARACTER, one should just use it until a problem is discovered.
And then one might find one "needs" BASE-CHAR. But not otherwise.
Pre-optimizing the type specifier before knowing that CHARACTER leads
to problems is mostly a bad idea. It needlessly increases program
complexity at a time when you're just exploring what you want from
your program. You may later throw away that line of code, and there's
no sense in having optimized it. Or you may later find the code
doesn't get enough play and doesn't need a declaration.
Life is short, and one isn't meant to spend it writing gratuitous
declarations that don't actually do anything other than make code
harder to write. At least, that's my own personal religious belief.
(Apologies in advance to those in this multicultural forum if I've
stepped on the toes of anyone whose religion teaches them that this IS
a good way to spend one's life.)
Over-optimizing also encourages you to build fragile interfaces. For
example, I ran into a bug in some Lisp implementation where the vendor
had decided to use simple strings for symbol names. Maybe an ok
assumption, but they didn't get it from the book, and they didn't fix
INTERN and friends to coerce symbol names to be simple, so when you
made a symbol with an adjustable string as a name, it let you do this,
but this made a mess. (Note that the spec says "a string" not "a
simple string" as the argument to INTERN. It doesn't forbid the
implementation from copying the string to be simple or another
element-type more suited to the characters it contains or whatever.
But it says this fact shouldn't be revealed to users.) The
particulars of the bug are not as relevant as the point about your
responsibility when you narrow a type: your interface points must
minimally check the incoming type and preferrably should coerce
reasonable alternate types so that people don't do what I had to do
while learning some Java a few weeks ago to cast an "integer
represented as an object" back to an integer by something nutty like
[if memory serves me right--my Java memory is very flaky]:
((Integer)someObject).getIntVal() having to do not one but two
coercions just to say "this is in fact the integer it looks like". I
don't want to get into a long discussion about Java or my inability to
navigate it smoothly or the fact that this clumsiness doesn't
overwhelm me with a desire to give up Lisp for it. My point here is
simply this: type restriction has its place but it also has its cost.
And you should try to avoid paying the cost because that cost can
include infection of unrelated modules with needless paranoia. (For
varying values of "need" again.) In the INTERN case above, the spec
said it wasn't supposed to work the way the implementation did it, so
it was easy for me to complain, but when you design your own
interfaces your users won't have that luxury--they have to do what you
implement. So make sure you're being sensitive to what's rational for
them to use.
- - - -
If one does feel compelled to pre-optimize this, what I recommend
doing is something akin to:
;;; !!! KLUDGE: Hide BASE-CHAR (ANSI) vs BASE-CHARACTER (CLTL2) distinction
(defconstant +base-char-type+
'#.(type-of (array-element-type "foo")))
(make-array :element-type +base-char-type+ ...options...)
This is a kludge and isn't quite 100% right because theoretically the
system could allocate a more restricted string representation for the
constant string "foo" of known character composition than it would
for base-char, but in practice I haven't observed implementations to
do this and so the kludge is pretty portable.
Sometimes MAKE-STRING keeps you from doing this, but MAKE-STRING doesn't
take all the arguments MAKE-ARRAY does so in practice I've had to do
this in some cases.
(Sometimes you'll run into cases where the +base-char-type+ needs to be
used in a not-for-evaluation situation and it may be helpful to use
#.+base-char-type+ in that case. If you do this, an EVAL-WHEN around the
DEFCONSTANT to make sure the variable is ready in the read-time environment
may be needed. I left it out of the above just to keep things simple.)
> from the description of the system class STRING in ANSI X3.226:
>
> A string is a specialized vector whose elements are of type CHARACTER or a
> subtype of type CHARACTER. When used as a type specifier for object
> creation, STRING means (VECTOR CHARACTER).
>
> so CHARACTER is already standard. there is no need to use BASE-CHAR, and
> no need to worry about portability problems.
Erik,
This is exactly what I had thought when I had first tried this in lispworks
3.2. Alas, what I got was
CL-USER 6 > (make-array 10 :element-type 'character)
#(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL)
Hardly a string... Which had me pretty confused and led me to believe that
'character was not the right type to use. (I probably should have asked,
but I was much more of a newbie back then than I am now, and this bit of
information became a decontextualized fact over time.)
Thankfully, lispworks 4.1 does the right thing:
CL-USER 13 > (make-array 10 :element-type 'character)
"
Thanks for clearing things up.
Sunil
ouch. (I'm glad it has been fixed later.)
| Hardly a string... Which had me pretty confused and led me to believe that
| 'character was not the right type to use. (I probably should have asked,
| but I was much more of a newbie back then than I am now, and this bit of
| information became a decontextualized fact over time.)
stuff like this is why I think programmers who don't read specifications
learn bad habits: failure to get what you expect must be investigated and
the culprit must actually be _found_: either you did something wrong, or
somebody else did something wrong. "oh, that didn't work, let's try
something else" is good if you deal with the physical world and people,
but when you're dealing with computers and programming languages, it's
the _last_ property of the physical world I want to imitate. if an
expectation doesn't come true, either the expectation is wrong, you made
a mistake in preparation for it, or there is a flaw in the system. if
you don't do the work necessary to figure out which of these three is the
right one, you have 1/3 chance of getting it right by luck. I think the
most important desideratum for a programmer is an _unwillingness_ just to
try something until it works -- a good programmer needs to know _why_.
| Thanks for clearing things up.
sure.
#:Erik