Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
read text from file, a chunk of more lines at a time
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 37 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
hans  
View profile  
 More options Oct 30 2011, 3:48 pm
Newsgroups: comp.lang.lisp
From: hans <schatzer.joh...@gmail.com>
Date: Sun, 30 Oct 2011 12:48:40 -0700 (PDT)
Local: Sun, Oct 30 2011 3:48 pm
Subject: read text from file, a chunk of more lines at a time
How to read a file, one "record" (of more lines, with a consistent
record delimiter) at a time?

RECORD1
some
text
RECORD2
some
other
text
RECORD3
and
much
more
text
RECORD4
etc.

thanks


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Wallich  
View profile  
 More options Oct 30 2011, 4:09 pm
Newsgroups: comp.lang.lisp
From: Paul Wallich <p...@panix.com>
Date: Sun, 30 Oct 2011 16:09:00 -0400
Subject: Re: read text from file, a chunk of more lines at a time
On 10/30/11 3:48 PM, hans wrote:

Probably the simplest way is a loop of readline (concatenating the
string) with a check for the delimiter, nested inside a loop that does
whatever you want with the records. Oh, and the inside loop will need an
EOF check as well. You can use an explicit loop or a while/until
construct or an if or a cond with an explicit transfer of control. You
could even use recursion.

paul


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Oct 30 2011, 8:37 pm
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@tfeb.org>
Date: Mon, 31 Oct 2011 00:37:19 +0000 (UTC)
Local: Sun, Oct 30 2011 8:37 pm
Subject: Re: read text from file, a chunk of more lines at a time

hans <schatzer.joh...@gmail.com> wrote:
> How to read a file, one "record" (of more lines, with a consistent
> record delimiter) at a time?

I will no doubt be crucified for saying so but: Perl. Read it in Perl, spit
out sexps from Perl, and read those with Lisp.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
XeCycle  
View profile  
 More options Oct 30 2011, 9:47 pm
Newsgroups: comp.lang.lisp
From: XeCycle <xecy...@gmail.com>
Date: Mon, 31 Oct 2011 09:47:54 +0800
Local: Sun, Oct 30 2011 9:47 pm
Subject: Re: read text from file, a chunk of more lines at a time

What's the variable "$/"?  Check perlvar(1perl).

--
Carl Lei (XeCycle)
Department of Physics, Shanghai Jiao Tong University
OpenPGP public key: 7795E591
Fingerprint: 1FB6 7F1F D45D F681 C845 27F7 8D71 8EC4 7795 E591


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
XeCycle  
View profile  
 More options Oct 30 2011, 9:49 pm
Newsgroups: comp.lang.lisp
From: XeCycle <xecy...@gmail.com>
Date: Mon, 31 Oct 2011 09:49:48 +0800
Local: Sun, Oct 30 2011 9:49 pm
Subject: Re: read text from file, a chunk of more lines at a time

Sorry, I thought I was in comp.lang.perl.misc.

But I recommend Perl, too.

--
Carl Lei (XeCycle)
Department of Physics, Shanghai Jiao Tong University
OpenPGP public key: 7795E591
Fingerprint: 1FB6 7F1F D45D F681 C845 27F7 8D71 8EC4 7795 E591


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Warnock  
View profile  
 More options Oct 30 2011, 10:06 pm
Newsgroups: comp.lang.lisp
From: r...@rpw3.org (Rob Warnock)
Date: Sun, 30 Oct 2011 21:06:04 -0500
Local: Sun, Oct 30 2011 10:06 pm
Subject: Re: read text from file, a chunk of more lines at a time
hans  <schatzer.joh...@gmail.com> wrote:

+---------------
| How to read a file, one "record" (of more lines, with a consistent
| record delimiter) at a time?
|
| RECORD1
| some
| text
| RECORD2
| some
| other
| text
| RECORD3
| and
| much
| more
| text
| RECORD4
| etc.
+---------------

If your record delimiter is actually something that matches the
regexp pattern "RECORD[0-9]+", then you can match/parse it very
easily with MISMATCH and PARSE-INTEGER:

    > (loop with lines = '("RECORD1"
                           "some"
                           "text"
                           "RECORD2"
                           "some"
                           "other"
                           "text")
            for line in lines
            for delim-p = (eql 6 (mismatch "RECORD" line))
            for datum = (if delim-p (parse-integer line :start 6) line)
        collect (list delim-p datum))

    ((T 1) (NIL "some") (NIL "text") (T 2) (NIL "some") (NIL "other")
     (NIL "text"))
    >

Adding the logic that batches the thus-tagged lines into "records"
is left as an exercise for the student.

-Rob

-----
Rob Warnock             <r...@rpw3.org>
627 26th Avenue         <http://rpw3.org/>
San Mateo, CA 94403


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Pascal J. Bourguignon  
View profile  
 More options Oct 30 2011, 10:14 pm
Newsgroups: comp.lang.lisp
From: "Pascal J. Bourguignon" <p...@informatimago.com>
Date: Mon, 31 Oct 2011 03:14:08 +0100
Local: Sun, Oct 30 2011 10:14 pm
Subject: Re: read text from file, a chunk of more lines at a time

By programming.  That is, using one's brain.

I don't understand this kind of question.  What problem do you have?

Do you have a problem of not knowing lisp I/O primitives?

Do you have a problem of not knowing how to read structured files?

Do you have a problem of not recognizing the structure of the file
(ie. not being able to come with a specificiation)?

What's your problem?

--
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
hans  
View profile  
 More options Oct 31 2011, 2:00 am
Newsgroups: comp.lang.lisp
From: hans <schatzer.joh...@gmail.com>
Date: Sun, 30 Oct 2011 23:00:40 -0700 (PDT)
Local: Mon, Oct 31 2011 2:00 am
Subject: Re: read text from file, a chunk of more lines at a time
On Oct 31, 3:14 am, "Pascal J. Bourguignon" <p...@informatimago.com>
wrote:

The file is 145 MB, has about 20000 records, a record may have over
500 lines, but the record separator is simply and always *RECORD* on a
separate line.
Sorry for the above complication with RECORD1, RECORD2 ...

In Perl you would simply do
$/ = "*RECORD*";
as Tim Bradshaw and XeCycle say.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kaz Kylheku  
View profile  
 More options Oct 31 2011, 3:20 am
Newsgroups: comp.lang.lisp
From: Kaz Kylheku <k...@kylheku.com>
Date: Mon, 31 Oct 2011 07:20:44 +0000 (UTC)
Local: Mon, Oct 31 2011 3:20 am
Subject: Re: read text from file, a chunk of more lines at a time
On 2011-10-31, Tim Bradshaw <t...@tfeb.org> wrote:

> hans <schatzer.joh...@gmail.com> wrote:
>> How to read a file, one "record" (of more lines, with a consistent
>> record delimiter) at a time?

> I will no doubt be crucified for saying so but: Perl. Read it in Perl, spit
> out sexps from Perl, and read those with Lisp.

No way. There is a new text mangler with Lisp roots.

http://www.nongnu.org/txr

@(collect)
RECORD@num
@  (collect)
@text
@  (until)
RECORD@(skip)
@  (end)
@(end)
@(output)
@  (repeat)
(@num @(rep)@text @(last)@text@(end))
@  (end)
@(end)

Test:

$ ./txr rec2sexp.txr -
RECORD1
a
b
c
d
RECORD2
x
y
z
RECORD3
d
[Ctrl-D]
(1 a b c d)
(2 x y z)
(3 d)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Pascal J. Bourguignon  
View profile  
 More options Oct 31 2011, 10:33 am
Newsgroups: comp.lang.lisp
From: "Pascal J. Bourguignon" <p...@informatimago.com>
Date: Mon, 31 Oct 2011 15:33:53 +0100
Local: Mon, Oct 31 2011 10:33 am
Subject: Re: read text from file, a chunk of more lines at a time

Ok, so it seems you can recognize more or less the structure of the
file.

You say "separator", but in your example, it looks like the 'RECORD'
token is a prefix.    You must choose what file structure you have:

    file   ::= { record } .
    record ::= 'RECORD' { line } .

    file   ::= { record } .
    record ::= { line } 'RECORD' .

    file   ::= [ record { 'RECORD' record } ] .
    record ::= { line } .

But you didn't answer the other questions:

>> Do you have a problem of not knowing lisp I/O primitives?

>> Do you have a problem of not knowing how to read structured files?

--
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Kovalenko  
View profile  
 More options Oct 31 2011, 12:59 pm
Newsgroups: comp.lang.lisp
From: Anton Kovalenko <an...@sw4me.com>
Date: Mon, 31 Oct 2011 20:59:45 +0400
Local: Mon, Oct 31 2011 12:59 pm
Subject: Re: read text from file, a chunk of more lines at a time
"Pascal J. Bourguignon" <p...@informatimago.com> writes:

>> The file is 145 MB, has about 20000 records, a record may have over
>> 500 lines, but the record separator is simply and always *RECORD* on a
>> separate line.
>> Sorry for the above complication with RECORD1, RECORD2 ...

> Ok, so it seems you can recognize more or less the structure of the
> file.

> You say "separator", but in your example, it looks like the 'RECORD'
> token is a prefix.    You must choose what file structure you have:

It's a wonderful illustration of Perl vs. CL differences.

Pascal Bourguignon mentioned possible variants of input grammar. Each of
them is fairly trivial to code in CL, and it might be just as trivial in
Perl. But that's not how people program in Perl, apparently: they
recognize $/ as something that has a chance to work, and they go on and
use it because it's simple and terse and "beautiful". Now let's look
closer at this beauty.

$/="*RECORD*" is obviously wrong:

 *RECORD*
 An item, that would set the *RECORD* straight.
 *RECORD*
 A previous record triggered a bug.

$/="\n*RECORD*\n" is somewhat better, but the *first* record header (if
there are headers) won't be recognized as record separator anymore.
(For a variant without \n's, we get an empty first record in this case,
 but, of course, Perl people would "solve" it by ignoring empty records).

Now, a line-sensitive regular expression could be useful as separator
instead, but $/ is _not_ a regex, so we're out of luck. It's "better" to
leave it as $/="*RECORD*. Good perl programmer would _document_ the
problem with inline *RECORD*s; that's the maximum quality we could
reasonably expect.

Seriously, such thing is "perfect" as a one-shot throwaway code _only_.
But when I want to massage a text file once and forget about it, I'd
better open it in the _editor_, and with some replace-regexps it will
become a 145-Mb file with S-expressions, which I'll then read with
CL:READ.

--
http://github.com/akovalenko/sbcl-win32-threads/wiki
+7(916)345-34-02 | Elektrostal' MO, Russia


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kaz Kylheku  
View profile  
 More options Oct 31 2011, 1:57 pm
Newsgroups: comp.lang.lisp
From: Kaz Kylheku <k...@kylheku.com>
Date: Mon, 31 Oct 2011 17:57:17 +0000 (UTC)
Local: Mon, Oct 31 2011 1:57 pm
Subject: Re: read text from file, a chunk of more lines at a time
On 2011-10-31, Kaz Kylheku <k...@kylheku.com> wrote:

Improved.

- :vars on in inner collect ensure that empty collects still
  produces a binding for the text variable (a binding to the empty list nil),
  even if there is no match.

- Output simplified.

@(collect)
RECORD@num
@  (collect :vars (text))
@text
@  (until)
RECORD@(skip)
@  (end)
@(end)
@(output)
@  (repeat)
(@num@(rep) @text@(end))
@  (end)
@(end)

$ ./txr rec2sexp.txr  -
RECORD1
RECORD2
a      
RECORD3
a  
b
foo RECORD4
RECORD4
[Ctrl-D]
(1)
(2 a)
(3 a b foo RECORD4)
(4)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Oct 31 2011, 3:42 pm
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@tfeb.org>
Date: Mon, 31 Oct 2011 19:42:28 +0000 (UTC)
Local: Mon, Oct 31 2011 3:42 pm
Subject: Re: read text from file, a chunk of more lines at a time

Kaz Kylheku <k...@kylheku.com> wrote:
> No way. There is a new text mangler with Lisp roots.

I don't think this is really different: my point wasn't really "use Perl"
it was "use the appropriate tool" (OK, I should have said that).  There
probably are cases where there is a real reason to use x for everything,
but generally the "reason" is some kind of invented thing in people's
minds, and in fact it is just fine to use a combination of tools: AWK or
Perl or txr or what-have-you for file-munging and Lisp or ... for other
bits

txr looks interesting.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Carlos  
View profile  
 More options Oct 31 2011, 4:18 pm
Newsgroups: comp.lang.lisp
From: Carlos <an...@quovadis.com.ar>
Date: Mon, 31 Oct 2011 21:18:49 +0100
Local: Mon, Oct 31 2011 4:18 pm
Subject: Re: read text from file, a chunk of more lines at a time
[Anton Kovalenko <an...@sw4me.com>, 2011-10-31 20:59]
[...]

To solve your "problem", a Perl programmer would probably just read and
discard the first header, and then set $/ to "\n*RECORD*\n". Your
strawman Perl programmers are too incompetent, you should fire them.
--

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Kovalenko  
View profile  
 More options Oct 31 2011, 7:34 pm
Newsgroups: comp.lang.lisp
From: Anton Kovalenko <an...@sw4me.com>
Date: Tue, 01 Nov 2011 03:34:54 +0400
Local: Mon, Oct 31 2011 7:34 pm
Subject: Re: read text from file, a chunk of more lines at a time

Carlos <an...@quovadis.com.ar> writes:
> To solve your "problem", a Perl programmer would probably just read and
> discard the first header, and then set $/ to "\n*RECORD*\n". Your
> strawman Perl programmers are too incompetent, you should fire them.

As we can see, even a compenent, caring Perl programmer proposes "read
and discard" instead of "read, check and discard", and that's in the
discussion of correctness.  Why would I need a strawman?

--
Regards, Anton Kovalenko
+7(916)345-34-02 | Elektrostal' MO, Russia


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kaz Kylheku  
View profile  
 More options Oct 31 2011, 8:02 pm
Newsgroups: comp.lang.lisp
From: Kaz Kylheku <k...@kylheku.com>
Date: Tue, 1 Nov 2011 00:02:22 +0000 (UTC)
Local: Mon, Oct 31 2011 8:02 pm
Subject: Re: read text from file, a chunk of more lines at a time
On 2011-10-31, Carlos <an...@quovadis.com.ar> wrote:

If you discard the first header, but that record is empty,
then you're again left with a header which does not match
"\n*RECORD*\n"

\n is not a good substitute for anchors like ^ and $ which are not
character matches, but a semantic extension to regexes.

--
Alan Perlis Epigram 32. Programmers are not to be measured by their ingenuity
and their logic but by the completeness of their case analysis.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Oct 31 2011, 8:03 pm
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@tfeb.org>
Date: Tue, 1 Nov 2011 00:03:57 +0000 (UTC)
Local: Mon, Oct 31 2011 8:03 pm
Subject: Re: read text from file, a chunk of more lines at a time

Anton Kovalenko <an...@sw4me.com> wrote:
> As we can see, even a compenent, caring Perl programmer proposes "read
> and discard" instead of "read, check and discard", and that's in the
> discussion of correctness.  Why would I need a strawman?

It's this kind of thing that makes me want to take Lisp programmers out and
shoot them.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Kovalenko  
View profile  
 More options Oct 31 2011, 9:08 pm
Newsgroups: comp.lang.lisp
From: Anton Kovalenko <an...@sw4me.com>
Date: Tue, 01 Nov 2011 05:08:31 +0400
Local: Mon, Oct 31 2011 9:08 pm
Subject: Re: read text from file, a chunk of more lines at a time

Tim Bradshaw <t...@tfeb.org> writes:
>> As we can see, even a compenent, caring Perl programmer proposes "read
>> and discard" instead of "read, check and discard", and that's in the
>> discussion of correctness.  Why would I need a strawman?

> It's this kind of thing that makes me want to take Lisp programmers out and
> shoot them.

Your own suggestion to spit out sexps was perferctly sane (and it
doesn't need Perl, which is a good sign). What's ridiculous here is not
Perl, or Perl's $/, it's how people stick to a specific Perl feature
($/), even after it was shown to be a wrong tool in a number of ways
(Kaz Kylheku noticed an additional danger of empty records).

Similar thing could happen with CL. Imagine that we're parsing
command-line arguments, and there's one that should be an
integer-bounded range, like 1222-33334. Let's use parse-integer.  Then
it turns out that 0xDEAD-0xDEEF is also valid, and 0177-0755 should be
octal and it's silently misinterpreted as decimal. Let's insert some
special cases and still use parse-integer. Then it turns out that we
accept 0x+12-0x+FF, which we shouldn't, and we insert some more code but
_still_ use parse-integer. Then 0-283 turns out to be misdetected as
octal and signals an error on 8...

Surely it _could_ happen with CL, but I have yet to see it happening.

--
Regards, Anton Kovalenko <http://github.com/akovalenko/sbcl-win32-threads>
+7(916)345-34-02 | Elektrostal' MO, Russia


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Carlos  
View profile  
 More options Oct 31 2011, 9:23 pm
Newsgroups: comp.lang.lisp
From: Carlos <an...@quovadis.com.ar>
Date: Tue, 1 Nov 2011 02:23:43 +0100
Local: Mon, Oct 31 2011 9:23 pm
Subject: Re: read text from file, a chunk of more lines at a time
[Kaz Kylheku <k...@kylheku.com>, 2011-11-01 00:02]

Come on, you are testing a sketch algorithm to a made up specification.
He was talking about *RECORD* being not a separator but a header. Now
you say there can be empty records? Then the Perl programmer would set
$/ to "*RECORD*\n" and join records if needed.

My point is that Perl programmers aren't necessarily stupid. That's all.

Oh, and also that Perl's augmented read-line simplifies the solution a
lot.

--


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Carlos  
View profile  
 More options Oct 31 2011, 9:25 pm
Newsgroups: comp.lang.lisp
From: Carlos <an...@quovadis.com.ar>
Date: Tue, 1 Nov 2011 02:25:25 +0100
Local: Mon, Oct 31 2011 9:25 pm
Subject: Re: read text from file, a chunk of more lines at a time
[Anton Kovalenko <an...@sw4me.com>, 2011-11-01 03:34]

> Carlos <an...@quovadis.com.ar> writes:

> > To solve your "problem", a Perl programmer would probably just read
> > and discard the first header, and then set $/ to "\n*RECORD*\n".
> > Your strawman Perl programmers are too incompetent, you should fire
> > them.

> As we can see, even a compenent, caring Perl programmer proposes "read
> and discard" instead of "read, check and discard", and that's in the
> discussion of correctness.  Why would I need a strawman?

Because I said "read and discard the first header", not "read and
discard anything whatsoever".

--


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Carlos  
View profile  
 More options Oct 31 2011, 9:29 pm
Newsgroups: comp.lang.lisp
From: Carlos <an...@quovadis.com.ar>
Date: Tue, 1 Nov 2011 02:29:53 +0100
Local: Mon, Oct 31 2011 9:29 pm
Subject: Re: read text from file, a chunk of more lines at a time
[Carlos <an...@quovadis.com.ar>, 2011-11-01 02:25]
> [Anton Kovalenko <an...@sw4me.com>, 2011-11-01 03:34]
> > Carlos <an...@quovadis.com.ar> writes:

> > > To solve your "problem", a Perl programmer would probably just
> > > read and discard the first header, and then set $/ to
> > > "\n*RECORD*\n". Your strawman Perl programmers are too
> > > incompetent, you should fire them.

> > As we can see, even a compenent, caring Perl programmer proposes
> > "read and discard" instead of "read, check and discard", and that's
> > in the discussion of correctness.  Why would I need a strawman?

> Because I said "read and discard the first header", not "read and
> discard anything whatsoever".

                   ^^^^^^^^^^ I think this "whatsoever" here isn't
                   right; I withdraw it.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Kovalenko  
View profile  
 More options Oct 31 2011, 9:39 pm
Newsgroups: comp.lang.lisp
From: Anton Kovalenko <an...@sw4me.com>
Date: Tue, 01 Nov 2011 05:39:35 +0400
Local: Mon, Oct 31 2011 9:39 pm
Subject: Re: read text from file, a chunk of more lines at a time

Anton Kovalenko <an...@sw4me.com> writes:
>>> As we can see, even a compenent, caring Perl programmer proposes "read
>>> and discard" instead of "read, check and discard", and that's in the
>>> discussion of correctness.  Why would I need a strawman?

>> It's this kind of thing that makes me want to take Lisp programmers out and
>> shoot them.

[...]

> [I]t's how people stick to a specific Perl feature
> ($/), even after it was shown to be a wrong tool in a number of ways
> (Kaz Kylheku noticed an additional danger of empty records).

[...]

> Surely it _could_ happen with CL, but I have yet to see it happening.

Well, that was a gross overstatement: it happens all the time with
FORMAT ("~a-~a" is incorrect for making symbol names from other symbol
names, but widely used). And I have an idea why it happens with Perl and
with FORMAT, but not with most other CL stuff.

If we leave out FORMAT, CL doesn't have "killer features", that is,
things so shining with elegance and brevity that we're instantly tempted
to use them. There's nothing magic about PARSE-INTEGER, or SEARCH, or
MAPCAR..., you can write your own and use it, sometimes without any
performance penalty. When a tool is appropriate, you use it; when it's
not quite there, you roll your own.  The original tool we wanted to use
usually provides some good hints on the interface we want to export
(e.g. our own parse-c-integer could take string, end, start, radix,
junk-allowed too, and :test & :key are useful for many other stuff).

In Perl, OTOH, _any_ feature is a killer feature. How would I roll my
own $/ or $_, if they were not there? Therefore, each feature that we're
using for a specific task has a chance of becoming addictive: it looks
like too much work to do if we dare to throw it away, even if it's not
really so hard for a specific task. It's not hard in Perl, after all, to
read a line at a time in a loop, check for "*RECORD*", collect a list --
that kind of boring thing we would do in CL.

--
Regards, Anton Kovalenko <http://github.com/akovalenko/sbcl-win32-threads>
+7(916)345-34-02 | Elektrostal' MO, Russia


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kaz Kylheku  
View profile  
 More options Oct 31 2011, 9:53 pm
Newsgroups: comp.lang.lisp
From: Kaz Kylheku <k...@kylheku.com>
Date: Tue, 1 Nov 2011 01:53:30 +0000 (UTC)
Local: Mon, Oct 31 2011 9:53 pm
Subject: Re: read text from file, a chunk of more lines at a time
On 2011-10-31, Kaz Kylheku <k...@kylheku.com> wrote:

Enough of the trivial Hello, World stuff, and on to a more robust, realistic
solution to the problem.

New requirements:

- produce literals, and escape occurences of " and single
  escapes within literals

- catch RECORDX where X is not a number

- enforce that records start with RECORD<NUM>

We use a filter (filters are based on a trie data structure) to do the
stringification.  A sprinkle of TXR's "blub-style for the Java spewing masses"
exception handling for the errors. We define a custom exception, derived
from exception type error.

We tighten the record collect with :gap 0 so that it does not skip nonmatching
garbage in its search for a header (not because we have to, but just for the
hell of it).

Look, Ma, one single regex used. For what regexes are designed for:
recognizing/validating a token.

@(deffilter lispstr ("\"" "\\\"") ("\\" "\\\\"))
@(defex badusage error)
@(try)
@  (collect :gap 0)
@    (cases)
RECORD@{num /[0-9]+/}
@    (or)
RECORD@nonnum
@      (throw badusage `RECORD followed by "@nonnum" which is not a number`)
@    (or)
@blah
@      (throw badusage `RECORD<N> missing, "@blah" found instead`)
@    (end)
@    (collect :vars (text))
@text
@    (until)
RECORD@(skip)
@    (end)
@  (end)
@  (output :filter lispstr)
@    (repeat)
(@num@(rep) "@text"@(end))
@    (end)
@  (end)
@(catch badusage (message))
@  (output)
ERROR: @message
@  (end)
@  (fail)
@(end)

Tests:

$ echo "foo" | txr rec2sexp.txr -
ERROR: RECORD<N> missing, "foo" found instead

$ echo "RECORDB" | txr rec2sexp.txr -
ERROR: RECORD followed by "B" which is not a number

$ echo "RECORD1" | txr rec2sexp.txr -
(1)

$ ./txr rec2sexp.txr -
RECORD1
a
b
c
d
RECORDB
3
ERROR: RECORD followed by "B" which is not a number

$ ./txr rec2sexp.txr -
RECORD1
a\b"cdef
g h i
j k
RECORD2
RECORD3
\
RECORD4
"
[Ctrl-D]
(1 "a\\b\"cdef" "g h i" "j k")
(2)
(3 "\\")
(4 "\"")


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Nov 1 2011, 3:33 am
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@tfeb.org>
Date: Tue, 1 Nov 2011 07:33:11 +0000 (UTC)
Local: Tues, Nov 1 2011 3:33 am
Subject: Re: read text from file, a chunk of more lines at a time

Carlos <an...@quovadis.com.ar> wrote:
> My point is that Perl programmers aren't necessarily stupid. That's all.

My point was that as well, with the additional one that Lisp programmers
are often really disturbingly literal-minded (I'd like to believe it's just
the 8 of them remaining in cll, but I don't).

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Nov 1 2011, 3:33 am
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@tfeb.org>
Date: Tue, 1 Nov 2011 07:33:12 +0000 (UTC)
Local: Tues, Nov 1 2011 3:33 am
Subject: Re: read text from file, a chunk of more lines at a time

Anton Kovalenko <an...@sw4me.com> wrote:
> Your own suggestion to spit out sexps was perferctly sane (and it
> doesn't need Perl, which is a good sign). What's ridiculous here is not
> Perl, or Perl's $/, it's how people stick to a specific Perl feature
> ($/), even after it was shown to be a wrong tool in a number of ways
> (Kaz Kylheku noticed an additional danger of empty records).

That, of course, wasn't what I meant.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 37   Newer >
« Back to Discussions « Newer topic     Older topic »