Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
How to split a string (or arbitrary sequence) at each occurrence of a value.
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 32 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Daniel Pittman  
View profile  
 More options Oct 12 2001, 4:40 am
Newsgroups: comp.lang.lisp
From: Daniel Pittman <dan...@rimspace.net>
Date: Fri, 12 Oct 2001 18:35:25 +1000
Local: Fri, Oct 12 2001 4:35 am
Subject: How to split a string (or arbitrary sequence) at each occurrence of a value.
I am looking for the simplest way to split a string into four strings
based on a character -- to parse an IP address string, specifically.

What is the best, easiest, fastest, etc, way to split a string into
substrings based on a character position.  In Emacs Lisp I would just:

(let ((address "210.23.138.16"))
  (split-string address "\\."))  ; second arg is regexp to split on.

Now, I don't actually need regexp functionality here; a literal '.' is
enough for me.

This strikes me as the sort of idiom that would be common enough for
Common Lisp[1] to feature it as part of the standard.

I would also be interested to know if y'all can suggest a general way to
do this for generalized sequences as well as for strings, but that's not
what I need to do right now.

Oh, and am I making a really silly mistake storing an IP address in a
slot of ":type (vector (integer 0 255) 4)"?

        Daniel

Footnotes:
[1]  CLISP 2.27, specifically, with the HyperSpec as reference.

--
Money won't buy happiness, but it will pay the salaries of
a large research staff to study the problem.
        -- Bill Vaughan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dr. Edmund Weitz  
View profile  
 More options Oct 12 2001, 5:24 am
Newsgroups: comp.lang.lisp
From: e...@agharta.de (Dr. Edmund Weitz)
Date: 12 Oct 2001 11:24:17 +0200
Local: Fri, Oct 12 2001 5:24 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

Daniel Pittman <dan...@rimspace.net> writes:
> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.

I needed something similar last week and came up with this solution:

(defun split (sequence &key
                       (test #'(lambda (x) (eq x #\Space))))
  "Returns a list of sub-sequences of SEQUENCE where each
element that satisfies TEST is treated as a separator."
  (let (result)
    (do* ((old-pos (position-if-not test sequence)
                   (when old-pos
                     (position-if-not test sequence
                                      :start old-pos))))
         ((null old-pos) (nreverse result))
     (let ((new-pos
              (position-if test sequence
                           :start old-pos)))
       (if new-pos
           (setf result (cons
                         (subseq sequence old-pos new-pos)
                         result)
                 old-pos (1+ new-pos))
         (setf result (cons
                       (subseq sequence old-pos)
                       result)
               old-pos nil))))))

Note that this might not be very fast, I didn't need it. Also note
that I'm rather new to CL, so others here will definitely have better
solutions.

Best regards,
Edi.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Haugan  
View profile  
 More options Oct 12 2001, 5:28 am
Newsgroups: comp.lang.lisp
From: Erik Haugan <e...@haugan.no>
Date: Fri, 12 Oct 2001 09:28:07 GMT
Local: Fri, Oct 12 2001 5:28 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
* Daniel Pittman <dan...@rimspace.net>

> What is the best, easiest, fastest, etc, way to split a string into
> substrings based on a character position.  In Emacs Lisp I would just:

This may not be fast (I don't know), but it's straight-forward and readable.

(defun split (string &optional (delimiter #\Space))
  (with-input-from-string (*standard-input* string)
    (let ((*standard-output* (make-string-output-stream)))
      (nconc (loop for char = (read-char nil nil nil)
                   while char
                   if (char= char delimiter)
                     collect (get-output-stream-string *standard-output*)
                   else
                     do (write-char char))
             (list (get-output-stream-string *standard-output*))))))

Erik


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christophe Rhodes  
View profile  
 More options Oct 12 2001, 5:40 am
Newsgroups: comp.lang.lisp
From: Christophe Rhodes <cs...@cam.ac.uk>
Date: 12 Oct 2001 10:40:41 +0100
Local: Fri, Oct 12 2001 5:40 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

Daniel Pittman <dan...@rimspace.net> writes:
> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.

> [snip]

> This strikes me as the sort of idiom that would be common enough for
> Common Lisp[1] to feature it as part of the standard.

> I would also be interested to know if y'all can suggest a general way to
> do this for generalized sequences as well as for strings, but that's not
> what I need to do right now.

See <URL:http://ww.telent.net/cliki/PARTITION>.

> Oh, and am I making a really silly mistake storing an IP address in a
> slot of ":type (vector (integer 0 255) 4)"?

No, that's less stupid than a lot of other representations :-)

Cheers,

Christophe
--
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christophe Rhodes  
View profile  
 More options Oct 12 2001, 5:43 am
Newsgroups: comp.lang.lisp
From: Christophe Rhodes <cs...@cam.ac.uk>
Date: 12 Oct 2001 10:43:03 +0100
Local: Fri, Oct 12 2001 5:43 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
[ superseded to clarify ]

Daniel Pittman <dan...@rimspace.net> writes:
> I am looking for the simplest way to split a string into four strings
> based on a character -- to parse an IP address string, specifically.

> [snip]

> This strikes me as the sort of idiom that would be common enough for
> Common Lisp[1] to feature it as part of the standard.

> I would also be interested to know if y'all can suggest a general way to
> do this for generalized sequences as well as for strings, but that's not
> what I need to do right now.

See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
community-discussed function is described in roughly
specification-level detail, with links to a reference implementation.

> Oh, and am I making a really silly mistake storing an IP address in a
> slot of ":type (vector (integer 0 255) 4)"?

No, that's less stupid than a lot of other representations :-)

Cheers,

Christophe
--
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Haugan  
View profile  
 More options Oct 12 2001, 9:02 am
Newsgroups: comp.lang.lisp
From: Erik Haugan <e...@haugan.no>
Date: Fri, 12 Oct 2001 13:01:41 GMT
Local: Fri, Oct 12 2001 9:01 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
Sorry for replying to my own article, however, I made such an inelegant
twist in the code I posted that I feel I have to correct it:

(defun string-split (string &optional (delimiter #\Space))
  (with-input-from-string (*standard-input* string)
    (let ((*standard-output* (make-string-output-stream)))
      (loop for char = (read-char nil nil nil)
            if (or (null char)
                   (char= char delimiter))
              collect (get-output-stream-string *standard-output*)
            else
              do (write-char char)
            while char))))

Erik


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marco Antoniotti  
View profile  
 More options Oct 12 2001, 9:49 am
Newsgroups: comp.lang.lisp
From: Marco Antoniotti <marc...@cs.nyu.edu>
Date: 12 Oct 2001 09:49:03 -0400
Local: Fri, Oct 12 2001 9:49 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

I am sorry to be sooo nagging (again) on such a stupid matter. But......

The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
descriptive of what the function does.

Cheers

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group        tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                 fax  +1 - 212 - 995 4122
New York, NY 10003, USA                 http://bioinformatics.cat.nyu.edu
                    "Hello New York! We'll do what we can!"
                           Bill Murray in `Ghostbusters'.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wade Humeniuk  
View profile  
 More options Oct 12 2001, 10:30 am
Newsgroups: comp.lang.lisp
From: "Wade Humeniuk" <humen...@cadvision.com>
Date: Fri, 12 Oct 2001 08:33:25 -0600
Local: Fri, Oct 12 2001 10:33 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
I usually do this type of thing with

(defun read-delimited-string (string &optional (delimiter #\.))
  "Returns a read list of delimited values from a string"
  (read-from-string
   (concatenate 'string "("
                (substitute #\space delimiter string)
                ")")))

CL-USER 3 > (read-delimited-string "210.23.138.16")
(210 23 138 16)
15

CL-USER 4 >

Wade

"Daniel Pittman" <dan...@rimspace.net> wrote in message

news:873d4pjlwy.fsf@inanna.rimspace.net...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christophe Rhodes  
View profile  
 More options Oct 12 2001, 10:43 am
Newsgroups: comp.lang.lisp
From: Christophe Rhodes <cs...@cam.ac.uk>
Date: 12 Oct 2001 15:43:14 +0100
Local: Fri, Oct 12 2001 10:43 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

I suppose this depends if you're a physicist or a set theorist; to a
physicist (me, for example) partition has connotation of putting
partitions into something, to divide it up;

I freely give permission to vendors to include the partition code in
their Lisps; if vendors think that it will help, they are free to call
it 'SPLIT-SEQUENCE' if they like, or 'SPLIT', or whatever. Not that I
generally believe in appeals to the market to determine correctness,
but in this case it's my way of dodging the issue. I *like* the name
'PARTITION', so that's what I call it; others are free to do
otherwise, though as a matter of unifying the community I would rather
hope that they didn't. Ultimately, I accept the possibility that I
will be in a minority of one.

Anyone else want to volunteer ideas for utility functions that
everyone writes? Imagine that CL had a 'PARTITION' in the language;
what would people lament the absence of to comp.lang.lisp once a week?

Cheers,

Christophe
--
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Moore  
View profile  
 More options Oct 12 2001, 10:57 am
Newsgroups: comp.lang.lisp
From: "Tim Moore" <mo...@bricoworks.com>
Date: 12 Oct 2001 14:57:41 GMT
Local: Fri, Oct 12 2001 10:57 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
In article <y6citdlarzk....@octagon.mrl.nyu.edu>, "Marco Antoniotti"

<marc...@cs.nyu.edu> wrote:
> Christophe Rhodes <cs...@cam.ac.uk> writes:
>> See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
>> community-discussed function is described in roughly
>> specification-level detail, with links to a reference implementation.
> I am sorry to be sooo nagging (again) on such a stupid matter. But......
>  The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
> descriptive of what the function does.  Cheers

Get over it!

Tim

Add smileys as necessary


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Russell Senior  
View profile  
 More options Oct 12 2001, 4:26 pm
Newsgroups: comp.lang.lisp
From: Russell Senior <seni...@aracnet.com>
Date: 12 Oct 2001 13:26:30 -0700
Local: Fri, Oct 12 2001 4:26 pm
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

>>>>> "Wade" == Wade Humeniuk <humen...@cadvision.com> writes:

Wade> I usually do this type of thing with

Wade> (defun read-delimited-string (string &optional (delimiter #\.))
Wade>   "Returns a read list of delimited values from a string"
Wade>   (read-from-string
Wade>    (concatenate 'string "("
Wade>                 (substitute #\space delimiter string)
Wade>                 ")")))
Wade>
Wade> CL-USER 3 > (read-delimited-string "210.23.138.16")
Wade> (210 23 138 16)
Wade> 15

This, of course, won't work the way you want if the delimited values
also contain spaces.  

I've been using a split-sequence function that was discussed here on
comp.lang.lisp back in Sept 1998, which works reasonably well.  The
problem above, though, raises the question of how one might handle
quoting of delimiters.  It hasn't been a problem for me, as usually
things are arranged so that it won't be, but in the general case it
could.

--
Russell Senior         ``The two chiefs turned to each other.        
seni...@aracnet.com      Bellison uncorked a flood of horrible      
                         profanity, which, translated meant, `This is
                         extremely unusual.' ''                      


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Shannon Spires  
View profile  
 More options Oct 12 2001, 6:02 pm
Newsgroups: comp.lang.lisp
From: Shannon Spires <svsp...@remove-this.nmia.com>
Date: Fri, 12 Oct 2001 15:52:04 -0600
Local: Fri, Oct 12 2001 5:52 pm
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
In article <873d4pjlwy....@inanna.rimspace.net>, Daniel Pittman

<dan...@rimspace.net> wrote:
> Oh, and am I making a really silly mistake storing an IP address in a
> slot of ":type (vector (integer 0 255) 4)"?

I usually store them as 32-bit integers. It's simple that way, and
my TCP/IP stack routines use integers internally anyway. Provided you
have good conversion routines to and from dotted notation for human I/O,
it works well.

Shannon Spires
svsp...@nmia.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Pierre R. Mai  
View profile  
 More options Oct 12 2001, 6:31 pm
Newsgroups: comp.lang.lisp
From: "Pierre R. Mai" <p...@acm.org>
Date: 13 Oct 2001 00:19:52 +0200
Local: Fri, Oct 12 2001 6:19 pm
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

Russell Senior <seni...@aracnet.com> writes:
> I've been using a split-sequence function that was discussed here on
> comp.lang.lisp back in Sept 1998, which works reasonably well.  The
> problem above, though, raises the question of how one might handle
> quoting of delimiters.  It hasn't been a problem for me, as usually
> things are arranged so that it won't be, but in the general case it
> could.

Once you throw escaping, or similar things into the equation, IMHO the
time has come to write a lexer/parser.  This is often only slightly
more complex than calling split-sequence/partition/what-have-you, but
offers you much more flexibility, and IMHO clarity.

Regs, Pierre.

--
Pierre R. Mai <p...@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wade Humeniuk  
View profile  
 More options Oct 12 2001, 7:06 pm
Newsgroups: comp.lang.lisp
From: "Wade Humeniuk" <humen...@cadvision.com>
Date: Fri, 12 Oct 2001 17:09:23 -0600
Local: Fri, Oct 12 2001 7:09 pm
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

> This, of course, won't work the way you want if the delimited values
> also contain spaces.

Of course not, but it does not have to in the case of dotted IP addresses.

This raises the issue of a generalized parser/reader for any conceivable
situation or writing a special purpose reader for specific cases.  The time
needed to implement a generalized solution (like regular expressions)
outweighs the time to implement a 100 of the specific readers.  Lazy man's
way out.

Here is a snippet of a parsing/reading problem from the LWW port for Aserve.
Delimiters are slightly more complex.

;;;
;;; DATE-TO-UNIVERSAL-TIME
;;; This is a contribution of Wade Humeniuk <humen...@cadvision.com>
;;; It reimplements the original function without MATCH-REGEXP
;;; which is not fully implemented in ACL-COMPAT
;;;

(defvar *net.aserve-package* (find-package :net.aserve))

(defun date-to-universal-time (date)
  ;; convert  a date string to lisp's universal time
  ;; we accept all 3 possible date formats

  ;; check preferred type first (rfc1123 (formerly refc822)):
  ;;    Sun, 06 Nov 1994 08:49:37 GMT
  ;; now second best format (but used by Netscape sadly):
  ;;    Sunday, 06-Nov-94 08:49:37 GMT
  ;; finally the third format, from unix's asctime
  ;;    Sun Nov  6 08:49:37 1994

  (let ((date (copy-seq date))
        (*read-eval* nil)
        (*package* *net.aserve-package*))
    (loop for char across date
          for i = 0 then (1+ i)
          when (or (char= #\, char)
                   (char= #\- char)
                   (char= #\: char))
          do (setf (elt date i) #\space))
    (setf date (concatenate 'string "(" date ")"))

    (destructuring-bind (day-of-week day month year hour minute second
                                     &optional timezone)
        (read-from-string date)
      (declare (ignore day-of-week timezone))
      (when (symbolp day) ;; probably third format, swap values
        (let ((real-day month)
              (real-month day)
              (real-hour year)
              (real-minute hour)
              (real-second minute)
              (real-year second))
          (setf day real-day
                month real-month
                year real-year
                hour real-hour
                minute real-minute
                second real-second)))
      (setf month (ecase month
                    (jan 1)
                    (feb 2)
                    (mar 3)
                    (apr 4)
                    (may 5)
                    (jun 6)
                    (jul 7)
                    (aug 8)
                    (sep 9)
                    (oct 10)
                    (nov 11)
                    (dec 12)))
      (cond
       ((and (> year 70) (< year 100)) (incf year 1900))
       ((<= year 70) (incf year 2000)))
      (encode-universal-time second minute hour day month year))))

#| The original code
(defun date-to-universal-time (date)
  ;; convert  a date string to lisp's universal time
  ;; we accept all 3 possible date formats

  (flet ((cvt (str start-end)
    (let ((res 0))
      (do ((i (car start-end) (1+ i))
    (end (cdr start-end)))
   ((>= i end) res)
        (setq res
   (+ (* 10 res)
      (- (char-code (schar str i)) #.(char-code #\0))))))))
    ;; check preferred type first (rfc1123 (formerly refc822)):
    ;;   Sun, 06 Nov 1994 08:49:37 GMT
    (multiple-value-bind (ok whole
     day
     month
     year
     hour
     minute
     second)
 (match-regexp
  "[A-Za-z]+, \\([0-9]+\\) \\([A-Za-z]+\\) \\([0-9]+\\)
\\([0-9]+\\):\\([0-9]+\\):\\([0-9]+\\) GMT"
  date
  :return :index)
      (declare (ignore whole))
      (if* ok
  then (return-from date-to-universal-time
  (encode-universal-time
   (cvt date second)
   (cvt date minute)
   (cvt date hour)
   (cvt date day)
   (compute-month date (car month))
   (cvt date year)
   0))))

    ;; now second best format (but used by Netscape sadly):
    ;;  Sunday, 06-Nov-94 08:49:37 GMT
    ;;
    (multiple-value-bind (ok whole
     day
     month
     year
     hour
     minute
     second)
 (match-regexp

  "[A-Za-z]+, \\([0-9]+\\)-\\([A-Za-z]+\\)-\\([0-9]+\\)
\\([0-9]+\\):\\([0-9]+\\):\\([0-9]+\\) GMT"
  date
  :return :index)

      (declare (ignore whole))

      (if* ok
  then (return-from date-to-universal-time
  (encode-universal-time
   (cvt date second)
   (cvt date minute)
   (cvt date hour)
   (cvt date day)
   (compute-month date (car month))
   (cvt date year) ; cl does right thing with 2 digit dates
   0))))

    ;; finally the third format, from unix's asctime
    ;;     Sun Nov  6 08:49:37 1994
    (multiple-value-bind (ok whole
     month
     day
     hour
     minute
     second
     year
     )
 (match-regexp

  "[A-Za-z]+ \\([A-Za-z]+\\) +\\([0-9]+\\)
\\([0-9]+\\):\\([0-9]+\\):\\([0-9]+\\) \\([0-9]+\\)"
  date
  :return :index)

      (declare (ignore whole))

      (if* ok
  then (return-from date-to-universal-time
  (encode-universal-time
   (cvt date second)
   (cvt date minute)
   (cvt date hour)
   (cvt date day)
   (compute-month date (car month))
   (cvt date year)
   0))))

    ))
|#

Wade


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Oct 12 2001, 7:16 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Fri, 12 Oct 2001 23:15:29 GMT
Local: Fri, Oct 12 2001 7:15 pm
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
* Christophe Rhodes
| See <URL:http://ww.telent.net/cliki/PARTITION>, wherein a
| community-discussed function is described in roughly specification-level
| detail, with links to a reference implementation.

* Marco Antoniotti
| I am sorry to be sooo nagging (again) on such a stupid matter. But......
| The name PARTITION is inappropriate.  SPLIT-SEQUENCE is much more
| descriptive of what the function does.

* Tim Moore
| Get over it!

  But "partition" is such a _fantastically_ bad name, especially to people
  who know a bit of mathematical terminology.  Effectively using up that
  name forever for something so totally unrelated to the mathematical
  concept is hostile.  It is like defining a programming language where
  "sin" and "tan" are operations on (in) massage parlor just because the
  designers are more familiar with them than with mathematics.  "Partition"
  is a good name for a string-related function when the _only_ thing you
  think about is strings, or sequences at best.  At the very least, it
  should be called partition-sequence, but even this sounds wrong to me.

  I tend to use :start and :end arguments to various functions instead of
  splitting one string into several, and make sure that functions I write
  accept :start and :end arguments, and that they work with all sequences
  and useful element types, not only strings and characters.

///


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kenny Tilton  
View profile  
 More options Oct 12 2001, 10:11 pm
Newsgroups: comp.lang.lisp
From: Kenny Tilton <ktil...@nyc.rr.com>
Date: Sat, 13 Oct 2001 02:11:00 GMT
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

hmmm. my dictionary says partition means to divide into parts. if
partition means something else to mathematicians, that's fine, natural
language is like that, but it's a bit harsh to moan about someone using
a word correctly just because someone else took liberties with it.

besides, in a custody fight between mathematics and sequences over the
symbol-function of 'partition, well this is Lisp, I think sequences win.
we could solomon-like split the baby in half and not let anyone use
'partition, but consider this: the only sequence function I see listed
in CLTL2 which does not take a generic name (such as 'position) all for
itself is the trivial case of 'copy-seq.

i think the math literates amongst us gots to remember whose house they
are in when reading Lisp. (y'all can grok (+ 2 2) right?). seems to me
unadorned function names go to sequences, and it is the guest domains
that need to tack on tie-break syllables.

kenny
clinisys

  Effectively using up that


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jochen Schmidt  
View profile  
 More options Oct 12 2001, 10:35 pm
Newsgroups: comp.lang.lisp
From: Jochen Schmidt <j...@dataheaven.de>
Date: Sat, 13 Oct 2001 04:32:50 +0200
Local: Fri, Oct 12 2001 10:32 pm
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

Wade Humeniuk wrote:
>> This, of course, won't work the way you want if the delimited values
>> also contain spaces.

> Of course not, but it does not have to in the case of dotted IP addresses.

> This raises the issue of a generalized parser/reader for any conceivable
> situation or writing a special purpose reader for specific cases.  The
> time needed to implement a generalized solution (like regular expressions)
> outweighs the time to implement a 100 of the specific readers.  Lazy man's
> way out.

> Here is a snippet of a parsing/reading problem from the LWW port for
> Aserve. Delimiters are slightly more complex.

[example snipped]

This parsing routine got replaced a while ago through a function using the
META Parser. The problem with using the READER for that stuff was that some
Browsers (Netscape) had a semicolon and some further characters behind the
date and if you wrap that string in parens, the closing paren is behind the
semicolon and therefore commented out.

The actual code in portableaserve is like this (which is a quick hack
written with META and not really nice...)

(eval-when (:compile-toplevel :load-toplevel :execute)
  (meta:enable-meta-syntax)
(deftype alpha-char () '(and character (satisfies alpha-char-p)))
(deftype digit-char () '(and character (satisfies digit-char-p)))
)

(defun date-to-universal-time (date)
  ;; convert  a date string to lisp's universal time
  ;; we accept all 3 possible date formats

  ;; check preferred type first (rfc1123 (formerly refc822)):
  ;;    Sun, 06 Nov 1994 08:49:37 GMT
  ;; now second best format (but used by Netscape sadly):
  ;;    Sunday, 06-Nov-94 08:49:37 GMT
  ;; finally the third format, from unix's asctime
  ;;    Sun Nov  6 08:49:37 1994

  (let (last-result)
    (meta:with-string-meta (buffer date)
           (labels ((make-result ()
                        (make-array 0
                                    :element-type 'base-char
                                    :fill-pointer 0 :adjustable t))
               (skip-day-of-week (&aux c)
                                 (meta:match [$[@(alpha-char c)]
                                               !(skip-delimiters)]))
               (skip-delimiters ()
                                (meta:match $[{#\: #\, #\space #\-}]))
               (word (&aux (old-index meta::index) c
                           (result (make-result)))
                     (or (meta:match [!(skip-delimiters) @(alpha-char c)
                                       !(vector-push-extend c result)
                                      $[@(alpha-char c)
                                         !(vector-push-extend c result)]
                                      !(setf last-result result)])
                         (progn (setf meta::index old-index) nil)))
               (integer (&aux (old-index meta::index) c
                              (result (make-result)))
                        (or (meta:match [!(skip-delimiters) @(digit-char c)
                                          !(vector-push-extend c result)
                                         $[@(digit-char c)
                                            !(vector-push-extend c result)]
                                         !(setf last-result
                                                (parse-integer result))])
                            (progn (setf meta::index old-index) nil)))
               (date (&aux day month year hours minutes seconds)
                     (and (meta:match [!(skip-day-of-week)
                                       {[!(word) !(setf month last-result)
                                         !(integer) !(setf day last-result)]
                                        [!(integer) !(setf day last-result)
                                         !(word) !(setf month
                                                        last-result)]}
                                       !(integer) !(setf year last-result)
                                       !(integer) !(setf hours last-result)
                                       !(integer) !(setf minutes
                                                         last-result)
                                       !(integer) !(setf seconds
                                                         last-result)])
                         ; (values seconds minutes hours day month)
                          (encode-universal-time seconds minutes hours day
                                                 (net.aserve::compute-month
                                                  (coerce month 'simple-string)
                                                  0)
                                                 year
                                                 0))))
              (date)))))

ciao,
Jochen

--
http://www.dataheaven.de


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wade Humeniuk  
View profile  
 More options Oct 13 2001, 12:26 am
Newsgroups: comp.lang.lisp
From: "Wade Humeniuk" <humen...@cadvision.com>
Date: Fri, 12 Oct 2001 22:29:06 -0600
Local: Sat, Oct 13 2001 12:29 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

> This parsing routine got replaced a while ago through a function using the
> META Parser. The problem with using the READER for that stuff was that
some
> Browsers (Netscape) had a semicolon and some further characters behind the
> date and if you wrap that string in parens, the closing paren is behind
the
> semicolon and therefore commented out.

Would it have worked to have substituted the #\; for #\space first?  Same
kind of routine but discarding the extra vars in destructuring-bind?

> The actual code in portableaserve is like this (which is a quick hack
> written with META and not really nice...)

Wow, I would not have thought a macro like meta existed.

Wade


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bulent Murtezaoglu  
View profile  
 More options Oct 13 2001, 2:38 am
Newsgroups: comp.lang.lisp
From: Bulent Murtezaoglu <b...@acm.org>
Date: Sat, 13 Oct 2001 06:37:36 GMT
Local: Sat, Oct 13 2001 2:37 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
>>>>> "KT" == Kenny Tilton <ktil...@nyc.rr.com> writes:

[...]
    KT> hmmm. my dictionary says partition means to divide into
    KT> parts. if partition means something else to mathematicians,
    KT> that's fine, natural language is like that, but it's a bit
    KT> harsh to moan about someone using a word correctly just
    KT> because someone else took liberties with it. [...]

Unfortunately it also means something to computer scientists, possibly
the same thing it means to mathematicians (what an equivalence relation
does to a set) so the overlap is not just with some remote mathematical
lingo.  When you say partition, a CS type would think of sets, not strings.
I therefore don't think Erik was being unduly harsh.  

Of course I was too distracted/lazy to say any of this and even read cll
when what became partition was being discussed, so I should probably shut
up now.

cheers,

BM


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kenny Tilton  
View profile  
 More options Oct 13 2001, 4:50 am
Newsgroups: comp.lang.lisp
From: Kenny Tilton <ktil...@nyc.rr.com>
Date: Sat, 13 Oct 2001 08:49:52 GMT
Local: Sat, Oct 13 2001 4:49 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

Bulent Murtezaoglu wrote:
>  When you say partition, a CS type would think of sets, not strings.
> I therefore don't think Erik was being unduly harsh.

<h> actually i was mimicking the teenager usage of "harsh", which usage
is highly exaggerated as with most teenspeak. and i was thinking of the
general case of objecting to someone using a word correctly, wasn't
thinking about EN's post at all at that point tho I can see why one
would construe it that way.

actually we used "partition" in our code recently in the sense you
described, in a partial DB replication scheme: a DB instance viewed as
partitioning the set of all DB instances according to whether the key
instance had a direct or indirect owning relationship of any given
instance.

that said, turning from my dictionary to my thesaurus I discover split
and partition listed together under "allocation". :(

do i hear you all saying that the objection is that this string
manipulation we are discussing takes an ordered sequence and chops it up
by finding certain delimiters and then crucially considering the order
when dividing up the string, ie, every element _between_ two delimiters
ends up in the same partition, whereas in partitioning order does not
matter, each set member gets tested individually with the predicate? if
so, ok, i get that distinction.

sadly, i just looked up "split" and though the definition sounded as if
order was a factor in the partitioning denoted by "split", the two
examples given were "split into groups" and "split up the money". :(

interesting, what synonym for partition implies order matters? i guess
"subseq" kinda hits the problem over the head (just checked, that was
omitted from the list of sequence functions I saw in CLTL2) so with that
precedent something like 'split-sequence or 'splitseq would indeed be
preferable.

kenny
clinisys


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Oct 13 2001, 5:28 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Sat, 13 Oct 2001 09:27:43 GMT
Local: Sat, Oct 13 2001 5:27 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
* Kenny Tilton
| hmmm. my dictionary says partition means to divide into parts. if
| partition means something else to mathematicians, that's fine, natural
| language is like that, but it's a bit harsh to moan about someone using
| a word correctly just because someone else took liberties with it.

  To repeat myself from the article you responded to, since a teenager's
  attention span is so short:

  At the very least, it should be called partition-sequence, but even this
  sounds wrong to me.

  The more general a name, the more general the functionality it should
  provide in order to defend usurping the general name.  If it only works
  on sequences and only uses _one_ meaning of a word at the exclusion of
  another, make it more specific.  I posted the first version of the code
  that got discussed and transmogrified and then renamed into "partition"
  without any discussion here.  It was called "split-sequence" as I recall.
  The code that they base "partition" on was initially called just "split"
  and renamed "partition".  Bad move.

  Common Lisp does not have a simple way to import a symbol from a package
  under another name.  This means the connection to a badly chosen name is
  broken if you choose to rename it.  This is all the more reason to be a
  little careful when you name things very generally.  "split" was horrible
  in that sense, too.  I notice in passing that Franz Inc's "aserve" has
  split-on-character, split-into-words, and split-string functions which
  all seem overly specific, but which are at leas properly named.

///


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christophe Rhodes  
View profile  
 More options Oct 13 2001, 5:45 am
Newsgroups: comp.lang.lisp
From: Christophe Rhodes <cs...@cam.ac.uk>
Date: 13 Oct 2001 10:45:23 +0100
Local: Sat, Oct 13 2001 5:45 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

I can't help but be slightly irritated by this, I'm afraid, as I noted
at the time the conspicuous absence of certain people (not just Erik)
in the debate about the splitting function and its naming, at times
when I thought they might well have something to contribute.

Nevertheless, the question is probably more "so what are we going to
do about it?" Well, that's a good question... my personal attitude at
this point right now is "why bother?"

No doubt my idealism will resurface at some point,

Christophe
--
Jesus College, Cambridge, CB5 8BL                           +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/                  (defun pling-dollar
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Russell Senior  
View profile  
 More options Oct 13 2001, 6:54 am
Newsgroups: comp.lang.lisp
From: Russell Senior <seni...@aracnet.com>
Date: 13 Oct 2001 03:54:13 -0700
Local: Sat, Oct 13 2001 6:54 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.

>>>>> "Erik" == Erik Naggum <e...@naggum.net> writes:

Erik> [...] I posted the first version of the code that got discussed
Erik> and transmogrified and then renamed into "partition" without any
Erik> discussion here.  It was called "split-sequence" as I recall.

I think I might have been the one to call it split-sequence.  This
function was discussed on this newsgroup in September 1998, initially
in a thread titled "I don't understand Lisp".  During a discussion of
regular expressions (I think it was) Erik posted a function with a
slightly different interface and purpose called delimited-substrings,
and I followed up with one (pretty horrifying, but functioning) called
split-sequence, which I had adapted/generalized from one I'd found
called split-string.  Over the next few days it was substantially
revised/rewritten several times on the newsgroup by various authors.
At the end of that thread, it was still being called split-sequence,
which I continue to like and still use.

It appears this is what resurfaced in a still mutating form about a
year ago, called variously split and partition.

When the Christophe Rhodes "split-sequence/partition" thread started
back in June/July, I wasn't paying very much attention and so I didn't
participate.

BTW, one useful feature that got lost along the way seems to be the
ability to provide a value for empty subsequences.

For what it's worth, I still like the name split-sequence.

--
Russell Senior         ``The two chiefs turned to each other.        
seni...@aracnet.com      Bellison uncorked a flood of horrible      
                         profanity, which, translated meant, `This is
                         extremely unusual.' ''                      


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Oct 13 2001, 7:17 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Sat, 13 Oct 2001 11:16:32 GMT
Local: Sat, Oct 13 2001 7:16 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
* Christophe Rhodes <cs...@cam.ac.uk>
| I can't help but be slightly irritated by this, I'm afraid, as I noted
| at the time the conspicuous absence of certain people (not just Erik)
| in the debate about the splitting function and its naming, at times
| when I thought they might well have something to contribute.

  Where did this debate occur?  I have just stuffed a private archive of a
  _lot_ of news into a huge database, and cannot find any discussion of the
  name "partition" in this forum.  If you go away and make up your own
  community and you do something stupid and somebody complains about it, it
  is fairly bad taste to blame the people _you_ left behind for not taking
  part in your discussion.  This is one of the reasons I do not think those
  mini-communities are doing any good.  You need a large number of people
  to weed out the silly ideas that look good to everyone in a small group.

| Nevertheless, the question is probably more "so what are we going to do
| about it?" Well, that's a good question... my personal attitude at this
| point right now is "why bother?"

  Yeah, why use something that is so badly named?  So, who cares?

  As I have indicated, I think splitting strings and creating huge amounts
  of garbage during parsing is bad software design.  The incessant copying
  of characters that plague most parsers is _the_ source of bad performance.

///


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Oct 13 2001, 7:41 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Sat, 13 Oct 2001 11:40:35 GMT
Local: Sat, Oct 13 2001 7:40 am
Subject: Re: How to split a string (or arbitrary sequence) at each occurrence of a value.
* Russell Senior <seni...@aracnet.com>
| I think I might have been the one to call it split-sequence.

  Yes.  Thank you for the correction and clarification.

| When the Christophe Rhodes "split-sequence/partition" thread started back
| in June/July, I wasn't paying very much attention and so I didn't
| participate.

  It looked to me like nobody really liked "partition" and the consensus
  was clearly on "split-sequence".  The name "partition" was just handed to
  us as something to accept despite the strong opposition.  However, I have
  not found the discussion behind this comment in "partition.lisp":

;;; * naming the function PARTITION rather than SPLIT.

  I wonder how this change was chosen.  Where can I find the discussion?

///


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 32   Newer >
« Back to Discussions « Newer topic     Older topic »