Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Get subseq of alist using regexp mathcing on cars

3 views
Skip to first unread message

Bruce Lambert

unread,
Jul 26, 2000, 3:00:00 AM7/26/00
to
Imagine I have an association list where the car of each member of the alist
is a symbol and the cdr is some other lisp object (it doesn't really matter
what type). Now I want to be able to retreive from the alist a sequence of
sublists whose cars match some regexp.

So I want to be able to say something like

(regexp-assoc <some-regexp> <some-alist>)

and get the elements of alist whose sequence of cars matches the regexp.
This function would probably also have to take a :from-start and :from-end
keyword. It would also return the location in the sequence where the match
started.

I'm not sure how to approach this, short of writing my own code to compile
regexps into finite state machines than will traverse the cars of a sequence
of lists. That's too much for me.

So a couple of questions:

1. Is it clear what I want to do?
2. How might I go about it?

One way I thought of would be to grab the cars from all the sublists in
alist and coerce them into a string (assume for simplicity that the cars are
all just single-character symbols or even single characters). Then I could
do the regexp matching on the string of cars, determine where the string
started and how long it was, and grab the corresponding set of sublists from
the original alist using subseq.

Any other ideas?

-bruce

Bruce Lambert

unread,
Jul 26, 2000, 3:00:00 AM7/26/00
to
This is what I came up with myself. It seems to work in the simple case
where the cars are all single-character symbols. This will work for my
current application. It could, of course, be much fancier. I still welcome
suggestions.

(defun regexp-assoc (regexp
alist)

"Takes a regular expression and an alist and returns the subsequence of
alist whose cars match the regexp. Assume cars are all symbols whose
print-names are single characters."

(let ((car-string "")
(matching-string "")
(match-position 0))

;; Grab all cars from alist and turn them into a string.
(setf car-string
(coerce
(mapcar #'(lambda (list) (coerce (car list) 'character))
alist)
'string))

;; Do regular expression match
(multiple-value-bind (boolean string)
(match-regexp regexp car-string)
(setf matching-string string))

;; Search for the position in car-string where matching string begins
(setf match-position
(search matching-string car-string))

;; Use match position and length of matching string to grab correct
;; subseq of original alist.

(subseq alist match-position (+ match-position (length
matching-string)))))


-bruce
Bruce Lambert <lamb...@uic.edu> wrote in message
news:8ln0j7$1qa$1...@newsx.cc.uic.edu...

Erik Naggum

unread,
Jul 26, 2000, 3:00:00 AM7/26/00
to
* "Bruce Lambert" <lamb...@uic.edu>

| This is what I came up with myself. It seems to work in the simple
| case where the cars are all single-character symbols. This will
| work for my current application. It could, of course, be much
| fancier. I still welcome suggestions.

I must admit to not having understood your original request, as
"assoc" is just not what you're asking for.

| "Takes a regular expression and an alist and returns the subsequence
| of alist whose cars match the regexp. Assume cars are all symbols
| whose print-names are single characters."

It doesn't really assume that. The cars may be one-character
strings, characters, or symbols with one-character names.

(defun regexp-subseq (sequence regexp &key (key #'car))
(destructuring-bind (match &optional (start . end))
(multiple-value-list
(match-regexp regexp
(map 'string (compose #'character key) sequence)
:return :index))
(if match
(subseq sequence start end)
nil)))

compose is either a macro or a function that composes functions. In
this case, it produces (lambda (x) (character (funcall key x))).

#:Erik
--
If this is not what you expected, please alter your expectations.

Thomas A. Russ

unread,
Jul 26, 2000, 3:00:00 AM7/26/00
to
"Bruce Lambert" <lamb...@uic.edu> writes:

>
> Imagine I have an association list where the car of each member of the alist
> is a symbol and the cdr is some other lisp object (it doesn't really matter
> what type). Now I want to be able to retreive from the alist a sequence of
> sublists whose cars match some regexp.
>
> So I want to be able to say something like
>
> (regexp-assoc <some-regexp> <some-alist>)
>
> and get the elements of alist whose sequence of cars matches the regexp.
> This function would probably also have to take a :from-start and :from-end
> keyword. It would also return the location in the sequence where the match
> started.

I would start with one of the regex packages you can get from the CMU
archives or alternatively from the www.alu.org web page. My personal
favorite is the nregex package. That one allows you to take a regular
expression (in a string) and compile a lisp function that tests another
string for a match against it.

OK, assume that you have a predicate for regular expressions, you can
then use the keyword argument to specify a the test function for ASSOC.

Your function would then look pretty much like the following:

(defun regexp-assoc (regular-expression a-list)
(let ((test-fn (compile-regex regular-expression)))
(assoc-if #'(lambda (x) (funcall test-fn x)) a-list)))

Note that you might need to do something like
(funcall test-fn (string x))
if the keys are not strings.

For regular expression matchers, see

http://www-cgi.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/match/0.html

--
Thomas A. Russ, USC/Information Sciences Institute t...@isi.edu

Wolfhard Buß

unread,
Jul 27, 2000, 3:00:00 AM7/27/00
to
Erik Naggum <er...@naggum.net> writes:

> (defun regexp-subseq (sequence regexp &key (key #'car))
> (destructuring-bind (match &optional (start . end))
> (multiple-value-list
> (match-regexp regexp
> (map 'string (compose #'character key) sequence)
> :return :index))
> (if match
> (subseq sequence start end)
> nil)))

How about

(defun regexp-subseq (sequence regexp &key (key #'car))

(destructuring-bind (match start &optional end)
...

or

(defun regexp-subseq (sequence regexp &key (key #'car))

(multiple-value-bind (match start end)


(match-regexp regexp
(map 'string (compose #'character key) sequence)
:return :index))
(if match
(subseq sequence start end)
nil))

?

-wb

Erik Naggum

unread,
Jul 27, 2000, 3:00:00 AM7/27/00
to
* wb...@gmx.net (Wolfhard Buß)
| How about
:
| ?

No.

lamb...@uic.edu

unread,
Aug 25, 2000, 3:00:00 AM8/25/00
to
Erik was kind enough to post the following code in response to a
request I made some weeks ago. However, when I compile this on ACL 5 I
get:

Error: Attempt to take the car of END which is not listp.

The error has something to do with the optional argument. Should it be
((start . end))? It compiles and seems to run when I make this change,
but I'd like to understand what's going on.

> (defun regexp-subseq (sequence regexp &key (key #'car))
> (destructuring-bind (match &optional (start . end))
> (multiple-value-list
> (match-regexp regexp
> (map 'string (compose #'character key) sequence)
> :return :index))
> (if match
> (subseq sequence start end)
> nil)))
>

> compose is either a macro or a function that composes functions. In
> this case, it produces (lambda (x) (character (funcall key x))).


Sent via Deja.com http://www.deja.com/
Before you buy.

0 new messages