Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

question on sort

7 views

Kent M Pitman

Aug 1, 2000, 3:00:00 AM8/1/00
to
kp gores <go...@sip.medizin.uni-ulm.de> writes:

...
> unfortunately they sort only sequences or vectors. I'd like to sort
> arrays of whatever dimensions (my array is (* 6)).
> The key should be a element in the array (eg. accessor for column 2).
...
> BTW, 1) how do i declare the types of the columns in my array?

Arrays are assumed to be of homogenous type. Given your desire not to
have this be so, coupled with your desire to sort, I suggest that what
you want is not a single array of high arity but rather a vector of vectors,
or better still, a vector of structs.

Erik Naggum

Aug 1, 2000, 3:00:00 AM8/1/00
to
* kp gores <go...@sip.medizin.uni-ulm.de>

| unfortunately they sort only sequences or vectors.

A sequence is a list or a vector.

| I'd like to sort arrays of whatever dimensions (my array is (* 6)).
| The key should be a element in the array (eg. accessor for column 2).

| does a fast and flexible implentation of sort/stable-sort for arbitrary
| array exist somewhere on the net?

Build a vector of the appropriate size with all the keys in it:

(let ((keys (make-array <size>)))
(dotimes (i <size>)
(setf (aref keys i) i))
keys)

Sort the keys vector, using a :key argument that retrieves the value
from your array with the value in the keys vector as index.

If you absolutely have to, rearrange your multidimensional array.
Otherwise, you just use an indirection through the sorted keys vector.

I consider the above an example of a pattern in Common Lisp, btw.

#:Erik
--

Barry Margolin

Aug 1, 2000, 3:00:00 AM8/1/00
to
In article <gores-E38B18....@news.uni-ulm.de>,
kp gores <go...@sip.medizin.uni-ulm.de> wrote:
>hello,
>
>Hyperspec knows the functions sort, stable-sort:
>
> sort sequence predicate &key key => sorted-sequence
>
> If sequence is a vector, the result is a vector that has the same
> actual array element type as sequence. If sequence is a list, the
> result is a list.
>
>unfortunately they sort only sequences or vectors. I'd like to sort
>arrays of whatever dimensions (my array is (* 6)).
>The key should be a element in the array (eg. accessor for column 2).
>
>does a fast and flexible implentation of sort/stable-sort for arbitrary
>array exist somewhere on the net?

You could create a sequence of row numbers:

(setq indices
(loop for i below (array-dimension 0 old-array)
collect i))

then sort these:

(setq sorted-indices
(sort indices #'< :key (lambda (index) (aref old-array index 2)))

then finally fill in a new array using the sorted indexes:

(loop for i upfrom 0
for sorted-index in sorted-indices
do
(dotimes (j array-dimension 1 old-array)
(setf (aref new-array i j) (aref old-array sorted-index j))))

> BTW, 1) how do i declare the types of the columns in my array?

With the :ELEMENT-TYPE option to MAKE-ARRAY.

> 2) what is the correct form for (declare (type (list my-type))),
> i.e a declarationn telling the compiler that the type is a list
> of elements of type my-type

(defun list-of-my-type (list)
(every #'(lambda (x) (typep x 'my-type))))

(deftype list-of-my-type ()
'(and list (satisfies list-of-my-type)))

(declare (type list-of-my-type ...))

Such a declaration is probably not going to do much; this type is useful
with CHECK-TYPE, but types that involve SATISFIES generally can't do much
for optimization.

--
Barry Margolin, bar...@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

David Hanley

Aug 1, 2000, 3:00:00 AM8/1/00
to

I am not sure what you are looking for.

What does sorting a higher dimensional array mean?

Why not sort a vector of vectors?

Answer these, and more help can be provided.

dave

Christophe Rhodes

Aug 2, 2000, 3:00:00 AM8/2/00
to
kp gores <go...@sip.medizin.uni-ulm.de> writes:

> hm, i meant if it is possible for my array of dimensions (* 6) to
> declare that column 0 is of type string, column 1 and 2 of type date
> etc...

Um. Are you sure that you have the optimal data type for your
application? An array should be seen as a set of the same types of
things -- for example, a temperature field in a room could be
represented as a three-dimensional array of floats (having set up a
suitable grid for the indices). What you seem to want is an array of
structs, or even possibly a list of structs, since you're not sure how

Or am I missing something?

Christophe

Raymond Toy

Aug 2, 2000, 3:00:00 AM8/2/00
to
>>>>> "kp" == kp gores <go...@sip.medizin.uni-ulm.de> writes:

kp> hm, donT know. probably because the definition would look like
kp> (make-array 20 :initial-element (make-array 6 )). whats wrong with

This probably isn't what you want either. Each element of the
20-element array is now pointing to exactly the same 6-element array:

* (setf z (make-array 3 :initial-element (make-array 6)))
#(#(0 0 0 0 0 0) #(0 0 0 0 0 0) #(0 0 0 0 0 0))
* (setf (aref (aref z 1) 3) "???")
"???"
* z
#(#(0 0 0 "???" 0 0) #(0 0 0 "???" 0 0) #(0 0 0 "???" 0 0))

Ray

Raymond Toy

Aug 2, 2000, 3:00:00 AM8/2/00
to
>>>>> "kp" == kp gores <go...@sip.medizin.uni-ulm.de> writes:

kp> in another post i gave his example:
kp> i want to put the long file listing of a directory into an array.
kp> the array has the dimensions (* 6):
kp> i dont know how many files there will be (thats why i wrote * as first
kp> index), but for each file i need the six entries
kp> name, owner, group ,size, creation date, modification date etc..
kp> each entry has a different type.

kp> of course i could define an array with elements of a type file-info, eg.
kp> a defclass with the six slots i need.
kp> but it was obvious to me to use an array.

kp> i'll try to think about why an array of structs is better.

Because it states more cleary what your intent is?
Because the elements have names?
What's in entry 3? Or was entry 5 what I wanted?
What happens if I decide to add a new entry?
What happens all the code if I rearrange the entries?

Pretty good reasons for me.

Ray

David Hanley

Aug 2, 2000, 3:00:00 AM8/2/00
to

kp gores wrote:

> In article <39872A42...@ncgr.org>, David Hanley <d...@ncgr.org>

> wrote:
>
> > I am not sure what you are looking for.
> > What does sorting a higher dimensional array mean?
>

> my array could be of dimensions (* 6) eg. for a files:
> there you could have entries

> name, owner, group ,size, creation date, modification date etc..
>

> sorting means to sort by name (key for sort column 0), or by creation
> date (key is column 4). even nicer it would be to sort by name as
> primary key and date as secondary key... you get the idea.

Ahh, I see. I wasn't understanding earlier.

I think others are right; what is probably better is to do
something like the following:

(defstruct rec
name
blah
blah
creation-date
... )

(setq arr (make-array *n*))
(dotimes (i *n*)
(setf (aref arr i) (make-rec)))

>
>
> >
> > Why not sort a vector of vectors?
>

> hm, donT know. probably because the definition would look like

> (make-array 20 :initial-element (make-array 6 )). whats wrong with

> (make-array '(20 6)) ??

Firstly, what's wrong is that every element of the array will point to
the same sub-array :) That's why I used a loop above.

What's wrong with the multidimensional array approach is that it's not
an array of things; rather it's 2d-array of all the same things. That's
the reason why sorting it is hard. Sorting a multidimensional array is
an ambiguous thing, because you could impose any arbitrary
structure on it. For example, are columns records, or are rows?
Maybe two consecutive rows compose records? What happens
when you get clever and use a 3 dimensional array?

But, for what you want, the following works.

(defun multisort( arr comp )
"sort a multidimensional array--not in place, either"
(let* (
(size (length arr))
(index-array (make-array size)))
(dotimes(i size)
(setf (aref index-array i) i))
(sort index-array
#'(lambda(x y)(funcall comp (aref arr x) (aref arr y))))
(map 'vector #'(lambda(x)(aref arr x)) index-array)))

;; testing code

(defparameter a #( #(1 2 3) #(2 2 2) #(3 4 1)))

(defun te( c )
"test by sorting based on a specific column"
(multisort a #'(lambda(x y)(< (aref x c)(aref y c)))))

By the way, something like initial-contents that called a function
would be really keen. That way you could fill the array
with individual (different) objects more easily.

dave

Barry Margolin

Aug 2, 2000, 3:00:00 AM8/2/00
to
In article <gores-E995B0....@news.uni-ulm.de>,
kp gores <go...@sip.medizin.uni-ulm.de> wrote:
>probably not. it seems like nobody in this group looks at an array the
>way i do :-(

You sound like a spreadsheet geek. When all you have is a hammer,
everything looks like a nail. In languages with that provide more
expressive data structures, arrays are generally only used for uniform
data. Lisp certainly allows you to put arbitrary data in any array
element, but it's usually more appropriate to use structures for groups of
related items that aren't uniform.

Coby Beck

Aug 2, 2000, 3:00:00 AM8/2/00
to

"kp gores" <go...@sip.medizin.uni-ulm.de> wrote in message
news:gores-E995B0....@news.uni-ulm.de...

>
> in another post i gave his example:
> i want to put the long file listing of a directory into an array.
> the array has the dimensions (* 6):
> i dont know how many files there will be (thats why i wrote * as first
> index), but for each file i need the six entries

> name, owner, group ,size, creation date, modification date etc..
> each entry has a different type.
>
> of course i could define an array with elements of a type file-info, eg.
> a defclass with the six slots i need.

THIS is the obvious approach!

> but it was obvious to me to use an array.

An array, yes, but not a multi-dimensional array.

>
> i'll try to think about why an array of structs is better.

because otherwise you are trying to put a square peg in a round hole!!

if it looks like a duck and quacks like a duck, model it after a duck ;-)

Coby

Robert Monfera

Aug 3, 2000, 3:00:00 AM8/3/00
to
Raymond Toy wrote:

> kp> i'll try to think about why an array of structs is better.
>
> Because it states more cleary what your intent is?
> Because the elements have names?
> What's in entry 3? Or was entry 5 what I wanted?
> What happens if I decide to add a new entry?
> What happens all the code if I rearrange the entries?
>
> Pretty good reasons for me.

Why could a heterogenous array _sometimes_ be better?

Because of better locality of data in memory (->cache hits)?
Because you may save ~24 type overhead bytes per row?
Because you don't have to garbage collect a huge array?
How fast can you access information?

OK, these reasons are all optimization-related, and there is a lot to
lose if one goes that way (no heterogenous objects, no class
dispatching, manual memory management...).

Robert

Kent M Pitman

Aug 3, 2000, 3:00:00 AM8/3/00
to
Robert Monfera <mon...@fisec.com> writes:

> OK, these reasons are all optimization-related, and there is a lot to
> lose if one goes that way (no heterogenous objects, no class
> dispatching, manual memory management...).

And no sorting support... :-)

Raymond Toy

Aug 3, 2000, 3:00:00 AM8/3/00
to
>>>>> "Robert" == Robert Monfera <mon...@fisec.com> writes:

Robert> Raymond Toy wrote:
kp> i'll try to think about why an array of structs is better.
>>
>> Because it states more cleary what your intent is?
>> Because the elements have names?
>> What's in entry 3? Or was entry 5 what I wanted?
>> What happens if I decide to add a new entry?
>> What happens all the code if I rearrange the entries?
>>
>> Pretty good reasons for me.

Robert> Why could a heterogenous array _sometimes_ be better?

Robert> Because of better locality of data in memory (->cache hits)?

How is locality improved? If the array is heterogenous, doesn't that
mean each array element is now some boxed thing so it's just a pointer
to where the real data is stored? When I access the real data, my
cache is blown.

Am I missing something obvious here?

Robert> Because you may save ~24 type overhead bytes per row?

I'm sorry, I don't quite see this.

Robert> Because you don't have to garbage collect a huge array?

Robert> How fast can you access information?

Well, of the Lisp implementations I know, multidimensional array
access is significantly slower than 1-D array access. However, in this
case, I suppose you might make up for that by not having to access the
structure slots.

Ray

Barry Margolin

Aug 3, 2000, 3:00:00 AM8/3/00
to
In article <4nr986e...@rtp.ericsson.se>,

Raymond Toy <t...@rtp.ericsson.se> wrote:
> Robert> Because of better locality of data in memory (->cache hits)?
>
>How is locality improved? If the array is heterogenous, doesn't that
>mean each array element is now some boxed thing so it's just a pointer
>to where the real data is stored? When I access the real data, my
>cache is blown.

That will be true of the structure slot contents as well. Since it's the
same for both representations, it is irrelevant to the comparison.

However, if all the structures are created in a loop (which is reasonable
to expect if you create them at the same time as you would have allocated
the 2-D array), they'll probably be adjacent in memory, which should
provide similar locality (not quite the same, because you'll have to jump
back and forth between the vector of structure references and the
structures themselves, but unless you're really hard up for RAM they should
both stay in memory).

>Am I missing something obvious here?
>
> Robert> Because you may save ~24 type overhead bytes per row?
>
>I'm sorry, I don't quite see this.

He's referring to the header data for each structure instance. Although 24
bytes sounds excessive. I wouldn't expect much more than a 4-byte pointer
to a structure or class definition object, which would be shared by all
instances of the same structure type or class.

Robert Monfera

Aug 3, 2000, 3:00:00 AM8/3/00
to
Raymond Toy wrote:

> How is locality improved? If the array is heterogenous, doesn't that
> mean each array element is now some boxed thing so it's just a pointer
> to where the real data is stored? When I access the real data, my
> cache is blown.
>

> Am I missing something obvious here?

Certainly :-) We talk about a non-existent type of array where one
column may be a single-float, another one a fixnum etc. Data are
unboxed.

> Robert> Because you may save ~24 type overhead bytes per row?
>

> I'm sorry, I don't quite see this.\

Try this:

(defclass loan ()
((amount :type double-float :initform 100.0d0)
(rate :type single-float :initform 0.07d0)))

(defparameter *v* (make-array 1000000 :element-type 'loan))

(defun test (v)
(loop for row from 0 to 999999
do (setf (aref v row) (make-instance 'loan))))

> Robert> Because you don't have to garbage collect a huge array?
>
> I don't quite follow here.

Immediate data do not have to be GC'd, whether it's a vector of vectors
or the above mentioned non-existent type.

Robert