I'm looking for a method to sort a 2d array
Each row of my array is made up of N columns. The first N-1 columns
are indexes of a co-ordinate in a N-1 dimensional space and the Nth
element is a number of times that co-ordinate is used. The N-1
dimensional space is sampled M times. If each co-ordinate is only
used once then the array has size N*M.
I want to sort the array so that the first coordinate is sorted
followed by the second co-ordinate up until the N-1 co-ordinate.
e.g. if i start with a 2 dimensional space, with each dimension split
into 4 bins, for 10 samples i might pick the following co-ordinates
1,4
2,2
4,4
4,1
3,1
2,4
4,4
2,1
2,1
1,3
un-sorted my array would look like
1 4 1
2 2 1
4 4 2
4 1 1
3 1 1
2 4 1
2 1 2
1 3 1
what i want is to sort it so that i get
1 3 1
1 4 1
2 1 2
2 2 1
2 4 1
3 1 1
4 1 1
4 4 2
then when i add a new sample it will be much easier to see if i have
already used the co-ordinate. If not i add the new co-ordinate to the
end and i re-sort after P new samples.
Any help comments would be fantastic.
1. Suppose you have B bins per dimension. Look at the first N-1
columns as "digits" in base B notation (or B+1 if subtracting 1 is not
convenient). Convert from base B to decimal and you have reduced your
N-1 keys to a single key. Sort on that key.
2. Do a classical "punched card" sort. Find a stable sorting algorithm
(one that preserves the order of previously examined keys). Sort by
columns N-1 downto 1.
3. Modern Fortran supports creating advanced data structures such as
trees. A binary tree might do the job for you. (Or you can simulate
one with arrays :-<.)
- e
> 1. Suppose you have B bins per dimension. Look at the first N-1
> columns as "digits" in base B notation (or B+1 if subtracting 1 is not
> convenient). Convert from base B to decimal and you have reduced your
> N-1 keys to a single key. Sort on that key.
> 2. Do a classical "punched card" sort. Find a stable sorting algorithm
> (one that preserves the order of previously examined keys). Sort by
> columns N-1 downto 1.
> 3. Modern Fortran supports creating advanced data structures such as
> trees. A binary tree might do the job for you. (Or you can simulate
> one with arrays :-<.)
I think this is approximately what what suggested above:
C:\gfortran\clf\sort_row>type sort_row.f90
module row_ptr_mod
implicit none
private
public row_ptr
type row_ptr
integer, pointer :: ptr(:)
end type row_ptr
public operator(<)
interface operator(<)
module procedure less_than
end interface operator(<)
contains
function less_than(x,y)
type(row_ptr), intent(in) :: x, y
logical less_than
integer i
if(.NOT.associated(x%ptr) .OR. .NOT.associated(y%ptr)) then
write(*,*) 'Error in less than: disassociated pointer'
stop
end if
if(size(x%ptr) /= size(y%ptr)) then
write(*,*) 'Error in less than: unequal row lengths'
end if
do i = 1, size(x%ptr)
if(x%ptr(i) < y%ptr(i)) then
less_than = .TRUE.
return
else if(x%ptr(i) > y%ptr(i)) then
less_than = .FALSE.
return
end if
end do
less_than = .FALSE.
return
end function less_than
end module row_ptr_mod
module sort_row_mod
use row_ptr_mod
implicit none
private
public bubblesort
contains
subroutine bubblesort(arr)
integer, target :: arr(:,:)
type(row_ptr) rows(size(arr,1))
integer i
integer j
integer k
type(row_ptr) temp
do k = 1, size(arr,1)
rows(k)%ptr => arr(k,:)
end do
do j = size(arr,1), 2, -1
do i = 1, j-1
if(rows(i+1) < rows(i)) then
temp = rows(i+1)
rows(i+1) = rows(i)
rows(i) = temp
end if
end do
end do
arr = reshape([(rows(k)%ptr,k=1,size(arr,1))], &
shape(arr), order = [2,1])
end subroutine bubblesort
end module sort_row_mod
program test_rows
use sort_row_mod
implicit none
integer A(8,3)
character(80) fmt
A = reshape([ &
1, 4, 1, &
2, 2, 1, &
4, 4, 2, &
4, 1, 1, &
3, 1, 1, &
2, 4, 1, &
2, 1, 2, &
1, 3, 1], &
shape(A), order = [2,1])
write(fmt,'(a,i0,a)') '(',size(A,2),'i4)'
write(*,'(a)') 'A = '
write(*,fmt) transpose(A)
call bubblesort(A)
write(*,'(a)') 'Sorted A = '
write(*,fmt) transpose(A)
end program test_rows
C:\gfortran\clf\sort_row>gfortran sort_row.f90 -osort_row
C:\gfortran\clf\sort_row>sort_row
A =
1 4 1
2 2 1
4 4 2
4 1 1
3 1 1
2 4 1
2 1 2
1 3 1
Sorted A =
1 3 1
1 4 1
2 1 2
2 2 1
2 4 1
3 1 1
4 1 1
4 4 2
Now, I used a fairly inefficient sort here (a bubblesort) but it's
written generically enough that you could pretty much copy and paste
any sorting algorithm in there where the j-loop currently lives.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
I'm hoping to follow this but I get errors with silverfrost f95:error 174 -
Unexpected '[' in expression. The first expression is so difficult to match
left and right braces that I couldn't say if braces are matched, but I get
the same error in the second statement (I think I snipped a little too close
on that one). I tried changing out the brackets for parens in the order
specifier without success.
--
"We wouldn't have a police department that had to show a profit every
year - or a fire department that had to show a profit every year. We
shouldn't do that with our health care system, either."
~~ Michael Moore
> arr = reshape([(rows(k)%ptr,k=1,size(arr,1))], &
> & shape(arr), order = [2,1])
>
> shape(A), order = [2,1])
>
> I'm hoping to follow this but I get errors with silverfrost f95:error 174 -
> Unexpected '[' in expression.
The square brackets are an f2003 feature. It is one supported by
some/many f95 compilers because it is easy to implement, but it wouldn't
be shocking to find compilers that it won't work on. Try substituting
the more cryptic tw0 character forms (/ instead of [ and /) instead of
].
--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
> Ron Ford <r...@nowhere.ford> wrote:
>
>> arr = reshape([(rows(k)%ptr,k=1,size(arr,1))], &
>> & shape(arr), order = [2,1])
>>
>> shape(A), order = [2,1])
>>
>> I'm hoping to follow this but I get errors with silverfrost f95:error 174 -
>> Unexpected '[' in expression.
>
> The square brackets are an f2003 feature. It is one supported by
> some/many f95 compilers because it is easy to implement, but it wouldn't
> be shocking to find compilers that it won't work on. Try substituting
> the more cryptic tw0 character forms (/ instead of [ and /) instead of
> ].
A = reshape([ &
1, 4, 1, &
2, 2, 1, &
4, 4, 2, &
4, 1, 1, &
3, 1, 1, &
2, 4, 1, &
2, 1, 2, &
1, 3, 1], &
shape(A), order = /2,1/)
I commented out the first error after everything I could fiddle with
failed. This draws the same error. Clearly the prudent path is to use an
f03 compiler. This isn't even the latest f95 now that Plato iv is
available, so it might be something that they've improved.
--
ron Ford
I took a final look at this with the added input of my reading glasses and
see better what Richard means.
A = reshape((/ &
1, 4, 1, &
2, 2, 1, &
4, 4, 2, &
4, 1, 1, &
3, 1, 1, &
2, 4, 1, &
2, 1, 2, &
1, 3, 1/), &
shape(A), order = (/2,1/))
This compiles.
arr = reshape((/(rows(k)%ptr,k=1,size(arr,1))/), &
shape(arr), order = (/2,1/))
This doesn't, but with a different error: error 113 - Internal compiler
error
Gotta run, i.e., bike.
if (N-1 < 8) then
you could set up a rank N-1 array, and bin your samples as you go
with array indexing
end if
No sweat, no fuss, no sorting. Entirely without supporting data, I
assert that this approach will, even with a relatively sparse matrix,
be faster than sorting. For one thing, you won't have to write a sort
routine. Sure, if M is large enough maybe your array ought to be
sparse. And if N > 7 this won't work (easily).
Another approach would be to use your coordinates as inputs to a hash
function and store your data in a hash table. Again, no need to sort,
just un-hash in coordinate order. You could, if sorting is expensive
enough and hashing is cheap enough, keep your existing 2d array and,
when all your samples are in, create the hash table and extract the
entries in sorted order.
Regards
Mark Westwood
PS Feel free to disregard these suggestions as inapplicable to your
problem -- but without a lot more background information I couldn't
stop myself.
On May 15, 5:19 am, Mark Westwood <markc.westw...@gmail.com> wrote:
> Or,
>
> if (N-1 < 8) then
>
> you could set up a rank N-1 array, and bin your samples as you go
> with array indexing
>
> end if
>
> No sweat, no fuss, no sorting. Entirely without supporting data, I
> assert that this approach will, even with a relatively sparse matrix,
> be faster than sorting. For one thing, you won't have to write a sort
> routine. Sure, if M is large enough maybe your array ought to be
> sparse. And if N > 7 this won't work (easily).
>
> Another approach would be to use your coordinates as inputs to a hash
> function and store your data in a hash table. Again, no need to sort,
> just un-hash in coordinate order. You could, if sorting is expensive
> enough and hashing is cheap enough, keep your existing 2d array and,
> when all your samples are in, create the hash table and extract the
> entries in sorted order.
In languages that have associative arrays (AWK, Perl) or multi-
dimensional subscripts as data (MUMPS) the problem is trivial. Of
course Fortran will be much, much faster. (Python compiled might be
close...)
- e