Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Re: Array vs. Range Indexing

153 views
Skip to first unread message

Patrick O'Leary

unread,
Aug 5, 2012, 9:52:07 AM8/5/12
to juli...@googlegroups.com
On Sunday, August 5, 2012 3:25:00 AM UTC-5, Jonathan Smith wrote:
1. Array vs. Range Indexing

The first test works fine, the second fails with a "BoundsError". The problem turns out to be where I am splitting the items array:

first, others = items[1], items[2:]

Like MATLAB, the special index end represents the last index:

first, others = items[1], items[2:end]

However, this might be quicker, using a dequeue method from Base:

first = shift(items)

for example:

a = [1, 2, 3]
r = 1:3

a[4:] # empty array
r[4:] # BoundsError

Should these be consistent?

Probably, but I'll let Jeff or Stefan comment.
 
2. Append Syntax

The syntax:

[item, items]

for appending an item or two arrays seems a little odd to me. I do see why Julia reserves '+' for numeric addition, but is it possible to have an operator for appending items or appending arrays? Something along the lines of Scala's ++ would help.

That's a MATLAB carryover, but more efficiently, again using the dequeue methods:

enqueue(items, item)

Besides that, I expect adding to and splitting from the end of the array is more efficient than from the beginning.

Hope this helps!

Jonathan Smith

unread,
Aug 5, 2012, 4:45:40 PM8/5/12
to juli...@googlegroups.com
The subsum example is a bit like the programs that interest me, but in a miniature. I more often deal with trees and graphs and search algorithms, and less often arrays of floating point numbers.


On Sunday, August 5, 2012 9:52:07 AM UTC-4, Patrick O'Leary wrote:
Like MATLAB, the special index end represents the last index:

first, others = items[1], items[2:end]

Thanks! It looks like 'end' will be very useful. In this case the initial index '2' still does cause the error.
 
However, this might be quicker, using a dequeue method from Base:

first = shift(items)

With the use of shift, the function gradually removes elements from the items array. The items array is passed in to two recursive calls of subsum. Unfortunately the first call chews up the array before the second call can use it. One fix for this problem is to make a copy of the array before it is passed in to the first recursive call (so it can chew harmlessly on that) and the second call can work on the original. The version below uses 'shift' and it works, but it also does an extra copy:

function subsum(total, items)
    if total == 0
        {[]}
    elseif items == []
        {}
    else
        first = shift(items)
        if first > total
            subsum(total, items)
        else
            [subsum(total, copy(items)), 
              insert_each(first, subsum(total-first, items))] 
        end
    end
end

The purely functional version is easier to reason about, and safer in multi-threaded programs. My style is not at all pure, but I try for a functional style as much as I can without being too wasteful.

That's a MATLAB carryover, but more efficiently, again using the dequeue methods:

enqueue(items, item)

This also changes each one of the items arrays rather than creating new ones. In subsum it would not not harmful, but I was being cautious. The function might not be as useful for other applications if it alters the original elements of its input.
 
Besides that, I expect adding to and splitting from the end of the array is more efficient than from the beginning.

This would certainly be true if I did not make copies. Its is also interesting to compare this version written in Scala:

type IntVectors = Vector[Vector[Int]]

def subsum(total: Int, items: Vector[Int]): IntVectors = {
  if (total == 0)
    Vector(Vector())
  else if (items.isEmpty)
    Vector()
  else {
    val (first, others) = (items.head, items.tail)
    if (first > total)
      subsum(total, others)
    else
      subsum(total, others) ++
        (subsum(total - first, others) map (first +: _))
  }
}

println(subsum(10, Vector(1, 2, 3, 4, 5, 6, 7, 8, 9)))

I intensionally used Vectors (immutable 1-d arrays) instead of Scala's cons-lists. This version does something interesting: it creates tiny 'display' objects that represent slices (pointers of sorts) to the original Vector Because displays are passed into the recursive calls of subsum, there is no need to make copies of the original integer Vectors. 'head' and 'tail' operations just involve adjusting an index in the display. Scala also creates objects that represent concatenations of display slices, so the result is actually (efficiently) assembled only at the end of the process.

Laziness sometimes pays off. 
 

Hope this helps!

It was very helpful. Thanks! 

Patrick O'Leary

unread,
Aug 6, 2012, 1:45:05 AM8/6/12
to juli...@googlegroups.com
On Sunday, August 5, 2012 3:45:40 PM UTC-5, Jonathan Smith wrote:
The subsum example is a bit like the programs that interest me, but in a miniature. I more often deal with trees and graphs and search algorithms, and less often arrays of floating point numbers.

On Sunday, August 5, 2012 9:52:07 AM UTC-4, Patrick O'Leary wrote:
Like MATLAB, the special index end represents the last index:

first, others = items[1], items[2:end]

Thanks! It looks like 'end' will be very useful. In this case the initial index '2' still does cause the error.

Yes, you'll have this problem on a single-element Vector.
 
 
However, this might be quicker, using a dequeue method from Base:

first = shift(items)

With the use of shift, the function gradually removes elements from the items array. The items array is passed in to two recursive calls of subsum. Unfortunately the first call chews up the array before the second call can use it. One fix for this problem is to make a copy of the array before it is passed in to the first recursive call (so it can chew harmlessly on that) and the second call can work on the original. The version below uses 'shift' and it works, but it also does an extra copy:

I didn't work out the algorithm--I was focused on your particular questions. Note that *not* using shift() and doing the head/tail operations the original way does a whole lot of extra copying, since the slices need to be copied to be bound to new variables. I believe that SubArray can avoid that.
 
<snip program>
The purely functional version is easier to reason about, and safer in multi-threaded programs. My style is not at all pure, but I try for a functional style as much as I can without being too wasteful.

Perfectly reasonable.
 
That's a MATLAB carryover, but more efficiently, again using the dequeue methods:

enqueue(items, item)

This also changes each one of the items arrays rather than creating new ones. In subsum it would not not harmful, but I was being cautious. The function might not be as useful for other applications if it alters the original elements of its input.

Yep. On the other hand, unlike a well-optimized functional language this can lead to a whole lot of extra copying in Julia (much as in MATLAB, where this will inevitably cause an Mlint warning). I'm not sure how much Julia or LLVM can figure out regarding reuse if you're using the explicit notation.

Besides that, I expect adding to and splitting from the end of the array is more efficient than from the beginning.

This would certainly be true if I did not make copies. Its is also interesting to compare this version written in Scala:
<snip second program>

I intensionally used Vectors (immutable 1-d arrays) instead of Scala's cons-lists. This version does something interesting: it creates tiny 'display' objects that represent slices (pointers of sorts) to the original Vector Because displays are passed into the recursive calls of subsum, there is no need to make copies of the original integer Vectors. 'head' and 'tail' operations just involve adjusting an index in the display. Scala also creates objects that represent concatenations of display slices, so the result is actually (efficiently) assembled only at the end of the process.

Here's what it looks like with SubArray:

function subsum(total, items)
   if total == 0
       {[]}
   elseif items == []
       {}
   else
       first, others = items[1], length(items) > 1 ? sub(items, 2:length(items)) : {}
       if first > total
           subsum(total, others)
       else
           [subsum(total, others), insert_each(first, subsum(total-first, others))]
       end
   end
end


Hope this helps!

It was very helpful. Thanks! 

You're welcome!
Reply all
Reply to author
Forward
0 new messages