breeze status

50 views
Skip to first unread message

David Hall

unread,
Jun 21, 2012, 3:16:01 PM6/21/12
to sca...@googlegroups.com, scal...@googlegroups.com
Hi everyone,

I've ported enough of Scalala so that all of the Scalanlp portions
compile and all tests pass. Yay! (Sometimes a few RNG tests still
fail. I'm working on stabilizing them.)

The math library is probably of the most interest. Here are some
notable features:

1) Currently DenseVectors, SparseVectors, DenseMatrices, Counters, and
Counter2s are supported.

2) In general, Doubles, Floats, and Ints are supported. Doubles have
some BLAS code, where usable. In general, DV[Double] and DM[Double]
are the most tested. All of these should be pretty fast, since they're
done using code generation and not generics.

3) In terms of operators, DV/DV, DV/SV, DM/DV, and C/C operators are supported.

4) V/V operators also exist, and they use a multimethod registry so
that the faster DV- and SV- specific operators can be delegated to.
Probably the multimethod lookup is slow enough that for shorter
vectors it makes more sense to just use generic operators... Maybe we
can benchmark that sometimes.

5) I haven't benchmarked anything yet, but I suspect that many
operations (though not all) are much much faster. I don't think
anything is slower, at least not for large vectors.

6) There is the beginnings of a VectorSpace hierarchy along the lines of:

VectorSpace -> MutableVectorSpace (corresponding to mathematical definitions)
| |
V V
NormedVectorSpace -> MutableNormedSpace (adds a norm of some sort)
| |
V V
InnerProductSpace -> MutableInnerProductSpace (adds dot products)
| |
V v
CoordinateSpace -> MutableCoordinateSpace (basically for spaces that
more or less correspond to Euclidean spaces, possibly infinite though)
|
V
TensorSpace (supports everything on Tensors,
which mostly means finite index set and access to a value for a given
coordinate)


7) I introduced two new types, inspired by numpy. The first is "UFunc"
which is a function that can be applied to an element V, and most
collections of that type V. The second is UReduceable, which supports
efficient reduction operations:

trait UFunc[@specialized -V, @specialized +V2] {
def apply(v: V):V2
def apply[T,U](t: T)(implicit cmv: CanMapValues[T, V, V2, U]):U =
cmv.map(t, apply _)
def applyActive[T,U](t: T)(implicit cmv: CanMapValues[T, V, V2,
U]):U = cmv.mapActive(t, apply _)
}

trait UReduceable[T, @specialized A] extends {
def apply[Final](c: T, f: URFunc[A, Final]):Final
}

trait URFunc[@specialized A, +B] {
def apply(cc: TraversableOnce[A]):B


def apply[T](c: T)(implicit urable: UReduceable[T, A]):B = {
urable(c, this)
}

def apply(arr: Array[A]):B = apply(arr, arr.length)
def apply(arr: Array[A], length: Int):B = apply(arr, 0, 1, length,
{_ => true})
def apply(arr: Array[A], offset: Int, stride: Int, length: Int,
isUsed: Int=>Boolean):B = {
apply((0 until length).filter(isUsed).map(i => arr(offset + i * stride)))
}

def apply(as: A*):B = apply(as)

}

The intent with URFunc is to allow for fast Array-based
implementations where possible, and slower Traversable-based
implementations where not. Basically mean(iterable) will always work,
but mean(denseVector) or mean(sparseVector) is potentially much
faster.

Examples of these are in breeze.linalg's and breeze.numerics's package
objects. '

MAJOR MISSING COMPONENTS:

1) sum(DenseMatrix, Axis.Horizontal/Vertical), and their kin. I'd like
to be able to implement these more generically, along the lines of
UFunc or URFunc...
2) The linear algebra routines, which shouldn't be hard to port, I
just haven't. This is probably what I'll do next.
3) Test Coverage. Right now, we have depressingly low test coverage.
Some of this is just because Scala generates a lot of useless methods
in @specialized land, but it's also because Scalala had terrible test
coverage, and I didn't add that many tests.
4) DM/SV multiply operations
5) Generic slicing. Only ranges are supported right now, and only for
DenseVectors and DenseMatrices. Limited slicing works for counters
too.

What else am I missing?

David Hall

unread,
Jun 22, 2012, 2:02:52 PM6/22/12
to sca...@googlegroups.com, scal...@googlegroups.com
Ok, progress on the math front. *I think that people's code bases
should in general be portable now, with mostly some import renaming.
If that's not true, let me know what to add and I will add it.*

Test coverage is now at 44% of instructions and 28% of branches, with
242 unit tests. They hit almost everything besides Matrix operators at
least some. (On a related note, by removing Binary[Update]Op's
inheritance of Function2, I decreased the number of instructions in
the code base from ~300K to ~86K... Fun times.)

I've brought back all linear algebra routines.

I've reintroduced the axis-based reduction operations (sum, mean,
etc.) in a way that is nice and generic. Basically sum(Tensor, Axis)
looks for a CanCollapse[Tensor, Axis, AxisTensor, ...] that then
reduces the AxisTensor down to a single value.

For instance:

assert(sum(DenseMatrix((1.0,3.0),(2.0,4.0)), Axis._0) ===
DenseMatrix((3., 7.)))
assert(sum(DenseMatrix((1.0,3.0),(2.0,4.0)), Axis._1) ===
DenseVector(4., 6.))
assert(sum(Counter2((1,'a,1.0),(1, 'b, 3.0), (2, 'a, 2.0), (2, 'b,
4.0)), Axis._0) === Counter('a -> 3., 'b -> 7.))
assert(sum(Counter2((1,'a,1.0),(1, 'b, 3.0), (2, 'a, 2.0), (2, 'b,
4.0)), Axis._1) === Counter(1 -> 4., 2 -> 6.))

That's it.

-- David

David Hall

unread,
Jun 22, 2012, 10:43:09 PM6/22/12
to sca...@googlegroups.com, scal...@googlegroups.com
Great thanks!

Let me know what you find.

-- David

On Fri, Jun 22, 2012 at 6:52 PM, Keith Stevens <fozzie...@gmail.com> wrote:
> Hey David,
>
> Awesome! Compiling together as a tight package is most excellent.  I'm going
> to take a crack at moving one of my packages over to see if there's anything
> missing/rough during the transition this weekend and onto next week.  As I
> do this, i'll add unit tests where they're missing and make not of what went
> wrong.
>
> --Keith
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Scalala" group.
>> To post to this group, send email to sca...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> scalala+u...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/scalala?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Scalala" group.
> To post to this group, send email to sca...@googlegroups.com.
> To unsubscribe from this group, send email to
> scalala+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/scalala?hl=en.

David Hall

unread,
Jun 22, 2012, 10:43:43 PM6/22/12
to sca...@googlegroups.com, scal...@googlegroups.com
Oh, I should mention that I am going to be gone next week (M-F) away
from email pretty much the whole time. Barbaric, I know.

-- David
Reply all
Reply to author
Forward
0 new messages