convolging 2 vectors with DenseVector - Performance

116 views
Skip to first unread message

Witold Eryk Wolski

unread,
Oct 8, 2012, 4:52:26 PM10/8/12
to scala-...@googlegroups.com

The following class convolves 2 DenseVectors.
For filter.length == 11 and
for data.length=500k I am able to filter 20 data vectors per second (20Hz).
Thats the same performance I get with an implementation based on Array, where I call Array.slice

Using an analogous implementation but based on Array and were I pass
also i (to compute the slice in data) into dotProduct I am able to filter 120 data vectors per second.

I thought DenseVector does are more clever slicing than Array... Or what I am doing wrong?
How can I tune my DenseVector based implemenation?


class BreezeFilter(filter : DenseVector[Double]) {
  //check if Filter is odd
  private def checkFilter() {
    val l: Int = filter.length
    if (l % 2 == 0) {
      throw new IllegalArgumentException("filter length must be odd")
    }
  }
  checkFilter
  val fHalf = (filter.length - 1) / 2

  def dotProduct[T <% Double](as: DenseVector[Double],  bs: DenseVector[Double]) = {
    var sum = 0.
    var i = 0
    while (i < bs.length) {
      sum += bs(i) * as(i)
      i += 1
    }
    sum
  }

  def filter(data: DenseVector[Double]): DenseVector[Double] = {
    val s:Int = data.length
    var myArray = DenseVector.zeros[Double](s)
    var x = data.length - filter.length
    for (i <- 0 to x) {
      myArray(i + fHalf) = dotProduct(data(i until i+filter.length) ,  filter )
    }
    return myArray
  }


}

David Hall

unread,
Oct 8, 2012, 6:11:13 PM10/8/12
to scala-...@googlegroups.com
What happens if you use "as dot bs" rather than your hand-rolled loop?
> --
> You received this message because you are subscribed to the Google Groups
> "Scala Breeze" group.
> To post to this group, send email to scala-...@googlegroups.com.
> To unsubscribe from this group, send email to
> scala-breeze...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/scala-breeze/-/6pSBX1PCP5kJ.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Jason Zaugg

unread,
Oct 8, 2012, 6:17:53 PM10/8/12
to scala-...@googlegroups.com
On Tue, Oct 9, 2012 at 12:11 AM, David Hall <dl...@cs.berkeley.edu> wrote:
> What happens if you use "as dot bs" rather than your hand-rolled loop?

I measured a 2x speedup.

Witold: you really ought to provide code for the fast array version,
and your benchmarking code. It is much easier for others to help you
out with profiling that way.

-jason

Witold Eryk Wolski

unread,
Oct 9, 2012, 5:36:53 AM10/9/12
to scala-...@googlegroups.com
Hi Jason

The default filter which gives me the 120HZ followed by the class I use for benchmarking. I Benchmark the BreezeFilter using mytest3 function.





package ch.ethz.ralab.filters

class DefaultFilter(filter: Array[Double]) {

  //check if Filter is odd
  private def checkFilter() {
    val l: Int = filter.length
    if (l % 2 == 0) {
      throw new IllegalArgumentException("filter length must be odd")
    }
  }
  checkFilter
  val fHalf = (filter.length - 1) / 2

  def dotProduct[T <% Double](as: Array[T], pos: Int, bs: Array[T]) = {
    //def dotProduct(as: Array[Double], pos :Int, bs: Array[Double]): Double = {

    var sum = 0.
    var i = 0
    while (i < bs.size) {
      sum += bs(i) * as(pos + i)
      i += 1
    }
    sum
  }

  def filter(data: Array[Double]): Array[Double] = {
    var myArray = new Array[Double](data.length)
    for (i <- 0 to data.length - filter.length) {
      myArray(i + fHalf) = dotProduct(data, i, filter)
    }
    return myArray
  }

}


>>>>>>>>>>>>>>>>>>>>HERE starts the test App.


package ch.ethz.ralab.filters

import breeze.linalg._

/**
 * @author ${user.name}
 */
object App2 {

  def main(args: Array[String]) {
    mytest3
    //mytest2
  }

  def mytest() {
    val x: AverageFilter = new AverageFilter(3);
    println(x.fSize)
    println(x.halfFilter)
    //println(arr.length)
    val arr = List[Double](1, 2, 3, 1, 2, 3, 1, 2, 3)
    for (al: Double <- arr) print(al + " ")
    println
    val brr = x.filter(arr.toArray)
    println(brr.length)
    for (bl: Double <- brr) print(bl + " ")
    var totalt = 0.
    val times = 500
    for (i <- 0 to times) {
      val arrb: Array[Double] = Array.fill(500000)(math.random)
      val t0 = System.nanoTime()
      x.filter(arrb)
      val t1 = System.nanoTime()
      totalt += t1 - t0
    }
    println
    val s = totalt / 1000000000.
    println("AVG FILTER Elapsed time: " + s + "s")
    println("AVG Filter hz : " + times / s + "Hz")
  }

  def mytest2() {
    val filter = Array[Double](1, 1, 1)
    val x = new DefaultFilter(filter);
    println(x.fHalf)
    //println(arr.length)
    val arr = List[Double](1, 2, 3, 1, 2, 3, 1, 2, 3)
    for (al: Double <- arr) print(al + " ")
    println
    val brr = x.filter(arr.toArray)
    println(brr.length)
    for (bl: Double <- brr) print(bl + " ")

    val filter2 = Array[Double](-1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1)
    val x2 = new DefaultFilter(filter2);

    var totalt = 0.
    var nrspec = 100
    println("starting test")
    for (i <- 0 to nrspec) {
      val arrb: Array[Double] = Array.fill(500000)(math.random)
      val t0 = System.nanoTime()
      x2.filter(arrb)
      val t1 = System.nanoTime()
      totalt += t1 - t0
    }
    println
    var sec = totalt / 1000000000.
    println("Elapsed time: " + sec + "s")
    println("Frequency : " + nrspec / sec + "Hz")
  }
 
  // breeze test
  def mytest3() {
   
    val filter2 = DenseVector[Double](-1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1)
    val x2 = new BreezeFilter(filter2);

    var totalt = 0.
    var nrspec = 100
    println("starting Breeze test")
    for (i <- 0 to nrspec) {
      val arrb: DenseVector[Double] = DenseVector.rand(500000)
      val t0 = System.nanoTime()
      x2.filter(arrb)
      val t1 = System.nanoTime()
      totalt += t1 - t0
    }
    println
    var sec = totalt / 1000000000.
    println("Elapsed time: " + sec + "s")
    println("Frequency : " + nrspec / sec + "Hz")

David Hall

unread,
Oct 10, 2012, 11:15:02 PM10/10/12
to scala-...@googlegroups.com
Thanks!

Could we have "averagefilter"?

-- David
> --
> You received this message because you are subscribed to the Google Groups
> "Scala Breeze" group.
> To post to this group, send email to scala-...@googlegroups.com.
> To unsubscribe from this group, send email to
> scala-breeze...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/scala-breeze/-/nYQ-oe0eK1UJ.

David Hall

unread,
Oct 11, 2012, 2:42:48 AM10/11/12
to scala-...@googlegroups.com
Ok, I ran Witold's code in a new benchmark project I set up that uses
Google calipers. I got a reasonable speedup of about 20 percent or so.
We're not only 2.5 slower than his hand rolled loop. I don't know that
we can get any better without macros or a different way of doing
slicing that doesn't involve creating temporary objects.

lessons learned:
1) I hate SBT. I have no idea how to do the equivalent of
https://github.com/sirthias/scala-benchmarking-template/blob/master/project/Build.scala
line 28 in a way that works for subprojects, without needing an
additional manually run command (patchclasspath).
2) Range#head is incredibly slow. Range#start is much faster.
3) hprof will lie about line numbers, as in lines that aren't even
being run get cited. sigh.

The remaining issues, according to hprof:

45% of the time is spent on this line in ConvolutionBenchmark.scala (131)
myArray(i + fHalf) = (data(i until i+filter.length) dot filter )

18% calling ddot in netlib
16% making new DenseVectors

re: the line pasted above, I suspect this is gc time, maybe boxing?
Though boxing doesn't show up in the profile. (I'm pretty annoyed
there's boxing here.) For contrast, in the hand-rolled version of the
benchmark, the equivalent line takes just 18%

This is what JD-GUI decompiles that line to:

myArray.update$mcD$sp(i + fHalf(),
BoxesRunTime.unboxToDouble(((NumericOps)data.apply(Predef..MODULE$.intWrapper(i).until(i
+ this.filter.length()),
DenseVector..MODULE$.canSlice())).dot(this.filter,
DenseVector..MODULE$.canDotD())));

Thoughts?

-- David

Jason Zaugg

unread,
Oct 11, 2012, 3:45:46 AM10/11/12
to scala-...@googlegroups.com
You could try to remove instantiation and the reading of Range from
the equation altogether, by a slice(from: Int, to: Int) method to
DenseVector.

I tend to use YourKit for profiling; you can get a free licence for
use with open source projects.

Can you lodge a bug for the Range#head performance problem?

-jason

Matthew Pocock

unread,
Oct 11, 2012, 9:32:15 AM10/11/12
to scala-...@googlegroups.com, scala-i...@googlegroups.com
On 11 October 2012 07:42, David Hall <dl...@cs.berkeley.edu> wrote:
2) Range#head is incredibly slow. Range#start is much faster.

This is a performance bug. Range#start looks up the constructor val of that name, which is very cheep. Range#head is inherited from IterableLike, and implemented as iterator.next, which is unavoidably expensive. The fix is to override `head` in Range to chain on to `start`.

Is this still the case in 2.10?

Matthew

--
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle University
skype: matthew.pocock
tel: (0191) 2566550

David Hall

unread,
Oct 11, 2012, 7:08:33 PM10/11/12
to scala-...@googlegroups.com, scala-i...@googlegroups.com
On Thu, Oct 11, 2012 at 6:32 AM, Matthew Pocock
<turingate...@gmail.com> wrote:
>
>
> On 11 October 2012 07:42, David Hall <dl...@cs.berkeley.edu> wrote:
>>
>> 2) Range#head is incredibly slow. Range#start is much faster.
>
>
> This is a performance bug. Range#start looks up the constructor val of that
> name, which is very cheep. Range#head is inherited from IterableLike, and
> implemented as iterator.next, which is unavoidably expensive. The fix is to
> override `head` in Range to chain on to `start`.
>
> Is this still the case in 2.10?

Right. It's still a problem in 2.10. All that's needed is to inherit
from IndexedSeqOptimized, but I am having trouble building scala.

-- David

>
> Matthew
>
> --
> Dr Matthew Pocock
> Integrative Bioinformatics Group, School of Computing Science, Newcastle
> University
> mailto: turingate...@gmail.com
> mailto: matthew...@ncl.ac.uk
> gchat: turingate...@gmail.com
> msn: matthew...@yahoo.co.uk
> irc.freenode.net: drdozer
> skype: matthew.pocock
> tel: (0191) 2566550
> mob: +447535664143
>
> --
> You received this message because you are subscribed to the Google Groups
> "Scala Breeze" group.
> To post to this group, send email to scala-...@googlegroups.com.
> To unsubscribe from this group, send email to
> scala-breeze...@googlegroups.com.

Witold Eryk Wolski

unread,
Oct 22, 2012, 4:15:57 AM10/22/12
to scala-...@googlegroups.com
Hi Jason,

I could not find a method slice in the DenseVector API?

Jason Zaugg

unread,
Oct 22, 2012, 5:47:16 AM10/22/12
to scala-...@googlegroups.com
It doesn't currently exists. I was trying to say that David could try
adding such a method to DenseVector to avoid the indirection of
constructing and deconstructing a Range object.

-jason

David Hall

unread,
Oct 23, 2012, 12:17:51 AM10/23/12
to scala-...@googlegroups.com
Just added a slice method and published artifacts.

-- David
Reply all
Reply to author
Forward
0 new messages