new Array vs new DenseVector from Array

60 views
Skip to first unread message

Alexander Ulanov

unread,
Oct 14, 2015, 2:45:25 PM10/14/15
to Scala Breeze
Dear Breeze developers,

Could you explain why creating a DenseVector from an Array takes much more time than creating the Array? I have not found an explanation within the code. It seems to do nothing. 

import breeze.linalg.{DenseVector => BDV}
val n = 12000000
val t2 = System.nanoTime()
val a = new Array[Double](n)
val bdva = new BDV[Double](a)
println("BDV from array: " + (System.nanoTime() - t2) / 1e9)
>> BDV from array: 8.45790515

val t3 = System.nanoTime()
val a = new Array[Double](n)
println("Just array: " + (System.nanoTime() - t3) / 1e9)
>> Just array: 0.216550366

Best regards, Alexander

David Hall

unread,
Oct 14, 2015, 3:04:01 PM10/14/15
to scala-...@googlegroups.com
Are you timing repeated invocations, or just that one? because the first class load is going to be very very slow (it has to load all the implementations of all the operators and register them with the multimethod registries), but after that it should be faster. 

--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/6476ce49-3ee8-4d84-97b3-02396d25dce8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexander Ulanov

unread,
Oct 14, 2015, 3:14:58 PM10/14/15
to Scala Breeze
The time does not differ much if I create new DenseVector once again:

val n = 12000000
val t1 = System.nanoTime()
val a = new Array[Double](n)
val bdva = new BDV[Double](a)
println("BDV from array: " + (System.nanoTime() - t1) / 1e9)
>>BDV from array: 8.192933733

val t2 = System.nanoTime()
val a = new Array[Double](n)
val bdva = new BDV[Double](a)
println("BDV from array second time: " + (System.nanoTime() - t2) / 1e9)
>>BDV from array second time: 7.908517467

David Hall

unread,
Oct 14, 2015, 3:45:27 PM10/14/15
to scala-...@googlegroups.com
benchmarking the jvm is a tricky thing, and especially for really small things, you shouldn't try do it like that. In fact, one shouldn't try to microbenchmark things yourself at all. Using google caliper (which knows how to do it), i ran the following benchmarks:

val n = 12000000
def timeAllocate(reps: Int) = run(reps) {
DenseVector.zeros[Double](n)
}

def timeAllocateArray(reps: Int) = run(reps) {
new Array[Double](n)
}

def timeAllocateDVArray(reps: Int) = run(reps) {
new DenseVector(new Array[Double](n))
}
And caliper says:

[info]            benchmark       ns linear runtime
[info]             Allocate 12334670 ==============================
[info]        AllocateArray 11672541 ============================
[info]      AllocateDVArray 12219076 =============================

So there's definitely overhead, but it's on the order of 4% (on average). However, the standard deviation is 4561770.39ns or roughly 37% of the mean, so you're probably mostly just profiling the garbage collector.

If we set n to something smaller, say 12000, we get 6413.83 ns for AllocateArray and 6454.17 ns for AllocateDVArray we get 0.6%. So, I wouldn't worry too much.

-- David

David Hall

unread,
Oct 14, 2015, 3:46:28 PM10/14/15
to scala-...@googlegroups.com
(and by definitely overhead, i mean probably almost no. no statistical test is going to say that's significant :-))

Alexander Ulanov

unread,
Oct 14, 2015, 5:34:22 PM10/14/15
to Scala Breeze
Thanks, this looks convincing. Actually, if I move the allocation of Array inline (as you do), then I get much faster results:
val t2 = System.nanoTime()
val bdva = new BDV[Double](new Array[Double](n))
println("BDV from array: " + (System.nanoTime() - t2) / 1e9)
>>BDV from array: 0.198465029

Could you check what happens with your test if you move Array allocation outside the BDV constructor call?

David Hall

unread,
Oct 14, 2015, 5:42:59 PM10/14/15
to scala-...@googlegroups.com
[info]  0% Scenario{vm=java, trial=0, benchmark=Allocate} 6546.46 ns; σ=645.62 ns @ 10 trials
[info]  7% Scenario{vm=java, trial=0, benchmark=AllocateArray} 6418.17 ns; σ=49.26 ns @ 3 trials
[info] 13% Scenario{vm=java, trial=0, benchmark=AllocateDVArray} 6550.07 ns; σ=62.46 ns @ 4 trials
[info] 20% Scenario{vm=java, trial=0, benchmark=AllocateDVArray2} 6549.97 ns; σ=132.70 ns @ 10 trials

no difference

Alexander Ulanov

unread,
Oct 14, 2015, 6:43:54 PM10/14/15
to Scala Breeze
Thank you for the help!

David Hall

unread,
Oct 14, 2015, 6:48:56 PM10/14/15
to scala-...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages