Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

NumericProperty

1 view
Skip to first unread message

Hicks, Matt

unread,
Jul 18, 2010, 9:49:16 AM7/18/10
to sgin...@googlegroups.com
This is primarily directed at Lex, but additional feedback is always appreciated. :)

I've created a NumericProperty to remove the possibility of boxing/unboxing on Properties and have pushed it into a branch named "numericproperty".  I am not seeing any performance benefits from this change so I'm not sure if I'm doing something wrong, or it's not making enough difference to be noticeable.  It's very likely that something I'm doing is causing auto-boxing / unboxing to occur and that's why I'm tossing this out for someone to take a look at.  I've combined all the functionality of AdvancedProperty into NumericProperty in the org.sgine.property package.

Lex

unread,
Jul 19, 2010, 12:08:47 AM7/19/10
to sgine-dev
Doublecheck that easing function are called without boxing/unboxing
overhead. Also there is a general performance penalty for having
funcions as objects.
The best thing to do is to benchmark property update method separately
from everything else. And compare the results to the code that does
the same thing and is implemented as a simple method, without any
property/functional indirection. This way it is easy to see what
impacts the performance and how much.


On Jul 18, 9:49 am, "Hicks, Matt" <mhi...@captiveimagination.com>
wrote:

Lex

unread,
Jul 19, 2010, 12:30:52 AM7/19/10
to sgine-dev
Only Function0 is fully specialzied, Function1 and Function2 are
partially specialized, and Function3 and above are not specialized. I
had a look at the property animator, it's an instance of Function3.
This means you are getting 4 boxing/unboxing events (3 + return value)
for each animator invocation.
Also I have found a lot of listeners, events, and many other things
going on inside properties. All these things cost cpu cycles and are
likely to be a part of the problem.

Hicks, Matt

unread,
Jul 19, 2010, 10:35:32 AM7/19/10
to sgin...@googlegroups.com
You think making an Animator trait instead of using Function would be better performance?

Lex

unread,
Jul 19, 2010, 12:19:55 PM7/19/10
to sgine-dev
Yes, that would help. However I am more concerned about events and
listeners. Having those on UI widgets works out, since you have about
a 100 widgets and a low framerate. But putting them on each numberic
property with thousands of properties and high fps may be too much
overhead.


On Jul 19, 10:35 am, "Hicks, Matt" <mhi...@captiveimagination.com>

Hicks, Matt

unread,
Jul 19, 2010, 12:21:23 PM7/19/10
to sgin...@googlegroups.com
Yeah, I might set a timing delay so it only sends an event every 100ms or something

Hicks, Matt

unread,
Jul 19, 2010, 8:12:54 PM7/19/10
to sgin...@googlegroups.com
Tried switching to using an Animator trait instead of Function and it made absolutely no difference in performance so I switched back.

I'm about to test the effect of event thresholds.

On Sun, Jul 18, 2010 at 11:30 PM, Lex <lex...@gmail.com> wrote:

Hicks, Matt

unread,
Jul 19, 2010, 8:29:28 PM7/19/10
to sgin...@googlegroups.com
Just got done setting an event fire threshold and bumping it up to every 100ms got me from 38fps to 56fps, which is a pretty good jump, but those events are what trigger revalidation of the matrix and so everything looks extremely choppy.  If I drop it down further to about 20ms there doesn't seem to be any real benefit and the animations are slightly less smooth than they should be.  Just removed the functionality as it doesn't seem beneficial.

It does seem that the ScaleComponent, LocationComponent, and RotationComponents are causing a great deal of the processing time as you previously stated.  If you could send me the combined alternative to determine those values all at once I would like to do some performance testing against that to see how much of an impact that makes.

On Mon, Jul 19, 2010 at 11:19 AM, Lex <lex...@gmail.com> wrote:

Lex

unread,
Jul 20, 2010, 1:18:06 AM7/20/10
to sgine-dev
Combining the Components into one will allow the use of optimized
trasformation routines, however you will not see the difference. I
have a demo running 30 000 animated objects using a full set of
transformations and not skipping a beat. There is about 10% difference
when running with optimized transformation routines. So the bottleneck
is not with math or unoptimized transformations.
There was a bottleneck with the renderer, however that was resolved to
some degree and you are seeing big improvements when animation is
turned off. The raw math thrououtput is more than enough for 30 000
object, nevermind 100-400. The fact that turning on the animation
causes FPS to fall so low, means that there is a MAJOR bottleneck
somewhere. This bottleneck would be shadowing any other performance
problems/improvements. So the test with Animator trait is simply
inconclusive. The last remaining palce the performance problems could
be hiding are in properties themselves. I suggest making a test
without any properties, using straightforward math, so you can compare
the difference. I am convinced the problem lies with the events,
listeners, and other features the properties are packed with.


On Jul 19, 8:29 pm, "Hicks, Matt" <mhi...@captiveimagination.com>
wrote:

Hicks, Matt

unread,
Jul 20, 2010, 8:46:30 AM7/20/10
to sgin...@googlegroups.com
I intend to spend more time this week with YourKit and would recommend you start using it as well.  It will take the guess-work out of this problem as it will help us determine exactly where the problem areas are and correct them.

I'm going to be doing more work on the new Shape system and after swapping that in to replace ImageRenderer I think that will cause a major performance improvement as well.

Hicks, Matt

unread,
Jul 24, 2010, 9:56:57 AM7/24/10
to sgin...@googlegroups.com
It would appear that the copying of data from matrix to direct buffer accounts for about 20% of the thread utilization on the rendering thread.  I'm attaching a screenshot from YourKit analyzing TestStressCubes:

sgine_performance.jpg

I'm tempted to put the camera matrix into a stored DirectDoubleBuffer and each render item into its own as well and update them on change then glLoadMatrix(cameraBuffer), glMultMatrix(itemWorldBuffer).  Anyone have any other suggestions of better ways to handle this?  This appears to be the single greatest performance hit on the render thread.
sgine_performance.jpg
Reply all
Reply to author
Forward
0 new messages