Marlin 0.9.0 performance

73 views
Skip to first unread message

Botond Kósa

unread,
Feb 8, 2018, 6:03:36 PM2/8/18
to marlin-renderer
Hi Laurent,

I tried Marlin 0.9.0 and it seems lighting fast, sometimes more than 2x faster than 0.8.2 when drawing polygon borders, using default settings. I wonder where this speedup might come from. You mentioned "higher sub-pixel accuracy (256x8) + larger tiles (128x64)" in the release notes, but even after I set sun.java2d.renderer.subPixel_log2_X back to 3 (to revert to 8x8 subpixels) I still got the 2x speedup.

By the way, what is the reason for using non-uniform subpixels and tile sizes in x and y directions? The 256x8 subpixel default setting seems particularly strange. Does this mean that near-vertical lines are smoothed differently than near-horizontal ones?

It would be great to set the renderer settings on a per-graphics basis (I am using MarlinGraphics2D), is that possible? Or would that hurt performance?

Cheers,
Botond


Laurent Bourgès

unread,
Feb 9, 2018, 4:52:53 AM2/9/18
to marlin-...@googlegroups.com
Hi Botond,

Thanks for your feedback !


I tried Marlin 0.9.0 and it seems lighting fast, sometimes more than 2x faster than 0.8.2 when drawing polygon borders, using default settings. I wonder where this speedup might come from. You mentioned "higher sub-pixel accuracy (256x8) + larger tiles (128x64)" in the release notes, but even after I set sun.java2d.renderer.subPixel_log2_X back to 3 (to revert to 8x8 subpixels) I still got the 2x speedup.

If you use VolatileImage(s), the larger tile size (128x64) instead of (32x32) lowers the D2D / OpenGL overhead, so I suspect this change corresponds to your observed gains.
I made tests last november (tweet on 4th nov), here is a plot illustrating the performance difference (Volatile images - 1 thread):

For large shape fills, gains are up to 65% in my benchmark results.
 
By the way, what is the reason for using non-uniform subpixels and tile sizes in x and y directions? The 256x8 subpixel default setting seems particularly strange. Does this mean that near-vertical lines are smoothed differently than near-horizontal ones?

The main reason is performance vs quality:
I increased the x-axis subpixel accuracy to 1/256th instead of 1/8th as it improves visual quality with NO cost (free).
If you adjust the y-axis subpixel count, it means increasing or decreasing the number of processed scanlines (8 by default) so performance will be lower or higher as illustrated in my JavaOne talk (page 83):

-  2 subpixels: 32% faster
-  4 subpixels: 20% faster
- 16 subpixels: 35% slower

Of course, if I increase the y-axis subpixel accuracy to 1/256th, rendering will become very very slow !
Finally to improve the accuracy on the y-axis, another algorithm must be implemented (AGG like) that computes exact covered area (triangle / trapezoid) but it represents a big effort (R&D) and will take too much time for me, alone !

It would be great to set the renderer settings on a per-graphics basis (I am using MarlinGraphics2D), is that possible? Or would that hurt performance?

For performance reasons, I adopted using lots of final static variables (constants) as hotspot JIT will eliminate dead-code and produce very efficient native code !

However, I will implement in the future (0.9.2) an enhancement to let Marlin know what is the Graphics2D RenderingHint KEY_RENDERING:
Rendering KEY_RENDERING VALUE_RENDER_QUALITY
VALUE_RENDER_SPEED
VALUE_RENDER_DEFAULT

It will allow tuning Marlin runtime settings depending on what the application wants: SPEED or QUALITY ...

PS: Could you try Marlin 0.9.1 ? its path clipper is better than ever.

Cheers,
Laurent

Botond Kósa

unread,
Feb 9, 2018, 6:54:52 AM2/9/18
to marlin-renderer
Thanks for the explanation. We use VolatileImage to render layers of polygons (administrative areas of a map). The fill is performed without AA, the borders are drawn with AA. This can be done because the borders are drawn using a 2px thick opaque stroke that completely hides any jagged edges produced by the non-AA fill.

On a particular location I measured the following rendering times:
  • fill without AA: 10 ms
  • draw with AA, 32x32 tile size: 124 ms
  • draw with AA, 128x64 tile size: 46 ms
  • draw with AA, 256x256 tile size: 32 ms
  • draw without AA: 26 ms
This explains why upgrading marlin from version 0.8.2 to 0.9.1 more than doubled the performance in this particular workload. Increasing the tile size to the maximum of 256x256 results in even higher performance, almost reaching the non-AA version.

What are the drawbacks of using large tiles? Higher memory usage? Does is affect the performance of drawing small images? (e.g. we render map feature labels on small BufferedImages like 100x20 pixels, is a 256x256 tile size suitable for that?)

Botond

Laurent Bourgès

unread,
Feb 9, 2018, 9:18:40 AM2/9/18
to marlin-...@googlegroups.com
Hi,

2018-02-09 12:54 GMT+01:00 Botond Kósa <boton...@idata.hu>:
Thanks for the explanation. We use VolatileImage to render layers of polygons (administrative areas of a map). The fill is performed without AA, the borders are drawn with AA. This can be done because the borders are drawn using a 2px thick opaque stroke that completely hides any jagged edges produced by the non-AA fill.

On a particular location I measured the following rendering times:
  • fill without AA: 10 ms
  • draw with AA, 32x32 tile size: 124 ms
  • draw with AA, 128x64 tile size: 46 ms
  • draw with AA, 256x256 tile size: 32 ms
  • draw without AA: 26 ms
This explains why upgrading marlin from version 0.8.2 to 0.9.1 more than doubled the performance in this particular workload. Increasing the tile size to the maximum of 256x256 results in even higher performance, almost reaching the non-AA version.

I did my tests on both BufferedImage & VolatileImage outputs on linux (64) i.e. xrender backend. Of course, tile size tuning depends on the backend (D3D, OpenGL for macOS & xrender) but also on the GPU capabilities & performance.

Tile size settings give the largest tile that Marlin gives to the java2d pipeline, the backends will process that tile by subdividing it or directly copying into the GPU but it depends ... if the shape is small, it will only use the needed size and produce a small tile (except row interleaving will be high so a small overhead can happen as pixel rows are less contiguous).

Moreover, large tiles can slow down things if tiles are almost empty / full: smaller tiles are faster for a diagonal line as less pixels will be processed with smaller tiles as lots of useless pixels will be processed anyway with larger tiles.
 

What are the drawbacks of using large tiles? Higher memory usage? Does is affect the performance of drawing small images? (e.g. we render map feature labels on small BufferedImages like 100x20 pixels, is a 256x256 tile size suitable for that?)

Finally tile size parameters allows to customize the Marlin renderer for particular scenes but I am very happy that the new default settings proved to be better than before for general use.
 
FYI tile size can be adjusted up to 1024x1024 (log2 = 10).

In the future, I would like to revisit tile handling to store block flags (32x32) and use adaptive tile size in the Marlin tile generator:
- small tiles for small features
- larger tiles if contiguous marked blocks (on x & y axes) i.e. merge small tiles to reduce tile processing overhead
However it is certainly tricky to implement ... but looks like the best approach to gain in any situation.

Laurent
Reply all
Reply to author
Forward
0 new messages