huge performance decrease with additional groups?

16 views
Skip to first unread message

josch

unread,
Oct 22, 2008, 5:25:45 PM10/22/08
to pyglet-users
hey guys!
Up to now I only had one big texture and one group for all my
drawings.
As the number of stuff increases one texture isnt enough and so i
created a new texture atlas and with it a new ordered TextureGroup.
The Problem is - even when I only add one vertex list belonging to
this new group/texture to my batch - the performance drops from 200fps
to 40fps. Is this normal?

josch

unread,
Oct 23, 2008, 1:36:27 AM10/23/08
to pyglet-users
by reducing my setup to a small testcase i found the culprit by
accident!
consider this code:

import pyglet

class Window(pyglet.window.Window):
def __init__(self):
super(Window, self).__init__(800, 600, resizable=True,
vsync=False )
self.fps = pyglet.clock.ClockDisplay()
pyglet.clock.schedule(lambda dt: None)
atlas1 = pyglet.image.atlas.TextureAtlas(width=2048,
height=2048)
atlas2 = pyglet.image.atlas.TextureAtlas(width=1024,
height=1024)
#atlas2 = pyglet.image.atlas.TextureAtlas(width=2048,
height=2048)
group1 = pyglet.graphics.TextureGroup(atlas1.texture)
group2 = pyglet.graphics.TextureGroup(atlas2.texture)
tile1 = atlas1.add(pyglet.image.load(None,
file=pyglet.resource.file('data/tiles/grass/1.png')))
tile2 = atlas2.add(pyglet.image.load(None,
file=pyglet.resource.file('data/tiles/grass/1.png')))
self.batch = pyglet.graphics.Batch()
vertex_list = []

vertex_list.append(self.batch.add(4, pyglet.gl.GL_QUADS,
group1,
('v2i', [0, 0, 32, 0, 32, 32, 0, 32]),
('t3f', tile1.tex_coords),
('c4B', (255,255,255,255)*4)))
vertex_list.append(self.batch.add(4, pyglet.gl.GL_QUADS,
group2,
('v2i', [32, 0, 64, 0, 64, 32, 32, 32]),
('t3f', tile2.tex_coords),
('c4B', (255,255,255,255)*4)))

def on_draw(self):
pyglet.gl.glClear(pyglet.gl.GL_COLOR_BUFFER_BIT)
self.batch.draw()
self.fps.draw()

window = Window()
pyglet.app.run()

you only need one small 32x32 tile for this to test.
this will run at 500fps for me.
but then when i uncomment the texture atlas line and have another
2048x2048 texture - it drops from 500fps to 40fps!!
all other texture atlas size combinations work - only two times
2048x2048 gives this strange performance drop!
why is that?
it seems that i have to think about rearranging my map objects....

Alex Holkner

unread,
Oct 23, 2008, 1:58:00 AM10/23/08
to pyglet...@googlegroups.com
On Thu, Oct 23, 2008 at 4:36 PM, josch <j.sc...@web.de> wrote:

> but then when i uncomment the texture atlas line and have another
> 2048x2048 texture - it drops from 500fps to 40fps!!
> all other texture atlas size combinations work - only two times
> 2048x2048 gives this strange performance drop!

I'd guess that you're hitting your video RAM limit -- with two
2048x2048 textures, the driver is needing to page them in and out of
video memory within each frame.

Alex.

josch

unread,
Oct 23, 2008, 2:12:47 AM10/23/08
to pyglet-users
I tried to replace the 2048x2048 with four 1024x1024 and the
performance decrease only hit when adding the fourth texture - makes
sense because i have 64MB video memory on my intel X3100.
So what do I do? I need this much space!
Should I divide everything up into smaller textures so that only that
texture is fetched from memory that is really needed and is also not
so big?
When playing around with four 1024x1024 i clearly saw that the one
texture that was obviously swapped to system memory was much faster to
get as the decrease was only down to 100fps and not 40fps.
Or should I dynamically TextureAtlas.add() stuff?

On Oct 23, 7:58 am, "Alex Holkner" <alex.holk...@gmail.com> wrote:

Nathan Whitehead

unread,
Oct 23, 2008, 12:44:12 PM10/23/08
to pyglet...@googlegroups.com
On Wed, Oct 22, 2008 at 11:12 PM, josch <j.sc...@web.de> wrote:
> ... So what do I do? I need this much space!

I think there are two general ways to proceed, depending on what your
application needs.

Case 1) You need the 2048x2048 space for texture data, but not all at
once on the screen. This would be the case if the application is a
game and the texture data contains all the animation frames for
sprites, backgrounds, etc., that appear in the game.

In this case I would proceed by dividing the data into more textures
that get used as needed. An idea is to have one texture per enemy
type. If the enemy type doesn't appear on screen, the texture can be
paged out until needed. With this method you would still get a
slowdown if lots of enemy types appeared onscreen at once, but that is
reasonable.

Case 2) You really do need the 2048x2048 space on the screen all at
once. I would proceed by resizing the texture data itself. Instead
of one 2048x2048 texture, scale the image data down by a factor of 2
to get a single 1024x1024 texture (perhaps using PIL). Then draw this
texture to the same size as previously. It won't be as crisp with the
same detail but it will be fast. This is like turning down the
"graphic detail" setting on lots of games to get your framerate up.
You should make this an option so that users with better graphics
cards get the full textures.
--
Nathan Whitehead

Casey Duncan

unread,
Oct 23, 2008, 12:45:07 PM10/23/08
to pyglet...@googlegroups.com
Two solutions come to mind:

- Reduce the size of the textures, either by using native texture
compression (GL_ARB_texture_compression if supported), reducing the
resolution of certain textures (this could be configurable as in many
commercial games), or simply using fewer different images in your
atlas

- Reorder the textures so that swapping is minimized. Put all of you
background textures together and make sure they are drawn first, then
do another batch from the next texture atlas, etc. There will still be
swapping, but hopefully not as much. You might find that using smaller
atlases is better if you aren't going over by much.

Bottom line though is that 64MB of VRAM isn't very much these days,
something's probably got to give in the quality dept.

-Casey

josch

unread,
Oct 23, 2008, 5:04:28 PM10/23/08
to pyglet-users
hey guys!
thx for the reply! here is how a scene roughly looks like:
http://www.assembla.com/spaces/heroes-renaissance/documents/crWD3WOnSr3ApNab7jnrAJ/download/Screenshot-hr.py-1.png
this map nearly fits on the screen and even this small one needs more
than 64MB space for the textures and this does NOT include additional
animation frames! Normal maps are even larger - up to sixteen times
the size. fortunately the overall number of different map objects will
not increase that much with bigger map sizes.

@nathan:
as this is about showing a map i really need all object on the screen
at once. this even more the case when the user zooms out or uses his
native screen size which can be huge. Drawing this much vertices is
not a problem - drawing fullscreen is still possible with 60fps on my
X3100 and the slowdown only appears when texture data has to be
swapped.
as you see from the screenshot, resizing the textures isn't an option
either. i must draw them full size.

@casey
didnt know about compression - is this doable in pyglet?
i already reorder my stuff - first i add 32x32 tiles, then 64x32
stuff, then 64x64.... fortunately every width/height is a multiple of
32

at least for the map objects i really have to divide stuff up into
several textures as for bigger maps all map objects wont fit into
2048x2048 (again, this is without animation frames...)

and you are right, 64MB isnt much these days. i just cant believe that
it is too small for a reimplementation of a game from 1998 - obviously
the texture data of one map doesnt fit into 64MB in most cases but
back then there was no hardware acceleration at all so no worries
about video memory.

i think i must bear with video memory being swapped to system memory.
i dont believe that it would make sense to dynamically fill textures
on the fly with only the currently needed stuff.

Alex Holkner

unread,
Oct 23, 2008, 5:43:39 PM10/23/08
to pyglet...@googlegroups.com
On 10/24/08, josch <j.sc...@web.de> wrote:
>
> hey guys!
> thx for the reply! here is how a scene roughly looks like:
> http://www.assembla.com/spaces/heroes-renaissance/documents/crWD3WOnSr3ApNab7jnrAJ/download/Screenshot-hr.py-1.png

Looking good. Texture compression won't really be suitable for you
game given the pixel-art style; however you may benefit from using a
lower bit depth for the texture internal format; e.g. GL_RGBA4 (4 bits
per component), GL_RGB5_A1 (5 bits per r/g/b, 1 for alpha), etc. I
haven't tried these myself, so I don't know if this makes a practical
difference on modern video cards.

Alex.

Nathan Whitehead

unread,
Oct 23, 2008, 5:45:49 PM10/23/08
to pyglet...@googlegroups.com
On Thu, Oct 23, 2008 at 2:04 PM, josch <j.sc...@web.de> wrote:
>
> hey guys!
> thx for the reply! here is how a scene roughly looks like:
> http://www.assembla.com/spaces/heroes-renaissance/documents/crWD3WOnSr3ApNab7jnrAJ/download/Screenshot-hr.py-1.png

I don't know how the gameplay works, but another idea is to not clear
the graphics buffer between frame updates. I.e., draw the highly
detailed big map on the color buffer, then leave it there. If the
screen isn't moving then you don't have to redraw each frame entirely.
Just don't call glClear at the start of drawing. Then when something
happens on part of the screen, redraw just that part. You can update
nonrectangular regions using the stencil buffer, or you can change the
viewport and use scissoring to update a rectangular area. The OpenGL
guide has a chapter on this.

This might take some extra coding to calculate which parts to update.
And if most of the screen is changing it wouldn't help very much. I
can't think of any reason you couldn't use the pyglet functions to do
things still. If you're updating rectangles then you would just need
a couple glViewport and glScissor calls before the update.
--
Nathan Whitehead

Alex Holkner

unread,
Oct 23, 2008, 5:47:42 PM10/23/08
to pyglet...@googlegroups.com

Remember to disable double-buffering if you take this route.

Alex.

Casey Duncan

unread,
Oct 23, 2008, 5:55:40 PM10/23/08
to pyglet...@googlegroups.com
On Thu, Oct 23, 2008 at 3:04 PM, josch <j.sc...@web.de> wrote:
>
> hey guys!
> thx for the reply! here is how a scene roughly looks like:
> http://www.assembla.com/spaces/heroes-renaissance/documents/crWD3WOnSr3ApNab7jnrAJ/download/Screenshot-hr.py-1.png
> this map nearly fits on the screen and even this small one needs more
> than 64MB space for the textures and this does NOT include additional
> animation frames! Normal maps are even larger - up to sixteen times
> the size. fortunately the overall number of different map objects will
> not increase that much with bigger map sizes.
[..]

Given this is a more old-school sprite style game, perhaps a more old
school approach is needed.

I seriously doubt the original game redrew the entire map every frame.
More than likely it used the tried and true "dirty rectangle" approach
and only redrew what had actually changed each frame. Perhaps you
should consider the same approach, don't clear the buffer and redraw
every frame, just figure out what parts of the map actually changed
and redraw that. Even if it takes a few frames to render the whole
map, you would rarely have to do it in practice so it wouldn't matter.

I would suggest reading this interesting essay on how simcity 4
renders their cityscapes:

http://simcity.ea.com/about/inside_scoop/3d1.php

One thing they admit at the end is how much better suited 3D graphics
technology is to first-person style games than god games, since you
can naturally limit the level of detail over distances.

-Casey

josch

unread,
Oct 24, 2008, 1:25:16 AM10/24/08
to pyglet-users
thx for the great input guys! I will consider this ideas!
lower bit depths dont work for me i think because I i.e. need multiple
alpha values.
scissoring and "dirty rectangle" sound reasonable but i doubt they
will be useful most of the time as the user only needs high framerates
when moving around the map a lot. When the user doesnt do anything
then i only need 6 updates/second. And with moving the scissoring
might have an effect but i doubt it would be worth the effort for now.
I think I will try to stick with optimizing the texture usage and
return here after this also didnt work as expected.

As far as I understood it, texture regions are better packed when
orderly adding those of the same size right?
If I add for example 32x32 and 64x64 images alternating more space
will be wasted then when for example first adding all 32x32 and then
all 64x64, right?
Where can I look up how textures are packed and how I can
intelligently increase the packing?

When analyzing how many pixels my loaded images have and what was used
in the texture there was only 64% used. I think as all images are of
multiple of 32 in size this could be highly optimized!

Jonathan Hartley

unread,
Oct 24, 2008, 6:45:57 PM10/24/08
to pyglet-users
Hey there,

I'm a bit confused why your map textures are so big in the above
example. If the map is about the same size as the screen, containing
unique textures all over the map, then you'll need around 1600x1200
pixels, at three bytes per pixel. That's 5.7Mb for enough textures to
cover a screen-sized map.

You said you're not using animation frames yet, so I don't think
that's the reason why your texture is so much bigger than my
calculation. Can the user zoom in on the map, so the textures are
really higher resolution than the screenshot you posted?

I'm probably just confused, but thought it worth asking, just on the
offchance that it isn't me who's overlooking something for once. :-)

Cheers,

Jonathan
Reply all
Reply to author
Forward
0 new messages