optimizing memory heavy app for low vram systems

10 views
Skip to first unread message

josch

unread,
Nov 18, 2008, 6:08:40 AM11/18/08
to pyglet-users
hello there!

i'm doing this tile engine with animation, zooming and stuff.
my problem arose when adding animations.
I had to load a lot more into video memory and my intel X3100 is
unable to cope with like 64mb data in vram.

i tested this with this little script:
http://phpfi.com/373981
when adding more than one 2048x2048 texture to the scene it slows down
below 12fps despite only a few GL_RECTS are drawn.

my original setup was to distribute all my map objects and animation
frames in several 1024x1024 textures so that no animation has to be
split between two textures so that i could animate by only exchanging
the textcoords.
i move around the map by dynamically resizing current vertex lists, add
()-ing new ones and delete()-ing old ones as required.
i have to delete() and add() because if i let zero sized vertex lists
being drawn the according textures will still be loaded each frame
even when no texture_region from them is used and this drops the
performance to 3fps on big maps with overall 64MB of stuff in
textures.

this proved to be fast on small maps as i could animate with only
changing tex_coords and since the overall texture size was not big
also drawing went with around 50fps.

but on big maps this way is nearly impossible to do as with a lot of
map objects, a lot of different textures have to be loaded which add
up to too much data for my gfx card.

so i decided to take even smaller textures so that in the same scene
more textures would have to be used but the overall size would be
smaller. this proved to be true and on big maps i get more fps with
this method.

but the problem is: when using 512x512 sized textures or smaller i
also have to split a lot of animations into different textures. so on
every animation frame i have to dynamically resize/add/delete the same
way as i do on moving. and this slows things down again - on small
maps from 50fps to 35fps. this is still okay but when combining this
with movement i get below 8fps.

to summarize:
my overall problem is not that my gfx card can draw not enough
rectangles but that it is unable to cope with more than one 2048x2048
texture.
the smaller i do the textures, the more resize/delete/add i have to do
and this slows down again.
the larger the textures, the slower it will get due to my vram
restrictions.

my only solution up to now is to just disable animations so that not
much vram will be used but can you guys think of other alternatives?

Alex Holkner

unread,
Nov 18, 2008, 7:27:23 AM11/18/08
to pyglet...@googlegroups.com

In your position I would think about modifying the contents of the
texture dynamically to cope with the needs of sprites' animations
(especially if the animations are synchronised, such that each
instance of the sprite is showing the same frame at a time). For
example, by using the blit_into function (or glTexSubImage2D). Note
that this technique is fast enough to play back full-motion video into
a texture even on low-end hardware.

Alex.

josch

unread,
Nov 18, 2008, 9:52:43 AM11/18/08
to pyglet-users
wow this idea is very interesting!
as i never used blit_into before, i would have never come to this
idea!

and yes, all animations are synchronous.

so this solves many issues.
with this method i would only need as much space as for storing one
single frame of each object.
as my previous benchmarks show - even large maps will fit into my
limited memory resources when not loading additional animation frames
too.
So i also dont have to use add()/delete() anymore because i can have
all vertex lists show up every time and only resizing them as
necessary.

i also dont have to do resizing or changing the texture region on
animation anymore (no touching of the batch as far as i understand it
- which is good) because i will only blit_into the texture and the
result will be drawn automatically on the next frame.

now i only hope for blit_into being fast enough for this. i only have
to find out how to implement this best as blitting into a texture this
is new to me.

josch

unread,
Nov 19, 2008, 10:59:21 PM11/19/08
to pyglet-users
okay it works now but is is way too slow.

depending on the amount of blit_to_texture (after binding the target
texture) this takes 0,2 to 0,25 seconds for 200-250 blits.

i assume this is because i store the animation frames as
AbstractImages so that the blit_to_texture method of ImageData is
called which does a lot of gl stuff before it calls glTexSubImage2D.

is this the right way to blit the AbstractImage into my texture? it
has to as having a texture to blit from wouldnt make any sense at all.

next i will try to rewrite the ImageData's blit_to_texture method for
my needs which will hopefully increase performance :(

josch

unread,
Nov 20, 2008, 12:38:29 AM11/20/08
to pyglet-users
by implementing my own blit_to_texture, leaving out everything that
isn't necessary on my setup (I always have RGBA data and want to use
glTexSubImage2D) i got some minor improvements that aren't worth the
effort in modifying this method - i get still around 4fps.
the real culprit is the _convert() section i that i replaced with the
code that is actually executed on my system.
a lot of nasty regex and string operations :(
(apparently the image data is turned upside down and not a string but
a list object so this has to be corrected)

imagedata = obj.next_frame()

data_pitch = abs(imagedata._current_pitch)

# Workaround: don't use GL_UNPACK_ROW_LENGTH
if pyglet.gl.current_context._workaround_unpack_row_length:
data_pitch = imagedata.width * 4

###### start costly
imagedata._ensure_string_data()
data = imagedata._current_data
rows = re.findall('.' * abs(data_pitch), data, re.DOTALL)
rows.reverse()
data = ''.join(rows)
###### end costly

if data_pitch & 0x1:
alignment = 1
elif data_pitch & 0x2:
alignment = 2
else:
alignment = 4
row_length = data_pitch / 4
pyglet.gl.glPushClientAttrib(pyglet.gl.GL_CLIENT_PIXEL_STORE_BIT)
pyglet.gl.glPixelStorei(pyglet.gl.GL_UNPACK_ALIGNMENT, alignment)
pyglet.gl.glPixelStorei(pyglet.gl.GL_UNPACK_ROW_LENGTH, row_length)

pyglet.gl.glTexSubImage2D(obj.tex.owner.target, obj.tex.owner.level,
obj.tex.x, obj.tex.y,
imagedata.width, imagedata.height,
pyglet.gl.GL_RGB, pyglet.gl.GL_UNSIGNED_BYTE,
data)
pyglet.gl.glPopClientAttrib()

# Flush image upload before data get GC'd.
pyglet.gl.glFlush()

of course i could prepare my images so that they dont have to be
converted in the first place on every blit.
but even when i leave out the convert code (displaying map objects
upside down) i only get around 18fps while only doing 120blits every
1/6 second so even blitting seems slow.
original fps without animation is 55.
maybe i should try my code on other hardware the next few days...

swiftcoder

unread,
Nov 20, 2008, 7:59:04 PM11/20/08
to pyglet-users
On Nov 20, 12:38 am, josch <j.scha...@web.de> wrote:
> (apparently the image data is turned upside down

It is more a matter of OpenGL's coordinate system being flipped. The
simple fix for this is to invert your texture coordinates, whi,e
leavign the image upside-down which will correct the issue with no
performance loss.

josch

unread,
Nov 20, 2008, 8:24:19 PM11/20/08
to pyglet-users
On Nov 21, 1:59 am, swiftcoder <swiftco...@gmail.com> wrote:
> The simple fix for this is to invert your texture coordinates, whi,e
> leavign the image upside-down which will correct the issue with no
> performance loss.

yeah good idea - i will use that. but i have to process the image data
once anyway (moved this to resource loading) to convert it from ctypes
array to binary string by calling _ensure_string_data()

i also observed that i can blit my image data without these calls:

pyglet.gl.glPushClientAttrib(pyglet.gl.GL_CLIENT_PIXEL_STORE_BIT)
pyglet.gl.glPixelStorei(pyglet.gl.GL_UNPACK_ALIGNMENT, alignment)
pyglet.gl.glPixelStorei(pyglet.gl.GL_UNPACK_ROW_LENGTH, row_length)
pyglet.gl.glPopClientAttrib()

so all i do now is to call glBindTexture followed by multiple
glTexSubImage2D calls.
but it would be nice to know why these calls are there in the first
place and why it works for me without them.

somehow this is still slow (250blits take 0.1sec) but it's faster than
it took my gfx hw to load all the additional data from vram in my
previous memory heavy approach.

Simon

unread,
Dec 15, 2008, 3:27:19 PM12/15/08
to pyglet-users
Hi Josh,

Did you find a way to enhance performance?

I did notice Alex spoke about blit_into (http://www.pyglet.org/doc/api/
pyglet.image.Texture-class.html#blit_into) whilst you used
blit_to_texture.
Don't know if this would affect performance though.
Reply all
Reply to author
Forward
0 new messages