Using zero-copy FW and gpu raster for Android.

885 views
Skip to first unread message

Sohan Jyoti Ghosh

unread,
Jan 21, 2015, 6:11:21 AM1/21/15
to graphi...@chromium.org

Hi,

 

I was going through some code in old stock browser, where they use SurfaceTexture buffers and render using ganesh.

Also they set up an eglSurface/device for the tiles directly, and map it to the Canvas for rendering.

http://osxr.org/android/source/external/webkit/Source/WebCore/platform/graphics/android/rendering/GaneshContext.cpp

https://android.googlesource.com/platform/external/webkit/+/11d7657ebc4dfb1462db8cdc8b48ffa25478e442/Source/WebCore/platform/graphics/android/rendering/GaneshRenderer.cpp


Has there been any experiment on this front ? 

Do you foresee any performance improvement with this ? Will it be possible with our current design ?


I think we would avoid the intermediate texture usage and cached tiles for gpu raster with this ? But, that would again take a hit for static pages.

How about the BindImage which we do in zero-copy now, would we save the binding cost here?


Please let us know your thoughts on this.


Thanks,

Sohan

Alexandre Elias

unread,
Jan 21, 2015, 5:38:54 PM1/21/15
to Sohan Jyoti Ghosh, graphics-dev
We shipped gralloc-based buffers (gralloc is the backing for SurfaceTexture) as the persistent tile backing in WebView in M30/M33.  So we've not only experimented with that, we've shipped it.  We then unshipped it because gralloc memory turned out to be expensive to lock/unlock every frame, it led to severe memory fragmentation issues on some drivers, and there is a global limit on grallocs due to having one or more filehandles for them.  Basically it's not well-suited for having large numbers of them.  Having a small queue of buffers used for upload, as the stock browser did, is a better fit, but that would create other pain points around queue management and we couldn't easily use it for Chrome because synchronous SurfaceTexture usage is not a public API.  We'd rather pursue approaches based on standard OpenGL features like PBOs.

Ganesh never shipped on the classic stock browser -- it was nowhere near polished enough to ship 3 years ago.  We're only just getting it to a shippable state today.  And speaking of Ganesh, the SurfaceTexture queue stuff is purely to improve texture upload performance compared to glTexSubImage so Ganesh actually makes work in that area less relevant.

Sohan Jyoti Ghosh

unread,
Jan 22, 2015, 4:23:12 AM1/22/15
to graphi...@chromium.org
Thanks aelias!

We know about SurfaceTexture/gralloc being used in current zero-copy and one-copy raster framework in Chrome for Android.

Just that we were trying to figure out if we can get best of both ganesh and zero-copy together ? If that's a good option at all for experimenting.

Thanks,
Sohan

Alexandre Elias

unread,
Jan 22, 2015, 9:57:41 AM1/22/15
to Sohan Jyoti Ghosh, graphics-dev

Well, Ganesh involves uploading textures for things like decoded images and various Skia caches.  It's a different workload than the software path, much less uploads overall but in potentially bigger chunks that might cause janks when they do happen.  We'll evaluate how best to handle uploads when that bubbles up as one of the major remaining perf issues with Ganesh; I'm not sure a priori what upload approach will be best though.

sangh...@gmail.com

unread,
Jan 23, 2015, 1:43:59 AM1/23/15
to graphi...@chromium.org, sohan...@samsung.com
Hi Alexandre
Having a small queue of buffers used for upload, as the stock browser did, is a better fit, but that would create other pain points around queue management and we couldn't easily use it for Chrome because synchronous SurfaceTexture usage is not a public API.

=> i think this concept seem to be one-copy in current Chromium. if you think there is pain points in one-copy, what is reason to implement it in Chrome now ?

i wonder what is Chrome's final fallback path of GPU Raster ? Will you use pixelbuffer or one-copy for GPU Raster's fallback ?

Dongseong Hwang

unread,
Jan 23, 2015, 7:54:54 AM1/23/15
to graphi...@chromium.org, sohan...@samsung.com, sangh...@gmail.com
PBO might be not a solution because of chromium multiprocess architecture. Gpu process cannot pass pinned memory (which is mapped on virtual memory space) to render process without any revolutionary idea. 

David Reveman

unread,
Jan 23, 2015, 12:12:56 PM1/23/15
to Dongseong Hwang, graphi...@chromium.org, sohan...@samsung.com, sangh...@gmail.com
On Fri Jan 23 2015 at 4:54:57 AM Dongseong Hwang <dongseo...@intel.com> wrote:


On Friday, January 23, 2015 at 8:43:59 AM UTC+2, sangh...@gmail.com wrote:
Hi Alexandre
Having a small queue of buffers used for upload, as the stock browser did, is a better fit, but that would create other pain points around queue management and we couldn't easily use it for Chrome because synchronous SurfaceTexture usage is not a public API.

=> i think this concept seem to be one-copy in current Chromium. if you think there is pain points in one-copy, what is reason to implement it in Chrome now  ?

i wonder what is Chrome's final fallback path of GPU Raster ? Will you use pixelbuffer or one-copy for GPU Raster's fallback ?

I'm working on making one-copy using standard shared memory the default mechanism for initializing tile textures with software rasterized content. Pixelbuffer mechanism can hopefully be removed soon.

JungJik Lee

unread,
Jan 25, 2015, 11:52:51 PM1/25/15
to graphi...@chromium.org, sohan...@samsung.com
hi, could I ask about memory fragmentation issues?
on some drivers?  does this happens in adreno or mali? so that does chromium have no plan to enable zero copy as default?
the reason I ask is that I had expected that surface texture issue(reference leak?) was solved on Lollipop and zero copy would be turn on after L.
thanks.

2015년 1월 22일 목요일 오전 7시 38분 54초 UTC+9, Alexandre Elias 님의 말:

Sangheo...@samsung.com

unread,
Jan 26, 2015, 12:56:16 AM1/26/15
to graphi...@chromium.org, dongseo...@intel.com, sohan...@samsung.com, sangh...@gmail.com
2015년 1월 24일 토요일 오전 2시 12분 56초 UTC+9, David Reveman 님의 말:
> On Fri Jan 23 2015 at 4:54:57 AM Dongseong Hwang <dongseo...@intel.com> wrote:
>
>
>
> On Friday, January 23, 2015 at 8:43:59 AM UTC+2, sangh...@gmail.com wrote:Hi Alexandre
>
> Having a small queue of buffers used for upload, as the stock browser did, is a better fit, but that would create other pain points around queue management and we couldn't easily use it for Chrome because synchronous SurfaceTexture usage is not a public API.
>
>
>
> => i think this concept seem to be one-copy in current Chromium. if you think there is pain points in one-copy, what is reason to implement it in Chrome now  ?
>
>
>
> i wonder what is Chrome's final fallback path of GPU Raster ? Will you use pixelbuffer or one-copy for GPU Raster's fallback ?
>
>
> I'm working on making one-copy using standard shared memory the default mechanism for initializing tile textures with software rasterized content. Pixelbuffer mechanism can hopefully be removed soon.


=> What would be benefit of one-copy+sharedmemroy compared to Pixelbuffer ?
In terms of memory, one-copy seem to use more memory because of additional staging resource pool.
And i am not sure on-copy-sharedmemory would use AsyncUpload or not. it it does not usee AsyncUpload, it would be worse than Pixelbuffer performance.



David Reveman

unread,
Jan 26, 2015, 9:58:11 AM1/26/15
to Sangheo...@samsung.com, graphi...@chromium.org, dongseo...@intel.com, sohan...@samsung.com, sangh...@gmail.com
On Mon Jan 26 2015 at 12:56:18 AM <Sangheo...@samsung.com> wrote:
2015년 1월 24일 토요일 오전 2시 12분 56초 UTC+9, David Reveman 님의 말:
> On Fri Jan 23 2015 at 4:54:57 AM Dongseong Hwang <dongseo...@intel.com> wrote:
>
>
>
> On Friday, January 23, 2015 at 8:43:59 AM UTC+2, sangh...@gmail.com wrote:Hi Alexandre
>
> Having a small queue of buffers used for upload, as the stock browser did, is a better fit, but that would create other pain points around queue management and we couldn't easily use it for Chrome because synchronous SurfaceTexture usage is not a public API.
>
>
>
> => i think this concept seem to be one-copy in current Chromium. if you think there is pain points in one-copy, what is reason to implement it in Chrome now  ?

There might be configurations where STs work well and we'll consider using them. In general, as resolution increase, the benefit of using STs increase. And if not for Android, we'll use this mechanism on other platforms.
 
>
>
>
> i wonder what is Chrome's final fallback path of GPU Raster ? Will you use pixelbuffer or one-copy for GPU Raster's fallback ?
>
>
> I'm working on making one-copy using standard shared memory the default mechanism for initializing tile textures with software rasterized content. Pixelbuffer mechanism can hopefully be removed soon.


=> What would be benefit of one-copy+sharedmemroy compared to Pixelbuffer ?
In terms of memory, one-copy seem to  use more memory because of additional staging resource pool.
And i am not sure on-copy-sharedmemory would use AsyncUpload or not. it it does not usee AsyncUpload, it would be worse than Pixelbuffer performance.

1. Reduced complexity.
2. Better control over memory usage. Not more usage. The staging resource usage can more easily be controlled by the compositor instead of being hidden behind the GLES interface.
3. Provides a more efficient mechanism to update textures than uploads on platforms where supported without adding more complexity to the compositor.

Fyi, one-copy on Android will have some async behavior but it will be hidden to the compositor. Performance of one-copy + shared memory should be the same or better than pixel buffers in all situations we care about. This is not the case yet but we're working on it.

David

Pradeep Mishra

unread,
Nov 23, 2017, 11:31:31 AM11/23/17
to Graphics-dev, Sangheo...@samsung.com, dongseo...@intel.com, sohan...@samsung.com, sangh...@gmail.com
Hi David,

Do you get any solution share memory between GPU and CPU memory for software rasterization ?  I am able to render the bitmap  in SW( using some costume algorithm)  but got performance hit in uploading the texture into GPU memory . Is there any way to move handle the memory handle/address to GPU without uploading ? 

David Reveman

unread,
Dec 5, 2017, 10:43:59 AM12/5/17
to Pradeep Mishra, Graphics-dev, Sangheo...@samsung.com, Dongseong Hwang, Sohan Jyoti Ghosh, Sangheon Kim
On Thu, Nov 23, 2017 at 12:10 AM, Pradeep Mishra <pradee...@gmail.com> wrote:
Hi David,

Do you get any solution share memory between GPU and CPU memory for software rasterization ?  I am able to render the bitmap  in SW( using some costume algorithm)  but got performance hit in uploading the texture into GPU memory . Is there any way to move handle the memory handle/address to GPU without uploading ? 

You can use GMBs for this but it will only work on platforms where sharing memory between CPU and GPU is supported (Mac and ChromeOS today). A texture upload will take place automatically on other platforms.
Reply all
Reply to author
Forward
0 new messages