[Android WebView] The slow eglCreateImageKHR may hurt the overall graphics performance of WebView

343 views
Skip to first unread message

Roger Yi

unread,
Mar 23, 2016, 5:21:13 AM3/23/16
to Graphics-dev
Hi,

I am testing the current Android WebView's (50.0.2656.0) graphics performance on some machines like Galaxy Nexus (Android 4.3) and Nexus 4 (Android 5.1)。

The fast scroll performance on this page (http://3g.sina.com.cn/) is OK on N4 but unacceptable on GN, although GN CPU/GPU is old and not as good as N4, but through the trace, its extreme slow eglCreateImageKHR call maybe the bottleneck, on N4 eglCreateImageKHR may use 1~2ms, but on GN, eglCreateImageKHR sometime > 10ms! also the frequency of DoProduceTextureDirectCHROMIUM call seems more often on GN than on N4, and eglCreateImageKHR call on chrome's gpu thread usually cause blocking on Android's gpu thread, and cause frame drop.

I am curious about the situation we need to call DoProduceTextureDirectCHROMIUM, and do we has chance to reduce its necessary?

PS:

To make WebView can run on Android 4.x, I let the parent compositor run on Android's UI thread.
trace_galaxy_nexus.json
trace_nexus4.json

Bo Liu

unread,
Mar 23, 2016, 1:03:13 PM3/23/16
to Roger Yi, Graphics-dev
This was discussed a bit here as well: https://groups.google.com/a/chromium.org/forum/#!topic/android-webview-dev/RDYN9kAx-PQ

Webview uses EGLImage to share textures between EGLContexts on different threads.

To hide expensive eglCreateImageKHR, you can look into these:
* increase tile size, assuming cost of eglCreateImageKHR is fixed
* look into re-using textures along with mailboxes for delegated frames, I *think* right now textures are recycled, but the mailboxes aren't
* pay the cost of eglCreateImageKHR as part of upload rather than draw, so in compositor, this would be also produce tile textures into mailboxes after upload. This would be generating more mailboxes, and significantly hurt upload performance, but maybe it's the right trade off here.

We won't be doing any of this upstream though, since performance on more modern devices is acceptable.

Btw, I'm very impressed you got this to work on 4.3

Roger Yi

unread,
Mar 24, 2016, 10:20:29 PM3/24/16
to Graphics-dev, roge...@gmail.com, bo...@chromium.org
Hi Bo,

Thanks for you reply, actually back port current WebView to 4.x is not that difficult ^_^. I know EGLImage is used to share Texture among different EGLContexts, and I understand the necessarity to let chromium have its own gpu thread in current design against previous design only Android gpu thread is used (m40). But to be honest, the compatibility and performance issue about EGLImage worry me most in the first place, and the situation happen on Galaxy Nexus actually prove this...

What I trying to figure out is:

1, The EGLImage for the Texture is only need to create once during the whole lifetime of Texture?
2, or, the EGLImage for the Texture need to create each time when Texture is update?
3, or, the EGLImage for the Texture need to create each time when Resource attached with this Texture is pass from child cc to parent cc?

From your reply, seems the second or the third is true? So, is the first achievable and how to achieve it?

在 2016年3月24日星期四 UTC+8上午1:03:13,Bo Liu写道:

Bo Liu

unread,
Mar 24, 2016, 11:49:40 PM3/24/16
to Roger Yi, Graphics-dev
On Thu, Mar 24, 2016 at 7:20 PM, Roger Yi <roge...@gmail.com> wrote:
Hi Bo,

Thanks for you reply, actually back port current WebView to 4.x is not that difficult ^_^. I know EGLImage is used to share Texture among different EGLContexts, and I understand the necessarity to let chromium have its own gpu thread in current design against previous design only Android gpu thread is used (m40). But to be honest, the compatibility and performance issue about EGLImage worry me most in the first place, and the situation happen on Galaxy Nexus actually prove this...

What I trying to figure out is:

1, The EGLImage for the Texture is only need to create once during the whole lifetime of Texture?
2, or, the EGLImage for the Texture need to create each time when Texture is update?
3, or, the EGLImage for the Texture need to create each time when Resource attached with this Texture is pass from child cc to parent cc?

Case 1 actually, with correct synchronization.

I was wrong about re-using mailboxes. If you look at the implementation in MailboxManagerSync, the EGLImage is alive as long as any texture associated with it is alive. So it doesn't matter if mailbox is reused or not, as long as textures are reused.

Looks like cc::ResourcePool keeps unused resources for only 1 second though. Previously it was keeping resources forever as long as everything is under the memory budget. New behavior helps with steady state memory a lot, but I guess not so much with texture reuse. I'm not sure how much of a change that was in practice.

Xing

unread,
Apr 4, 2016, 9:30:23 PM4/4/16
to Graphics-dev
It seems you are running AwShellActivity, instead of system webview. Because for system webview, the DrawFunctor runs at android RenderThread. While the DrawFunctor of AwShellActivity runs at the UI thread.
So will this lead to some performance difference?


在 2016年3月23日星期三 UTC+8下午5:21:13,Roger Yi写道:

Roger Yi

unread,
Apr 4, 2016, 9:48:17 PM4/4/16
to Graphics-dev
Hi Xing,

1, It is a system webview, and to make it can run on Android 4.x, I let the parent cc run on Android's UI thread;
2, It actually has performance difference between run on Android 4.x and Android 5.0+, but as I said, this difference is not the bottleneck;

在 2016年4月5日星期二 UTC+8上午9:30:23,Xing写道:

Roger Yi

unread,
Apr 6, 2016, 1:56:02 AM4/6/16
to Graphics-dev, roge...@gmail.com, bo...@chromium.org
Hi Bo,

I have try some experiments to reduce the performance impact by the cost of eglCreateImageKHR, and get some interesting results.

I force to create a mailbox before the texture's first writing (first rasterize for gpu rasterization of first upload for cpu rasterization), in a phone use Tegra4, the cost of eglCreateImageKHR reduce significantly, from 10ms+ to 0.x ms, and the overall performance increase is perceivable.

My guess is, create EGLImage before the Texture has real image data, EGL can treat the reserved data behavior as false, and save the cost of data copy or cache synchronization or whatever it need to take. 

Actually, call eglCreateImageKHR by 'false' EGL_IMAGE_PRESERVED_KHR will have the same result in Tegra4, although create EGLImage after the Texture has real image data with 'false' EGL_IMAGE_PRESERVED_KHR will cause content error.

By far, this phenomenon only found on Tegra4, Galaxy Nexus still the same, but I think it is worthy to try. BTW, the mailbox only need to create once during Texture lifetime, so we just change its timing.

在 2016年3月25日星期五 UTC+8上午11:49:40,Bo Liu写道:

Willy Yu (游佳偉)

unread,
Apr 6, 2016, 2:55:35 AM4/6/16
to Roger Yi, Graphics-dev, bo...@chromium.org

Hi Roger,

 

I think it need to do more test on other platforms.

Because eglCreateImageKHR is platform dependent. Mainly caused by GPU driver.

It may fix slow issue on Tegra4, but may not on other platforms.

 

Thanks

--
You received this message because you are subscribed to the Google Groups "Graphics-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to graphics-dev...@chromium.org.

************* Email Confidentiality Notice ********************
The information contained in this e-mail message (including any 
attachments) may be confidential, proprietary, privileged, or otherwise
exempt from disclosure under applicable laws. It is intended to be 
conveyed only to the designated recipient(s). Any use, dissemination, 
distribution, printing, retaining or copying of this e-mail (including its 
attachments) by unintended recipient(s) is strictly prohibited and may 
be unlawful. If you are not an intended recipient of this e-mail, or believe 
that you have received this e-mail in error, please notify the sender 
immediately (by replying to this e-mail), delete any and all copies of 
this e-mail (including any attachments) from your system, and do not
disclose the content of this e-mail to any other person. Thank you!

Roger Yi

unread,
Apr 6, 2016, 3:09:37 AM4/6/16
to Graphics-dev, roge...@gmail.com, bo...@chromium.org
As I said, this change only change the timing to create the mailbox of texture, so it cause not harm on other device in theory no matter it will gain or not.

在 2016年4月6日星期三 UTC+8下午2:55:35,Willy Yu写道:

Willy Yu (游佳偉)

unread,
Apr 6, 2016, 3:26:43 AM4/6/16
to Roger Yi, Graphics-dev, bo...@chromium.org

Got your point

Bo Liu

unread,
Apr 6, 2016, 1:11:34 PM4/6/16
to Willy Yu (游佳偉), Roger Yi, Graphics-dev
On Wed, Apr 6, 2016 at 12:26 AM, Willy Yu (游佳偉) <Will...@mediatek.com> wrote:

Got your point

Thanks

 

From: graphi...@chromium.org [mailto:graphi...@chromium.org] On Behalf Of Roger Yi
Sent: Wednesday, April 06, 2016 3:10 PM
To: Graphics-dev
Cc: roge...@gmail.com; bo...@chromium.org

Subject: Re: [Android WebView] The slow eglCreateImageKHR may hurt the overall graphics performance of WebView

 

As I said, this change only change the timing to create the mailbox of texture, so it cause not harm on other device in theory no matter it will gain or not.

201646日星期三 UTC+8下午2:55:35Willy Yu写道:

Hi Roger,

 

I think it need to do more test on other platforms.

Because eglCreateImageKHR is platform dependent. Mainly caused by GPU driver.

It may fix slow issue on Tegra4, but may not on other platforms.

 

Thanks

 

From: graphi...@chromium.org [mailto:graphi...@chromium.org] On Behalf Of Roger Yi
Sent: Wednesday, April 06, 2016 1:56 PM
To: Graphics-dev
Cc: roge...@gmail.com; bo...@chromium.org
Subject: Re: [Android WebView] The slow eglCreateImageKHR may hurt the overall graphics performance of WebView

 

Hi Bo,

 

I have try some experiments to reduce the performance impact by the cost of eglCreateImageKHR, and get some interesting results.

 

I force to create a mailbox before the texture's first writing (first rasterize for gpu rasterization of first upload for cpu rasterization), in a phone use Tegra4, the cost of eglCreateImageKHR reduce significantly, from 10ms+ to 0.x ms, and the overall performance increase is perceivable.


Roger Yi

unread,
Apr 6, 2016, 10:57:51 PM4/6/16
to Graphics-dev, Will...@mediatek.com, roge...@gmail.com, bo...@chromium.org
I am glad to be help, before I try the 'create mailbox before texture's first writing' thing, what I think is avoid to block the Android's gpu thread when pass child frame to parent cc, you can also check this is work or not, maybe it has little help, although not that obvious.

在 2016年4月7日星期四 UTC+8上午1:11:34,Bo Liu写道:

Bo Liu

unread,
Apr 6, 2016, 11:14:25 PM4/6/16
to Roger Yi, Graphics-dev, Willy Yu
You mean blocking on the sync point of the frame? That can't be avoided. Otherwise the resources in the frame may not be ready to draw which can cause rendering corruptions.

Roger Yi

unread,
Apr 6, 2016, 11:39:41 PM4/6/16
to Graphics-dev, roge...@gmail.com, Will...@mediatek.com, bo...@chromium.org
My thinking is —— original, mailbox is create when texture pass from child cc to parent cc, and parent cc must wait the mailbox's creation then the cost of eglCreateImageKHR will cause long block. And when I change the timing of mailbox creation, if the texture do not need to pass from child cc to parent cc in current frame, then the parent cc no need to wait the creation of this mailbox. Maybe this will have some help to avoid the frame drop, but parent cc still need to wait the sync point, so the help will very little...


在 2016年4月7日星期四 UTC+8上午11:14:25,Bo Liu写道:

Bo Liu

unread,
Apr 6, 2016, 11:51:01 PM4/6/16
to Roger Yi, Graphics-dev, Willy Yu
You are describing the "pay the cost of eglCreateImageKHR as part of upload rather than draw" idea.

Bo Liu

unread,
Apr 6, 2016, 11:56:43 PM4/6/16
to Roger Yi, Graphics-dev, Willy Yu
On Wed, Apr 6, 2016 at 8:50 PM, Bo Liu <bo...@chromium.org> wrote:
You are describing the "pay the cost of eglCreateImageKHR as part of upload rather than draw" idea.

You are welcome experiment with that idea in your fork. But at this point I'm not sure yet if that's the right thing to do yet for upstream chromium. I don't really have time right to experiment with the potential trade-offs.

Willy Yu (游佳偉)

unread,
Apr 7, 2016, 9:10:02 AM4/7/16
to bo...@chromium.org, Roger Yi, Graphics-dev

Another issue about sync point.

The sync point may block main or render thread by GPU rasterization or HTML5 canvas raster on the gpu thread.

If the rasterization is low on gpu thread, the sync point must wait until GPU completed.

 

It is a little similar to eglCreateImageKHR slow case. And seems that there is no better solution for this currently.

 

My thought is to query the texture resource whether completed when AppendToQuad

Once completed, the resource can be sent to parent, otherwise compositor would not append to AppendToQuad

But :

1.       it need a interface to query resource was ready

2.       it may not efficient to query across GPU thread

 

or any other better solution ?

Bo Liu

unread,
Apr 7, 2016, 6:14:26 PM4/7/16
to Willy Yu (游佳偉), Roger Yi, Graphics-dev
On Thu, Apr 7, 2016 at 6:09 AM, Willy Yu (游佳偉) <Will...@mediatek.com> wrote:

Another issue about sync point.

The sync point may block main or render thread by GPU rasterization or HTML5 canvas raster on the gpu thread.

If the rasterization is low on gpu thread, the sync point must wait until GPU completed.


Are you saying that's frame N sync point is waiting on work from frame N+1? That sounds like a scheduling problem to me. There are graphics folks working on better command buffer scheduling right now. See the "GPU Service Scheduler" thread.

If it's waiting for work for the current frame, then that just means that frame is taking a long time. Then there really is nothing we can do.
 

 

It is a little similar to eglCreateImageKHR slow case. And seems that there is no better solution for this currently.

 

My thought is to query the texture resource whether completed when AppendToQuad

Once completed, the resource can be sent to parent, otherwise compositor would not append to AppendToQuad

But :

1.       it need a interface to query resource was ready

2.       it may not efficient to query across GPU thread

 

or any other better solution ?


imo that's the wrong problem to solve. cc should not put resources that are not ready to draw in a frame. Slow ProduceTextureMailbox/eglCreateImageKHR is a bug itself that should be fixed. Should not introduce yet another complex system just to hide it.

We can't really introduce asynchronous or polling behavior everywhere just to avoid blocking, because then latency of the whole system will go through the roof.

(I'm making grand claims about the whole graphics system here. I'm sure someone will correct me if I'm wrong :p)

Willy Yu (游佳偉)

unread,
Apr 8, 2016, 7:35:51 AM4/8/16
to bo...@chromium.org, Roger Yi, Graphics-dev

 

 

From: Bo Liu [mailto:bo...@chromium.org]
Sent: Friday, April 08, 2016 6:14 AM
To: Willy Yu (
游佳偉)
Cc: Roger Yi; Graphics-dev
Subject: Re: [Android WebView] The slow eglCreateImageKHR may hurt the overall graphics performance of WebView

 

 

 

On Thu, Apr 7, 2016 at 6:09 AM, Willy Yu (游佳偉) <Will...@mediatek.com> wrote:

Another issue about sync point.

The sync point may block main or render thread by GPU rasterization or HTML5 canvas raster on the gpu thread.

If the rasterization is low on gpu thread, the sync point must wait until GPU completed.

 

Are you saying that's frame N sync point is waiting on work from frame N+1? That sounds like a scheduling problem to me. There are graphics folks working on better command buffer scheduling right now. See the "GPU Service Scheduler" thread.

 

If it's waiting for work for the current frame, then that just means that frame is taking a long time. Then there really is nothing we can do.

 àYes, this is what I said.

I have checked M51 WebView. It seems has lot improvement.

I didn’t found render thread was block by sync point, will check further.

 

 

It is a little similar to eglCreateImageKHR slow case. And seems that there is no better solution for this currently.

 

My thought is to query the texture resource whether completed when AppendToQuad

Once completed, the resource can be sent to parent, otherwise compositor would not append to AppendToQuad

But :

1.       it need a interface to query resource was ready

2.       it may not efficient to query across GPU thread

 

or any other better solution ?

 

imo that's the wrong problem to solve. cc should not put resources that are not ready to draw in a frame. Slow ProduceTextureMailbox/eglCreateImageKHR is a bug itself that should be fixed. Should not introduce yet another complex system just to hide it.

 

We can't really introduce asynchronous or polling behavior everywhere just to avoid blocking, because then latency of the whole system will go through the roof.

 

Not really polling all the resource, may a notification that the texture resource is ready..

 

Reply all
Reply to author
Forward
0 new messages