Unsafe uses of base::RefCounted

186 views
Skip to first unread message

Éric Noyau

unread,
Mar 17, 2015, 12:27:38 PM3/17/15
to Chromium-dev, Kibeom Kim
tl;dr: base::RefCounted seems to be used for classes that are passed between threads, leading to possible corruptions and crashes.

While chasing an issue with an instance of gfx::Image send to another thread for resizing someone noticed a crash on Android. It turns out that gfx::Image uses an ImageStorage class which is subclassing base::RefCounted. The current best guess for the crash is that the refcount for imageStorage was indirectly incremented on one thread while being decremented on another:

1- gfx::Image is allocated on main thread, refcount of storage is 1
2- A copy of the gfx::Image is bound in a post task, refcount is 2
3- Image goes out of scope on main thread, refcount is 1
4- Image goes out of scope on resizing thread, refcount is 0, storage is deallocated.

The issue is that 3 and 4 are run on different thread, so they could be run in a different order, and end up executing at the same time. Slim chance, but possible. At this point I'm asking for guidance on how to move forward with this. Not being able to pass gfx::Image safely between threads was quite a surprise to me.

What worries me most is that it could be a quite common issue, as the refcount is indirect and as such not really visible to the developer passing an object between threads. To try to detect the extend of the problem I made a CL that makes base::RefCountedBase inherit from base::NonThreadSafe. This change causes a number of tests failures as they use non thread safe refcounted classes on multiple threads, most of the time indirectly.

It is surprising to me that I'm the first one to encounter this issue, so maybe I'm missing something obvious.

Thoughts?

-- Eric




Dana Jansens

unread,
Mar 17, 2015, 12:44:42 PM3/17/15
to no...@chromium.org, Chromium-dev, Kibeom Kim
I can't imagine that these are anything but bugs, and we should fix them to use RefCountedThreadSafe, or stop inc/decing refcounts on multiple threads.

Because RefCounted/ThreadSafe has to be inherited from, it's easy to change the usage of the class but not realize you needed to change its base class. I've seen the other way too where we had a RefCountedThreadSafe class used entirely on one thread. Less dangerous, just wasteful overhead.

It is surprising to me that I'm the first one to encounter this issue, so maybe I'm missing something obvious.

It certainly feels that way, doesn't it.. I'm supportive of DCHECKing that RefCounted is used correctly as you're doing.
 

Thoughts?

-- Eric




--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Matt Giuca

unread,
Mar 17, 2015, 2:37:32 PM3/17/15
to dan...@chromium.org, no...@chromium.org, Robert Sesek, Chromium-dev, Kibeom Kim
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

Robert Sesek

unread,
Mar 17, 2015, 2:41:02 PM3/17/15
to Matt Giuca, Dana Jansens, no...@chromium.org, Chromium-dev, Kibeom Kim
I'd be curious to know the answers to the questions I raised on that review. I do think there could be a perf hit with moving to RefCountedThreadSafe, but if people are passing gfx::Image across threads, then it's definitely something to consider.

Eric: Did you see this pattern in existing code, or were you trying to add new code that did this?

rsesek / @chromium.org

Dana Jansens

unread,
Mar 17, 2015, 2:41:02 PM3/17/15
to Matt Giuca, no...@chromium.org, Robert Sesek, Chromium-dev, Kibeom Kim
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

Robert Sesek

unread,
Mar 17, 2015, 2:44:43 PM3/17/15
to Dana Jansens, Matt Giuca, no...@chromium.org, Chromium-dev, Kibeom Kim
On Tue, Mar 17, 2015 at 2:39 PM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

The class is internally ref counted, much like SkImage is, so that we don't copy image resource data unnecessarily. Making it RefCountedThreadSafe is reasonable if we need to do it. I just don't know what the performance tradeoff is.

Matt Giuca

unread,
Mar 17, 2015, 2:49:17 PM3/17/15
to Dana Jansens, no...@chromium.org, Robert Sesek, Chromium-dev, Kibeom Kim
OK, I think we need two investigations:
1. The perf implications of gfx::Image switching to RefCountedThreadSafe, and
2. How easily we can "fix" all usages of gfx::Image to not rely on thread safety.

I've filed a bug http://crbug.com/468010 to track this issue. I'm travelling this week but I should be able to look at it next week. If anyone else does any investigating, please update that bug.

Éric Noyau

unread,
Mar 17, 2015, 3:09:09 PM3/17/15
to Robert Sesek, Matt Giuca, Dana Jansens, Chromium-dev, Kibeom Kim
On Tue, Mar 17, 2015 at 7:38 PM, Robert Sesek <rse...@chromium.org> wrote:
I'd be curious to know the answers to the questions I raised on that review. I do think there could be a perf hit with moving to RefCountedThreadSafe, but if people are passing gfx::Image across threads, then it's definitely something to consider.

Eric: Did you see this pattern in existing code, or were you trying to add new code that did this?

Existing code, used only in mobile: components/enhanced_bookmarks/bookmark_image_service and its two concrete subclasses. Note that in this particular case the Image is considered immutable, the resize builds and returns a new image, and is not changing the existing image in any way (Some of the things that Matt's CL fixes are more related to const state, which is a separate issue from what I see here). 
 
-- Eric

Nicolas Zea

unread,
Mar 17, 2015, 4:03:50 PM3/17/15
to no...@chromium.org, Robert Sesek, Matt Giuca, Dana Jansens, Chromium-dev, Kibeom Kim, will...@chomium.org
FYI, Will Chan I believe did a similar investigation (cost of using RefCounted vs RefCountedThreadSafe everywhere) some years ago. See https://groups.google.com/a/chromium.org/forum/#!msg/chromium-dev/5OBvPvCctAw/udfXE82jYF0J for background.

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Pawel

unread,
Mar 17, 2015, 9:23:45 PM3/17/15
to chromi...@chromium.org, no...@chromium.org, kkim...@chromium.org
One issue is, from my personal experience, not too many people actually test with DCHECKs on. This is a personal pet peeve of mine, as I run into DCHECKs hitting (on CrOS at least) pretty often during my own testing, which I almost always do on Debug.

I think we don't have any bots running tests with DCHECKs on?

Nico Weber

unread,
Mar 17, 2015, 9:29:00 PM3/17/15
to Pawel Osciak, Chromium-dev, no...@chromium.org, Kibeom Kim
On Tue, Mar 17, 2015 at 9:23 PM, Pawel <pos...@chromium.org> wrote:
On Wednesday, March 18, 2015 at 1:44:42 AM UTC+9, Dana Jansens wrote:
On Tue, Mar 17, 2015 at 9:26 AM, Éric Noyau <no...@chromium.org> wrote:
It is surprising to me that I'm the first one to encounter this issue, so maybe I'm missing something obvious.

It certainly feels that way, doesn't it.. I'm supportive of DCHECKing that RefCounted is used correctly as you're doing.

One issue is, from my personal experience, not too many people actually test with DCHECKs on. This is a personal pet peeve of mine, as I run into DCHECKs hitting (on CrOS at least) pretty often during my own testing, which I almost always do on Debug.

I think we don't have any bots running tests with DCHECKs on?

Huh? All debug bots on the main waterfall run with DCHECKs on. All trybots run tests in release with dcheck_always_on. We have tons of bots running tests with dchecks. Maybe the CrOS code doesn't have good test coverage?

Pawel Osciak

unread,
Mar 17, 2015, 9:32:13 PM3/17/15
to Nico Weber, Chromium-dev, no...@chromium.org, Kibeom Kim
My apologies, I stand corrected. This is great to hear. Not enough test coverage might be the case, and also I think because we don't run ARM CrOS tests (I might be wrong again here though) and we don't run on real CrOS machines in CQ (this I know to be true).

oshima

unread,
Mar 17, 2015, 9:32:42 PM3/17/15
to Dana Jansens, Matt Giuca, no...@chromium.org, Robert Sesek, Chromium-dev, Kibeom Kim
On Tue, Mar 17, 2015 at 11:39 AM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

+1. gfx::Image itself isn't thread safe, so using RefCountedThreadSafe won't help. ImageStorage should implement NonThreadSafe to catch such usage.

Éric Noyau

unread,
Mar 18, 2015, 6:17:24 AM3/18/15
to Nicolas Zea, Robert Sesek, Matt Giuca, Dana Jansens, Chromium-dev, Kibeom Kim, will...@chomium.org

On Tue, Mar 17, 2015 at 9:03 PM, Nicolas Zea <z...@google.com> wrote:
FYI, Will Chan I believe did a similar investigation (cost of using RefCounted vs RefCountedThreadSafe everywhere) some years ago. See https://groups.google.com/a/chromium.org/forum/#!msg/chromium-dev/5OBvPvCctAw/udfXE82jYF0J for background.

Very relevant thread. Thanks for the link.

-- Eric

Éric Noyau

unread,
Mar 18, 2015, 6:19:49 AM3/18/15
to oshima, Dana Jansens, Matt Giuca, Robert Sesek, Chromium-dev, Kibeom Kim
On Wed, Mar 18, 2015 at 2:31 AM, oshima <osh...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:39 AM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

+1. gfx::Image itself isn't thread safe, so using RefCountedThreadSafe won't help. ImageStorage should implement NonThreadSafe to catch such usage.


The consequences of doing something like this are dire. This would prevent any code from sending an image to another thread to perform tasks such as:
  • Calculate its dominant color
  • Generate a resized version
  • Generate a desaturated version
  • <whatever my UI designer is going to come up next>
  • Prepare any kind of animation
If we make gfx::Image DCHECK on thread boundaries all that will be achieved is to force all this expensive computation to happen on the main thread, a very big nono in my book. Not to mention break a lot of code.

Being able to work on images on a separate thread is one of the main tenet of modern UI, especially on mobile. Maybe this usage should be made more explicit in the API, and only allowed for const Images, but it must be allowed, otherwise the nice smooth animations and the long lists of images scrolling effortlessly are going to be impossible to achieve.

-- Eric
 

James Robinson

unread,
Mar 18, 2015, 12:20:47 PM3/18/15
to no...@chromium.org, oshima, Dana Jansens, Matt Giuca, Robert Sesek, Chromium-dev, Kibeom Kim
On Wed, Mar 18, 2015 at 3:18 AM, Éric Noyau <no...@chromium.org> wrote:


On Wed, Mar 18, 2015 at 2:31 AM, oshima <osh...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:39 AM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

+1. gfx::Image itself isn't thread safe, so using RefCountedThreadSafe won't help. ImageStorage should implement NonThreadSafe to catch such usage.


The consequences of doing something like this are dire. This would prevent any code from sending an image to another thread to perform tasks such as:

No, it would mean you couldn't use gfx::Image to do these tasks.  There are many data types that can represent sending image data around.

- James

Jeffrey Yasskin

unread,
Mar 18, 2015, 1:06:27 PM3/18/15
to Mitsuru Oshima, Dana Jansens, Matt Giuca, no...@chromium.org, Robert Sesek, Chromium-dev, Kibeom Kim
On Tue, Mar 17, 2015 at 6:31 PM, oshima <osh...@chromium.org> wrote:


On Tue, Mar 17, 2015 at 11:39 AM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

+1. gfx::Image itself isn't thread safe, so using RefCountedThreadSafe won't help. ImageStorage should implement NonThreadSafe to catch such usage.

Since "thread safe" is a really ambiguous term, can you clarify what you mean? Assuming it's changed to use RefCountedThreadSafe under the covers, is it invalid to read a gfx::Image from multiple threads at once, or is it only invalid to modify the image from multiple threads at once? Are you worrying about the fact that it's refcounted, so a read-only guarantee has to be about all copies, which might be too difficult to ensure?

oshima

unread,
Mar 18, 2015, 1:24:47 PM3/18/15
to Jeffrey Yasskin, Dana Jansens, Matt Giuca, no...@chromium.org, Robert Sesek, Chromium-dev, Kibeom Kim
On Wed, Mar 18, 2015 at 10:05 AM, Jeffrey Yasskin <jyas...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 6:31 PM, oshima <osh...@chromium.org> wrote:


On Tue, Mar 17, 2015 at 11:39 AM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

+1. gfx::Image itself isn't thread safe, so using RefCountedThreadSafe won't help. ImageStorage should implement NonThreadSafe to catch such usage.

Since "thread safe" is a really ambiguous term, can you clarify what you mean? Assuming it's changed to use RefCountedThreadSafe under the covers, is it invalid to read a gfx::Image from multiple threads at once, or is it only invalid to modify the image from multiple threads at once? Are you worrying about the fact that it's refcounted, so a read-only guarantee has to be about all copies, which might be too difficult to ensure?

RefCountedThreadSafe simply make ref counting thread safe, but not the class itself, and "reading" gfx::Image may modify internal state. For example, AsXxx can methods modifies the internal ImageStorage, so two threads calling these methods can lead to bad state.

Alexander Potapenko

unread,
Mar 18, 2015, 1:28:13 PM3/18/15
to no...@chromium.org, Chromium-dev, Kibeom Kim
Shameless plug: I think this issue has been reported by
ThreadSanitizer 1 year ago (https://crbug.com/364006)

> Thoughts?
>
> -- Eric
>
>
>
>
> --
> --
> Chromium Developers mailing list: chromi...@chromium.org
> View archives, change email options, or unsubscribe:
> http://groups.google.com/a/chromium.org/group/chromium-dev



--
Alexander Potapenko
Software Engineer
Google Moscow

Peter Kasting

unread,
Mar 18, 2015, 8:27:35 PM3/18/15
to Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Matt Giuca, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim
On Wed, Mar 18, 2015 at 10:23 AM, oshima <osh...@chromium.org> wrote:

On Wed, Mar 18, 2015 at 10:05 AM, Jeffrey Yasskin <jyas...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 6:31 PM, oshima <osh...@chromium.org> wrote:


On Tue, Mar 17, 2015 at 11:39 AM, Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 17, 2015 at 11:36 AM, Matt Giuca <mgi...@chromium.org> wrote:
I've got a CL out to make gfx::Image threadsafe by using RefCountedThreadSafe:

It's about 11 months old though and stagnated due to performance concerns (the reviewer, +rsesek, raised performance concerns and I didn't follow up -- sorry about that).

I'm happy to push ahead with this CL if we have sufficient motivation to make gfx::Image thread-safe, but I think it would warrant some kind of perf analysis, and I'm not sure what we'd do if we found that it was having an adverse impact on performance.

The alternative is to stop reffing the class on multiple threads. We should do one or the other. (I'm all for simpler/less refcounting.)

+1. gfx::Image itself isn't thread safe, so using RefCountedThreadSafe won't help. ImageStorage should implement NonThreadSafe to catch such usage.

Since "thread safe" is a really ambiguous term, can you clarify what you mean? Assuming it's changed to use RefCountedThreadSafe under the covers, is it invalid to read a gfx::Image from multiple threads at once, or is it only invalid to modify the image from multiple threads at once? Are you worrying about the fact that it's refcounted, so a read-only guarantee has to be about all copies, which might be too difficult to ensure?

RefCountedThreadSafe simply make ref counting thread safe, but not the class itself, and "reading" gfx::Image may modify internal state. For example, AsXxx can methods modifies the internal ImageStorage, so two threads calling these methods can lead to bad state.

Still, that doesn't mean one has to go to NonThreadSafe, if doing that prevents using gfx::Image from multiple threads at all.

Instead, one could simply ensure that the gfx::Image is only ever accessed on one thread at a time.  If you want to pass it to another thread, make sure you lose access to it and don't regain it until the thread passes it back.  Such a model would probably make sense for some of the tasks Eric cited as making sense for a background thread.  Or put a lock around it to guarantee exclusive access, if that makes sense.

These would be analogous to how we treat most data types in most classes today.

PK

Daniel Cheng

unread,
Mar 18, 2015, 8:38:16 PM3/18/15
to pkas...@chromium.org, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Matt Giuca, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim

Would making something RefCountedThreadSafe give the impression it's safe to use on multiple threads even when it isn't?

Daniel

(I seem to recall a similar thread in the past but I'm having trouble finding it)


--

Peter Kasting

unread,
Mar 18, 2015, 8:39:48 PM3/18/15
to Daniel Cheng, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Matt Giuca, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim
On Wed, Mar 18, 2015 at 5:37 PM, Daniel Cheng <dch...@chromium.org> wrote:

Would making something RefCountedThreadSafe give the impression it's safe to use on multiple threads even when it isn't?

I wouldn't make something RefCountedThreadSafe if it's not safe to use on multiple threads.

IOW, I wouldn't make the internal storage of gfx::Image RefCountedThreadSafe if we can instead safely use it from multiple threads by ensuring only one thread is accessing it at a time.  In that case RefCounted storage should work correctly, unless I'm mistaken about how RefCounted works. 

Daniel Cheng

unread,
Mar 18, 2015, 8:44:03 PM3/18/15
to Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Matt Giuca, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim

What I'm thinking about is refcount issues like this:
other_thread->PostTask(..., base::Bind(&DoImageStuff, image));
// Current thread and other thread release refs in a racy manner.

Daniel

Dana Jansens

unread,
Mar 18, 2015, 8:49:44 PM3/18/15
to Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Matt Giuca, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim
On Wed, Mar 18, 2015 at 5:43 PM, Daniel Cheng <dch...@chromium.org> wrote:

What I'm thinking about is refcount issues like this:
other_thread->PostTask(..., base::Bind(&DoImageStuff, image));
// Current thread and other thread release refs in a racy manner.

If you have to synchronize use of it across threads, maybe you don't need/want to be reffing it on the other thread at all? Not sure if that is really needed for this case or not, it just happens to be the default easiest behaviour to achieve.

John Bauman

unread,
Mar 18, 2015, 8:50:09 PM3/18/15
to Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Matt Giuca, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim
You could allocate a single gfx::Image on the heap and pass a scoped_ptr to it between threads, which is ugly but would prevent the refcounts from changing in this manner.

--

Matt Giuca

unread,
Mar 23, 2015, 10:33:52 PM3/23/15
to John Bauman, Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Eric Noyau, Robert Sesek, Chromium-dev, Kibeom Kim
I've done some more digging and filed a new bug about the general RefCounted thread safety (http://crbug.com/469952).

I think the best approach is to make RefCounted DCHECK when referenced from multiple threads (which is what Eric did in his patch referenced in the original email), but in order to not break Chrome (or block on fixing all of these issues), introduce a fallback class UnsafeRefCounted which can be used by any classes that do not pass the DCHECKs. This allows us to have a clearly visible list of transgressors that we can fix up over time.

I've put up a CL to do that:
(Not ready for review yet; need to fix test failures.)

(Again, apologies to Eric for making an alternative CL.)

Would folks be amenable to this approach? (Keep it broken in the short term, but clearly visible in the source code where it is broken.)

Kibeom Kim

unread,
Mar 24, 2015, 12:08:26 AM3/24/15
to Matt Giuca, John Bauman, Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
It will be really nice to have RefCounted DCHECK, but one issue is, often we just want to pass the ownership to another thread, maybe using scoped_ptr as jbauman@ said (and I was going to fix the case Eric mentioned by scoped_ptr) and will fail. We should support that flow.

Matt Giuca

unread,
Mar 24, 2015, 3:46:17 AM3/24/15
to Kibeom Kim, John Bauman, Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
I see, so you are suggesting that for gfx::Image (i.e., a class that has a scoped_refptr to a RefCounted), if you want to create it on Thread A and pass it to Thread B, you create it in a scoped_ptr and pass it via that mechanism. Thus, the Image never gets copied and so its internal refcount remains at 1 during the entire transfer process.

Although the refcount is incremented on Thread A and later manipulated on Thread B, it is thread-safe because the reference is never manipulated while the object is in transit between the threads.

Good point: this will violate the DCHECK I proposed but it should be perfectly thread-safe. I will have to think more about whether it is possible to have some kind of DCHECK to catch the legitimate threading violations. (That would be highly desirable, rather than having to wait for TSan to catch them.)

I want to make sure that we catch code like this:

void FunctionOnUIThread() {
  base::scoped_refptr<MyObject> my_object(new MyObject);
  BrowserThread::PostTask(
      BrowserThread::FILE,
      FROM_HERE,
      base::Bind(&FunctionOnFILEThread, my_object);
}

void FunctionOnFILEThread(base::scoped_refptr<MyObject> my_object) {
  g_my_object = my_object;
}

Because even though it *looks* benign (just passing an object via scoped_refptr from one thread to another), there is (I believe) a race to update the refcount on the two threads.

Dana Jansens

unread,
Mar 24, 2015, 1:18:09 PM3/24/15
to Matt Giuca, Kibeom Kim, John Bauman, Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 12:45 AM, Matt Giuca <mgi...@chromium.org> wrote:
I see, so you are suggesting that for gfx::Image (i.e., a class that has a scoped_refptr to a RefCounted), if you want to create it on Thread A and pass it to Thread B, you create it in a scoped_ptr and pass it via that mechanism. Thus, the Image never gets copied and so its internal refcount remains at 1 during the entire transfer process.

I'm not sure but I think maybe the suggestion meant to wrap a reference in a object owned by a scoped_ptr, and pass that around? You need to ensure you pass it back to the original thread to destroy it though.

Putting a refcounted thing directly in a scoped_ptr is bad because it deletes it on release, but that violates the reference counting contract, so you can end up with use-after-frees.

Alternately, putting a RefCounted* in base::Unretained() prevents Bind from adding references to it, and you'll get back a raw pointer on the other side.

Kibeom Kim

unread,
Mar 24, 2015, 2:07:11 PM3/24/15
to Dana Jansens, Matt Giuca, John Bauman, Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
Sorry for not elaborating enough... :( Matt parsed correctly.


I was planning to allocate |image| in heap at the first place, put inside scoped_ptr, and pass by base::Passed. (At least for the meantime until we find a better solution, we're observing crashes due to this.)

Dana Jansens

unread,
Mar 24, 2015, 2:12:45 PM3/24/15
to Kibeom Kim, Matt Giuca, John Bauman, Daniel Cheng, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 11:06 AM, Kibeom Kim <kkim...@chromium.org> wrote:
Sorry for not elaborating enough... :( Matt parsed correctly.


I was planning to allocate |image| in heap at the first place, put inside scoped_ptr, and pass by base::Passed. (At least for the meantime until we find a better solution, we're observing crashes due to this.)

OK That's a bit better since gfx::Image itself is not refcounted, but it still does have a copy constructor. So, FWIW this would still allow you to assign *scoped_gfx_image to something and end up reffing it on the other thread. While it introduces a level of indirection around doing the wrong thing, it's by no means making things bulletproof.

Daniel Cheng

unread,
Mar 24, 2015, 6:34:37 PM3/24/15
to Matt Giuca, Kibeom Kim, John Bauman, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Dana Jansens, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
We could do something similar to NonThreadSafe::DetachFromThread() that allows you to unbind from a thread, right?

Daniel

Dana Jansens

unread,
Mar 24, 2015, 7:23:11 PM3/24/15
to Matt Giuca, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 4:12 PM, Matt Giuca <mgi...@google.com> wrote:
Daniel: I thought a bit about that (basically as a way to let me proceed with adding a DCHECK mechanism to catch thread violations, but allow hand-off from one thread to another). But the problem is it's hard to do the hand-off with layers of containers (like Image).

In fact, in my specific instance of this problem, I'm passing a ShortcutInfo to another thread. ShortcutInfo contains an ImageFamily which contains a map of Images which contain a scoped_refptr to an ImageStorage, which is RefCounted. So to do that hand-off correctly, I would have to add a DetachFromThread() delegate method to Image, ImageFamily and ShortcutInfo. That sounds onerous if everybody has to do that to get around the rather blunt DCHECKs.

Dana: Can you elaborate on "You need to ensure you pass it back to the original thread to destroy it though." I wasn't aware of this requirement; that sounds like creating an object on one thread and handing it off to another is not safe after all?

If the RefCounted class inherits from NonThreadSafe, then destroying it on the wrong thread will fire a DCHECK.

Generally, destroying things on a different thread, when they are not explicitly designed for this is a problem and causes bugs, so that DCHECK seems WAI. Handing it off to another thread to use until its done and gives it back should be safe. Other behaviours sound like they need the class to be thread-safe. Destruction is changing the ref-counts so it would require thread-safe atomic ref counting anyhow.

Matt Giuca

unread,
Mar 24, 2015, 7:50:09 PM3/24/15
to Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
Oops, sent from wrong email address. Repeating my previous message:

Daniel: I thought a bit about that (basically as a way to let me proceed with adding a DCHECK mechanism to catch thread violations, but allow hand-off from one thread to another). But the problem is it's hard to do the hand-off with layers of containers (like Image).

In fact, in my specific instance of this problem, I'm passing a ShortcutInfo to another thread. ShortcutInfo contains an ImageFamily which contains a map of Images which contain a scoped_refptr to an ImageStorage, which is RefCounted. So to do that hand-off correctly, I would have to add a DetachFromThread() delegate method to Image, ImageFamily and ShortcutInfo. That sounds onerous if everybody has to do that to get around the rather blunt DCHECKs.

Dana: Can you elaborate on "You need to ensure you pass it back to the original thread to destroy it though." I wasn't aware of this requirement; that sounds like creating an object on one thread and handing it off to another is not safe after all?

On Wed, 25 Mar 2015 at 10:22 Dana Jansens <dan...@chromium.org> wrote:

If the RefCounted class inherits from NonThreadSafe, then destroying it on the wrong thread will fire a DCHECK.

Generally, destroying things on a different thread, when they are not explicitly designed for this is a problem and causes bugs, so that DCHECK seems WAI. Handing it off to another thread to use until its done and gives it back should be safe. Other behaviours sound like they need the class to be thread-safe. Destruction is changing the ref-counts so it would require thread-safe atomic ref counting anyhow.

Hmm, it doesn't *necessarily* require the class to be thread-safe, in that if you create a class on Thread A, safely hand it off to Thread B (even without the cooperation of that class), and then destroy it on Thread B, you aren't ever going to simultaneously run methods of that class on two threads. So even a non-thread-safe class should support that. But there is the possibility that the class will be specifically designed to be destroyed on a certain thread, so you are right in general.

I still think there's a valid case for creating an object on one thread and handing it off to another thread where it will be destroyed, and we should support that if we're going to introduce DCHECKs.

Dana Jansens

unread,
Mar 24, 2015, 7:56:57 PM3/24/15
to Matt Giuca, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 4:49 PM, Matt Giuca <mgi...@chromium.org> wrote:
Oops, sent from wrong email address. Repeating my previous message:

Daniel: I thought a bit about that (basically as a way to let me proceed with adding a DCHECK mechanism to catch thread violations, but allow hand-off from one thread to another). But the problem is it's hard to do the hand-off with layers of containers (like Image).

In fact, in my specific instance of this problem, I'm passing a ShortcutInfo to another thread. ShortcutInfo contains an ImageFamily which contains a map of Images which contain a scoped_refptr to an ImageStorage, which is RefCounted. So to do that hand-off correctly, I would have to add a DetachFromThread() delegate method to Image, ImageFamily and ShortcutInfo. That sounds onerous if everybody has to do that to get around the rather blunt DCHECKs.

Dana: Can you elaborate on "You need to ensure you pass it back to the original thread to destroy it though." I wasn't aware of this requirement; that sounds like creating an object on one thread and handing it off to another is not safe after all?

On Wed, 25 Mar 2015 at 10:22 Dana Jansens <dan...@chromium.org> wrote:

If the RefCounted class inherits from NonThreadSafe, then destroying it on the wrong thread will fire a DCHECK.

Generally, destroying things on a different thread, when they are not explicitly designed for this is a problem and causes bugs, so that DCHECK seems WAI. Handing it off to another thread to use until its done and gives it back should be safe. Other behaviours sound like they need the class to be thread-safe. Destruction is changing the ref-counts so it would require thread-safe atomic ref counting anyhow.

Hmm, it doesn't *necessarily* require the class to be thread-safe, in that if you create a class on Thread A, safely hand it off to Thread B (even without the cooperation of that class), and then destroy it on Thread B, you aren't ever going to simultaneously run methods of that class on two threads. So even a non-thread-safe class should support that. But there is the possibility that the class will be specifically designed to be destroyed on a certain thread, so you are right in general.

I still think there's a valid case for creating an object on one thread and handing it off to another thread where it will be destroyed, and we should support that if we're going to introduce DCHECKs.

That sounds like a reasonable case for RefCountedThreadSafe? Or do you think that's too heavyweight for this use case?

Matt Giuca

unread,
Mar 24, 2015, 11:08:07 PM3/24/15
to Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Wed, 25 Mar 2015 at 10:55 Dana Jansens <dan...@chromium.org> wrote:
On Tue, Mar 24, 2015 at 4:49 PM, Matt Giuca <mgi...@chromium.org> wrote:
Oops, sent from wrong email address. Repeating my previous message:

Daniel: I thought a bit about that (basically as a way to let me proceed with adding a DCHECK mechanism to catch thread violations, but allow hand-off from one thread to another). But the problem is it's hard to do the hand-off with layers of containers (like Image).

In fact, in my specific instance of this problem, I'm passing a ShortcutInfo to another thread. ShortcutInfo contains an ImageFamily which contains a map of Images which contain a scoped_refptr to an ImageStorage, which is RefCounted. So to do that hand-off correctly, I would have to add a DetachFromThread() delegate method to Image, ImageFamily and ShortcutInfo. That sounds onerous if everybody has to do that to get around the rather blunt DCHECKs.

Dana: Can you elaborate on "You need to ensure you pass it back to the original thread to destroy it though." I wasn't aware of this requirement; that sounds like creating an object on one thread and handing it off to another is not safe after all?

On Wed, 25 Mar 2015 at 10:22 Dana Jansens <dan...@chromium.org> wrote:

If the RefCounted class inherits from NonThreadSafe, then destroying it on the wrong thread will fire a DCHECK.

Generally, destroying things on a different thread, when they are not explicitly designed for this is a problem and causes bugs, so that DCHECK seems WAI. Handing it off to another thread to use until its done and gives it back should be safe. Other behaviours sound like they need the class to be thread-safe. Destruction is changing the ref-counts so it would require thread-safe atomic ref counting anyhow.

Hmm, it doesn't *necessarily* require the class to be thread-safe, in that if you create a class on Thread A, safely hand it off to Thread B (even without the cooperation of that class), and then destroy it on Thread B, you aren't ever going to simultaneously run methods of that class on two threads. So even a non-thread-safe class should support that. But there is the possibility that the class will be specifically designed to be destroyed on a certain thread, so you are right in general.

I still think there's a valid case for creating an object on one thread and handing it off to another thread where it will be destroyed, and we should support that if we're going to introduce DCHECKs.

That sounds like a reasonable case for RefCountedThreadSafe? Or do you think that's too heavyweight for this use case?

So my original proposal was just to make gfx::Image use RefCountedThreadSafe instead of trying to force all the clients to carefully avoid referencing Image objects on multiple threads at a time. (See my original CL: https://codereview.chromium.org/240293006/).

As a sort of straw-man, I have created a CL doing what Kibeom suggested for the images around app shortcut creation (which was my original stake in this matter):

So just as one data point: I think this was quite an excessive refactor and could present architectural problems in the future. I was forced to make the ShortcutInfo class (which wraps the gfx::Images) non-copyable and convert all clients to pass it around in a scoped_ptr. I had to follow the trail all the way back up into the UI level where the ShortcutInfo was being copied from, and rewrite the code so the ShortcutInfo gets moved in a scoped_ptr, rather than being copied. This makes it impossible to let the dialog box keep a copy of the image that's being written to disk (fortunately, we don't need to do this right now, but it's an example of something that becomes impossible under this scheme).

And we're talking about potentially every cross-thread usage of gfx::Image having to go through a similar process. Based on this single data point, I think it's much simpler to convert Image to use RefCountedThreadSafe.

oshima

unread,
Mar 25, 2015, 12:32:08 AM3/25/15
to Matt Giuca, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
As I mentioned before, gfx::Image (and other imaged stored in ImageStorage) isn't thread safe, so making gfx::Image (I assume what you really meant is ImageStorage, not gfx::Image itself though) RefCountedThreadSafe can increase confusion and misuse. Even if you make gfx::Image's copy thread-safe, image representations stored in the gfx::Image aren't thread safe, so any attempt to create a copy of gfx::ImageSkia in gfx::Image in different threads can cause similar issue.

I assume what you want to achieve is to safely pass an object to another thread? If so, how about creating generic wrapper class that simply move ownership on copy? It prevents a chance for the object to be accessed on two threads.

Daniel Cheng

unread,
Mar 25, 2015, 1:59:00 AM3/25/15
to oshima, Matt Giuca, Dana Jansens, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Wed, Mar 25, 2015 at 12:30 PM oshima <osh...@chromium.org> wrote:
As I mentioned before, gfx::Image (and other imaged stored in ImageStorage) isn't thread safe, so making gfx::Image (I assume what you really meant is ImageStorage, not gfx::Image itself though) RefCountedThreadSafe can increase confusion and misuse. Even if you make gfx::Image's copy thread-safe, image representations stored in the gfx::Image aren't thread safe, so any attempt to create a copy of gfx::ImageSkia in gfx::Image in different threads can cause similar issue.

I assume what you want to achieve is to safely pass an object to another thread? If so, how about creating generic wrapper class that simply move ownership on copy? It prevents a chance for the object to be accessed on two threads.

How is this different/better than scoped_ptr<gfx::Image>? Also, I think I'd be more comfortable if this transfer was explicit rather than on copy.

Matt Giuca

unread,
Mar 25, 2015, 2:01:08 AM3/25/15
to oshima, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Wed, 25 Mar 2015 at 15:30 oshima <osh...@chromium.org> wrote:
As I mentioned before, gfx::Image (and other imaged stored in ImageStorage) isn't thread safe, so making gfx::Image (I assume what you really meant is ImageStorage, not gfx::Image itself though) RefCountedThreadSafe can increase confusion and misuse. Even if you make gfx::Image's copy thread-safe, image representations stored in the gfx::Image aren't thread safe, so any attempt to create a copy of gfx::ImageSkia in gfx::Image in different threads can cause similar issue.

I think this is making perfect the enemy of good, so to speak.

Let's say all I'm trying to do is this (which is true for my use case in app shortcut creation):
1. Create an Image on Thread A.
2. Pass it to Thread B. Image will not be read or written again on Thread A.
3. Read the Image on Thread B.
4. Destroy the Image on Thread B.

As I see it, there are two ways to accomplish this without a data race:
a) Make the ImageStorage use RefCountedThreadSafe, or
b) Wrap the Image in a scoped_ptr and pass it to the other thread without touching the ref count.

Now, neither of these approaches solve the problem of the thread-safety of the underlying ImageSkia. Depending on your use case, that may or may not be a problem. In my use case, I don't think it is a problem, but in other use cases (concurrently reading or writing the image), it would still be a problem. Crucially, it's still a problem for both approach (a) and (b).

So the fact that (a) won't make Image completely thread-safe cannot be used as an argument in favour of (b). The fact is that (a) is a partial solution to the problem, and I think (having fully implemented both solutions) that (a) is a much simpler, less error-prone and general solution than (b).

I don't think we should let the fact that it is not a complete thread-safety solution stop us from going ahead with it as a partial solution (as long as the documentation for Image still states that it is not thread-safe).

Another thing we could do is add a MakeThreadSafe() method to gfx::Image which delegates to ImageSkia::MakeThreadSafe(). That way, we would truly have an optionally-thread-safe Image class. But we would need to make sure the bits in between ImageSkia and Image (i.e., ImageStorage's ref count) are also thread-safe.

Peter Kasting

unread,
Mar 25, 2015, 3:00:06 AM3/25/15
to Matt Giuca, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Mitsuru Oshima, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 8:07 PM, Matt Giuca <mgi...@chromium.org> wrote:
As a sort of straw-man, I have created a CL doing what Kibeom suggested for the images around app shortcut creation (which was my original stake in this matter):

So just as one data point: I think this was quite an excessive refactor and could present architectural problems in the future. I was forced to make the ShortcutInfo class (which wraps the gfx::Images) non-copyable and convert all clients to pass it around in a scoped_ptr. I had to follow the trail all the way back up into the UI level where the ShortcutInfo was being copied from, and rewrite the code so the ShortcutInfo gets moved in a scoped_ptr, rather than being copied. This makes it impossible to let the dialog box keep a copy of the image that's being written to disk (fortunately, we don't need to do this right now, but it's an example of something that becomes impossible under this scheme).

And we're talking about potentially every cross-thread usage of gfx::Image having to go through a similar process. Based on this single data point, I think it's much simpler to convert Image to use RefCountedThreadSafe.

In fairness, most classes aren't supposed to be copyable to begin with.  I wonder if the reason this feels somewhat painful is that we let classes be made copyable that never should have been.

I don't know enough to know whether that is accurate.

PK 

oshima

unread,
Mar 25, 2015, 3:50:53 AM3/25/15
to Daniel Cheng, Matt Giuca, Dana Jansens, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 10:58 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Wed, Mar 25, 2015 at 12:30 PM oshima <osh...@chromium.org> wrote:
As I mentioned before, gfx::Image (and other imaged stored in ImageStorage) isn't thread safe, so making gfx::Image (I assume what you really meant is ImageStorage, not gfx::Image itself though) RefCountedThreadSafe can increase confusion and misuse. Even if you make gfx::Image's copy thread-safe, image representations stored in the gfx::Image aren't thread safe, so any attempt to create a copy of gfx::ImageSkia in gfx::Image in different threads can cause similar issue.

I assume what you want to achieve is to safely pass an object to another thread? If so, how about creating generic wrapper class that simply move ownership on copy? It prevents a chance for the object to be accessed on two threads.

How is this different/better than scoped_ptr<gfx::Image>? Also, I think I'd be more comfortable if this transfer was explicit rather than on copy.


It's mostly for convenience. You can't use scoped_ptr as a member variable of copyable struct (and gfx::Image is designed to be copy-able). I know it's debatable, and I don't have strong opinion about which is better. I'm perfectly fine with scoped_ptr<gfx::Image>, although other people may feel the other way.

oshima

unread,
Mar 25, 2015, 3:55:37 AM3/25/15
to Peter Kasting, Matt Giuca, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
gfx::ImageSkia is designed to be copy-able, backed with RefCounted ImageStorage to share actual image data.
I don't know the history (It's already this way when I joined), but maybe because it's more convenient than directly using scoped_refptr everywhere? I'm not sure.

oshima

unread,
Mar 25, 2015, 4:15:45 AM3/25/15
to Matt Giuca, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 11:00 PM, Matt Giuca <mgi...@chromium.org> wrote:
On Wed, 25 Mar 2015 at 15:30 oshima <osh...@chromium.org> wrote:
As I mentioned before, gfx::Image (and other imaged stored in ImageStorage) isn't thread safe, so making gfx::Image (I assume what you really meant is ImageStorage, not gfx::Image itself though) RefCountedThreadSafe can increase confusion and misuse. Even if you make gfx::Image's copy thread-safe, image representations stored in the gfx::Image aren't thread safe, so any attempt to create a copy of gfx::ImageSkia in gfx::Image in different threads can cause similar issue.

I think this is making perfect the enemy of good, so to speak.

Let's say all I'm trying to do is this (which is true for my use case in app shortcut creation):
1. Create an Image on Thread A.
2. Pass it to Thread B. Image will not be read or written again on Thread A.
3. Read the Image on Thread B.
4. Destroy the Image on Thread B.

As I see it, there are two ways to accomplish this without a data race:
a) Make the ImageStorage use RefCountedThreadSafe, or
b) Wrap the Image in a scoped_ptr and pass it to the other thread without touching the ref count.

Now, neither of these approaches solve the problem of the thread-safety of the underlying ImageSkia.
Depending on your use case, that may or may not be a problem. In my use case, I don't think it is a problem, but in other use cases (concurrently reading or writing the image), it would still be a problem. Crucially, it's still a problem for both approach (a) and (b).

Yes, that's because that's different issue. There are two kinds of "thread-safety".
1) an object can be passed/shared between multiple objects and can be safely destroyed.
  a) If you need to copy reference on multiple threads at the same time, you have to use RefCountedThreadSafe.
      This needs extra caution because "using" such object from multiple thread isn't guaranteed to be safe, unless
      the object is designed to be thread safe.
  b) If you just need to pass an object to another thread, and the object will only be used on that thread, you don't have to use RefCountedThreadSafe.

2) an object can be accessed from multiple threads safely. RefCountedThreadSafe won't solve this.
 
My understanding is that you're trying to solve 1.b).


So the fact that (a) won't make Image completely thread-safe cannot be used as an argument in favour of (b). The fact is that (a) is a partial solution to the problem, and I think (having fully implemented both solutions) that (a) is a much simpler, less error-prone and general solution than (b).

The code may look simpler but implication is not. I understand that you want to simplify the code, and that's why I suggested to create a wrapper. It'd be like unique_ptr, but limited. I believe it won't make the code much more complicated.


I don't think we should let the fact that it is not a complete thread-safety solution stop us from going ahead with it as a partial solution (as long as the documentation for Image still states that it is not thread-safe).

My point is that using RefCountedThreadSafe just to pass an object to another thread is overkill and potentially dangerous.
I prefer a design that prevents misuse than promote. just my 2c.

Matt Giuca

unread,
Mar 25, 2015, 6:46:07 PM3/25/15
to oshima, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
Yes, that's because that's different issue. There are two kinds of "thread-safety".
1) an object can be passed/shared between multiple objects and can be safely destroyed.
  a) If you need to copy reference on multiple threads at the same time, you have to use RefCountedThreadSafe.
      This needs extra caution because "using" such object from multiple thread isn't guaranteed to be safe, unless
      the object is designed to be thread safe.
  b) If you just need to pass an object to another thread, and the object will only be used on that thread, you don't have to use RefCountedThreadSafe.

2) an object can be accessed from multiple threads safely. RefCountedThreadSafe won't solve this.

Thanks for breaking it down into numbers and letters. Very useful for reference.

My understanding is that you're trying to solve 1.b).

Well, I'm technically trying to solve 1a, but my use case is fairly close to 1b.

That is: the current use of gfx::Image (in my use case of shortcut creation) is absolutely 1a, in that we have a reference to the gfx::Image in a dialog box. The dialog box passes a copy of the gfx::Image to the FILE thread, while still holding a reference to it. Then it closes, releasing its handle on the gfx::Image. (By "coincidence", it closes immediately after sending the image to the FILE thread, but we could easily keep the dialog open for further user input.)

So that is 1a "you need to copy reference on multiple threads at the same time", which is why I've been suggesting we use RefCountedThreadSafe. Maybe it's overkill to convert the entire Image class, but I'm guessing there are other uses of gfx::Image that also fall into 1a (because we are seeing TSan errors on other usages of gfx::Image besides mine).

Having said all that, my use case is reasonably close to 1b, in that I technically don't need the gfx::Image in the dialog box after passing it to the FILE thread. Converting my use case to 1b is the subject of https://codereview.chromium.org/1038573002/ which I mentioned earlier. My point is that it made my code quite a bit more constrained (in how it can be extended in the future). All other things being equal, I'd rather be allowed to write 1a uses of gfx::Image, than have all clients forced to do 1b.
 

The code may look simpler but implication is not. I understand that you want to simplify the code, and that's why I suggested to create a wrapper. It'd be like unique_ptr, but limited. I believe it won't make the code much more complicated.

You're basically suggesting that I use scoped_ptr. (I could make a wrapper to avoid the syntactic overhead of writing "scoped_ptr" everywhere.) The issue I have is not syntactic overhead. It's that throughout the entire codebase, every object that holds a reference to a gfx::Image that might be transferred to another thread at some point in the future must be non-copyable and single-owned (either inside a scoped_ptr, or some equivalent mechanism).

I can do this. I've done it on that CL. But I just want to be clear about the architectural (not syntactic) burden that this potentially places on all clients.

Another approach suggested by benwells@ on that CL is that instead of making sure the Image is never copied, we DeepCopy() the underlying ImageSkia just before handing off to the other thread. That adds some additional overhead, obviously, but at least it localizes the problem. It would mean adding a DeepCopy() method to the Image and ImageStorage classes.

Dana Jansens

unread,
Mar 25, 2015, 6:52:48 PM3/25/15
to Matt Giuca, oshima, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Wed, Mar 25, 2015 at 3:45 PM, Matt Giuca <mgi...@chromium.org> wrote:
Yes, that's because that's different issue. There are two kinds of "thread-safety".
1) an object can be passed/shared between multiple objects and can be safely destroyed.
  a) If you need to copy reference on multiple threads at the same time, you have to use RefCountedThreadSafe.
      This needs extra caution because "using" such object from multiple thread isn't guaranteed to be safe, unless
      the object is designed to be thread safe.
  b) If you just need to pass an object to another thread, and the object will only be used on that thread, you don't have to use RefCountedThreadSafe.

2) an object can be accessed from multiple threads safely. RefCountedThreadSafe won't solve this.

Thanks for breaking it down into numbers and letters. Very useful for reference.

My understanding is that you're trying to solve 1.b).

Well, I'm technically trying to solve 1a, but my use case is fairly close to 1b.

That is: the current use of gfx::Image (in my use case of shortcut creation) is absolutely 1a, in that we have a reference to the gfx::Image in a dialog box. The dialog box passes a copy of the gfx::Image to the FILE thread, while still holding a reference to it. Then it closes, releasing its handle on the gfx::Image. (By "coincidence", it closes immediately after sending the image to the FILE thread, but we could easily keep the dialog open for further user input.)

So that is 1a "you need to copy reference on multiple threads at the same time", which is why I've been suggesting we use RefCountedThreadSafe. Maybe it's overkill to convert the entire Image class, but I'm guessing there are other uses of gfx::Image that also fall into 1a (because we are seeing TSan errors on other usages of gfx::Image besides mine).

Having said all that, my use case is reasonably close to 1b, in that I technically don't need the gfx::Image in the dialog box after passing it to the FILE thread. Converting my use case to 1b is the subject of https://codereview.chromium.org/1038573002/ which I mentioned earlier. My point is that it made my code quite a bit more constrained (in how it can be extended in the future). All other things being equal, I'd rather be allowed to write 1a uses of gfx::Image, than have all clients forced to do 1b.
 

The code may look simpler but implication is not. I understand that you want to simplify the code, and that's why I suggested to create a wrapper. It'd be like unique_ptr, but limited. I believe it won't make the code much more complicated.

You're basically suggesting that I use scoped_ptr. (I could make a wrapper to avoid the syntactic overhead of writing "scoped_ptr" everywhere.) The issue I have is not syntactic overhead. It's that throughout the entire codebase, every object that holds a reference to a gfx::Image that might be transferred to another thread at some point in the future must be non-copyable and single-owned (either inside a scoped_ptr, or some equivalent mechanism).

I can do this. I've done it on that CL. But I just want to be clear about the architectural (not syntactic) burden that this potentially places on all clients.

It may be just me but I generally find it to be the opposite. Ref-counting adds a burden of complexity in that now we have email threads just as this one. Single ownership is very clear about where things are owned and where they are destroyed. You can always keep around raw pointers if you want to access it in more than one place, you just need to coordinate the lifetime, which makes the code explicit. It requires more thought when writing code, but improves understanding of it for others, and hackability in the future, IMHO.

Peter Kasting

unread,
Mar 25, 2015, 7:29:07 PM3/25/15
to Dana Jansens, Matt Giuca, oshima, Daniel Cheng, Kibeom Kim, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Wed, Mar 25, 2015 at 3:51 PM, Dana Jansens <dan...@chromium.org> wrote:
You're basically suggesting that I use scoped_ptr. (I could make a wrapper to avoid the syntactic overhead of writing "scoped_ptr" everywhere.) The issue I have is not syntactic overhead. It's that throughout the entire codebase, every object that holds a reference to a gfx::Image that might be transferred to another thread at some point in the future must be non-copyable and single-owned (either inside a scoped_ptr, or some equivalent mechanism).

I can do this. I've done it on that CL. But I just want to be clear about the architectural (not syntactic) burden that this potentially places on all clients.

It may be just me but I generally find it to be the opposite. Ref-counting adds a burden of complexity in that now we have email threads just as this one. Single ownership is very clear about where things are owned and where they are destroyed. You can always keep around raw pointers if you want to access it in more than one place, you just need to coordinate the lifetime, which makes the code explicit. It requires more thought when writing code, but improves understanding of it for others, and hackability in the future, IMHO.

+1.  I generally find refcounting to be an anti-pattern.  This may be because I've been in too many threads where we were trying to track down how and when an object got destroyed, and it would have been a lot clearer to have 50 lines of code that explicitly managed the (single) object's lifetime than 10 lines of code that just refcounted.  Perhaps some places it's actually awesome.  And I haven't looked at your specific patch.

This does line up with longstanding team policy, though, which I would state as "in general, don't make things threadsafe; if you need to use them on multiple threads, pass them (or copy them) between threads and ensure that only one thread can possibly access them at once".  That's been a rule of thumb since very early in the project's life.

PK 

Ben Wells

unread,
Mar 25, 2015, 7:30:25 PM3/25/15
to dan...@chromium.org, Matt Giuca, oshima, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
I had a look at the scoped_ptr CL Matt put up. I think what irked me about it is the way we've got a hidden ref counted thing (i.e. the storage inside an image) we don't really trust much, so we're wrapping it in a scoped ptr to handle its lifetime issues in a way we can understand.

It felt like a hack to work around some deeper problem, but I am not an expert in the history or usage of images in chromium.

Jeffrey Yasskin

unread,
Mar 25, 2015, 7:39:37 PM3/25/15
to oshima, Matt Giuca, Dana Jansens, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Tue, Mar 24, 2015 at 9:30 PM, oshima <osh...@chromium.org> wrote:
> As I mentioned before, gfx::Image (and other imaged stored in ImageStorage)
> isn't thread safe, so making gfx::Image (I assume what you really meant is
> ImageStorage, not gfx::Image itself though) RefCountedThreadSafe can
> increase confusion and misuse.

We could take Will Chan's suggestion and make RefCounted itself
thread-safe, so that there's no written indication that a class
inheriting from it is, itself, thread-safe for things other than
ownership transfers. That would eliminate a footgun from the
refcounting infrastructure, which seems like a win even if refcounting
itself is a bad idea.

Matt Giuca

unread,
Mar 25, 2015, 7:42:05 PM3/25/15
to Ben Wells, dan...@chromium.org, oshima, Daniel Cheng, Kibeom Kim, John Bauman, Peter Kasting, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
Right. I'm totally in agreement (with Dana and Peter) about single ownership vs ref counting. I'm a huge fan of C++s default mentality of everything is a value with clear ownership. I'm not trying to argue for reference counting.

The problem here is that Image is already refcounted, and that means it breaks the C++ default value semantics for everything that contains it (ImageFamily, ShortcutInfo, and so on). I'm used to things either being non-copyable, or copyable with deep copy semantics. What's murky about this is that normally, I only have to think about thread safety when I pass a pointer (or scoped_refptr) to an object on another thread. Image is forcing me to consider thread safety of passing an object by value to another thread. There is the problem (and my suggestion is that an object that pretents to be a value but secretly refcounts behind the scenes should have the courtesy of also secretly being thread-safe).

Perhaps the correct answer is "just assume Image is non-copyable" (which is effectively what I'm doing in my CL). That demonstrates the weirdness Ben is talking about: having an object that is conceptually "non-copyable" but with a reference count of exactly 1 inside it. But yeah, I'm happy to go ahead with this since it's a localized fix that won't require changing the way this heavily-used class works.

Peter Kasting

unread,
Mar 25, 2015, 7:45:23 PM3/25/15
to Matt Giuca, Ben Wells, Dana Jansens, oshima, Daniel Cheng, Kibeom Kim, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
On Wed, Mar 25, 2015 at 4:41 PM, Matt Giuca <mgi...@chromium.org> wrote:
Right. I'm totally in agreement (with Dana and Peter) about single ownership vs ref counting. I'm a huge fan of C++s default mentality of everything is a value with clear ownership. I'm not trying to argue for reference counting.

The problem here is that Image is already refcounted, and that means it breaks the C++ default value semantics for everything that contains it (ImageFamily, ShortcutInfo, and so on). I'm used to things either being non-copyable, or copyable with deep copy semantics. What's murky about this is that normally, I only have to think about thread safety when I pass a pointer (or scoped_refptr) to an object on another thread. Image is forcing me to consider thread safety of passing an object by value to another thread. There is the problem (and my suggestion is that an object that pretents to be a value but secretly refcounts behind the scenes should have the courtesy of also secretly being thread-safe).

Perhaps the correct answer is "just assume Image is non-copyable" (which is effectively what I'm doing in my CL). That demonstrates the weirdness Ben is talking about: having an object that is conceptually "non-copyable" but with a reference count of exactly 1 inside it. But yeah, I'm happy to go ahead with this since it's a localized fix that won't require changing the way this heavily-used class works.

That's a good way of phrasing things.  Perhaps Image should really be made non-copyable and not have any kind of reference-counting.

Someone like Dana would be more able than me to say if that makes sense.

PK 

Kibeom Kim

unread,
Mar 25, 2015, 8:15:49 PM3/25/15
to Peter Kasting, Matt Giuca, Ben Wells, Dana Jansens, oshima, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
If that can be done, I think it's the cleanest. I suspect that many places took advantage of gfx::Image's internally ref counted copy, for moving stack allocated images. Since we don't allow std::move yet, probably we should implement .Pass() too.



oshima

unread,
Mar 25, 2015, 8:37:57 PM3/25/15
to Kibeom Kim, Peter Kasting, Matt Giuca, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
I agree that most objects shouldn't be ref-counted, but there are reasons why (internal data of) gfx::Image is ref counted. Images are often shared, and copying them is very expensive. (and SkBitmap does basically the same thing)

I don't think it's impossible to remove ref-counting, but it may have different kinds of issues. (right now, no one cares who owns the image data, but you'll have to).

- oshima

Matt Giuca

unread,
Mar 25, 2015, 8:54:58 PM3/25/15
to oshima, Kibeom Kim, Peter Kasting, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington
FYI I just had a go at adding a DeepCopy() method to gfx::Image (so if you have no idea how many backing references there are and need to pass it to another thread, you can just DeepCopy() it and go in a scoped_ptr). I got as far as having a Skia-backed Image deep-copyable, but it was a lot of work and I would still need to add deep copying support to all of the other representations. I decided it wasn't worth it, but if anyone thinks it is, let me know.

Kibeom Kim

unread,
Apr 8, 2015, 8:46:32 PM4/8/15
to Matt Giuca, oshima, Peter Kasting, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington, Ted Choc
Another data point:
  1. When user bookmarks a page, we download a salient image from the page. BookmarkImageServiceAndroid::BitmapFetcherHandler::OnFetchComplete [UI thread]
  2. Resize the image and extract dominant color BookmarkImageService::ResizeAndStoreImage. [Thread 1]
    • Save the image locally. PersistentImageStore::Insert [Thread 2]
    • Pass the image back to callback. (This happens in parallel with the above). |Callback| in enhanced_bookmarks_bridge.cc [UI thread]

I wrapped gfx::Image by scoped_refptr. And due to the parallel execution inside the step 2. I made ImageRecord class RefCountedThreadSafe, which has scoped_ptr<gfx::Image> member https://codereview.chromium.org/1031293002/diff/260001/components/enhanced_bookmarks/image_record.h .

(1. Yeah I didn't like this workaround.  2. I noticed that |gfx::ImageSkia::storage_| is RefCountedThreadSafe unlike |gfx::Image::storage_|)


Matt Giuca

unread,
Apr 8, 2015, 9:45:22 PM4/8/15
to Kibeom Kim, oshima, Peter Kasting, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington, Ted Choc
On Thu, 9 Apr 2015 at 10:45 Kibeom Kim <kkim...@chromium.org> wrote:
I wrapped gfx::Image by scoped_refptr.

Yeah so this is what I was afraid of happening if ImageStorage is not RefCountedThreadSafe.

We are already in a pretty silly situation with 3 nested reference counted objects:
  • Image has a ref count on ImageStorage which has...
  • ImageSkia which has a ref count on ImageSkiaStorage which has...
  • SkBitmap which has a ref count on SkPixelRef.
And you're adding (localized to your code) a fourth layer of refcounting, apparently for no other reason than that the gfx::Image refcount is non-thread-safe.

Perhaps you could directly use an ImageSkia instead of an Image? Then you would have just two layers of ref counting instead of four.

Just floating another general suggestion: we could remove the ref counting on Image altogether, and have its copy operator copy its ImageRep. For ImageSkia-backed images, this will continue to be ref-counted (but the ref counting will be thread-safe). For non-Image-Skia-backed images this might do a deep copy. If that's a problem, we could add ref counting to the individual representations. That would remove one layer of refcounting from gfx::Image. (Though it would have the same drawback as my original suggestion, which is that it would add the overhead of thread-safety around refcounting for copying a gfx::Image.)

oshima

unread,
Apr 8, 2015, 10:48:17 PM4/8/15
to Matt Giuca, Kibeom Kim, Peter Kasting, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington, Ted Choc
How about implementing Pass(), which transfers ownership of the storage? (probably with check if the ref count is 1) That can also prevent the instance being shared with threads.

Kibeom Kim

unread,
Apr 8, 2015, 10:53:16 PM4/8/15
to Matt Giuca, oshima, Peter Kasting, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington, Ted Choc
On Wed, Apr 8, 2015 at 6:44 PM Matt Giuca <mgi...@chromium.org> wrote:
Perhaps you could directly use an ImageSkia instead of an Image? Then you would have just two layers of ref counting instead of four.

Unfortunately that wasn't an option for us as we share the code with ios and they use UIImage instead of skia. :(

Kibeom Kim

unread,
Apr 9, 2015, 7:42:04 PM4/9/15
to oshima, Matt Giuca, Peter Kasting, Ben Wells, Dana Jansens, Daniel Cheng, John Bauman, Jeffrey Yasskin, Eric Noyau, Robert Sesek, Chromium-dev, Theresa Wellington, Ted Choc
On Wed, Apr 8, 2015 at 7:46 PM oshima <osh...@chromium.org> wrote:
How about implementing Pass(), which transfers ownership of the storage? (probably with check if the ref count is 1) That can also prevent the instance being shared with threads.

Actually, independent from our issue, I think it makes sense for |scoped_refptr| to support move and Pass(). Uploaded CL https://codereview.chromium.org/1076953002/

Reply all
Reply to author
Forward
Message has been deleted
0 new messages