Is it okay to use std::function in the PartitionAlloc library?

31 views
Skip to first unread message

Yuki Shiino

unread,
Nov 24, 2021, 2:36:04 AM11/24/21
to memor...@chromium.org, Takashi Sakamoto, bar...@chromium.org, li...@chromium.org
Hi team,

We're working to make PartitionAlloc library a standalone library (independent from //base library), and one of our challenges is how to get rid of base::OnceCallback and base::RepeatingCallback, which are quite complex and we don't want to copy them into the PartitionAlloc library.

We're wondering whether it's an option or forbidden to use `std::function` in the new PartitionAlloc library instead of base::OnceCallback and base::RepeatingCallback.  Do you have any opinions, thoughts, and/or suggestions about use of std::function?

We could use raw function pointers + void* as an alternative (e.g. a callback + a deleter of the callback + void* to represent the callback object), but use of void* may be more discouraged than use of std::function because void* is type-unsafe.

Cheers,
Yuki Shiino

Kentaro Hara

unread,
Nov 24, 2021, 2:45:03 AM11/24/21
to Yuki Shiino, Dana Jansens, Daniel Cheng, memor...@chromium.org, Takashi Sakamoto, bar...@chromium.org, li...@chromium.org
+1 to avoiding a dependency from PA to base::Callbacks.

I'm personally fine with using std::function but @Dana Jansens and @Daniel Cheng will be more familiar with it :)



--
You received this message because you are subscribed to the Google Groups "memory-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to memory-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/memory-dev/CAN0uC_RNB%2BfnTHU0M5bdzRvNve-RhRbXjVedYKQW7Jt%2BcngU_g%40mail.gmail.com.


--
Kentaro Hara, Tokyo

Daniel Cheng

unread,
Nov 24, 2021, 3:21:46 AM11/24/21
to Kentaro Hara, Yuki Shiino, Dana Jansens, memor...@chromium.org, Takashi Sakamoto, bar...@chromium.org, li...@chromium.org
IIRC, v8 uses std::function but it's not exposed in //v8/include: https://source.chromium.org/search?q=std::function&sq=&ss=chromium%2Fchromium%2Fsrc:v8%2Finclude%2F

If we can limit the usage of std::function<T> to PA-internal code, then I think that's fine.
If we do need to expose it from PA for various hooks (I'm guessing we might?), it would be useful to understand:

- does a function pointer suffice? It sounds like a context pointer would be needed in some (most?) cases?
- if a function pointer does not suffice, can we limit how many files need to use the PA API to limit the use of std::function<T> in Chrome code itself?

It's probably worth noting that std::function<T> itself isn't safe by default. It can reference a lambda that captures by reference (or captures raw pointers), which would result in potentially unsafe accesses that aren't protected by raw_ptr.

In comparison, void* requires a cast to the actual type, but presumably, this will generally be done in coordination (e.g. the same code setting the function pointer is also setting the context pointer).

All things considered though, modern C++ would probably encourage the use of std::function<T> over function pointer + void*, so from that perspective, I'm OK with it.

Also, do we need the deleter for the void* context? I would have expected most PA callbacks to be scoped to the lifetime of the program. Presumably we don't want to introduce std::function objects with static lifetime.

Daniel

Yuki Shiino

unread,
Nov 25, 2021, 7:25:24 AM11/25/21
to Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, Yuki Shiino, Dana Jansens, memor...@chromium.org, bar...@chromium.org
I checked where we need callbacks and why.  I think I now have a better understanding.  Would you folks check my understanding below (especially tasak@ and lizeb@)?  Actually I got most of my understanding from tasak@.

Where do we need callbacks and why?
1. PartitionAllocMemoryReclaimer
We'd like to run "reclaim memory" every after T sec.

2. ThreadCacheRegistry
We'd like to run "purge thread cache" after T sec, where T is dynamically tuned according to the usage of thread cache.
We schedule another run (with a new interval) once one run finishes so that we periodically run "purge thread cache".

3. MUAwareTaskBasedBackend (*Scan)
We'd like to run "scan the quarantine" after T sec, where T is dynamically tuned according to the memory usage.
We schedule another run (with a new interval) once one run finishes so that we periodically run "scan the quarantine".

So, the usage of three cases are mostly the same.  The pattern here is:
1. Run X.
2. T = new time interval
3. Schedule these steps to run when T sec passes.
note that the callback X needs no input argument and time T is a parameter for a scheduler (task runner), which lives outside of the PartitionAlloc library.  The usage is not so complicated, relatively simple.

Then, I think we can go in the following way (see the following pseudo code example).  In short, the idea is just to use a non-generic closure type instead of a generic closure type (= base::OnceCallback<T>).  Since it's not generic, the maintenance cost should be low, I think.  In the above three use cases, non-generic closures only need to have a pointer to an object that must outlive the closure.

// PartitionAlloc library
class TaskXClosure {
 public:
  void operator()();
 
 private:
  TaskXObject* task_x_object_;
};

using TaskXScheduler = void (*)(TaskXClosure, TimeDelta);
TaskXScheduler g_scheduler;
void StartTaskX(TaskXScheduler scheduler) {
  g_scheduler = scheduler;
  TaskXClosure task_x = create_task_X();
  TimeDelta t = calculate_new_interval();
  g_scheduler(task_x, t);
}

void TaskXClosure::operator()() {
  do_X();
  TimeDelta t = calculate_new_interval();
  g_scheduler(*this, t);
}

// Client code of PartitionAlloc library
// TaskRunner is available here.
void ScheduleTaskX(TaskXClosure task_x, TimeDelta t) {
  ThreadTaskRunnerHandle::Get()->PostDelayedTask(
    FROM_HERE,
    BindOnce(
      [](TaskXClosure task_x) { task_x(); },
      task_x),
    t);
}
int main() {
  StartTaskX(ScheduleTaskX);
}

About dcheng's questions,
Q1. Is a context pointer needed?
A2. Yes, we need a MemoryReclaimer, ThreadCache, or MUAwareTaskBasedBackend object.

Q2. How many files need to use PA library?
A2. At most 3, I think.

Q3. Do we need the deleter for a context object?
A3. In prod, I don't think we need it.  For testing, we may need it.  In the above "non-generic closure", I assumed that the client code stops the periodic task X being scheduled before destroying the context object.  (We need an API to stop scheduling a new task X and wait for an already scheduled task X to be done.)

After all, std::function may look much simpler than the above "non-generic closure" idea.

Or, we can use a function pointer + context object pointer instead, but in that case, there is no room to add a control for the lifetime of the context object in the future.  Function pointers might look simpler, but I'm afraid that there is no room to add something in the future (for debugging or extensions).  Since PartitionAlloc library will be separated from the Chromium code base, we have to care about source-code-level backward compatibility in the future.  Function pointers look a little bit too poor to me from this point of view.

What do you folks think?

Cheers,
Yuki Shiino


2021年11月24日(水) 17:21 Daniel Cheng <dch...@chromium.org>:

dan...@chromium.org

unread,
Nov 25, 2021, 9:13:43 AM11/25/21
to Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
Might be a silly question, but can the code thats doing the timeout and then runs these various methods just learn about them and run them directly, instead of using a callback?

IOW:

<wait_for_timer>
memory_reclaimer_->TimeElapsed();
thread_cache_registry_->TimeElapsed();

If the timer mechanism was in base then that won't work obviously. But if you're pulling this out of base, the timer code could be specialized for PA?

Yuki Shiino

unread,
Nov 25, 2021, 11:34:35 AM11/25/21
to dan...@chromium.org, Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
Unfortunately, timer is implemented in //base.  Plus, I vaguely remember that Chromium wants to have a (strong) control of event / task scheduling with their own prioritization.  For example, Chromium may want to choose which task runner should be used for TaskX by themselves.  I think it might not be desired that PartitionAlloc library implements its own timer and scheduling.

Cheers,
Yuki Shiino


2021年11月25日(木) 23:13 <dan...@chromium.org>:

dan...@chromium.org

unread,
Nov 25, 2021, 11:37:12 AM11/25/21
to Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
On Thu, Nov 25, 2021 at 11:34 AM Yuki Shiino <yukis...@chromium.org> wrote:
Unfortunately, timer is implemented in //base.  Plus, I vaguely remember that Chromium wants to have a (strong) control of event / task scheduling with their own prioritization.  For example, Chromium may want to choose which task runner should be used for TaskX by themselves.  I think it might not be desired that PartitionAlloc library implements its own timer and scheduling.

Hm, but the whole premise here is "We're working to make PartitionAlloc library a standalone library (independent from //base library),"

So there is no base?

Yuki Shiino

unread,
Nov 25, 2021, 11:49:34 AM11/25/21
to dan...@chromium.org, Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
2021年11月26日(金) 1:37 <dan...@chromium.org>:
On Thu, Nov 25, 2021 at 11:34 AM Yuki Shiino <yukis...@chromium.org> wrote:
Unfortunately, timer is implemented in //base.  Plus, I vaguely remember that Chromium wants to have a (strong) control of event / task scheduling with their own prioritization.  For example, Chromium may want to choose which task runner should be used for TaskX by themselves.  I think it might not be desired that PartitionAlloc library implements its own timer and scheduling.

Hm, but the whole premise here is "We're working to make PartitionAlloc library a standalone library (independent from //base library),"

So there is no base?

Right, there is no base.  So, our idea is: PA library provides functions to be run periodically, and it's the application's responsibility to actually schedule tasks to run those functions.

Timer and scheduler (task runner) must be implemented in applications.  It's the application's responsibility to initialize the PA library and set up various things such as memory reclaimer, thread cache purger, and *scan.  If the application never uses *scan, it's okay that the application doesn't set up a scheduler for *scan.

Does this make sense?

dan...@chromium.org

unread,
Nov 25, 2021, 11:53:05 AM11/25/21
to Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
On Thu, Nov 25, 2021 at 11:49 AM Yuki Shiino <yukis...@chromium.org> wrote:
2021年11月26日(金) 1:37 <dan...@chromium.org>:
On Thu, Nov 25, 2021 at 11:34 AM Yuki Shiino <yukis...@chromium.org> wrote:
Unfortunately, timer is implemented in //base.  Plus, I vaguely remember that Chromium wants to have a (strong) control of event / task scheduling with their own prioritization.  For example, Chromium may want to choose which task runner should be used for TaskX by themselves.  I think it might not be desired that PartitionAlloc library implements its own timer and scheduling.

Hm, but the whole premise here is "We're working to make PartitionAlloc library a standalone library (independent from //base library),"

So there is no base?

Right, there is no base.  So, our idea is: PA library provides functions to be run periodically, and it's the application's responsibility to actually schedule tasks to run those functions.

Timer and scheduler (task runner) must be implemented in applications.  It's the application's responsibility to initialize the PA library and set up various things such as memory reclaimer, thread cache purger, and *scan.  If the application never uses *scan, it's okay that the application doesn't set up a scheduler for *scan.

Does this make sense?

Thanks for clarifying.

Then my follow-up question is why does PA need to construct std::function() at all? If it exposes the things that need to be called, then the application would set up the timer and make a Callback (or std::function) for it to put in the timer, if that made sense for them. Or call them directly if not. Do you want PA to return a std::function() to the application for what needs to be run? Why not just expose the function instead?

Yuki Shiino

unread,
Nov 25, 2021, 12:19:04 PM11/25/21
to dan...@chromium.org, Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
The functions to be called back need some arguments.  If we made all necessary arguments to global variables (global objects), then we don't need any argument, and just exposing functions works fine.  This idea works well as long as memory reclaimer, thread cache registry, and *scan are singletons, but I'm afraid that it's fragile design and I'm not really sure if it's okay to assume these objects are singletons.

If we don't like the way of global variables, a callback needs a pointer to a context object (= a bag of necessary arguments) at least.  Thus, "a function pointer (callback) + a pointer to a context object" is another option on the table.

I showed another idea using "non-generic closure".

std::function is yet another option.

I agree that "just a function + global variables" is an option, too.


2021年11月26日(金) 1:53 <dan...@chromium.org>:

dan...@chromium.org

unread,
Nov 25, 2021, 12:33:32 PM11/25/21
to Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
On Thu, Nov 25, 2021 at 12:19 PM Yuki Shiino <yukis...@chromium.org> wrote:
The functions to be called back need some arguments.  If we made all necessary arguments to global variables (global objects), then we don't need any argument, and just exposing functions works fine.  This idea works well as long as memory reclaimer, thread cache registry, and *scan are singletons, but I'm afraid that it's fragile design and I'm not really sure if it's okay to assume these objects are singletons.

If we don't like the way of global variables, a callback needs a pointer to a context object (= a bag of necessary arguments) at least.  Thus, "a function pointer (callback) + a pointer to a context object" is another option on the table.

I showed another idea using "non-generic closure".

std::function is yet another option.

I agree that "just a function + global variables" is an option, too.

Or you can have a public struct that holds the data you want to store on it, and has a function that delegates to the internal thing, passing along the data?

class PublicMemoryReclaimer {
  int field_;
  bool other_field_;
  MemoryReclaimer* internal_;
 public:
  void TimeElapsed() {
    internal_->DoTheThing(field_, other_field_);
  }
};

PublicMemoryReclaimer GetReclaimerToRunSometimes();


This gives a typesafe API to the application using PA. It's a bit like rolling your own std::function but it's only for a few things. I wish we did this sort of thing more as the type erasure of Callback/function makes it hard to debug and understand what happens when.

This way the application knows which thing it's activating in PA from the type, not just from the variable name.

What do you think of this?

Yuki Shiino

unread,
Nov 25, 2021, 12:59:18 PM11/25/21
to dan...@chromium.org, Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
Looks great!  However, we'd like some more features...

- The PA library would like to control the time interval per each call (e.g. depending on how much memory is used/freed, the next scan can be delayed more (or less)).

- The PA library might want to stop it in case of unit tests.  (In prod, it's unlikely that we need to stop memory reclaimer, etc.)

The first item looks "must have" looking at the current usage.  I understand that we can easily add more methods to PublicMemoryReclaimer.

class PublicMemoryReclaimer {
  ...
  TimeDelta GetNextTimeInterval() const;
  bool ShouldRunMore() const;
};

IIUC, this is very similar to TaskXClosure in my example, except that the PA library never touches the scheduler (applications do not inject their scheduler to the PA library).  Then, the question is whether it's important or not for the PA library to have a liberty to touch a scheduler.

I'm happy with the idea and I think it's worth giving it a try.  What do you folks think?

Cheers,
Yuki Shiino



2021年11月26日(金) 2:33 <dan...@chromium.org>:

dan...@chromium.org

unread,
Nov 25, 2021, 1:15:03 PM11/25/21
to Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
On Thu, Nov 25, 2021 at 12:59 PM Yuki Shiino <yukis...@chromium.org> wrote:
Looks great!  However, we'd like some more features...

- The PA library would like to control the time interval per each call (e.g. depending on how much memory is used/freed, the next scan can be delayed more (or less)).

Adding a function to the public class as you've done looks good. And it seems that otherwise we'd have to have a Callback/function for those as well, which is even more typeless functions, so this seems even better.

- The PA library might want to stop it in case of unit tests.  (In prod, it's unlikely that we need to stop memory reclaimer, etc.)

This seems like the responsibility of the application - if it wants to not run it anymore it stops calling it. PA doesn't need to know about what mode the application is, right?

Chris Hamilton

unread,
Nov 25, 2021, 2:04:09 PM11/25/21
to dan...@chromium.org, Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
Maybe an approach similar to what V8 takes would work? They have their own abstraction of a base library (for task scheduling, closures, timers, platform abstractions, etc), and then when building against Chrome provide an implementation that is a wrapper of Chrome base. This lets other embedders provide their own similar primitives as an impl of the PA base library, while simultaneously letting Chrome's base impl have full knowledge of PAs tasks/timers/etc when Chrome is the embedder.

The other approach of exposing endpoints that the embedder has to specifically call at the appropriate times (ie, the embedder brings their own timers, and sets them up, not PA) works as well. As long as we avoid a solution that leads to another anonymous thread running tasks that the base/ scheduler doesn't know about.

Cheers,

Chris

Egor Pasko

unread,
Nov 25, 2021, 2:40:24 PM11/25/21
to Chris Hamilton, dan...@chromium.org, Yuki Shiino, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org, bar...@chromium.org
Does the PA library want to support distribution as a dynamic library that replaces the system allocator with minimal or no effort from the client app? This would mandate maintaining a C API (and ABI as well), which makes it significantly different from the v8 embedding model.

The _extended_ PA API discussed here (for posting periodic tasks) would be more user-friendly to maintain in C as well. It would make life easier for those clients not aiming to depend on C++. Another issue is worrying about C++ library differences from different compilers (keeping it sane adds complexity in PA).

Such a C API can be wrapped into a C++ library for clients (like Chrome) that want it - this would add easy coverage for the C API even if we are not using it directly in Chrome.


Bartek Nowierski

unread,
Nov 25, 2021, 7:40:53 PM11/25/21
to Egor Pasko, Yuki Shiino, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, Kentaro Hara, memor...@chromium.org
+1 on C interface, if easily doable (not a top priority though).

Regarding the interface, to continue on Dana's point, I think everything can be done in the caller without callbacks. I'm guessing we just got trapped in the mindset of the current API state, which has a complex callback mechanism.

From a point of view of an in-Chromium call-site, it could look like this:
void ScheduleNextThreadCachePurge(delay) {
FROM_HERE,
BindOnce(
[]() {
            TRACE_EVENT0("memory", "PeriodicPurge");
            auto new_delay = partition_alloc::PurgeThreadCache();
            ScheduleNextThreadCachePurge(new_delay)
          }),
      delay);
}
[...]
ScheduleNextThreadCachePurge(partition_alloc::RecommendedInitialPurgeThreadCacheInterval());

A more naive implementation in another app could spawn a new thread with code like:
auto delay = partition_alloc::RecommendedInitialPurgeThreadCacheInterval();
while (1) {
  sleep(delay);
  delay = partition_alloc::PurgeThreadCache();
}

We could even get rid of RecommendedInitialPurgeThreadCacheInterval, if we're fine with the first purge call to come immediately, which I think we are.

If we really need the caller to stop purging, we could return a special value of delay (e.g. 0, or -1), or a pair with bool.


MemoryReclaimer could be even simpler, because it doesn't have an adjustable interval... currently. But I understand that to make the API forward-looking, we can provide this option already.


Cheers,
Bartek

Kentaro Hara

unread,
Nov 25, 2021, 7:48:58 PM11/25/21
to Bartek Nowierski, Egor Pasko, Yuki Shiino, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
+1 to the Bartek's API idea :)


--
Kentaro Hara, Tokyo

Yuki Shiino

unread,
Nov 26, 2021, 9:02:11 AM11/26/21
to Kentaro Hara, Bartek Nowierski, Egor Pasko, Yuki Shiino, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
Thanks for a lot of input.  I'm now very interested in the way of Platform object that V8 and Blink use (chrisha@'s idea).

About C API, my impression is that the expected users of the PA library as of now are all C++ applications.  I think it's okay for now to focus on C++ API.  When any need of C API arises, we can build C API on top of C++ API.  Binary compatibility (ABI) is out of scope of the PA library, IIUC.  Just like V8 and Blink, we won't provide any binary compatibility of the PA library.  We provide only source code compatibility.

About the idea of Platform object, it looks great to me.  We now have three existing use cases of callback functions in PA, but this number could change in the future (actually *scan's use case is relatively new).  If we made it the application's responsibility to invoke these "callback"-ish functions, whenever the number of these callback-ish functions changes, we have to request applications to adapt to new APIs.  In the way of Platform object (injecting timer, scheduler, etc., whatever necessary), the PA library could automatically start repeating tasks if appropriate.

Plus, the idea of Platform object can solve other dependency issues in general, not limited to callbacks.  The pattern of a Platform object is a) generally extensible to support more things, b) able to make it error when pure virtual member functions are not overridden, c) suitable for the testing purpose (mock implementation of Platform object), and maybe more goodness.  It's already been proved that the idea works well in the cases of V8 and Blink.

What do you folks think?

Cheers,
Yuki Shiino


2021年11月26日(金) 9:49 Kentaro Hara <har...@chromium.org>:

Kentaro Hara

unread,
Nov 26, 2021, 9:23:52 AM11/26/21
to Yuki Shiino, Bartek Nowierski, Egor Pasko, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
I agree that the Platform pattern is very generic and have no objection to it. The Platform pattern is used between Blink and //content/ too.

However, I like Bartek's idea for its simplicity -- it doesn't need to expose callbacks to the embedder. My personal preference is to 1) start with the simple API, 2) make PDFium, d8 and ARC++ work with PA, and 3) revise the API based on the learning. I'd prefer avoiding over-generalizing APIs before doing 2).



--
Kentaro Hara, Tokyo

dan...@chromium.org

unread,
Nov 26, 2021, 9:26:00 AM11/26/21
to Kentaro Hara, Yuki Shiino, Bartek Nowierski, Egor Pasko, Chris Hamilton, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
On Fri, Nov 26, 2021 at 9:23 AM Kentaro Hara <har...@chromium.org> wrote:
I agree that the Platform pattern is very generic and have no objection to it. The Platform pattern is used between Blink and //content/ too.

However, I like Bartek's idea for its simplicity -- it doesn't need to expose callbacks to the embedder. My personal preference is to 1) start with the simple API, 2) make PDFium, d8 and ARC++ work with PA, and 3) revise the API based on the learning. I'd prefer avoiding over-generalizing APIs before doing 2).

+1 :)

Egor Pasko

unread,
Nov 26, 2021, 10:33:53 AM11/26/21
to Yuki Shiino, Kentaro Hara, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
I like what Bartek suggested!

On Fri, Nov 26, 2021 at 3:02 PM Yuki Shiino <yukis...@chromium.org> wrote:
Thanks for a lot of input.  I'm now very interested in the way of Platform object that V8 and Blink use (chrisha@'s idea).

About C API, my impression is that the expected users of the PA library as of now are all C++ applications.  I think it's okay for now to focus on C++ API.  When any need of C API arises, we can build C API on top of C++ API.

My point previously was that if you build the C API on top of the C++ API, then Chrome will not provide the coverage for this C API, so I'm expecting more issues with the other clients of the PA library. It is workable, just not ideal.
 
  Binary compatibility (ABI) is out of scope of the PA library, IIUC.  Just like V8 and Blink, we won't provide any binary compatibility of the PA library.  We provide only source code compatibility.

This is an important bit that I did not know about. Thank you for clarifying.

To make sure we are on the same page with what source code compatibility means, let me ask a question. If someone hypothetically wants to try the PA library with ARC, would they need to modify the source of the libc in ARC (bionic) and rebuild both the bionic and the PA library? If so, I guess I am confused by the terminology, this sounds more like a "source tarball" or a "source repo" than a "library" to me. It could be fine depending on requirements. The requirements would be nice to outline in advance though (sorry if the hint sounds too explicit, and doubly so if the document exists and I just failed to find it:)

A few more words to wrap up my thoughts. A library with a C interface would have had less potential hurdles when rolling dependencies on various platforms and systems. Managing library dependencies is generally a hard and thankless job. However, if the requirements are not assuming a lot of integration with other software, then a source tarball may be good enough.

Yuki Shiino

unread,
Nov 30, 2021, 12:24:24 PM11/30/21
to Egor Pasko, Yuki Shiino, Kentaro Hara, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
Thank you all for providing your thoughts and better ideas.  It looks that everyone is happy with starting with a simple thing; it's the applications' responsibility to run periodic tasks.  The PA library provides APIs to run periodic tasks, to get the next time interval, to get whether we should continue running the periodic tasks.  We'll give it a shot to Dana + Bartek's idea.


My point previously was that if you build the C API on top of the C++ API, then Chrome will not provide the coverage for this C API, so I'm expecting more issues with the other clients of the PA library. It is workable, just not ideal.

The current implementation of the PartitionAlloc has already been written in C++.  I guess that you're suggesting the following structure?

  C++ API --(use)--> C API --(use)--> C++ implementation of PA

rather than

  C API --(use)--> C++ API --(use)--> C++ implementation of PA

in order to exercise C API?

If that's the only purpose, we can do it later I think.  We can see whether we can build another C++ API on top of the proposed C API that is built on top of C++ API.

  C++ API for design evaluation --(use)--> C API --(use)--> C++ API --(use)--> C++ impl of PA

Plus, I thought that we wouldn't want very different C API from C++ API.  I was thinking of something like below, almost 1:1 mappings.

  // C++ API
  class File {
    static File* Open(...);
    bool Read(...);
  };

  // C API
  FilePtr FileOpen(...) { return File::Open(...); }
  bool FileRead(FilePtr fp, ...) { return ((File*)fp)->Read(...); }
  // where the first argument `fp` is essentially `this` pointer.

This should be quite similar to what fopen(3C), fread(3C), etc. look like.  I was optimistic about C API design given that we could design C++ API elegantly; C API should naturally be elegant due to the elegantness of C++ API.


To make sure we are on the same page with what source code compatibility means, let me ask a question. If someone hypothetically wants to try the PA library with ARC, would they need to modify the source of the libc in ARC (bionic) and rebuild both the bionic and the PA library?

No, no need to modify the source of libc.  Both PA and PA-E (PartitionAlloc-Everywhere aka PartitionAlloc as malloc) have been shipped on Chrome for Android, but we've not modified libc source.

When I wrote "no binary compatibility", I meant "binary compatibility to applications", i.e. ABI between the PA library and its application code, not ABI between the PA library and libc nor libc++.

IIUC, we're going with the "source tarball" way, and it's good enough?

Cheers,
Yuki Shiino


2021年11月27日(土) 0:33 Egor Pasko <pa...@chromium.org>:

Egor Pasko

unread,
Dec 1, 2021, 3:43:09 PM12/1/21
to Yuki Shiino, Kentaro Hara, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
On Tue, Nov 30, 2021 at 6:24 PM Yuki Shiino <yukis...@chromium.org> wrote:
Thank you all for providing your thoughts and better ideas.  It looks that everyone is happy with starting with a simple thing; it's the applications' responsibility to run periodic tasks.  The PA library provides APIs to run periodic tasks, to get the next time interval, to get whether we should continue running the periodic tasks.  We'll give it a shot to Dana + Bartek's idea.


My point previously was that if you build the C API on top of the C++ API, then Chrome will not provide the coverage for this C API, so I'm expecting more issues with the other clients of the PA library. It is workable, just not ideal.

The current implementation of the PartitionAlloc has already been written in C++.  I guess that you're suggesting the following structure?

  C++ API --(use)--> C API --(use)--> C++ implementation of PA

rather than

  C API --(use)--> C++ API --(use)--> C++ implementation of PA

in order to exercise C API?

If that's the only purpose, we can do it later I think.  We can see whether we can build another C++ API on top of the proposed C API that is built on top of C++ API.

  C++ API for design evaluation --(use)--> C API --(use)--> C++ API --(use)--> C++ impl of PA

Plus, I thought that we wouldn't want very different C API from C++ API.  I was thinking of something like below, almost 1:1 mappings.

  // C++ API
  class File {
    static File* Open(...);
    bool Read(...);
  };

  // C API
  FilePtr FileOpen(...) { return File::Open(...); }
  bool FileRead(FilePtr fp, ...) { return ((File*)fp)->Read(...); }
  // where the first argument `fp` is essentially `this` pointer.

This should be quite similar to what fopen(3C), fread(3C), etc. look like.  I was optimistic about C API design given that we could design C++ API elegantly; C API should naturally be elegant due to the elegantness of C++ API.

Now I am confused. I thought PA as a library would expose malloc/calloc/free/memalign etc. along with extended API for more fine control of purging, thread cache sizing etc. Is the API with FilePtr above an example/inspiration? I don't see how it fits the goals, sorry :)

To make sure we are on the same page with what source code compatibility means, let me ask a question. If someone hypothetically wants to try the PA library with ARC, would they need to modify the source of the libc in ARC (bionic) and rebuild both the bionic and the PA library?

No, no need to modify the source of libc.  Both PA and PA-E (PartitionAlloc-Everywhere aka PartitionAlloc as malloc) have been shipped on Chrome for Android, but we've not modified libc source.

Without modifying the sources of the libc, how would Android apps in ARC, like PlayStore, be switched to use PA? If I understand correctly, ARC runs inside Chrome, but Chrome does not run inside ARC :)

When I wrote "no binary compatibility", I meant "binary compatibility to applications", i.e. ABI between the PA library and its application code, not ABI between the PA library and libc nor libc++.

This would mean every app using libc for allocating memory will need to be modified to use PA and be re-linked with it? If we throw C++ across the board, we'd need to be careful of what we use: because the app code could rely on a different libc++ version than PA. If there aren't too many apps to relink like that, then we can surely solve this case by case.

Yuki Shiino

unread,
Dec 2, 2021, 3:05:36 AM12/2/21
to Egor Pasko, Yuki Shiino, Kentaro Hara, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
Replied inline.

2021年12月2日(木) 5:43 Egor Pasko <pa...@chromium.org>:


On Tue, Nov 30, 2021 at 6:24 PM Yuki Shiino <yukis...@chromium.org> wrote:
Thank you all for providing your thoughts and better ideas.  It looks that everyone is happy with starting with a simple thing; it's the applications' responsibility to run periodic tasks.  The PA library provides APIs to run periodic tasks, to get the next time interval, to get whether we should continue running the periodic tasks.  We'll give it a shot to Dana + Bartek's idea.


My point previously was that if you build the C API on top of the C++ API, then Chrome will not provide the coverage for this C API, so I'm expecting more issues with the other clients of the PA library. It is workable, just not ideal.

The current implementation of the PartitionAlloc has already been written in C++.  I guess that you're suggesting the following structure?

  C++ API --(use)--> C API --(use)--> C++ implementation of PA

rather than

  C API --(use)--> C++ API --(use)--> C++ implementation of PA

in order to exercise C API?

If that's the only purpose, we can do it later I think.  We can see whether we can build another C++ API on top of the proposed C API that is built on top of C++ API.

  C++ API for design evaluation --(use)--> C API --(use)--> C++ API --(use)--> C++ impl of PA

Plus, I thought that we wouldn't want very different C API from C++ API.  I was thinking of something like below, almost 1:1 mappings.

  // C++ API
  class File {
    static File* Open(...);
    bool Read(...);
  };

  // C API
  FilePtr FileOpen(...) { return File::Open(...); }
  bool FileRead(FilePtr fp, ...) { return ((File*)fp)->Read(...); }
  // where the first argument `fp` is essentially `this` pointer.

This should be quite similar to what fopen(3C), fread(3C), etc. look like.  I was optimistic about C API design given that we could design C++ API elegantly; C API should naturally be elegant due to the elegantness of C++ API.

Now I am confused. I thought PA as a library would expose malloc/calloc/free/memalign etc. along with extended API for more fine control of purging, thread cache sizing etc. Is the API with FilePtr above an example/inspiration? I don't see how it fits the goals, sorry :)

I'm sorry, it's just an example of how we can build C API on top of C++ API in general.  I just wanted to explain the idea.


To make sure we are on the same page with what source code compatibility means, let me ask a question. If someone hypothetically wants to try the PA library with ARC, would they need to modify the source of the libc in ARC (bionic) and rebuild both the bionic and the PA library?

No, no need to modify the source of libc.  Both PA and PA-E (PartitionAlloc-Everywhere aka PartitionAlloc as malloc) have been shipped on Chrome for Android, but we've not modified libc source.

Without modifying the sources of the libc, how would Android apps in ARC, like PlayStore, be switched to use PA? If I understand correctly, ARC runs inside Chrome, but Chrome does not run inside ARC :)

We suppose two cases: explicit use of PartitionAlloc and PartitionAlloc-Everywhere.

[Explicit use of PartitionAlloc]
You instantiate PartitionRoot and call PartitionRoot::Alloc(size, type_name), for example, in order to allocate a memory region.  You're expected to use PartitionAlloc APIs explicitly instead of malloc/free or operator new/delete.

[PartitionAlloc-Everywhere]
This is a way to make malloc/free/etc. backed by PartitionAlloc, so you don't need to explicitly use PartitionAlloc APIs (except for the initial setup).  The way how to intercept the system default malloc/free/etc. highly depends on platforms.  On Linux, we hijack the exported symbols of `malloc`, `free`, etc.  On Android, the apps are expected to use `-Wl,--wrap=malloc`, etc.  On Android P and later, there is a better way but we've not yet supported it.  On macOS, we hijack the system default malloc zone.


When I wrote "no binary compatibility", I meant "binary compatibility to applications", i.e. ABI between the PA library and its application code, not ABI between the PA library and libc nor libc++.

This would mean every app using libc for allocating memory will need to be modified to use PA and be re-linked with it? If we throw C++ across the board, we'd need to be careful of what we use: because the app code could rely on a different libc++ version than PA. If there aren't too many apps to relink like that, then we can surely solve this case by case.

Yes, every app is expected to be modified to use PA (at least the initialization of PA-E) and re-linked with the PA library.

Overall, we expect that the PA library will look similar to tcmalloc, jemalloc, and mimalloc (not exactly the same, but also not very different).  The applications' source code needs slight modifications and the apps need to be linked with the library.

Egor Pasko

unread,
Dec 3, 2021, 9:23:25 AM12/3/21
to Yuki Shiino, Kentaro Hara, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
Thank you for the detailed explanation of the goals. It all makes sense. I feel sorry now for having sent so many inquiries.

Some more thoughts below.

On Thu, Dec 2, 2021 at 9:05 AM Yuki Shiino <yukis...@chromium.org> wrote:
To make sure we are on the same page with what source code compatibility means, let me ask a question. If someone hypothetically wants to try the PA library with ARC, would they need to modify the source of the libc in ARC (bionic) and rebuild both the bionic and the PA library?

No, no need to modify the source of libc.  Both PA and PA-E (PartitionAlloc-Everywhere aka PartitionAlloc as malloc) have been shipped on Chrome for Android, but we've not modified libc source.

Without modifying the sources of the libc, how would Android apps in ARC, like PlayStore, be switched to use PA? If I understand correctly, ARC runs inside Chrome, but Chrome does not run inside ARC :)

We suppose two cases: explicit use of PartitionAlloc and PartitionAlloc-Everywhere.

[Explicit use of PartitionAlloc]
You instantiate PartitionRoot and call PartitionRoot::Alloc(size, type_name), for example, in order to allocate a memory region.  You're expected to use PartitionAlloc APIs explicitly instead of malloc/free or operator new/delete.

[PartitionAlloc-Everywhere]
This is a way to make malloc/free/etc. backed by PartitionAlloc, so you don't need to explicitly use PartitionAlloc APIs (except for the initial setup).  The way how to intercept the system default malloc/free/etc. highly depends on platforms.  On Linux, we hijack the exported symbols of `malloc`, `free`, etc.  On Android, the apps are expected to use `-Wl,--wrap=malloc`, etc.  On Android P and later, there is a better way but we've not yet supported it.  On macOS, we hijack the system default malloc zone.

On Android (such as ARC) injecting PA-E early in the system zygote should easily enable experiments and perhaps even deploy systemwide, without the apps even knowing which allocator they use (unless an app overrides with jemalloc/etc explicitly - that should be rare) and observe the magnitude of memory savings. There are several ways to do it, and having traditional C APIs would simplify it. An additional problem of asking every app to link against PA-E explicitly is that we will end up with many copies of PA on the system - consuming disk and RAM. All in all, a scalable solution is not at all obvious (for Android and ARC). Worth another discussion?
 
When I wrote "no binary compatibility", I meant "binary compatibility to applications", i.e. ABI between the PA library and its application code, not ABI between the PA library and libc nor libc++.

This would mean every app using libc for allocating memory will need to be modified to use PA and be re-linked with it? If we throw C++ across the board, we'd need to be careful of what we use: because the app code could rely on a different libc++ version than PA. If there aren't too many apps to relink like that, then we can surely solve this case by case.

Yes, every app is expected to be modified to use PA (at least the initialization of PA-E) and re-linked with the PA library.

I think that's OK as a goal. I was thinking of more scalable ways as in the response above, but it might be too early now to aim at them.
 
Overall, we expect that the PA library will look similar to tcmalloc, jemalloc, and mimalloc (not exactly the same, but also not very different).  The applications' source code needs slight modifications and the apps need to be linked with the library.

Thanks! This makes a lot of sense.

Yuki Shiino

unread,
Dec 6, 2021, 8:30:36 AM12/6/21
to Egor Pasko, Yuki Shiino, Kentaro Hara, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
2021年12月3日(金) 23:23 Egor Pasko <pa...@chromium.org>:
Thank you for the detailed explanation of the goals. It all makes sense. I feel sorry now for having sent so many inquiries.

Some more thoughts below.

On Thu, Dec 2, 2021 at 9:05 AM Yuki Shiino <yukis...@chromium.org> wrote:
To make sure we are on the same page with what source code compatibility means, let me ask a question. If someone hypothetically wants to try the PA library with ARC, would they need to modify the source of the libc in ARC (bionic) and rebuild both the bionic and the PA library?

No, no need to modify the source of libc.  Both PA and PA-E (PartitionAlloc-Everywhere aka PartitionAlloc as malloc) have been shipped on Chrome for Android, but we've not modified libc source.

Without modifying the sources of the libc, how would Android apps in ARC, like PlayStore, be switched to use PA? If I understand correctly, ARC runs inside Chrome, but Chrome does not run inside ARC :)

We suppose two cases: explicit use of PartitionAlloc and PartitionAlloc-Everywhere.

[Explicit use of PartitionAlloc]
You instantiate PartitionRoot and call PartitionRoot::Alloc(size, type_name), for example, in order to allocate a memory region.  You're expected to use PartitionAlloc APIs explicitly instead of malloc/free or operator new/delete.

[PartitionAlloc-Everywhere]
This is a way to make malloc/free/etc. backed by PartitionAlloc, so you don't need to explicitly use PartitionAlloc APIs (except for the initial setup).  The way how to intercept the system default malloc/free/etc. highly depends on platforms.  On Linux, we hijack the exported symbols of `malloc`, `free`, etc.  On Android, the apps are expected to use `-Wl,--wrap=malloc`, etc.  On Android P and later, there is a better way but we've not yet supported it.  On macOS, we hijack the system default malloc zone.

On Android (such as ARC) injecting PA-E early in the system zygote should easily enable experiments and perhaps even deploy systemwide, without the apps even knowing which allocator they use (unless an app overrides with jemalloc/etc explicitly - that should be rare) and observe the magnitude of memory savings. There are several ways to do it, and having traditional C APIs would simplify it. An additional problem of asking every app to link against PA-E explicitly is that we will end up with many copies of PA on the system - consuming disk and RAM. All in all, a scalable solution is not at all obvious (for Android and ARC). Worth another discussion?

I think it's worth another discussion.  When the overall goodness of PartitionAlloc is well proved for Android apps, Android (OS) team should be interested in using the PartitionAlloc library as the system default allocator.  In that case, no changes for Android apps are required because malloc and operator new will be backed by PA by default on Android.  malloc() will be implemented using PartitionRoot::Alloc in that case.

Cheers,
Yuki Shiino

Kentaro Hara

unread,
Dec 6, 2021, 9:17:38 AM12/6/21
to Yuki Shiino, Egor Pasko, Bartek Nowierski, Chris Hamilton, dan...@chromium.org, Daniel Cheng, Takashi Sakamoto, li...@chromium.org, memor...@chromium.org
Android (OS) team should be interested in using the PartitionAlloc library as the system default allocator.  In that case, no changes for Android apps are required because malloc and operator new will be backed by PA by default on Android.  malloc() will be implemented using PartitionRoot::Alloc in that case.

Right. This is what I want to experiment with after building the standalone library.

The recent experiment of replacing Fuchsia's allocator (Scudo) with PA showed a ~20% memory reduction. This is a strong signal that Android OS may be able to reduce memory usage massively by replacing Scudo with PA. I understand this is controversial for many reasons but It's definitely worth measuring it :)


Reply all
Reply to author
Forward
0 new messages