In the new text world, we'll be caching "gfxTextRun" objects that can contain strings of positioned glyphs that we use to measure and paint text. This is generally a good thing but we need a cache management policy; there could be lots of these objects and they could be large.
Currently in the in-memory caches that I know about, we have a limit of N bytes or K items, and evict items using a least-recently-used policy or some approximation thereof. I've never liked this approach. Memory we use to cache items that aren't being accessed could be used somewhere else (either in Firefox or by other applications) to speed up other things. It seems to me that the amount of memory we spend caching items should be related to how often those items are actually being used. In the limit, a quiescent Firefox should release all memory used by caches to the operating system.
Now, that may actually happen in effect, if Firefox's caches get paged out. But then we hit another issue, which is that paging in cache contents is often more expensive than just taking a cache miss and recreating the cached item. That will probably be true for text runs, especially if we can't localize the textrun data in memory to minimize seeks to the pagefile, which we can't because some of the cached data is allocated by platform libraries.
I think the right way to think about this is that memory used for caching has a cost proportional to time ... i.e., it's rented from the system. We should bound that cost by evicting cached items after they haven't been used for some time, even if the number of cached items is small.
For the textruns, I'm thinking of starting eviction at around 5 seconds of unuse, enough time to reflow and paint in most cases.
Robert O'Callahan wrote: > In the new text world, we'll be caching "gfxTextRun" objects that can > contain strings of positioned glyphs that we use to measure and paint > text. This is generally a good thing but we need a cache management > policy; there could be lots of these objects and they could be large.
> Currently in the in-memory caches that I know about, we have a limit of > N bytes or K items, and evict items using a least-recently-used policy > or some approximation thereof. I've never liked this approach. Memory we > use to cache items that aren't being accessed could be used somewhere > else (either in Firefox or by other applications) to speed up other > things. It seems to me that the amount of memory we spend caching items > should be related to how often those items are actually being used. In > the limit, a quiescent Firefox should release all memory used by caches > to the operating system.
> Now, that may actually happen in effect, if Firefox's caches get paged > out. But then we hit another issue, which is that paging in cache > contents is often more expensive than just taking a cache miss and > recreating the cached item. That will probably be true for text runs, > especially if we can't localize the textrun data in memory to minimize > seeks to the pagefile, which we can't because some of the cached data is > allocated by platform libraries.
> I think the right way to think about this is that memory used for > caching has a cost proportional to time ... i.e., it's rented from the > system. We should bound that cost by evicting cached items after they > haven't been used for some time, even if the number of cached items is > small.
> For the textruns, I'm thinking of starting eviction at around 5 seconds > of unuse, enough time to reflow and paint in most cases.
> Rob
If our cached data would be already paged out, evicting items would actually page them back in first, right? I wonder if we have a good idea on when OSes page out to get rid of our not-so-used items before that happens.
Axel Hecht wrote: > If our cached data would be already paged out, evicting items would > actually page them back in first, right? I wonder if we have a good idea > on when OSes page out to get rid of our not-so-used items before that > happens.
Most OSes don't support detection of page presence. Even if they did, it would be hard to use. For caches where we can limit the lifetime of unused items to no more than say ten seconds, we should mostly avoid this problem.
Axel Hecht wrote: > If our cached data would be already paged out, evicting items would > actually page them back in first, right? I wonder if we have a good idea > on when OSes page out to get rid of our not-so-used items before that > happens.
I think if we were to just |free()| the cached items then they wouldn't get paged in even if they were paged out. At least i'd hope that's the case. However if the cached items are C++ objects that we |delete| and in their dtor access a bunch of members then we'd probably end up swapping in the memory.
Neil wrote: > Jonas Sicking wrote: >> I think if we were to just |free()| the cached items then they >> wouldn't get paged in even if they were paged out.
> Doesn't the CRT track free blocks in a chain or some such? That would > involve writing to the freed block (or its header).
It does.
This is not really what I'm aiming for though. I want to free cached stuff before it gets paged out. Even the pageout wastes resources.
Robert O'Callahan wrote: > Neil wrote: >> Jonas Sicking wrote: >>> I think if we were to just |free()| the cached items then they >>> wouldn't get paged in even if they were paged out.
>> Doesn't the CRT track free blocks in a chain or some such? That would >> involve writing to the freed block (or its header).
> It does.
> This is not really what I'm aiming for though. I want to free cached > stuff before it gets paged out. Even the pageout wastes resources.
Very true indeed.
I think for textruns it would be good to try to keep the data in memory until the page is fully loaded. During loading I bet that we do a lot of measuring since we are continuously reflowing and rerendering the page. Given a timeout of 5 seconds I think we should accomplish that unless the network is really slow, in which case the network will be the limiting factor anyway.
Of course, we should keep in mind the other timers that we have that throttles reflows and the likes during loading.
Jonas Sicking wrote: > Robert O'Callahan wrote: >> Neil wrote: >>> Jonas Sicking wrote: >>>> I think if we were to just |free()| the cached items then they >>>> wouldn't get paged in even if they were paged out.
>>> Doesn't the CRT track free blocks in a chain or some such? That would >>> involve writing to the freed block (or its header).
>> It does.
>> This is not really what I'm aiming for though. I want to free cached >> stuff before it gets paged out. Even the pageout wastes resources.
> Very true indeed.
> I think for textruns it would be good to try to keep the data in memory > until the page is fully loaded. During loading I bet that we do a lot of > measuring since we are continuously reflowing and rerendering the page. > Given a timeout of 5 seconds I think we should accomplish that unless > the network is really slow, in which case the network will be the > limiting factor anyway.
> Of course, we should keep in mind the other timers that we have that > throttles reflows and the likes during loading.
I have seen portals taking up to 7 seconds to load all their adds and stuff.
But, as roc said that time would start after an object hasn't been used, to the timer would probably be constantly reset during a load.
I wonder if there's a way to find reoccurring objects used every now and then. I bet in ajax applications, we could have those. Maybe clobbering after load more aggressively and loosely later may be a strategy to speed up DHTML apps that switch back and forth, like menus and the like.
Axel Hecht wrote: > But, as roc said that time would start after an object hasn't been used, > to the timer would probably be constantly reset during a load.
In practice we won't be resetting timers all the time but doing some approximation with a repeating timer and age bits, but yes, if an object gets touched frequently, it won't be released.
> I wonder if there's a way to find reoccurring objects used every now and > then. I bet in ajax applications, we could have those. Maybe clobbering > after load more aggressively and loosely later may be a strategy to > speed up DHTML apps that switch back and forth, like menus and the like.
We could count the number of resurrections and increase the age limit for such objects, if that seems necessary.