I am trying to implement an LRU cache. Several global lru lists could be accessed concurrently by multiple goroutines, which could be a disaster in a machine with 24 or more cores.
Therefore, it would be great if I can add the item to P-local storage and flush the batched item into the lru list as a whole. This should greatly reduce the contention for the global lru list.
How can I do it? I saw some related github issues, #8281 and #21355, which leads me to a project called gls, but the code seems too much to integrate into my project (actually I'd better not include any third-party package to avoid potential law issues). Is there any built-in way to achieve this?
Thanks
type CacheManager struct {// things a worker needs to know, such as the global cache, the specific worker's local cache, etc.}func master() {for i := 0; i < workers; i++ {m := new(CacheManager)m.x = y // set up your thread local storage:go m.Worker()}}
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c79d3801-2f03-43fd-8dd8-35904b481341%40googlegroups.com.
--
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/A3408951-C5C8-4699-96E3-DFED6FB8B4C6%40ix.netcom.com.
I am trying to implement an LRU cache.
The simple, common way--if I understand you need correctly--is to launch a method goroutine.type CacheManager struct {// things a worker needs to know, such as the global cache, the specific worker's local cache, etc.}func master() {for i := 0; i < workers; i++ {m := new(CacheManager)m.x = y // set up your thread local storage:go m.Worker()}}Unfortunately this does not seem to be in any intro guides, which pushes people to complicated workarounds.
On Tue, Jul 23, 2019 at 10:22 AM Zihan Yang <whois.z...@gmail.com> wrote:
--I am trying to implement an LRU cache. Several global lru lists could be accessed concurrently by multiple goroutines, which could be a disaster in a machine with 24 or more cores.
Therefore, it would be great if I can add the item to P-local storage and flush the batched item into the lru list as a whole. This should greatly reduce the contention for the global lru list.
How can I do it? I saw some related github issues, #8281 and #21355, which leads me to a project called gls, but the code seems too much to integrate into my project (actually I'd better not include any third-party package to avoid potential law issues). Is there any built-in way to achieve this?
Thanks
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c79d3801-2f03-43fd-8dd8-35904b481341%40googlegroups.com.
You will need to use CGo. See Chaney “id” project for ideas. Since the underlying OS threads are shared and will not be switched while in the C code it is fairly straightforward but having a robust typed cache will be problematic.There are discussions on golang-dev about named Go routines which might be a workaround, but so far the TLS (or PLS) discussion has been a non starter.
--I am trying to implement an LRU cache. Several global lru lists could be accessed concurrently by multiple goroutines, which could be a disaster in a machine with 24 or more cores.
Therefore, it would be great if I can add the item to P-local storage and flush the batched item into the lru list as a whole. This should greatly reduce the contention for the global lru list.
How can I do it? I saw some related github issues, #8281 and #21355, which leads me to a project called gls, but the code seems too much to integrate into my project (actually I'd better not include any third-party package to avoid potential law issues). Is there any built-in way to achieve this?
Thanks
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/bd54928f-cac5-4886-9252-04812f72b299%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/9dbe56c7-d0cd-4301-bc37-0560bcd684c4%40googlegroups.com.
https://github.com/davecheney/junk/tree/master/id but see the warnings.Still, I would use mutex and shared caches until you can prove that the costs matter before resorting to the library.You might be interested in github.com/robaho/go-concurrency-test
On Jul 23, 2019, at 9:55 PM, Zihan Yang <whois.z...@gmail.com> wrote:
CGO sounds good, but I remember CGO invocation is not cheap? (not sure where I saw it).Would you provide a reference to "Chaney id project"? I don't find much information about it.I will look into the named goroutine discussion.Thanks.
在 2019年7月24日星期三 UTC+8上午1:52:35,Robert Engels写道:You will need to use CGo. See Chaney “id” project for ideas. Since the underlying OS threads are shared and will not be switched while in the C code it is fairly straightforward but having a robust typed cache will be problematic.There are discussions on golang-dev about named Go routines which might be a workaround, but so far the TLS (or PLS) discussion has been a non starter.I am trying to implement an LRU cache. Several global lru lists could be accessed concurrently by multiple goroutines, which could be a disaster in a machine with 24 or more cores.
Therefore, it would be great if I can add the item to P-local storage and flush the batched item into the lru list as a whole. This should greatly reduce the contention for the global lru list.
How can I do it? I saw some related github issues, #8281 and #21355, which leads me to a project called gls, but the code seems too much to integrate into my project (actually I'd better not include any third-party package to avoid potential law issues). Is there any built-in way to achieve this?
Thanks
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c79d3801-2f03-43fd-8dd8-35904b481341%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/bd54928f-cac5-4886-9252-04812f72b299%40googlegroups.com.
Instead of starting new goroutines, send requests to existing goroutines via a channel.
In effect you are simulating an N core processor with per code local caches and one shared cache so you can look at how processors manage cache.Just create N goroutines at the start and profile to see what happens. If necessary you can adjust the number of goroutines as per load.
Note the cost of managing LRU. If there is no clear pattern of accessing similar items, you may be better off using random replacement. As an example you can use LRU for smaller local caches and random for the global cache.
But if I were you I'd spend time on understanding access patterns before anything else. If I have no idea on the pattern, I'd pick the simplest implementation and evolve it based on profiling. In other words do not assume that "greatly reduce the contention for global lru list" is the right thing to do *unless* profiling tells you so.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/9dbe56c7-d0cd-4301-bc37-0560bcd684c4%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/e3a12b07-21b2-400e-ab2a-5ee2a07b23eb%40googlegroups.com.
The OS file cache is almost certainly going to better than rolling your own. You can probably still find the studies that the Lucene team did in this regard.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/e3a12b07-21b2-400e-ab2a-5ee2a07b23eb%40googlegroups.com.
As I said before, my use case is a little complicated, let me explain it.
I am trying to add more cache inside a project called [gVisor][1], which is an open source user-space kernel from google. Simply speaking, it is a secure container sandbox runtime that leverages hardware-assisted virtualization(VT-x for Intel). Sentry, the core part of gVisor, could run in GR0 or HR3 (it usually runs in GR0, unless it encounters privileged instructions that cannot be executed in non-root mode).
For IO operations, most of times, sentry will get trapped down to HR3 and let the host kernel (e.g., linux) deal with the OS cache, exactly what you said. However, due to the complex architecture and sooooo many layers of abstractions in gVisor, the performance is bad. Therefore, we would like to allow more page cache in gVisor to avoid trapping to HR3 in each IO operations.Since we are adding cache in sentry, there must be a way to manage and reclaim them. LRU is the most intuitive strategy, but I find that contention overhead completely diminishes the performance improvement brought by cache. So I would like a way to avoid contentions on the same lru list.
I tried to evaluate the performance with lru & without lru, the performance with a global lru is 10x slower than without lru at all.
I should have said that my evaluation is just self-written cycle measurement, which is very rough and lack of repeated experiment. So the factor number might be different in your case. But the contention for the single sync.Mutex really hinders the performance in my case.> "what evidence is there a thread-local cache would?"Strictly speaking, I'm not looking for thread-local, but P-local (similar to per-cpu data in linux). Since a P can contain only one goroutine at any time, the P-local storage needs not locking. Only when I flush the batched items into the global lru cache, the exclusive locking is needed.Now that P-local storage does not exist, Im thinking about reducing the operations on the list. Not every operation needs to delete-and-reinsert element from the list, but just changes some attributes of the item itself. That is a compromise of LRU strategy, but in this way, we only need read lock if we operate on the item itself instead of the list.I am not an expert in caching and golang, so please correct me if I misunderstand anything.
Jesper Louis Andersen <jesper.lo...@gmail.com> 于2019年7月24日周三 下午10:44写道:
On Wed, Jul 24, 2019 at 7:16 AM Zihan Yang <whois.z...@gmail.com> wrote:I tried to evaluate the performance with lru & without lru, the performance with a global lru is 10x slower than without lru at all.This would make me step back a bit and revisit my initial plan. If a global cache isn't helping, what evidence is there a thread-local cache would? Said another way, the work you are caching must have a certain computational overhead to it, which warrants a cache in the first place.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/a7669373-cc10-436c-bf17-78ea13a7b8e2%40googlegroups.com.
I should have said that my evaluation is just self-written cycle measurement, which is very rough and lack of repeated experiment. So the factor number might be different in your case. But the contention for the single sync.Mutex really hinders the performance in my case.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAGrdgiW%2BqiwccrwakDxy4xZ-dY%3DchwQZQ9_wv2Z%2Bxc9-Sbt__A%40mail.gmail.com.
A sync.Map is lock free on read, and a single lock on writes and so will out perform this solution in all cases.