Jake
Thanks for thoughts
My approach made the assumption that Wait() worked as advertised - all the goroutines are released when the Wait() hits zero.
What you’re pointing out is that Wait() is in some sense racy itself.
I had assumed that no goroutine in a waitgroup was ‘released' until they all were.
Your explanation says they’re released in some non-deterministic manner. And nobody can tell when.
If the waitgroup is being used in the classic manner (in essence, goroutine A waiting for a bunch of other goroutines to signal completion, with those other routines doing nothing after completion, then the ’non-deterministic release’ semantics is fine. But - obviously - it doesn’t work for my purposes.
This is the likely situation, therefore.
I’m trying to avoid using channel communication to separate the ‘phases’ (the computations interspersed by barriers) for efficiency reasons.
I think the approach is sound - barriers are a known thing - but my misunderstanding of Wait() means that my barrier isn’t sound at all. So I shall seek to build a better barrier, and also measure channel performance - because I think to make the barrier work as desired, I need to construct it from two waitgroups, and whack one then the other being careful about resets.
But that with extra path length I should also look again at channels.
— P
Pete Wilson <peter....@bsc.es>: Jan 16 10:28AM -0600
Gentlepersons
I asked for advice on how to handle a problem a few days ago, and have constructed a testbed of what I need to do, using WaitGroups in what seems to be a standard manner.
But the code fails and I don’t understand why.
The (simple version of) the code is at https://play.golang.org/p/-TEZqik6ZPB <https://play.golang.org/p/-TEZqik6ZPB>
In short, what I want to do is to have a controller goroutine (main) plus some number of worker goroutines
I implement a Barrier function which operates on a properly-initialised waitgroup.
The Barrier function simply does Done() then Wait()
What I want is that each worker does a two-phase operation
- wait until everybody has passed a start barrier
- do some work
- wait until everybody has passed an end barrier
- do some work
.. doing this some number of times
In parallel, main has created and initialised the start and end waitgroups wgstart and wgend
main has then created the worker goroutines (in the real thing I want roughly one worker per core, so there’s also some setting of GOMAXPROCS)
main then enters a loop in which it
- waits until everbody including it has passed the start barrier
- resets the start barrier
- waits until everybody has bassed the end barrier
- resets the end barrier
This behaviour is observed, except the code panics, both in the playgorund and on my machine. Typical failure is:
----------- [2] main about to barrier start --------------- w[7] enters barrier startpanic: sync: WaitGroup is reused before previous Wait has returned goroutine 10 [running]: sync.(*WaitGroup).Wait(0xc00002c030) /usr/local/go-faketime/src/sync/waitgroup.go:132 +0xae main.Barrier(0x4, 0x4bef21, 0x3, 0xc00002c030) /tmp/sandbox686473236/prog.go:51 +0x12b main.worker(0x4, 0xc00002c020, 0xc00002c030, 0xa) /tmp/sandbox686473236/prog.go:35 +0x309 created by main.main /tmp/sandbox686473236/prog.go:78 +0x295
What have I misunderstood and done wrongly?
Thanks!
— P
WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
http://bsc.es/disclaimer
|
"jake...@gmail.com" <jake...@gmail.com>: Jan 16 08:59AM -0800
There may be other problems as well, but the WaitGroup.Add
<https://golang.org/pkg/sync/#WaitGroup.Add> documentation says:
" If a WaitGroup is reused to wait for several independent sets of events,
new Add calls must happen after all previous Wait calls have *returned*."
You have a race condition. What I believe is happening is the following:
- The last goroutine calls `Barrier(w, "start", wgstart)`. That calls
barrier.Done(). It then calls Wait(), but Wait() has not returned.
- Meanwhile main() calls `Barrier(threads, "start", &wgstart)`. The
Wait() in that call returns because all the goroutines have called Done().
- main() calls `wgstart.Add(threads + 1)`
- The goroutine from above is still in the Wait() call, hence the
panic.
There is also another possible scenario, that is not causing the panic I
see, but could cause incorrect behavior:
- The last goroutine calls `Barrier(w, "start", wgstart)`. That calls
barrier.Done().
- Meanwhile main() calls `Barrier(threads, "start", &wgstart)`. The
Wait() in that call returns because all the goroutines have called Done().
- main() calls `wgstart.Add(threads + 1)`
- The goroutine from above now calls Wait(), but since Add was already
called, it blocks. That goroutine is now 'stuck', because Wait() will
never return, which will in turn end up blocking all the other goroutines
eventually.
Honestly, I think you need to rethink your whole model.
Hope that helps.
On Saturday, January 16, 2021 at 11:28:59 AM UTC-5 Pete Wilson wrote:
|
Brian Candler <b.ca...@pobox.com>: Jan 16 01:02PM -0800
On Saturday, 16 January 2021 at 16:28:59 UTC Pete Wilson wrote:
> In short, what I want to do is to have a controller goroutine (main) plus
> some number of worker goroutines
This doesn't answer your question, but if you haven't seen it already I
recommend this video about concurrency patterns in go:
https://www.youtube.com/watch?v=5zXAHh5tJqQ
All of it is well worth watching, but an example of using a semaphore
channel instead of a worker pool starts at 32:15.
|
Manlio Perillo <manlio....@gmail.com>: Jan 16 10:10AM -0800
I'm reading the https://tip.golang.org/pkg/embed/ package documentation and
I found a possible inconsistency.
At the end of https://tip.golang.org/pkg/embed/#hdr-Directives:
"Patterns must not match files outside the package's module, such as
‘.git/*’ or symbolic links"
and
"Patterns must not contain ‘.’ or ‘..’ path elements nor begin with a
leading slash"
It seems to me that the first phrase is not necessary, since the second
phrase prevents matching files outside the package module.
Thanks
Manlio Perillo
|
Axel Wagner <axel.wa...@googlemail.com>: Jan 16 07:24PM +0100
I don't think they do. There are two examples in the first phrase, which
are not excluded by the second - the ".git" directory and a symbolic link
(pointing outside of the module).
On Sat, Jan 16, 2021 at 7:11 PM Manlio Perillo <manlio....@gmail.com>
wrote:
|
Dmitri Shuralyov <dmit...@golang.org>: Jan 16 10:28AM -0800
I think both are needed, they don't overlap. Note that the second phrase
says "must not contain '.' or '..' path *elements*", emphasis them being a
complete path element. So "./git" is disallowed by the second phrase, but
".git" is not.
On Saturday, January 16, 2021 at 1:10:47 PM UTC-5 manlio....@gmail.com
wrote:
|
Axel Wagner <axel.wa...@googlemail.com>: Jan 16 07:29PM +0100
To put it another way:
The second phrase is a lexical requirement about the pattern. It must not
contain a . or .. element - whether or not the result is included in the
module (e.g. "foo/../foo/bar" is not allowed either, even though it's
equivalent to "foo/bar").
But, a lexical path *in* the module might still refer to a file not
included in it it - either by a symlink, or by being in the .git directory
(and maybe other cases I'm unaware of). So, the first phrase excludes any
case where the file is not included the module, whether or not the name you
refer it by lexically contains . or '..'.
Both phrases are necessary.
On Sat, Jan 16, 2021 at 7:24 PM Axel Wagner <axel.wa...@googlemail.com>
wrote:
|
Manlio Perillo <manlio....@gmail.com>: Jan 16 12:02PM -0800
Thanks. I was only considering the parent of the module's root directory.
Is the concept of "outside the module" defined somewhere?
Manlio Perillo
Il giorno sabato 16 gennaio 2021 alle 19:30:05 UTC+1
|
Manlio Perillo <manlio....@gmail.com>: Jan 16 12:04PM -0800
As an example: is testdata outside the package's module?
Thanks
Manlio
Il giorno sabato 16 gennaio 2021 alle 21:02:25 UTC+1 Manlio Perillo ha
scritto:
|
Axel Wagner <axel.wa...@googlemail.com>: Jan 16 09:08PM +0100
I think this is the best doc about what is included in a module:
https://golang.org/ref/mod#zip-path-size-constraints
Everything not in that list is "outside" that module.
On Sat, Jan 16, 2021 at 9:02 PM Manlio Perillo <manlio....@gmail.com>
wrote:
|
Dmitri Shuralyov <dmit...@golang.org>: Jan 16 12:08PM -0800
Directories named testdata are included in the module; they're needed for
tests to run. The most important thing that's left out are subdirectories
that contain a go.mod file, since the content of such directories is a
different module.
Some good places to look for full details include
https://golang.org/ref/mod#zip-path-size-constraints and
https://pkg.go.dev/golang.org/x/mod/zip.
On Saturday, January 16, 2021 at 3:04:58 PM UTC-5 manlio....@gmail.com
wrote:
|
Manlio Perillo <manlio....@gmail.com>: Jan 16 12:23PM -0800
https://golang.org/ref/mod#zip-path-size-constraints prevents directories
that begin with a dot, but only because the directory is interpreted as a
package.
It is not clear, to me, if `.git` is ignored by the `embed` directive
because it is the private directory of the VCS or because it starts with a
dot.
Thanks
Manlio Perillo
Il giorno sabato 16 gennaio 2021 alle 21:09:08 UTC+1
|
Dmitri Shuralyov <dmit...@golang.org>: Jan 16 12:45PM -0800
It gets pretty subtle. The ".git" directories aren't included in module
zips by the go command (I don't know if this is documented anywhere, but
it's very sensible behavior), but they aren't disallowed. A custom module
zip may include a ".foo", "_foo", or even ".git" directory with files.
In the the phrase you mentioned:
> Patterns must not match files outside the package's module, such as
‘.git/*’ or symbolic links
Symbolic links are neither included not allowed.
.git/* files aren't included by the go tool.
As I understand, the "such as ‘.git/*’ or symbolic links" part is just an
example of some common types of files that aren't included in modules. The
important part of that phrase is "Patterns must not match files outside the
package's module". For example, if you have this tree:
$ tree .
.
├── LICENSE
├── go.mod // module example.com/m1
├── p.go
├── p_test.go
└── nested
├── go.mod // example.com/m1/nested
├── foo.txt
└── ...
Then p.go can't embed "nested/foo.txt", because nested/foo.txt is going to
be outside of the m1 module.
If you're looking to improve package embed documentation, I suggest filing
an issue <https://golang.org/issue/new>. If your goal to understand this
better for your own interests, I hope you find the nuanced details above
interesting. :)
On Saturday, January 16, 2021 at 3:23:32 PM UTC-5 manlio....@gmail.com
wrote:
|
Axel Wagner <axel.wa...@googlemail.com>: Jan 16 09:47PM +0100
In general, embedding files from directories starting with dot ("hidden
directories") works fine. But you must take care, to either mention the
hidden directory explicitly, or the file you want to exclude, as otherwise,
the hidden directory will be skipped by embed (see
https://github.com/golang/go/issues/42328).
.git is thus special. As
https://pkg.go.dev/golang.org/x/mod/zip#CreateFromDir mentions, .git and
similar directories are skipped when creating the zip file of a module,
because they are not deemed "part of the module" (which, I think, makes a
lot of sense), so they can't be embedded based on the rule that embedded
files must be part of the module (i.e. they must be included in the zip
file).
It might be reasonable to spell the skippage of .git etc. out more
specifically in the docs.
On Sat, Jan 16, 2021 at 9:24 PM Manlio Perillo <manlio....@gmail.com>
wrote:
|
"atd...@gmail.com" <atd...@gmail.com>: Jan 16 05:20AM -0800
Hello,
Just wondering if there is a way to create references to other fields in go
comments.
If it does not exist, wouldn't it be something valuable, for easier
navigation in godoc and its new iteration? (I would assume that we would
have to check comments for broken references on code change)
|
Tim Hockin <tho...@google.com>: Jan 15 03:43PM -0800
> example.com/m/submod v0.0.0 => ./submod
> )
> I think you might have tried this already. It gives the same "main module ... does not contain package" error. I believe that's a bug. I've opened #43733 to track it.
Interesting. If that's a bug, then maybe I'll be able to do what I
need once fixed.
> In general, it should be possible to give 'go list' an absolute or relative path (starting with ./ or ../) to any directory containing a package which is part of any module in the build list. For example, some tools list directories in the module cache to find out what package a .go file belongs to.
> As a workaround, you could put a go.mod in an otherwise empty directory (in /tmp or something), then require the relevant modules from the repo and replace them with absolute paths. Then you can run 'go list' in that directory with absolute paths of package directories.
Interesting - is the difference the absolute paths vs relative?
I hoped maybe `-modfile` would do the same trick, but alas not:
```
$ (cd /tmp/gomodhack/; go list /tmp/go-list-modules/submod/used/)
example.com/m/submod/used
$ go list --modfile /tmp/gomodhack/go.mod /tmp/go-list-modules/submod/used/
main module (tmp) does not contain package tmp/submod/used
```
It also fails some cases:
```
(cd /tmp/gomodhack/; go list /tmp/go-list-modules/submod/used/)
example.com/m/submod/used
thockin@thockin-glaptop4 go-list-modules main /$ (cd /tmp/gomodhack/;
go list /tmp/go-list-modules/staging/src/example.com/other1/used/)
go: finding module for package example.com/m/staging/src/example.com/other1/used
cannot find module providing package
example.com/m/staging/src/example.com/other1/used: unrecognized import
path "example.com/m/staging/src/example.com/other1/used": reading
https://example.com/m/staging/src/example.com/other1/used?go-get=1:
404 Not Found
```
It seems that is because the "main" (top-level dir) go.mod has
`replace` directives with relative paths, which kubernetes really
does.
> Incidentally, golang.org/x/tools/go/packages will call 'go list' under the hood in module mode. go/build might do the same, depending on how it's invoked. 'go list' may be the best thing to use if it gives the information you need.
Yeah, I noticed. When GO111MODULE=off, everything I am doing is much
faster. I'm wary of depending on that forever, though.
Stepping back, I fear I am pushing the square peg into a round hole.
Let me restate what I am trying to do.
I want to run a slow codegen process only if the packages it depends
on have ACTUALLY changed (mtime is a good enough proxy) and I don't
know a priori which packages need codegen. I want to scan the file
tree, find the files that need codegen, check their deps, and only
then run the codegen.
We do this today with `go list` and GO111MODULE=off, but I was advised
at some point that x/tools/go/packages was the future-safe approach.
If there's a better way, I am all ears.
Tim
GO111MODULE=off
|
Randall O'Reilly <rcore...@gmail.com>: Jan 16 12:45AM -0800
Is the slowness here related to https://github.com/golang/go/issues/29427 ? I'm not sure the earlier fix for that actually fixed the issues I was having (and the issue remains open) -- I need to do more research for my situation, but just in case this might be another reason to revisit the slowness factor there..
- Randy
|
"pat2...@gmail.com" <pat2...@gmail.com>: Jan 15 08:00PM -0800
> got 'can't fork: out of memory' if I set overcommit_memory to 2, change it
> to 0 made this disappear. However for embedded systems I normally set
> overcommit as 2 and no swap to avoid OOM in the field.
The big footprint is from common libraries and runtime system. I believe
this is a clear result of the design decision trying to avoid
"DLL hell" that we all lived with in the early Windows era.
If we all ran a good operating system, such as Tenex, most of the libraries
would be shared by virtual page system. Automatically,
so that even in a small real world memory system, it would be fine. The
early Tenex systems typically had about 480 K of memory. Back in 1969, that
was very expensive. Now that I think about it, the memory size and CPU
speed of those early Tenex systems is far smaller than most embedded
microcontrollers. See Tenex, by
BBN. https://en.wikipedia.org/wiki/TENEX_(operating_system)
|
Nikolay Dubina <nikolay.d...@gmail.com>: Jan 15 06:10PM -0800
*What is this?*
Fast feature preprocessing in Go with feature parity to sklearn
https://github.com/nikolaydubina/go-featureprocessing
*What is new?*
- Added batch processing
- Added inplace processing
- Added parallel processing
- Made all inference code to have zero memory allocations
- Added sklearn Python comparison benchmarks
- Did a lot of benchmarking
- Cleaned up API
- Cleaned up generated code a bit
- Made more realistic generated tests
- Made more realistic benchmarks
- Improved documentation
Thanks! 🛠
- Nikolay
|
Ross Light <ro...@zombiezen.com>: Jan 15 03:44PM -0800
The Go CDK stores a Context in a struct while performing I/O:
https://pkg.go.dev/gocloud.dev/blob#Bucket.NewWriter
It could be argued that this is done for compatibility with the io.Reader
and io.Writer interfaces. However, I think this pattern, used sparingly, is
suitable for API interactions where multiple method calls are required for
a single conceptual task.
-Ross
On Tuesday, January 12, 2021 at 9:29:39 AM UTC-8 Jean de Klerk wrote:
|
WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer