Re: [go-nuts] A goroutine using an infinite loop

2,021 views
Skip to first unread message

Dave Cheney

unread,
Mar 27, 2013, 8:35:49 PM3/27/13
to Ciaran Doherty, golan...@googlegroups.com
Some suggestions:

> func genRandNum() *[]byte {

You don't need to return a pointer to a slice, slices are already
reference types.

> count := 1024
> rand_store := make([]byte, count)
> io.ReadFull(rand.Reader, rand_store)

you must check the error here.

> return &rand_store
> }
>
> var id_buffer chan []byte = make(chan []byte, 10)
> var gen_ids_once sync.Once
>
> func bufferNewIds() {

Pass in a <- chan struct{} here which will be used to signal to the
goroutine to stop

func bufferNewIds(done <- chan struct{}

> for {
> rand_store := genRandNum()
> hasher := sha1.New()
> hasher.Write(*rand_store)
> id_buffer <- hasher.Sum(nil)[0:20]
> }

use a select here to send to id_buffer or read from done

select {
case <- done:
// we're done
return
case id_buffer <- hasher.Sum(nil)[0:20]
}

> }
>
> func genIds() {
> gen_ids_once.Do(func() {
> go bufferNewIds()
> })
> }


Also, idiomatic go uses camelCase, not underscores.

DisposaBoy

unread,
Mar 28, 2013, 3:46:09 AM3/28/13
to golan...@googlegroups.com
in addition to what Dave Cheney already said, I'd move the groutne startup into an init function and let it will the buffer forever . I wouldn't stop it because if you only needed say ten IDs you could probably just generate the ten in the init function. the API if you will would then consist solely of a function that read from the buffer channel or maybe expose that channel directly

Johann Höchtl

unread,
Mar 28, 2013, 5:26:53 AM3/28/13
to golan...@googlegroups.com


Am Donnerstag, 28. März 2013 00:27:38 UTC+1 schrieb Ciaran Doherty:


I have a need to get a lot of unique id very quickly, after reading the docs, I was told that the best way to get unique ids was to the crytpo package but that this would run slow. So what I have done is use channel as a buffer and have an infinite loop in a goroutine to keep this channel filled. The upside is that I 'always' have quick access to a new unique id. The down side are I have no idea whether or not this a good idea, or how to safely shut the process after start this loop.


Why not use uuids? There are a couple of packages outside which give you uuid v4.

Volker Dobler

unread,
Mar 28, 2013, 6:31:27 AM3/28/13
to golan...@googlegroups.com
Am Donnerstag, 28. März 2013 00:27:38 UTC+1 schrieb Ciaran Doherty:
I have a need to get a lot of unique id very quickly, after reading the docs, I was told that the best way to get unique ids was to the crytpo package but that this would run slow. So what I have done is use channel as a buffer and have an infinite loop in a goroutine to keep this channel filled. The upside is that I 'always' have quick access to a new unique id. The down side are I have no idea whether or not this a good idea, or how to safely shut the process after start this loop.

Any insights/feedback will be much appreciated. Thank you (code below)

I'd like to challenge the whole setup:
If you need *unique* ids why not just use an var id uint64 and id++ approach?
Or do you need "unpredictable" ids? Is a tiny chance of id collisions okay?

I am not a cryptographer but I don't think hashing 20 new random bytes is pretty
much overkill:
 1. generating a stream of random bytes with a cryptographically secure png
 2. those are hashed with a cryptographically secure one way function
Which type of attacks do you intend to prevent with this twofold work?

You are not using SHA-1 not really as a hash function: You are just taking
20 bytes input and convert them output 160 bit output (which is the same).
I think hashing a uniformly distributed source will not reduce the probability
of collisions.

You stated that you need these ids "fast" but your code will read as many
bytes from /dev/urandom as you need for your ids. Arguably this happens
in different goroutines so it can be done in parallel but the problem will
be too low entropy in the kernels entropy pool to read from /dev/urandom.

Why not seed AES from crypto/rand and generate the ids from AES in
CTR mode?  Most likely this will be much faster.

One advise: Never do your own algos on any cryptographic stuff.

V.

P.S. For me it is easier to review idiomatic (randStore instead of
rand_store) Go code.

Ciaran Doherty

unread,
Mar 29, 2013, 7:08:00 AM3/29/13
to golan...@googlegroups.com
Thanks for replying. The channel length is just something to allow the code in an async manner and 10 is a round number. I am using these to generate a arbitrary number of ids. Moving the goroutine into a startup function is something I will look to do. (Once I get as far as writing a start up function :) )

Ciaran Doherty

unread,
Mar 29, 2013, 7:09:52 AM3/29/13
to golan...@googlegroups.com, Ciaran Doherty
Thank you for your feedback. I will apply these changes to my code. This is just the sort of help I was looking for to help me learn go :)

Ciaran Doherty

unread,
Mar 29, 2013, 7:36:58 AM3/29/13
to golan...@googlegroups.com

> I'd like to challenge the whole setup:
> If you need *unique* ids why not just use an var id uint64 and id++ approach?
> Or do you need "unpredictable" ids? Is a tiny chance of id collisions okay?

The tiny chance of collisions is the type of id I want. I am making unique references to an arbitrary number of pieces of data. This data could be running over multiple processes. So it is the tiny chance of collision that I am interested in. 

> I am not a cryptographer but I don't think hashing 20 new random bytes is pretty much overkill:
>  1. generating a stream of random bytes with a cryptographically secure png
>  2. those are hashed with a cryptographically secure one way function
> Which type of attacks do you intend to prevent with this twofold work?

The hash of 20 new random bytes is because 20 is a nice round number. I really don't know what a good value is here. I am just using this hash an id to access data. The hashing function has the advantage of giving a consistent output, which makes it much better for an id. (At least this is what I intended to do. I am not trying to defend against any attacks here.)

> You are not using SHA-1 not really as a hash function: You are just taking
> 20 bytes input and convert them output 160 bit output (which is the same).
> I think hashing a uniformly distributed source will not reduce the probability of collisions.

Aha. Now this is a mistake on my part. What is the correct way of using a sha-1 hash as a id?

> You stated that you need these ids "fast" but your code will read as many
> bytes from /dev/urandom as you need for your ids. Arguably this happens
> in different goroutines so it can be done in parallel but the problem will
> be too low entropy in the kernels entropy pool to read from /dev/urandom.

Well I wasn't trying to be clever here. Just generate the ids before I needed them. To be honest this comes from a little project I am doing to teach myself go. This seemed like a logical place to put a useful goroutine. (It does make my tests run a little quicker)

> Why not seed AES from crypto/rand and generate the ids from AES in
> CTR mode?  Most likely this will be much faster.

I hadn't thought of this. My starting point for these ids was to look around and see what other software uses for ids. (Document stores etc). They use seem to use either sha-1 or sha-256. Would the AES in CTR mode by as good?

> One advise: Never do your own algos on any cryptographic stuff.

definitely good advice that I always try to live up to (except when I don't)

> V.

On Wednesday, 27 March 2013 23:27:38 UTC, Ciaran Doherty wrote:
Hi,

I have a newbie question here, hopefully some wise insight can be shed on this.

I have a need to get a lot of unique id very quickly, after reading the docs, I was told that the best way to get unique ids was to the crytpo package but that this would run slow. So what I have done is use channel as a buffer and have an infinite loop in a goroutine to keep this channel filled. The upside is that I 'always' have quick access to a new unique id. The down side are I have no idea whether or not this a good idea, or how to safely shut the process after start this loop.

Any insights/feedback will be much appreciated. Thank you (code below)


import (
"crypto/rand"
"crypto/sha1"
"io"
"sync"
)

func genRandNum() *[]byte {
count := 1024
rand_store := make([]byte, count)
io.ReadFull(rand.Reader, rand_store)
return &rand_store
}

var id_buffer chan []byte = make(chan []byte, 10)
var gen_ids_once sync.Once

func bufferNewIds() {
for {
rand_store := genRandNum()
hasher := sha1.New()
hasher.Write(*rand_store)
id_buffer <- hasher.Sum(nil)[0:20]
}
}

Tamás Gulácsi

unread,
Mar 29, 2013, 1:49:32 PM3/29/13
to golan...@googlegroups.com
Use uuid version 4. That is 16 bytes random id.

Volker Dobler

unread,
Mar 30, 2013, 8:09:15 AM3/30/13
to golan...@googlegroups.com


Am Freitag, 29. März 2013 12:36:58 UTC+1 schrieb Ciaran Doherty:

> I am not a cryptographer but I don't think hashing 20 new random bytes is pretty much overkill:
>  1. generating a stream of random bytes with a cryptographically secure png
>  2. those are hashed with a cryptographically secure one way function
> Which type of attacks do you intend to prevent with this twofold work?

The hash of 20 new random bytes is because 20 is a nice round number. I really don't know what a good value is here. I am just using this hash an id to access data. The hashing function has the advantage of giving a consistent output, which makes it much better for an id. (At least this is what I intended to do. I am not trying to defend against any attacks here.)

> You are not using SHA-1 not really as a hash function: You are just taking
> 20 bytes input and convert them output 160 bit output (which is the same).
> I think hashing a uniformly distributed source will not reduce the probability of collisions.

Aha. Now this is a mistake on my part. What is the correct way of using a sha-1 hash as a id?

> You stated that you need these ids "fast" but your code will read as many
> bytes from /dev/urandom as you need for your ids. Arguably this happens
> in different goroutines so it can be done in parallel but the problem will
> be too low entropy in the kernels entropy pool to read from /dev/urandom.

Well I wasn't trying to be clever here. Just generate the ids before I needed them. To be honest this comes from a little project I am doing to teach myself go. This seemed like a logical place to put a useful goroutine. (It does make my tests run a little quicker)

> Why not seed AES from crypto/rand and generate the ids from AES in
> CTR mode?  Most likely this will be much faster.

I hadn't thought of this. My starting point for these ids was to look around and see what other software uses for ids. (Document stores etc). They use seem to use either sha-1 or sha-256. Would the AES in CTR mode by as good?
There is no "as good" here, it is either suitable or not and SHA of a document
is something completely different than. AES in CRT-mode

I admit I am confused. The most typical variants of IDs are:
- Just any handy identifier: Typically sequential ints
- Un-*guessable* identifiers: Typically some strong random numbers
- Un-*forgeable* identifiers: Typically the cryptographic hash of the data

What your code produces are un-guessable IDs. But if you do not intend
to prevent *any* kind of attack, than this is plain overkill: Just use 1, 2, 3, ...
Document store compute a document-ID from the document itself
not from a random source like crypt/random.


V.
Reply all
Reply to author
Forward
0 new messages