Generation of Strings - generation

294 views
Skip to first unread message

The MrU

unread,
Jul 12, 2016, 5:36:15 PM7/12/16
to golang-nuts

Hi,

I have this code https://play.golang.org/p/9o5TReZ7jT3

My idea was to generate all possible combinations pretty much this:

aaa
bbb
ccc
abb
acc
baa
bbb
bcc
caa
cbb
ccc
aba
abb
abc

you get the picture I think.

I got this solution but the objective is to simplify the solution without using channels if possible :

package main

import "fmt"

func GenerateCombinations(alphabet string, length int) <-chan string {
	c := make(chan string)

	go func(c chan string) {
		defer close(c)

		AddLetter(c, "", alphabet, length)
	}(c)

	return c
}

func AddLetter(c chan string, combo string, alphabet string, length int) {
	if length <= 0 {
		return
	}

	var newCombo string
	for _, ch := range alphabet {
		newCombo = combo + string(ch)
		c <- newCombo
		AddLetter(c, newCombo, alphabet, length-1)
	}
}

func main() {

	list := "abc"

	for combination := range GenerateCombinations(list, 3) {
		if len(combination) == 3 {
			fmt.Println(combination)
		}

	}

	fmt.Println("Done!")
}

Paul Borman

unread,
Jul 12, 2016, 6:26:51 PM7/12/16
to The MrU, golang-nuts

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

catal...@gmail.com

unread,
Jul 13, 2016, 11:12:29 AM7/13/16
to golang-nuts

the code don't create goroutine. thus you don't have to care about troubles with managing goroutine.

2016년 7월 13일 수요일 오전 6시 36분 15초 UTC+9, The MrU 님의 말:

Michael Jones

unread,
Jul 14, 2016, 2:17:58 AM7/14/16
to catal...@gmail.com, golang-nuts

The first time this came up I wrote code to generate general alphabet combinations as quickly as I could. The approach I used was a generator object that one calls to set the length and alphabet size, and a Next() function that is called repeatedly until it signals that all values have been generated. The values are handled in a slightly indirect way.

 

There are three use modes:

(a)     Simply advance the internal state by calling Next() until done, this lets you count the number of solutions without looking at the details, as in some of the associated tests.

(b)     Interpret the present state as an array of symbols from the user-supplied alphabet string in a reused array. In this case, there is no allocation or garbage during generation.

(c)     Interpret the present state as a string, such simply creates a string. This is (b) plus string creation and GC overhead. It is about 2.6 to 3.1 times slower.

 

There are two modes of generation as well, ordered (“AA, AB, BB”) and unordered (“AA, AB, BA, BB”). From the original post here, clearly unordered is the desired mode. The speeds of this on my MacBook feel fast…

 

BenchmarkUnordered1-8                200000000                    8.21 ns/op

BenchmarkUnordered2-8                50000000                     32.3 ns/op

BenchmarkUnordered3-8                10000000                    194 ns/op

BenchmarkUnordered4-8                 1000000                     1538 ns/op

BenchmarkUnordered5-8                  100000                     18404 ns/op

BenchmarkUnordered6-8                    5000       255203 ns/op

BenchmarkUnordered7-8                     300       4163767 ns/op

BenchmarkUnordered8-8                      20       81721059 ns/op

BenchmarkUnordered9-8                       1       1883039855 ns/op

 

BenchmarkUnorderedRate1-8       500000000                    3.68 ns/op

BenchmarkUnorderedRate2-8       200000000                    7.92 ns/op

BenchmarkUnorderedRate3-8       100000000                   13.4 ns/op

BenchmarkUnorderedRate4-8       100000000                   18.6 ns/op

BenchmarkUnorderedRate5-8       100000000                   24.1 ns/op

BenchmarkUnorderedRate6-8       50000000                     27.9 ns/op

BenchmarkUnorderedRate7-8       50000000                     32.2 ns/op

BenchmarkUnorderedRate8-8       30000000                     37.3 ns/op

BenchmarkUnorderedRate9-8       30000000                     40.7 ns/op

 

…with speeds for s-symbol alphabets of lengths 1..s being low tens of nanoseconds per combination in array modes and ~3x that in string mode. This is no channel, no allocation or freeing, no GC needed, no recursion. Because it has tests and the code in separate files, it is not really a good go playground candidate. Sorry.

--

Bakul Shah

unread,
Jul 15, 2016, 2:10:32 AM7/15/16
to golang-nuts
What the OP wants is equivalent to generating *all* n digit numbers in base n. For example, given

aa
ab
ba
bb

if you map a to 0 and b to 1 you get numbers 0..3 (base 2). In general you have n^n combinations..

If you consider only the cost of *copying* (or changing an existing string to a different one), it is clear the least cost can be obtained by changing 1 digit at a time to go to the next combination (using base N greycoding - if there is such a thing). So for example, for abc, you’d do something like

aaa
aab
aac
bac
bab
baa
caa
cab
cac

and so on. Then the total cost is n^n-1 digit changes. Though I do not know of an efficient base N greycoding algorithm.

ON the other hand a traditional counter (as below) won’t be significantly more expensive.

var dig [n]int
outer:
for {
for j = n-1; j >= 0; j— {
++dig[j]
if dig[j] < n { /* output and */ continue outer }
dig[j] = 0
}
if j == -1 { break }

Paul Borman

unread,
Jul 15, 2016, 12:57:09 PM7/15/16
to Bakul Shah, golang-nuts
BTW, the solution I provided does change one digit at a time, working just like an odometer.  The OP privately said that even the reporting channel was undesired in my solution, so I sent the OP a link to one that allocated a slice and then filled it in, rather than sending it down a channel.  The OP never responded after I sent a link to the second version (https://play.golang.org/p/bL9g8CM4DC) , so I guess it addressed the OP's issue, or the OP found a different solution.

Bakul Shah

unread,
Jul 15, 2016, 1:08:29 PM7/15/16
to Paul Borman, golang-nuts
Ah. Should've read your original play link more carefully!
You could've made it a bit more efficient by allocating one
large string (N^N+N) and one Printf! Of course, that'd fall
apart once you have more combinations than can fit in memory
but that is easily solved.

Still interested in knowing if anyone has an more efficient
greycode sequence algorithm.

On Fri, 15 Jul 2016 09:56:55 PDT Paul Borman <bor...@google.com> wrote:
>
> BTW, the solution I provided does change one digit at a time, working just
> like an odometer. The OP privately said that even the reporting channel
> was undesired in my solution, so I sent the OP a link to one that allocated
> a slice and then filled it in, rather than sending it down a channel. The
> OP never responded after I sent a link to the second version (
> https://play.golang.org/p/bL9g8CM4DC) , so I guess it addressed the OP's
> issue, or the OP found a different solution.
>
> On Thu, Jul 14, 2016 at 11:10 PM, Bakul Shah <ba...@bitblocks.com> wrote:
>
> > What the OP wants is equivalent to generating *all* n digit numbers in
> > base n. For example, given
> >
> > aa
> > ab
> > ba
> > bb
> >
> > if you map a to 0 and b to 1 you get numbers 0..3 (base 2). In general
> > you have n^n combinations..
> >
> > If you consider only the cost of *copying* (or changing an existing strin=
> g
> > to a different one), it is clear the least cost can be obtained by changi=
> ng
> > 1 digit at a time to go to the next combination (using base N greycoding =
> -
> > if there is such a thing). So for example, for abc, you=E2=80=99d do some=
> thing like
> >
> > aaa
> > aab
> > aac
> > bac
> > bab
> > baa
> > caa
> > cab
> > cac
> >
> > and so on. Then the total cost is n^n-1 digit changes. Though I do not
> > know of an efficient base N greycoding algorithm.
> >
> > ON the other hand a traditional counter (as below) won=E2=80=99t be signi=
> ficantly
> > more expensive.
> >
> > var dig [n]int
> > outer:
> > for {
> > for j =3D n-1; j >=3D 0; j=E2=80=94 {
> > ++dig[j]
> > if dig[j] < n { /* output and */ continue outer }
> > dig[j] =3D 0
> > }
> > if j =3D=3D -1 { break }
> > }
> >
> > On Jul 13, 2016, at 11:17 PM, Michael Jones <michae...@gmail.com>
> > wrote:
> >
> > The first time this came up I wrote code to generate general alphabet
> > combinations as quickly as I could. The approach I used was a generator
> > object that one calls to set the length and alphabet size, and a Next()
> > function that is called repeatedly until it signals that all values have
> > been generated. The values are handled in a slightly indirect way.
> >
> > There are three use modes:
> > (a) Simply advance the internal state by calling Next() until done,
> > this lets you count the number of solutions without looking at the detail=
> s,
> > as in some of the associated tests.
> > (b) Interpret the present state as an array of symbols from the
> > user-supplied alphabet string in a reused array. In this case, there is n=
> o
> > allocation or garbage during generation.
> > (c) Interpret the present state as a string, such simply creates a
> > string. This is (b) plus string creation and GC overhead. It is about 2.6
> > to 3.1 times slower.
> >
> > There are two modes of generation as well, ordered (=E2=80=9CAA, AB, BB=
> =E2=80=9D) and
> > unordered (=E2=80=9CAA, AB, BA, BB=E2=80=9D). From the original post here=
> , clearly
> > unordered is the desired mode. The speeds of this on my MacBook feel fast=
> =E2=80=A6
> >
> > BenchmarkUnordered1-8 200000000 8.21
> > ns/op
> > BenchmarkUnordered2-8 50000000 32.3
> > ns/op
> > BenchmarkUnordered3-8 10000000 194 ns/o=
> p
> > BenchmarkUnordered4-8 1000000 1538
> > ns/op
> > BenchmarkUnordered5-8 100000 18404
> > ns/op
> > BenchmarkUnordered6-8 5000 255203 ns/op
> > BenchmarkUnordered7-8 300 4163767 ns/op
> > BenchmarkUnordered8-8 20 81721059 ns/op
> > BenchmarkUnordered9-8 1 1883039855 ns/op
> >
> > BenchmarkUnorderedRate1-8 500000000 3.68 ns/op
> > BenchmarkUnorderedRate2-8 200000000 7.92 ns/op
> > BenchmarkUnorderedRate3-8 100000000 13.4 ns/op
> > BenchmarkUnorderedRate4-8 100000000 18.6 ns/op
> > BenchmarkUnorderedRate5-8 100000000 24.1 ns/op
> > BenchmarkUnorderedRate6-8 50000000 27.9 ns/op
> > BenchmarkUnorderedRate7-8 50000000 32.2 ns/op
> > BenchmarkUnorderedRate8-8 30000000 37.3 ns/op
> > BenchmarkUnorderedRate9-8 30000000 40.7 ns/op
> >
> > =E2=80=A6with speeds for s-symbol alphabets of lengths 1..s being low ten=
> s of
> > nanoseconds per combination in array modes and ~3x that in string mode.
> > This is no channel, no allocation or freeing, no GC needed, no recursion.
> > Because it has tests and the code in separate files, it is not really a
> > good go playground candidate. Sorry.
> >
> >
> > *From: *<golan...@googlegroups.com> on behalf of catal...@gmail.com
> >
> > 2016=EB=85=84 7=EC=9B=94 13=EC=9D=BC =EC=88=98=EC=9A=94=EC=9D=BC =EC=98=
> =A4=EC=A0=84 6=EC=8B=9C 36=EB=B6=84 15=EC=B4=88 UTC+9, The MrU =EB=8B=98=EC=
> =9D=98 =EB=A7=90:
> >
> > Hi,
> >
> > I have this code https://play.golang.org/p/9o5TReZ7jT3
> > <https://play.golang.org/p/9o5TReZ7jT>
> >
> > My idea was to generate all possible combinations pretty much this:
> >
> > aaa
> > bbb
> > ccc
> > abb
> > acc
> > baa
> > bbb
> > bcc
> > caa
> > cbb
> > ccc
> > aba
> > abb
> > abc
> >
> > you get the picture I think.
> >
> > I got this solution but the objective is to simplify the solution without
> > using channels if possible :
> >
> > package main
> >
> >
> >
> > import "fmt"
> >
> >
> >
> > func GenerateCombinations(alphabet string, length int) <-chan string {
> >
> > c :=3D make(chan string)
> >
> >
> >
> > go func(c chan string) {
> >
> > defer close(c)
> >
> >
> >
> > AddLetter(c, "", alphabet, length)
> >
> > }(c)
> >
> >
> >
> > return c
> >
> > }
> >
> >
> >
> > func AddLetter(c chan string, combo string, alphabet string, length int) =
> {
> >
> > if length <=3D 0 {
> >
> > return
> >
> > }
> >
> >
> >
> > var newCombo string
> >
> > for _, ch :=3D range alphabet {
> >
> > newCombo =3D combo + string(ch)
> >
> > c <- newCombo
> >
> > AddLetter(c, newCombo, alphabet, length-1)
> >
> > }
> >
> > }
> >
> >
> >
> > func main() {
> >
> >
> >
> > list :=3D "abc"
> >
> >
> >
> > for combination :=3D range GenerateCombinations(list, 3) {
> >
> > if len(combination) =3D=3D 3 {

Paul Borman

unread,
Jul 15, 2016, 1:25:34 PM7/15/16
to Bakul Shah, golang-nuts
It wasn't clear to me what the OP wanted.  Did they just want to print them?  Did they want to use them?  For use, I think the channel is probably the most straightforward to use, but the OP didn't want any channels at all.  The underlying array that slices refer to could also be made smaller, for example, set = "ab" and size = 3 can be satisfied with the underlying text of aaababbbaa.  I didn't bother to try and figure out a general algorithm, however, I just did that one by hand.

    -Paul

Michael Jones

unread,
Jul 16, 2016, 1:36:05 AM7/16/16
to Paul Borman, Bakul Shah, golang-nuts

Whatever they wanted, we've put far more into answering it than than they have in responding. I sent my response taking time from the Farnborough Air Show, not wanting their concerns to go unaddressed. Disappointed to see no following reply to all the good advice from everyone here.

Tong Sun

unread,
Jul 22, 2016, 2:42:47 PM7/22/16
to golang-nuts, bor...@google.com, ba...@bitblocks.com
Don't feel frustrated Michael. 

All the good advice from everyone here will benefit many more people besides OP. 

I myself learned a lot (especially, me too don't like the idea why channels has to be involved for such simple task initially, then saw that mind-opening comment from Paul Borman about print vs. use). 

Thanks again everyone!


On Saturday, July 16, 2016 at 1:36:05 AM UTC-4, Michael Jones wrote:

Whatever they wanted, we've put far more into answering it than than they have in responding. I sent my response taking time from the Farnborough Air Show, not wanting their concerns to go unaddressed. Disappointed to see no following reply to all the good advice from everyone here.

On Jul 15, 2016 6:25 PM, "'Paul Borman' via golang-nuts" <golan...@googlegroups.com> wrote:
It wasn't clear to me what the OP wanted.  Did they just want to print them?  Did they want to use them?  For use, I think the channel is probably the most straightforward to use, but the OP didn't want any channels at all.  The underlying array that slices refer to could also be made smaller, for example, set = "ab" and size = 3 can be satisfied with the underlying text of aaababbbaa.  I didn't bother to try and figure out a general algorithm, however, I just did that one by hand.
Reply all
Reply to author
Forward
0 new messages