# Re: Sprouts notation

20 views

### Dan Hoey

Dec 24, 2008, 3:00:38 PM12/24/08
Yper Cube sent me a note requesting input on Sprouts notation suitable
for a position library as in GLOP. With his permission, I'm
responding to sprouts-theory generally.

Yper Cube wrote:
> I liked the parenthesis idea.It doesn't work in positions arousing from
> torii and proj. planes surfaces but maybe it can still be used sometimes.
> (Example: we can't use it in .abcabc.
> but we can write .abacbc.
> as .(b)(b). , can't we?

I think this is a good idea when possible.

> I would also prefer GLOP to use "{" to mark the start of a region and not
> only to mark its ending and "." to mark for boundary starting as well. Like,
> 0.0.AB.}AB.}]!
> use something like
> {.0.0.AB.}{.AB.}]!
> or maybe
> [.0.0.AB.}{.AB.}]
>
> In a recent sprouts theory post, I mentioned another possible technique to
> copmact notation further.
> Use of * or ^ when we have multiple identical boundaries.
> Example. instead of 0.0.0.0.}], use:
> 0*4.}]
> (GLOP uses the above now)
> or
> [{(.0.)^4}]
> or
> [{(.0.)*4}]

I strongly disagree with this. We should save our matching brackets
for things that need to be grouped recursively, whereas boundaries
within regions within lands are handled by hierarchy. Boundaries
within a region should be _separated_ (as with "."), regions should be
separated (say, with "/"), and lands should be separated (say, with
"%"). The use of a trailing separator is superfluous and should be
avoided, so the above would be 0.0.AB/AB . The "!" for
end-of-position should only be needed if there's some other thing
following the position. A trailing separator should be permitted on
input, but ignored.

For nonplanar regions, I suggest a region should be suffixed with +n
for T^n or -n for P^n. Since the region will always be followed by
"/", "%", "!", or end-of-string, there is no restriction on multiple
digits for n.

> Using parenthesis, we can extend this to include more complex objects (than
> boundaries)
> Like, instead of 0.0.0.0.2.2.2.2.2.AB.CD.EF.}AB.}CD.}EF.} ,use:
> {(.0.)*4(.2.)*5(.AB.{.AB.})*3}

This can be done in several ways without the leading or trailing
separators. Perhaps the simplest is to always use a kind of brackets
(say "<>"), so that <0.>4 means 0.0.0.0
<1.2A/>2 means 1.2A/1.2A
<12>32 means 1212122
<(>3<(<)>2>3 means (((())())()) .
The <...> is always followed by a single digit specifying the repeat.
A position canonicalizer may not want to use all of these constructs,
but they should be recognized on input.

Your last example in which the parentheses imply a change of variables
is somewhat problematic. I think it should be done with a different
approach, which I call the "partial position" representation. I've
talked about partial positions before--I think they will eventually
appear in the database by themselves, along with simplifications such
as the ^x:x.0 = ^x:x2 that has been observed before.

A partial position is a land that has some unmatched pivots called
"parameters". The parameters are listed in the front of the partial
position in a sort of lambda notation, like ^xyz:1xy2z.2 . This
refers to a part of a position whose pivots are to be matched up with
another part of the position in a way to be determined. Copies of the
partial position may be matched up in several different places.

When there is only one parameter, the presence of that parameter
outside the partial position denotes a pivot attached to a copy of the
partial position. So ^x:xA/A1%xx.x means AB.C/AD/D1/BE/E1/CF/F1 .
Partial positions can instantiate other partial positions:
^x:x((1))%^z:x.xz%zzx means
ABC/D.EA/D((1))/E((1))/F.GB/F((1))/G((1))/C((1)) .

With a multiple-parameter partial position instantiated more than
once, we must specify which parameters go with which instantiation.
In the simplest case, on a plane where the matching pivots appear in
the same region (and therefore on a single boundary), the parameters
parenthesize, so we can specify parameters as [x...y...z] without
ambiguity. For instance, ^xyz:1xy2z.2%xy[z1[xyz]yx]z would mean
ABF1GHJEDC/1AB2C.2/1DE2F.2/1GH2J.2 . In more complicated situations,
we can specify multiple parameter sets for a partial position as in
^xyz^tuv:x(y(z))%xtuyvz for ADEBFC/A(B(C))/D(E(F)) . The primary
parameter set can still be parenthesized if necessary, but the others
must be unambiguously matched with each other.

Finally, I would like a construct for specifying the join of two
partial positions. I propose using "=xyz=tuv" , where the first "="
replaces the preceding "%" character. So ^xyzw:xyzw1=xyzw=xywz refers
to the (nonplanar) position ABCD1/ABDC1

On alphabets: Glop introduced the idea of lower-case characters for
pier spots, which can appear on multiple boundaries within a position.
upper-case characters were reserved for pivots. The problem is that
we sometimes run out of one kind or another. With the addition of
parameter letters, the problem is worse, so I propose a more general
approach. For readability, a program is encouraged to use ABC... for
pivots, abc... for labeled pier spots, and xyz... for parameters so
far as possible. The rules for recognizing which is which should be
loosened so that any letter can be used for any of these purposes.

A letter that appears in a lambda expression "^xyz:" is always a
parameter, and those are the only parameters. For multiple parameters
in brackets, each parameter must appear just once except within inner
brackets. Other multiple parameter sets appearing on a boundary are
assumed to match each other, so ^xy%xy.xy means AB.CD/AB/CD; we need
^xy^zw:xy%xz.yw to refer to AC.BD/AB/CD . No other unbracketed
parameters of the set may appear on that boundary. Other sets of
unbracketed parameters that appear within a region are assumed to
match each other, so ^xy%Axy/A1xy means ABC/A1CD/BC/CD ; again we
can't have other stray members of the set in the region. If none of
these is possible, each set of parameters can appear only once in a

Excluding the parameters, a letter that appears twice on a boundary is
a pier spot, and may appear only twice on that boundary. Excluding
these, a letter that appears on different boundaries in the same
region may appear only twice in the region, and the two letters match
each other. (This can only happen on nonplanar boards). After all of
these are taken into account, remaining letters must appear exactly
twice per land, and match each other.

Just in case we end up with a need for more than 52 letters, I believe
we should allow an sequence like &word; to be used for a letter, where
"word" is composed entirely of letters.

So the grammar we have is

<position> := <gamelabel> <spec> | <spec> | <position> "!"
<gamelabel> := <number> ( "+" | "-" ) ":"
<spec> := <simple>
| <partial> "=" <letters> "=" <letters>
| <partial> "%" <spec>
<partial> := "^" <letters> ":" <land>
| "^" <letters> ":" <partial>
<regular> := <land>
| <land> "%" <regular>
| <regular> "%"
<land> := <toporegion>
| <toporegion> "/" <land>
| <land> "/"
<toporegion> := <region>
| <region> ( "+" | "-" ) <number>
<region> := <boundary>
| <boundary> "." <region>
| <region> "."
<boundary> := <subboundary>
| <subboundary> <boundary>
<subboundary> := <site>
| "(" <boundary> ")"
| "[" <letter> <boundary> <letter> "]"
<site> := <letter> | 1 | 2 | "&" <letters> ";"
<letters> := <letter> | <letter> <letters>

This grammar accidentally prohibits extended letters as parameters,
but maybe that's a good thing. There are other requirements such as
that parameter lists can't repeat a letter, but that is for symantic
analysis.

The gamelabel is in case someone needs to mark the position as normal
or misere or specify an initial number of spots. I suppose I should
allow the number or sign to be omitted.

The use of "<>" for repetition is a metalanguage used for abbreviating
this language. It is expanded textually first before applying this
grammar.

We still have a set of brackets "{}" that aren't used, but I find they
tend to look too much like parentheses to be very useful. Other
symbols such as @~#\$*_\,;?'"` may turn out to have some use as well.

Finally, I should note that GLOP databases need to have a version
number at the top so that older versions can be recognized and

Dan

### Josh Purinton

Jan 4, 2009, 2:16:31 PM1/4/09
On Wed, Dec 24, 2008 at 3:00 PM, Dan Hoey <Ho...@aic.nrl.navy.mil> wrote:
> Yper Cube wrote:
>>
>> I liked the parenthesis idea.It doesn't work in positions arousing from
>> torii and proj. planes surfaces but maybe it can still be used sometimes.
>> (Example: we can't use it in .abcabc.
>> but we can write .abacbc.
>> as .(b)(b). , can't we?

I'm also a huge fan of the parenthesis idea. It's so obvious in
retrospect, but no one (besides Dan) thought of it. However, I'd
prefer to use angle brackets ("<" and ">") instead of parenthesis,
since I want to reserve parenthesis for more general purposes, such as
enclosing the whole game (see below).

> We should save our matching brackets
> for things that need to be grouped recursively, whereas boundaries
> within regions within lands are handled by hierarchy. Boundaries
> within a region should be _separated_ (as with "."), regions should be
> separated (say, with "/"), and lands should be separated (say, with
> "%").

I agree, except I'd prefer to see "+" as the component ("land")
separator,since it's use for this purpose has a long history in
combinatorial game theory. I would retain "%" as the separator between
partial positions and the final position in which they are
instantiated.

> For nonplanar regions, I suggest a region should be suffixed with +n
> for T^n or -n for P^n. Since the region will always be followed by
> "/", "%", "!", or end-of-string, there is no restriction on multiple
> digits for n.

I'd like to avoid the use of plus and minus here, since they have
other connotations in sprouts notation. Perhaps instead we could write
Tn or Cn as a suffix. For example, we would write (0.0+1<1>)T3. This
would give us a notation for representing sums of games on different
surfaces, such as 0.0C3+1<1>T3. This might be useful if it turns out
that certain games on more exotic surfaces are equivalent to other
games on simpler surfaces, in the same way that we consider some
sprouts components to be equivalent to nim-heaps and write, for
example, *2+1<1>.

> Finally, I would like a construct for specifying the join of two
> partial positions. I propose using "=xyz=tuv" , where the first "="
> replaces the preceding "%" character. So ^xyzw:xyzw1=xyzw=xywz refers
> to the (nonplanar) position ABCD1/ABDC1

I don't understand this. Do you have a planar example? Also, what

### danny purvis

Jan 6, 2009, 12:47:31 AM1/6/09
The bad thing about angle brackets is that they might not work so well in html contexts.

From: Josh Purinton <josh.p...@gmail.com>
Cc: Yper Cube <yper...@gmail.com>
Sent: Sunday, January 4, 2009 2:16:31 PM
Subject: Re: Sprouts notation

### Dan Hoey

Jan 11, 2009, 1:25:47 PM1/11/09
to sprouts-theory
Josh Purinton wrote:

> I'm also a huge fan of the parenthesis idea. It's so obvious in
> retrospect, but no one (besides Dan) thought of it. However, I'd
> prefer to use angle brackets ("<" and ">") instead of parenthesis,
> since I want to reserve parenthesis for more general purposes, such
> as enclosing the whole game (see below).

I suppose this makes sense, since parentheses are traditional for
general grouping. The problems with HTML would arise in any case,
since the alternative is to use angle brackets for repetitions.

[ On my suggestion of separating boundaries with ".", regions with
"/", and lands with "%"]

> I agree, except I'd prefer to see "+" as the component ("land")
> separator,since it's use for this purpose has a long history in
> combinatorial game theory. I would retain "%" as the separator between
> partial positions and the final position in which they are
> instantiated.

That's fine, though I still want to be able to instantiate partial
positions in a multiple-land position. It might even be worthwhile to
allow a partial position to include multiple lands, so that the
partial position ^x:x1 could be written ^x:x2+1 or even ^x:x2+*1.

> > For nonplanar regions, I suggest a region should be suffixed with
> > +n for T^n or -n for P^n. Since the region will always be
> > followed by "/", "%", "!", or end-of-string, there is no
> > restriction on multiple digits for n.

> I'd like to avoid the use of plus and minus here, since they have
> other connotations in sprouts notation. Perhaps instead we could write
> Tn or Cn as a suffix.

That sounds good.

> For example, we would write (0.0+1<1>)T3.

This is confusing, since the topology notation refers to a particular
region. Each region has its own topology signifier (or none, if the
region is planar). So is this supposed to mean 0.0T3+1<1>T3? I would
rather not apply a topology signifier to all the regions in a
position, since different regions usually end up with different
topologies.

> > Finally, I would like a construct for specifying the join of two
> > partial positions. I propose using "=xyz=tuv" , where the first
> > "=" replaces the preceding "%" character. So
> > ^xyzw:xyzw1=xyzw=xywz refers to the (nonplanar) position
> > ABCD1/ABDC1

> I don't understand this. Do you have a planar example? Also, what
> about joining three partial positions?

This notation could be used for joining an arbitrary number of partial
positions. The reason for the equal sign notation is that a partial
position is normally instantiated by mentioning the exterior site of a
pivot that appears in the partial position. But if the exterior site
of one partial position is internal to the other, this doesn't work.
Perhaps the simplest case is the empty loop partial position
^xy^st^uv:xy!. Three of them could be put together in a cycle as
^xy^st^uv:xy=xsu=tvy!. This is just a long way of saying AB/BC/CA!.

On further consideration, we may want to use partial positions only in
cases where the external points appear in a single region. In this
case, we canonicalize a position P by taking its region graph, R(P).
The vertices of R(P) are the regions of P, and the edges of R(P)
connect regions that share a pivot in P. We separate R(P) into its
singly-connected components (SCCs): two vertices are in the same SCC
if there are two disjoint paths between them.

The SCCs form a tree. A tree is canonicalized by height: The height
of a leaf node is zero, and a non-leaf node of degree N has height one
greater than the maximum height of its N-1 neighbors of least height.
A neighbor of equal or greater height is called the parent. When all
heights are assigned, there will either be a root node of greatest
height with no parent, or two nodes tied for greatest height,
connected by an edge, which are each other's parents.

We create a partial position for each SCC except the root (if any).
The partial position's parameters are the pivots shared with the
parent, and the partial position refers to the partial positions of
which it is the parent. Of course, if two partial positions are
isomorphic, only one need be defined. If there is a root, it is the
position at the end of the partial positions. If there are two
co-parents, then the equals-sign notation is used to express their
connection.

As an example, consider the position 0.A/AB/BC.DE/0.C/DF/EF!. The
SCCs at height 0 are 0.A, 0.C, and DF/EF. The other two regions are
co-parents. So the partial position notation would be
^x:0.x%^yz:yA/zA%^s:sx%^t:tx.yz=s=t!.

### Dan Hoey

Jan 12, 2009, 4:46:20 PM1/12/09
I wrote:
> As an example, consider the position 0.A/AB/BC.DE/0.C/DF/EF!. The
> SCCs at height 0 are 0.A, 0.C, and DF/EF. The other two regions are
> co-parents. So the partial position notation would be
> ^x:0.x%^yz:yA/zA%^s:sx%^t:tx.yz=s=t!.

On further consideration, the percent signs are superfluous. A partial
position can be followed by a "^", introducing another partial
position, an "=", introducing a coparent correspondence, or a ":"
(instead of a "%"), introducing the root SCC. Then this position
would be ^r:0.r^st:sA/tA^u:ru^v:rv.st=u=v!.

If we are representing a multi-land position, I suggest that coparent
pairs appear first (each as an "=...=..." clause) followed by the
root SCCs, where all but the first root SCC can be introduced with a

So 0.A/AB/0.B+0.A/AB/BC/0.C+0.A/AB/BC/CD/0.D+0.A/AB/BC/CD/DE/0.E would
be written ^r:0.r^s:rs^t:st=s=s=t=t+rr+ss!. Of course, this would
probably be placed in the database as r:2r^s:rs^t:st=s=s=t=t+rr+ss!,
since ^r:0.r = ^r:2r .

Note that coparent correspondences do not have topological signifiers
(since they don't represent any new regions) but the regions of the
SCCs may be suffixed with Tn or Cn for nonplanar topologies.

This decomposition into SCCs is a first step toward graph isomorphism
canonicalization.

Dan

### Josh Purinton

Jan 13, 2009, 5:48:27 PM1/13/09
On Sun, Jan 11, 2009 at 1:25 PM, Dan Hoey <hao...@gmail.com> wrote:
> That's fine, though I still want to be able to instantiate partial
> positions in a multiple-land position. It might even be worthwhile to
> allow a partial position to include multiple lands, so that the
> partial position ^x:x1 could be written ^x:x2+1 or even ^x:x2+*1.

Looks good. Perhaps we could write that as ^x:(x2+1) or ^x:(x2+*1) if
it was part of a larger position specifier.

> [On my suggestion of applying topology signifiers to multiple regions]

>
> This is confusing, since the topology notation refers to a particular
> region.

I see what you mean. I like your suggestion of 0T3+1<1>T3.

If I understand the notation for joining multiple partial positions,
the syntax is
[partial position specifiers for sites abc...xyz...]=abc...=xyz...
and the meaning is that all the partial positions involving the sites
abc...xyz... are instantiated such that a & x are two sites of the
same spot, b & y are two sites of the same spot, c & z are two sites
of the same spot, etc. Clever.

See the attached diagram of your example ^xy^st^uv:xy=xsu=tvy.

partial.png

### Dan Hoey

Jan 13, 2009, 7:38:42 PM1/13/09
Josh Purinton wrote:

> If I understand the notation for joining multiple partial positions,
> the syntax is
> [partial position specifiers for sites abc...xyz...]=abc...=xyz...
> and the meaning is that all the partial positions involving the sites
> abc...xyz... are instantiated such that a & x are two sites of the
> same spot, b & y are two sites of the same spot, c & z are two sites
> of the same spot, etc. Clever.
>
> See the attached diagram of your example ^xy^st^uv:xy=xsu=tvy.

Yes, that's what the "=" notation means. I see I totally omitted
explaining how I intended it to set up a matching between parameters,
so thanks for your patient reconstruction.

But if we use partial positions for simply-connected components, then
that wouldn't be used because AB/BC/CA! is not simply-connected. In an
SCC tree, the "=" notation is only used for co-roots. Those would be
partial positions representing SCCs at the same height that appear on
opposite sides of a boundary, like ^xy:xyAB/BC/CA=xy=xy! for
ABCD/ABEF/CG/GD/EH/HF!.

Dan

### Dan Hoey

Jan 22, 2009, 6:33:34 PM1/22/09
I propose using parentheses in this neo-Glop notation in a different
way: Parentheses can enclose a pivot/loop chain as proposed by Josh
that parentheses be used in pivot/loop notation to enclose neo-Glop
notation, we can use double sets of parentheses if neo-Glop needs
grouping. I'm not sure about a good notation for repetition in
neo-Glop, but pivot/loop notation makes most of it unnecessary.

When parentheses are used in pivot/loop notation, I expect they
introduce a new boundary in the region. In neo-Glop, parentheses
may introduce part of a boundary, as in 0.2(L1)<(P2)> , which means
0.2AB<C>/0.AB/0.0.C .