[9fans] anchors broken in the g command in sam on p9p?

smi...@icebubble.org

unread,

Aug 21, 2013, 1:11:49 AM8/21/13

to

Maybe someone here can help me make sense of this simple sam session:

,c
this is a file, one of
many files with singular
and/or plurals
.
,y/ / g/.+s$/ p
plurals

I would expect that to have responded with "thisfilesplurals".
According to the docs, g/.+s$/ should check that dot ends with "s".

,y/ / g/^[ao].*/ p
singular
and/or

I would have thought that would return "aoneof".

It looks as if ^ and $ are acting as \n.

What am I missing?

--
+---------------------------------------------------------------+
|Smiley <smi...@icebubble.org> PGP key ID: BC549F8B |
|Fingerprint: 9329 DB4A 30F5 6EDA D2BA 3489 DAB7 555A BC54 9F8B|
+---------------------------------------------------------------+

Rob Pike

unread,

Aug 21, 2013, 2:45:18 AM8/21/13

to

Nothing. That's exactly what ^ and $ do.

-rob

smi...@icebubble.org

unread,

Aug 21, 2013, 1:19:18 PM8/21/13

to

Rob Pike <rob...@gmail.com> writes:

> Nothing. That's exactly what ^ and $ do.
>

OK. How does one match the start/end of dot in a g// or v// regexp?

> -rob
>

Wait, Rob Pike? That "Rob Pike?" As in the guy who wrote the program?
How fortuitous. :)

I read your paper on sam, as well as the tutorial, and I understand the
examples, but I'm having trouble figuring out how to use sam for Real
Work(TM). Every time I try to, I get unexpected results: either an
error message, or something I didn't expect. One thing that I'm having
trouble with is figuring out how to SHRINK dot. Expanding dot (by
regexp search) is easy:

,c/this is a line of letters/
,x/l..e/p
line
,x/(a|the) l..e [a-zA-Z]+/p
a line of

But if I want to shrink dot, say to just those two (..) characters, how
can I do it? I could y/l/, but I'd need a way to detect on which "l"
dot is being split. To wit, there is also an "l" matched by [a-zA-Z].

One of the things I had difficulty figuring out was how dot behaves
between subcommands in a compound command:

,c

this is some text for sam
to edit using commands in
a command block, to see
whether or not dot is passed
from command to subsequent command
within the block

.
,x/(.+\n)+/ y/command/ {
/[ \n][a-zA-Z]+/-/[ \n][a-zA-Z]+/-/[ \n][a-zA-Z]+/-/[ \n][a-zA-Z]+/d
a/X/
}

The docs don't state explicitly how dot changes (or doesn't change)
between subcommands. The tutorial says that "{" sets dot for each
subcommand, but doesn't reveal that "{" resests dot *to the same thing*
for each subcommand. As this example shows, a d// followed by an a// is
not the same as a c//, because dot is reset between subcommands.

A number of sources on the Interweb state that you use sam as your
preferred editor. Maybe that's easier if happen to be the person who
wrote the program. :) I though that, perhaps, acme might be more
usable, but the acme docs state that it implements the same command set
as sam (with the exception of the k, n, q, !, and = commands). Do you
still use sam as your day-to-day editor? Or have you switched to
ema^H^H^Hacme?

Another feature (limitation?) of sam is that changes made by subcommands
must be in left-to-right order. IIRC, the acme docs mentioned something
about sorting changes to the file in order to implement its Undo and
Redo functions. Does the sam-like editing in acme have the same
left-to-right limitation? I figure, if anyone would know, it would be
the person who wrote acme, too. :)

The use of structural regular expressions looks like it could be very
expressive and, ultimately, very useful. It would be great if I could
figure out how to use sam (or acme, if it's any better) for real life
work.

Thanks! (...for you help ...and for writing this infernal program in
the first place ;) )

Rudolf Sykora

unread,

Aug 21, 2013, 4:14:53 PM8/21/13

to

On 21 August 2013 07:11, <smi...@icebubble.org> wrote:
>
> Maybe someone here can help me make sense of this simple sam session:
>
> ,c
> this is a file, one of
> many files with singular
> and/or plurals
> .
> ,y/ / g/.+s$/ p
> plurals
>
> I would expect that to have responded with "thisfilesplurals".

,y/ / g/.+s/ p

does it

> According to the docs, g/.+s$/ should check that dot ends with "s".
>
> ,y/ / g/^[ao].*/ p
> singular
> and/or

>
> I would have thought that would return "aoneof".
>

similarly, I believe
,y/ / g/[ao].*/ p
would do it

I think the pattern in g must match the dot entirely...

(sure, I might be wrong, I haven't tested it thoroughly.)

PS.: I believe there are some dark places in the sam language that can
lead to unexpected behaviour. Particularly the line endings are a
pain.

Rudolf Sykora

unread,

Aug 21, 2013, 4:23:34 PM8/21/13

to

On 21 August 2013 22:14, Rudolf Sykora <rudolf...@gmail.com> wrote:

> I think the pattern in g must match the dot entirely...

Well, sorry, having checked the paper, this is not true.

Ruda

Rudolf Sykora

unread,

Aug 22, 2013, 2:24:26 AM8/22/13

to

On 21 August 2013 19:19, <smi...@icebubble.org> wrote:

Rob Pike <rob...@gmail.com> writes:

OK. How does one match the start/end of dot in a g// or v// regexp?

... seems like a good question to me

Steve Simon in his Sam command reference card also uses ^ and $
for his TODAY example, so this might actually be wrong.

Ruda

Rob Pike

unread,

Aug 22, 2013, 5:35:45 AM8/22/13

to

Short answer: you can't. It would be nice though.

-rob

smi...@icebubble.org

unread,

Aug 22, 2013, 1:03:52 PM8/22/13

to

Well, I finally figured it out: how to use sam for Real Life Work(TM)!
It took me about 8 hours to figure out, but I finally managed to create
my first practical sam script. I just kind of pulled a Buddha, you
know, "I will not move from this spot until I can program sam!" ;)

Far from being mystical, however, the experience was reminiscent of
coding in Scheme. You have to kind of think of some things backwards,
and the code ends up looking all but unreadable. In fact, sam is
probably LESS readable than Scheme, because all its commands are single
letters, and it doesn't have any syntax for comments.

The script ended up being 1722 bytes long, occupying 50 lines. It's not
perfect; it's not even elegant; but it gets the job done. While I did
not reach enlightenment, I did end up with a script that can convert a
(very poorly-formatted) HTML page into reStructuredText. And it just
might have made me just as happy. :)

Erez Schatz

unread,

Aug 25, 2013, 3:58:40 AM8/25/13

to

On 22 August 2013 20:03, <smi...@icebubble.org> wrote:

Well, I finally figured it out: how to use sam for Real Life Work(TM)!

It took me about 8 hours to figure out, but I finally managed to create
my first practical sam script. I just kind of pulled a Buddha, you
know, "I will not move from this spot until I can program sam!" ;)

Care to share your script here? I'd love to see what you came up with.

--

Erez

Dentro: an outliner with an agenda
http://erezschatz.github.com/dentro/

smi...@icebubble.org

unread,

Sep 1, 2013, 8:52:54 PM9/1/13

to

Erez Schatz <moon...@gmail.com> writes:

> Care to share your script here? I'd love to see what you came up with.

OK. I've stripped-out the application-specific data from the script;
here it is in its redacted form:

,{
,y/<form/ g/<head/ s/(.|\n)*>(Expected Page Title)<(.|\n)*updated.*>([0-9]+\/[0-9]+\/[0-9]+)<(.|\n)*/:Title: \2\n:Date: \4\n/
s/<form//
,y/<form/ v/<head/ y/<h2/ v|/h2| d
,y/<form/ v/<head/ s/<h2/\n/
,y/<form/ v/<head/ y/<h2/ {
g|/h2| y/<tr/ {
v/<td/ s/([ >]| )*([^<]*)<\/h2>(.|\n)*/\n\n\2\n==========================\n/
g/<td/ g/Entry#/ y/<td/ {
v|/td| c/\n/
}
g/<td/ v/Entry#/ y/<td/ v|/td| d
g/<td/ y/<td/ {
g|/td| v/colspan="3"/ g/>(Header Foo)/ y/(Header Foo)/ {
v|/td| c/:/
g|/td| s/ : /:/
g|/td| x/(\n|<[^<>]+>\??| )+/ c/ /
g|/td| a/\n/
}
g|/td| v/colspan="3"/ g/>(Header Bar)/ y/(Header Bar)/ {
v|/td| c/:/
g|/td| i/:/
g|/td| x/( |\n|<[^<>]+>\??| )+/ c/ /
g|/td| a/\n/
}
g|/td| v/colspan="3"/ g/>(Entry#|Field Foo|Filed Bar|Field Baz|Field Quux|Field Snarf|Field Barf)/ y/(Entry#|Field Foo|Filed Bar|Field Baz|Field Quux|Field Snarf|Field Barf)/ {
v|/td| c/:/
g|/td| s/[:.?]*/:/
g|/td| x/( |\n|<[^<>]+>| )+/ c/ /
g|/td| a/\n/
}
g|/td| v/colspan="3"/ g/inch/ {
s/[^>]*>( |\n|<[^<>]+>| )*(([a-zA-Z0-9\-]+ )+inch ?([a-zA-Z0-9\-]+ )*[a-zA-Z0-9\-]+)( |\n|<[^<>]+>| )*/:Inches: \2\n/
}
g|/td| v/colspan="3"/ g/Stock/ {
i/:Density:/
x/( |\n|<?[^<>]+>| )+/ c/ /
a/\n/
}
g|/td| g/colspan="3"/ {
s/[^>]*>/:Details:/
y/[^>]*colspan="3"[^>]*>/ x/( |\n|<[^<>]+>\??| )+/ c/ /
a/\n:Notes: \n\n/
}
g|/td| g|checkbox| d
}
}
}
}
,x/<h2|<tr|<td/d

That last line is a bit of a hack. I needed it because there didn't
appear to be any way to delete the "<h2" delimiters from within the
,y/<h2/. But it works because those HTML tags do not appear in the
final reStructuredText.

A lot of the complexity of the script comes from the need to keep the
changes in sequence. I really hope that the implementation of the sam
language in Acme doesn't impose the same requirement to keep changes in
order; it's a Real Pain(TM).

One of the things that still perplexes me is the apparent necessity of
the s command. The sam paper claims that the s command isn't necessary.
But I couldn't find any way to do the edits without resorting to it. If
you could figure out how to replace the s commands with a combination of
other sam commands, I'd be quite impressed indeed!