Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Code rewriter

61 views
Skip to first unread message

Graham Dicker

unread,
Jul 20, 2023, 8:41:25 AM7/20/23
to
How do I use the Code Rewriter please?

Graham Dicker

unread,
Jul 21, 2023, 4:11:37 AM7/21/23
to
On Thursday, July 20, 2023 at 1:41:25 PM UTC+1, Graham Dicker wrote:
> How do I use the Code Rewriter please?
Perhaps I should give a bit of background:

I wrote a Dolphin Smalltalk 6.1 application for my employer back in the noughties. They have been using it ever since almost all day every day. I retired in 2010. Now they need some changes. I am happy to do the work but it's 13 years since I worked on it. So I had a look at it and try to remember how it works. One of the things that I have been doing as I look around is to use the Code Mentor. For the code below it complains about ifTrue:/ifFalse returns instead of and:/or:. I have never used the Code Rewriter before but I wondered if it would 'fix' my code so the mentor doesn't complain about it. But I have no idea how to use it.

autoSignonOK: aDriveMapper

"Private - If the auto signon box is ticked but the username/password are not defined, ask the user what to do. Returns true to carry on, false to stop"

(autosignon value and: [ aDriveMapper autoSignonDefined not ]) ifTrue: [
^MessageBox confirm: 'Auto Signon details not defined. Continue anyway?' caption: 'SVPanel'.
].
^true.

Thanks very much for any help

danie...@gmail.com

unread,
Jul 22, 2023, 11:28:07 PM7/22/23
to
I've used the Code Rewriter a ton at my own workplace, with their product which dates back to Dolphin 5 and is actively developed in D7. It's incredibly useful, and frustratingly difficult to explain. I got a quick primer from another developer, then just dove in and started using it—everything else I know is self-taught, and my experience trying to teach more than the very basics to others was less successful than I'd hoped—in no small part my own fault, I'm sure. There's just something about it that

Perhaps the most useful thing I can give you in a couple paragraphs is this: The match/replace expressions are a superset of normal Smalltalk, and all the added syntax starts with a backtick. I generally refer to them as "pattern variables". The little cheatsheet on the righthand side of the window gives you some hints, but a few examples should help flesh it out:

* `var matches any variable (temp, argument, global, etc—capitalization not important, and this includes special variables like self, super, thisContext)
* `#lit matches any literal (true, false, numbers, symbols, strings, etc)
* self `msg matches any unary self-send.
* The receiver can also be a pattern variable itself, e.g. `#lit `msg
* You can match a message with a fixed number of arguments by hard-coding them, e.g. self `msg: `#arg. Note this will also match binary messages even though you write it like a single-argument keyword message. Additional args don't get their own backtick, the first backtick implies the entire message is a pattern, so e.g. self `msg: `#arg1 withAnother: `#arg2. The names of the selector parts don't influence the match, but when you write a replace expression they must match exactly.
* `.stmt matches a whole statement—from the beginning of the line to a period, basically.
* The @ is where things get complicated, because what it means is highly context-dependent—"use a list" is not an adequate summary:
* `@exp matches any expression—more-or-less any statement *except* a ^-return. This is the wildcard-of-wildcards—the canonical "any unary message send" is `@rcv `msg, or with args `@rcv `msg: `@arg1 andAnother: `@arg2.
* `@vars, if it appears inside a temp declaration like | foo `@vars |, instead matches zero-or-more declared temps. The name is meaningless, there's nothing special about "exp" or "vars".
* `@#lits is valid only inside an array literal and matches zero-or-more elements.
* `@.stmts matches a sequence of statements...but the first time you try this it will not work right, I almost guarantee it, but I don't have time to try to explain—feel free to write me when it comes up.
* `@rcv `@msgs: `@args is the canonical "message with any number of arguments" pattern—the @ in the selector means you can only have one "argument", and it is actually an array of all of them. This is awkward sometimes and I definitely wish the matching were more flexible.
* When it says { = use a block...hooo boy. Subject for another day.
* ``, recurse-into, just means that whatever is matched by ``@arg in e.g. self `msg: ``@arg is also, itself, searched for matches of the whole pattern. Because of the existence of blocks, a "message argument" can easily contain a whole little world of its own, so this definitely comes up.

Okay, given all that, what you enter in the search field is checked, recursively at every point, against the parse tree of each method being examined (just one if one is selected, a whole class if not, or the whole system if no class is selected—but you can't _de_select so you generally have to open a new window. Also this is very, very slow.). As it goes, it puts whatever _actually_ appeared in the method in place of one of your pattern variables in a dictionary. Then it takes the replace expression, substitutes in those matched chunks for the pattern variables there, and plugs that in where it found the match. Think named backreferences in a regex—indeed, you can think of the whole thing as a regex of sorts, except it's not regular and it's not an expression :p. But it is a pattern, just one that is matched against a parse tree rather than a stream of characters. Everything is parse trees here—that's the most basic thing you have to understand, and the mindset you have to adopt. If you get good at using it you'll be seeing them in your dreams, and you'll likely pick up the ability to basically hand-write a parse tree for simple expressions without much thought.

Regarding your Code Mentor example, a few things. The Code Mentor absolutely _should_ be able to fix the things it points out. In some cases it can—I *think* there's sometimes a link saying "an automated transformation is available to address this issue"? There's also a set of canned transformations available by pressing the "Transform..." button in the code rewriter pane. The guts of these live in class methods of TransformationRule—and for that matter the match expressions used by the Code Mentor, because that's mostly what they are, live class-side on ParseTreeLintRule and BlockLintRule—the latter being essentially one giant pattern block, like the `{} syntax, which, subject for another day, but you can probably figure out some by looking at them. Also a quick note—all of this is much more developed in Pharo in terms of ease-of-use, though the basic transform functionality is no different. I'm much less familiar with Pharo though, so I find it much more difficult to dig around for examples there.

For your specific example, honestly I like your code the way it is—many of the Code Mentor's rules are very subjective or situational. But it makes for a fine example—a naive transformation to turn that case into a single expression might look like:

Search for:
`@exp ifTrue: [^`@trueRet].
^`@falseRet.

Replace with:
^`@exp ifTrue: [`@trueRet] ifFalse: [`@falseRet].

You would need another to handle an early-out-if-false:
"Search for:" `@exp ifFalse: [^`@falseRet]. ^`@trueRet. "Replace with same as above"

However. When I say "naive", well, remember what I said about `@.stmts? It's not just that, it's any match expression extending across multiple statements. To handle this for methods that have other statements at the beginning, the required expressions become more complicated—essentially you have to add:
|`@temps|
`@.stmts.
before each pattern to capture the beginning of the method and pass it through unchanged.

Okay, this is a lot and it barely scratches the surface. I'm going to leave it there even though I probably left out something important—feel free to ask further questions!

Daniel

Graham Dicker

unread,
Jul 23, 2023, 5:46:12 AM7/23/23
to
Daniel

Thank you for your very extensive explanation. I have now got the rewriter to work by creating a method specifically matching one of the canned transformations. I think my application is probably too tiny to need the code rewriter. It's only around seventy classes and a dozen or so windows in total and I'm probably the only person who will ever do any work on it. But I shall certainly file away your posting for future reference and I am sure it will prove valuable to many other people.

Graham

danie...@gmail.com

unread,
Jul 24, 2023, 4:01:33 PM7/24/23
to
FWIW, I start reaching for the rewriter when I expect to make the same change as little as...maybe even a half-dozen times, certainly a dozen (depending on the complexity of the required rewrite expression itself). Even if it's not *faster*, it eliminates a source of human error when making repetitive changes—and sometimes you learn something in the process, if your match expression doesn't match something you think it should, and then you go, "oh wait, that's not actually the same". So I wouldn't rule out it being useful to you on an application of that size, though you certainly won't have the sort of sweeping changes affecting hundreds of methods that I've occasionally had.

While I'm here, why don't I leave a quick primer on pattern blocks ("quick", he says—nature of the beast):

There are actually three kinds: "wrapper" blocks, non-wrapper (I'll call them "bare") matching blocks, and replacement blocks. The syntax is internally the same for all three: `{:arg ... | <body of a block—temps and statements> }. They are differentiated by context:
* Wrapper blocks are only meaningful in a search expression. What I'm calling a bare matching block is just a bare block that exists in a search expression—it doesn't know the difference per se, but the semantics are different. In turn a replacement block is just a bare block that exists in a replace expression.
* Bare blocks (match or replace) can appear anywhere an "expression" could—anywhere `@exp would make sense. Wrapper blocks have the same precedence as a unary selector, and they "wrap" whatever is in the "receiver" position.
* Matching blocks, wrapper or not, have (up to/optionally) two arguments: The parse node being matched against, and the "context"—the dictionary of captures so far. The block returns true/false to indicate whether the match should proceed, as you'd expect. For a wrapper block, the node being matched is first checked against the wrapped node, and only if that matches is the block executed.
* Replacing blocks have only the one argument, the context. Their return value must be a *parse node* which is inserted in the tree at that point. I wrote a bunch of helpers to make constructing these more fluid, which I may share if I can find the time to put stuff all together.
* Within a pattern block, you can directly refer to pattern variables, and what you get is whatever's in the context at that variable—whatever it had previously matched (so, no forward references). Internally there's a step that rewrites them—`{:node | `@exp} is rewritten to `{:node :ctx | ctx at: (RBPatternVariableNode named: '`@exp')}. (A replacement block wouldn't have the node arg, and would be much more likely to have explicitly declared the ctx arg in the first place.)

I bet none of this makes a damn bit of sense, so, some examples. These are going to be either trivial or kinda pointless, because the easiest things to think of already have better tools, but they give you an idea of what's *possible*:

"Match a message with any of a list of selectors. The parentheses are important otherwise the block would only wrap `@args."
(`@rcv `@msgs: `@args) `{:msg | "msg is an StMessageNode" #(#foo #bar #foo:bar:) includes: msg selector}

"Perhaps a more real example, match any Duration constructor shorthand. In this case the first block wraps the receiver, the second one wraps the whole unary message send"
`#lit `{:lit | "lit is an StLiteralNode" lit value isNumber} `msg `{:msg | #(#second #seconds #minute #minutes #hour #hours #day #days) includes: msg selector}
"We could use a bare block for the receiver, but we would need to first test if it is literal, as other node types don't understand #value"
`{:node | node isLiteralNode and: [node value isNumber} ...etc...
"Or, we could even do the whole thing as a single block, but...why?"
`{:node | node isMessage and: [node receiver isLiteralNode and: [node receiver value isNumber and: [#(...list of selectors...) includes: node selector]]]}
"Notice that the version with two wrapper blocks is actually the shortest—I find bare matching blocks are very rare, because usually even just a *little* refining by way of the wrapped node can save a lot of code."

"Replace assignments to a variable with self-sends of a setter—basically a half-assed Abstract Instance Variables refactoring. Search for:"
`var := `@exp
"Replace with:"
`{StMessageNode receiver: (StVariableNode named: 'self') selector: (`var name , ':') asSymbol arguments: {"No backtick, this is a brace array" `@exp}}

In general, one of the best ways to learn about pattern blocks, like anything in Smalltalk, is to stick a breakpoint in one—`#lit `{:lit | self halt} is perfectly valid, and will happily drop you into a debugger deep in the heart of the matching process. You can then inspect the nodes and context to get a better feel for how to talk to them. When writing a replacing block, it can be useful to halt, examine the context, manually write out the replacement expression, parse it, and look at that parse tree to see how you would go about constructing it node-by-node.

For some ideas of what's possible, I wrote a pattern to consolidate repetitive sends like:
self foo.
self bar.
self baz: #abc.
into a cascade:
self
foo;
bar;
baz: #abc.
and note that this works even if the receiver is something more complicated than a variable—which is when it's really useful, as otherwise the cascade is arguably *longer* than the original, as above. The code for that is actually quite complicated to handle in a fully general way, and depends on extensions not present in the base system (though mostly for conciseness). On a similar note, it's possible to rewrite (OrderedCollection new) add: #x; add: #y; ...etc...; yourself to and from (Array with: #x with: #y ...etc...). Really the sky's the limit. If I ever get to release those extensions, I'll release some of these patterns with them.

Have fun! :)

Daniel
0 new messages