Rewriter and Positional

Malte Schwerhoff

unread,

Dec 22, 2011, 8:58:29 AM12/22/11

to ki...@googlegroups.com

Hi all,

it seems that rewriting a tree of nodes which are subtypes of
scala.util.parsing.input.Positional does not preserve the position
information. More precisely, if a node is duplicated internally by
Rewriter.dup the .pos field isn't duplicated.

The following hacky patch to Rewriter.dup overcomes this:

val t1 = ctor.newInstance (children : _*).asInstanceOf[T]
t1 match {
case p: Positional => p.setPos(t.asInstanceOf[Positional].pos)
case _ => ()
}

t1

However, I am quite sure that there is a more elegant and more general
way of handling situations where not all relevant fields are available
through the Product interface.

Cheers,
Malte

Tony Sloane

unread,

Jan 12, 2012, 10:39:41 PM1/12/12

to ki...@googlegroups.com

Hi Malte,

Please excuse my delay in responding. Summer vacation has been too tempting...

On 23/12/2011, at 12:58 AM, Malte Schwerhoff wrote:

> it seems that rewriting a tree of nodes which are subtypes of
> scala.util.parsing.input.Positional does not preserve the position
> information. More precisely, if a node is duplicated internally by
> Rewriter.dup the .pos field isn't duplicated.

Yes, that's true.

> The following hacky patch to Rewriter.dup overcomes this:
>
> val t1 = ctor.newInstance (children : _*).asInstanceOf[T]
> t1 match {
> case p: Positional => p.setPos(t.asInstanceOf[Positional].pos)
> case _ => ()
> }
>
> t1
>
> However, I am quite sure that there is a more elegant and more general
> way of handling situations where not all relevant fields are available
> through the Product interface.

This change would certainly do the trick for many uses, but it would be nice to have a more general scheme. The dup path will only catch those nodes that are created as copies of existing nodes whose children are rewritten.

For example, imagine you have a rule such as the following (which does not involve dup at all):

rule {
case Foo (a, b) => Foo (b, a)
}

Assume that Foo extends Positional. The new Foo construct will not have a position. If you wanted the position from the matched Foo to be transferred across to the new one, you would have to set it explicitly.

I haven't been able to see a simple general scheme that would make this kind of position propagation easier to do than adding lots of setPos calls to the rules. Suggestions are welcome...

cheers,
Tony

Tony Sloane

unread,

Jan 26, 2012, 11:14:23 AM1/26/12

to ki...@googlegroups.com

Further to the recent discussion about rewrites losing position information, I have just pushed a change that adds a new PositionalRewriter object. If you wish to preserve positions in generic rewrites, use PositionalRewriter instead of the standard Rewriter object. Terms with a position in this context means anything that is a scala.util.parsing.input.Positional.

Note that if you have explicit rules that create new nodes, you still have to set their positions manually. The new object only helps with the nodes that are created behind the scenes by rewrites that use the generic traversal operators all, one, some, etc. E.g., in the strategy

everywhere (rule {
case n @ Leaf (i) =>
Leaf (i + 1).setPos (n.pos)
})

we have to explicitly set the position of the new leaves, but not the positions of the new inner nodes that are created by the everywhere.

Assuming no problems spring into view, this change will be in the 1.2 release which I hope to have out in the next few weeks.

cheers,
Tony

Malte Schwerhoff

unread,

Feb 1, 2012, 3:56:40 AM2/1/12

to ki...@googlegroups.com

Hi Tony,

On 1/13/2012 04:39, Tony Sloane wrote:
> [...]

>
>> The following hacky patch to Rewriter.dup overcomes this:
>>
>> val t1 = ctor.newInstance (children : _*).asInstanceOf[T]
>> t1 match {
>> case p: Positional => p.setPos(t.asInstanceOf[Positional].pos)
>> case _ => ()
>> }
>>
>> t1
>>

> [...]

>
> This change would certainly do the trick for many uses, but it would
> be nice to have a more general scheme. The dup path will only catch
> those nodes that are created as copies of existing nodes whose
> children are rewritten.

A possible and quite general solution would be to add callbacks to dup
that are invoked with the original and the clone as arguments. Such
callbacks could pattern match on the pair (original, duplicate), modify
the duplicate and then return it (or even another fresh copy).

E.g.

def keepPosition[T <: Product](orig: T, dup: T): T =
(orig, dup) match {
case (op: Positional, dp: Positional) => dp.setPos(op.pos)
case _ => dup
}

Rewriter.addDupCallback(keepPosition)

Callbacks could e.g. be chained, and maybe return a tuple (Boolean, T)
where the first element is true if the chaining should continue.

It would be even better if we could use Scala's type system to relief
the user from pattern matching on the expected type him/herself, e.g. by
something like

def keepPosition(orig: Positional, dup: Positional): Positional =
dp.setPos(op.pos)

Rewriter.addDupCallback[Positional](keepPosition)

But I am not sure how one would achieve this in a type-safe manner.

> For example, imagine you have a rule such as the following (which
> does not involve dup at all):
>
> rule { case Foo (a, b) => Foo (b, a) }
>

> [...]

>
> I haven't been able to see a simple general scheme that would make
> this kind of position propagation easier to do than adding lots of
> setPos calls to the rules. Suggestions are welcome...

That one is trickier, I agree, and not solved by adding callbacks to
Rewriter.dup. I also don't see a way of solving the problem that
automatically preserves var-fields without explicitly copying them over.

Although ... maybe the callbacks do work here, too, if they could be
triggered after the rule has been applied, and if original and return
object are (still) available. Checking for identity might prevent
invoking the callback although the object hasn't actually been copied.

Cheers,
Malte

Tony Sloane

unread,

Feb 6, 2012, 3:22:08 AM2/6/12

to ki...@googlegroups.com

Hi Malte,

Callbacks are not a bad idea for this kind of customisation. I will add it to my list of things to look at when I have a bit more time. One important factor though: I think it's vital to not add any overhead to the dup operation in the normal course of events, since in a generic traversal it is called a lot. Having *Rewriter objects such as the PositionalRewriter that I added recently means that we can allow extensions but not impose an overhead on people who don't need them.

> It would be even better if we could use Scala's type system to relief
> the user from pattern matching on the expected type him/herself, e.g. by
> something like
>
> def keepPosition(orig: Positional, dup: Positional): Positional =
> dp.setPos(op.pos)
>
> Rewriter.addDupCallback[Positional](keepPosition)
>
> But I am not sure how one would achieve this in a type-safe manner.

Agreed, but I'm not sure how to do it either at the moment.

>> For example, imagine you have a rule such as the following (which
>> does not involve dup at all):
>>
>> rule { case Foo (a, b) => Foo (b, a) }
>>
>> [...]
>>
>> I haven't been able to see a simple general scheme that would make
>> this kind of position propagation easier to do than adding lots of
>> setPos calls to the rules. Suggestions are welcome...
>
> That one is trickier, I agree, and not solved by adding callbacks to
> Rewriter.dup. I also don't see a way of solving the problem that
> automatically preserves var-fields without explicitly copying them over.
>
> Although ... maybe the callbacks do work here, too, if they could be
> triggered after the rule has been applied, and if original and return
> object are (still) available. Checking for identity might prevent
> invoking the callback although the object hasn't actually been copied.

Perhaps callbacks could be made to work in this more general case, but it gets trickier if there are internal nodes. E.g.,

rule { case Foo (a, b) => Bar (Ble (a), Blo (b)) }

which callbacks get called on which nodes?

cheers,
Tony

Malte Schwerhoff

unread,

Feb 6, 2012, 3:59:57 AM2/6/12

to ki...@googlegroups.com

I agree that performance is vital, but I am not sure if checking the set
of callbacks for emptiness causes a significant overhead. I guess only
benchmarks will tell. It might also be possible to check for emptiness
before starting the traversal, and if true, continue using a dup-version
that doesn't consider the callbacks at all.
Then again, having *Rewriter objects could allow customisation beyond
what is possible with callbacks, so it would anyway be nice to have them.

In such cases it is probably best to let the user make such a decision.
The callback could provide the following callback

(origNode, dupNode) match {
case (foo: Foo, bar: Bar) =>
bar.ble.pos = foo.a.pos
bar.blo.pos = foo.b.pos
bar.x = foo.y
bar
}

I would only match callbacks against the input-output pair of rules. If
the callback needs to affect deeper nodes the user has to implement the
callback accordingly.

Cheers,
Malte

Tony Sloane

unread,

Feb 6, 2012, 5:06:23 AM2/6/12

to ki...@googlegroups.com

On 06/02/2012, at 9:59 AM, Malte Schwerhoff wrote:

>> Callbacks are not a bad idea for this kind of customisation. I will
>> add it to my list of things to look at when I have a bit more time. One
>> important factor though: I think it's vital to not add any overhead to
>> the dup operation in the normal course of events, since in a generic
>>> traversal it is called a lot. Having *Rewriter objects such as the
>> PositionalRewriter that I added recently means that we can allow
>> extensions but not impose an overhead on people who don't need them.
>
> I agree that performance is vital, but I am not sure if checking the set
> of callbacks for emptiness causes a significant overhead. I guess only
> benchmarks will tell. It might also be possible to check for emptiness
> before starting the traversal, and if true, continue using a dup-version
> that doesn't consider the callbacks at all.
> Then again, having *Rewriter objects could allow customisation beyond
> what is possible with callbacks, so it would anyway be nice to have them.

Yes, I think we will add them. The performance issues are never entirely
clear, since whether something like this is a problem depends on exactly
what rewrite you are doing. E.g., rewrites involve Scala collection
classes do not use dup, so no overhead would be incurred there.

<snip>

>> Perhaps callbacks could be made to work in this more general case,
>> but it gets trickier if there are internal nodes. E.g.,
>>
>> rule { case Foo (a, b) => Bar (Ble (a), Blo (b)) }
>>
>> which callbacks get called on which nodes?
>
> In such cases it is probably best to let the user make such a decision.
> The callback could provide the following callback
>
> (origNode, dupNode) match {
> case (foo: Foo, bar: Bar) =>
> bar.ble.pos = foo.a.pos
> bar.blo.pos = foo.b.pos
> bar.x = foo.y
> bar
> }
>
> I would only match callbacks against the input-output pair of rules. If
> the callback needs to affect deeper nodes the user has to implement the
> callback accordingly.