Reducing algorithm obfuscation

606 views
Skip to first unread message

Andrew Simper

unread,
Jun 6, 2014, 3:17:31 AM6/6/14
to julia...@googlegroups.com
In implementations where you want named data, I've noticed that the algorithm gets obfuscated by lots of variable names with dots after them. For example, here is a basic analog model of a state variable filter used as a sine wave generator:

immutable SvfSinOscCoef
    g0::Float64
    g1::Float64
end
immutable SvfSinOsc
    ic1eq::Float64
    ic2eq::Float64
end
function SvfSinOscCoef_Init (;freq=1.0, sr=44100.0)    
    local g::Float64 = tan (2pi*freq/sr)
    local g0 = 1.0/(1.0+g^2)
    SvfSinOscCoef (g0,g*g0)
end
function SvfSinOsc_Init (startphase::Float64)
    SvfSinOsc (cos(startphase), sin(startphase))
end

But the tick function looks a bit messy:

function tick (state::SvfSinOsc, coef::SvfSinOscCoef)
    local v1::Float64 = coef.g0*state.ic1eq - coef.g1*state.ic2eq
    local v2::Float64 = coef.g1*state.ic1eq + coef.g0*state.ic2eq
    SvfSinOsc (2*v1 - state.ic1eq, 2*v2 - state.ic2eq)
end


It would be really cool if there was a way to shorthand the syntax of this to something like the following, which is a lot more readable:

function tick (state::SvfSinOsc, coef::SvfSinOscCoef)
    using s, c
    local v1::Float64 = g0*ic1eq - g1*ic2eq
    local v2::Float64 = g1*ic1eq + g0*ic2eq
    SvfSinOsc (2*v1 - ic1eq, 2*v2 - ic2eq)
end


Lots of algorithms have arguments with the same type, but even then you could still specify using just the most used argument, but if it doesn't help make things more clear or isn't useful then people don't have to use it at all.



Another pattern that would be nice to handle cleanly is: fetch state to local, compute on local, store local to state. I have written code that generates code to handle this since it is such a pain to keep everything in sync, but if there was some way to automate this at the language level then it would really rock, so here is an example of the longhand way, which isn't too bad for this example, but just imagine if there are 20 or so variables, and you are writing multiple tick functions:

type SvfSinOsc
    ic1eq::Float64
    ic2eq::Float64
end

function tick (state::SvfSinOsc, coef::SvfSinOscCoef)
    local ic1eq::Float64 = state.ic1eq
    local ic2eq::Float64 = state.ic2eq
    for i = 1:100
        # compute algorithm using local copies of state.ic1eq and state.ic2eq
    end
    state.ic1eq = ic1eq
    state.ic2eq = ic2eq
    return state
end


I have a feeling that macros may be able to help out here to result in something like:

function tick (state::SvfSinOsc, coef::SvfSinOscCoef)
    @fetch state
    for i = 1:100
        # compute iterative algorithm using local copies of state.ic1eq and state.ic2eq
    end
    @store state
    return state
end

But I'm not sure how to code such a beast, I tried something like:

macro fetch(obj::SvfSinOsc)
    return quote
        local ic1eq = obj.ic1eq
        local ic2eq = obj.ic2eq
    end
end

macro store(obj::SvfSinOsc)
    return quote
        obj.ic1eq = ic1eq
        obj.ic2eq = ic2eq
    end
end

dump(osc)
macroexpand (:(@fetch osc))
macroexpand (:(@store osc))

SvfSinOsc 
  ic1eq: Float64 1.0
  ic2eq: Float64 0.0

Out[28]: :($(Expr(:error, TypeError(:anonymous,"typeassert",SvfSinOsc,:osc))))







Billou Bielour

unread,
Jun 6, 2014, 5:33:38 AM6/6/14
to julia...@googlegroups.com
For the macro problem, note that arguments of macro are expressions, or Symbols:

Just as functions map a tuple of argument values to a return value, macros map a tuple of argument expressions to a returned expression. They allow the programmer to arbitrarily transform the written code to a resulting expression, which then takes the place of the macro call in the final syntax tree.

macro test(x); dump(x); end

@test var
Symbol var

So you can do something like that:

macro fetch(obj)
    quote
        local ic1eq = $obj.ic1eq
        local ic2eq = $obj.ic2eq
    end
end

This will not work however, because of hygiene:

macroexpand (:(@fetch var))

:(begin  # /opt/mandelbrot/session-manager/src/Worker.jl, line 3:
        local #174#ic1eq = var.ic1eq # line 4:
        local #175#ic2eq = var.ic2eq
    end)

You just need to escape your quote block:

macro fetch(obj)
    quote
        local ic1eq = $obj.ic1eq
        local ic2eq = $obj.ic2eq
    end  |> esc
end

You can also look here for a similar question:

https://groups.google.com/forum/#!searchin/julia-users/unpack/julia-users/IQS2mT1ITwU/gEyj6JNJsuAJ

Andrew Simper

unread,
Jun 7, 2014, 3:24:33 AM6/7/14
to julia...@googlegroups.com
Hi Billou, thanks for letting me know about the |> esc part! I'll make a new topic to cover the macro part of this post.

Andrew Simper

unread,
Jun 12, 2014, 12:13:42 AM6/12/14
to julia...@googlegroups.com
So just to post again to make things clearer, right now algorithms tend to look pretty ugly and obfuscated since you have to prefix function arguments with the argument names using dot notation:

function tick (state::SvfSinOsc, coef::SvfSinOscCoef)
    local v1::Float64 = coef.g0*state.ic1eq - coef.g1*state.ic2eq
    local v2::Float64 = coef.g1*state.ic1eq + coef.g0*state.ic2eq
    SvfSinOsc (2*v1 - state.ic1eq, 2*v2 - state.ic2eq)
end


This is a lot more readable to me, and it would be super useful to have a "using" type operation similar to namespace but it could run on variables instead, so that although writing the following is equivalent to what is above, it is much easier to see what is going on:

function tick (state::SvfSinOsc, coef::SvfSinOscCoef)
    using state, coef
    local v1::Float64 = g0*ic1eq - g1*ic2eq
    local v2::Float64 = g1*ic1eq + g0*ic2eq
    SvfSinOsc (2*v1 - ic1eq, 2*v2 - ic2eq)
end

What are peoples opinions on this? Would anyone else find it useful?

John Myles White

unread,
Jun 12, 2014, 12:14:45 AM6/12/14
to julia...@googlegroups.com
Personally, I'm not super excited about this idea.

 -- John

Keno Fischer

unread,
Jun 12, 2014, 12:30:05 AM6/12/14
to julia...@googlegroups.com
I don't think it warrants syntax, but might be nice in a macro. I've had cases where I just put my entire simulation state in a single object, so I don't need to give 100s of parameters to every object. In that case (where the object is more of a container than an abstraction), it might be nice to use. 

Andrew Simper

unread,
Jun 12, 2014, 1:07:16 AM6/12/14
to julia...@googlegroups.com
The problem with using a macro is that you will always have to make a local copy of the data, if it was a language feature then then a mutable type could be passed in as the argument and the same non-obfuscated code could be used to update the state in place, which may be preferable depending on the situation.

Here is another example from the Julia.org webpage:

immutable Pixel
    r::Uint8
    g::Uint8
    b::Uint8
end

function rgb2gray!(img::Array{Pixel})
    for i=1:length(img)
        p = img[i]
        v = uint8(0.30*p.r + 0.59*p.g + 0.11*p.b)
        img[i] = Pixel(v,v,v)
    end
end

function rgb2gray2!(img::Array{Pixel})
    for i=1:length(img)
        using img[i]
        v = uint8(0.30*r + 0.59*g + 0.11*b)
        img[i] = Pixel(v,v,v)
    end
end

Keno Fischer

unread,
Jun 12, 2014, 1:21:14 AM6/12/14
to julia...@googlegroups.com
In my opinion, looking at that example this is way to magical. Plus the general consensus is that you should only add syntax if it is something extremely special that requires compiler support. For the non-mutable case, that's not the case here. The mutable case has me worried. It introduces the possibility that an assingment (a simple one, not a setfield or getindex) actually has effects outside of the function which doesn't happen anywhere else in julia. 

Andrew Simper

unread,
Jun 12, 2014, 1:30:16 AM6/12/14
to julia...@googlegroups.com
Are namespaces going to be supported in julia? It would be the same mechanism as that, an order of preference to choose what a particular name is referring to, no more. So if julia is not going to support "using" on a namespace then I completely understand not wanting to support it on variables as I have suggested.

I don't follow what you mean by "The mutable case has me worried. It introduces the possibility that an assingment (a simple one, not a setfield or getindex) actually has effects outside of the function which doesn't happen anywhere else in julia."

Can you please provide an example to illustrate what you are worried about?

Keno Fischer

unread,
Jun 12, 2014, 1:39:19 AM6/12/14
to julia...@googlegroups.com
There is no plans for namespace support other than what's already in with modules.

I'll try to explain. Say you have a mutable 
type foo
    a::Int
    b::Int
end

Then to modify it in a function you have to explicitly say
function bar(f::foo)
f.a = 2
f.b = 3
end

etc., for it to modify my argument foo. I think it's way too easy to refactor a function say

function bar(f::foo)
... lots of stuff ...
a = 2*f.a
... lots of stuff ...
f.b*10*a
end

into 

function bar(f::foo)
using f
... lots of stuff ...
a = 2*f.a
... lots of stuff ...
b*10*a
end

because you want to immediately access b, but forgot that foo also has an a field. In general in julia an assignment like

var = value

will almost never have effects outside it's current scope (var[idx] = and var.field = are different). The only exception to this is if you write

function test()
global var
var = value
end

but as you can see that annotation is very explicit and limited. I would be similarly opposed to a language feature that causes all variables in a function to implicitly be global.

As I said this isn't a concern in the immutable case because the subsequent assignment would always override the original value and simple assignments will not have effects outside the current scope. This can be implemented with macros though.

Hope that clarifies it,
Keno

Andrew Simper

unread,
Jun 12, 2014, 1:59:56 AM6/12/14
to julia...@googlegroups.com
Yep, that makes sense, thanks for showing the example. I already have another thread going on sorting out a "fetch" macro to introduce a local copy of the names in a type.

Out of interest, what is the "local" keyword for?

Andrew Simper

unread,
Jun 12, 2014, 2:16:30 AM6/12/14
to julia...@googlegroups.com
It seems that the local keyword is a bit of a language kludge to me, since it is implied in most cases, apart from stating the new scope in the form of a for loop etc. It would seem more natural and consistent to me to add the local keyword in front of all variables you want to be local in scope, and everyting else is global. This line of reasoning I'm sure has already been argued to death, and obviously having an implicit local was decided to be best.

An ideal use case to me would be to write be able to write an algorithm at the REPL using globals, but then easily wrap that up in a function that takes a single type that contains all the globals that were used in the algorithm. You could copy and paste the same global code, then stick a "using this" in front of it and it would all work great.

Oh well, perhaps julia isn't for me, I'm just too used to not being forced to use dots everywhere to access stuff in "this".  

Andrew Simper

unread,
Jun 12, 2014, 3:21:42 AM6/12/14
to julia...@googlegroups.com
On Thursday, June 12, 2014 2:16:30 PM UTC+8, Andrew Simper wrote:
It seems that the local keyword is a bit of a language kludge to me, since it is implied in most cases, apart from stating the new scope in the form of a for loop etc. It would seem more natural and consistent to me to add the local keyword in front of all variables you want to be local in scope, and everyting else is global. This line of reasoning I'm sure has already been argued to death, and obviously having an implicit local was decided to be best.

Having the local keyword like it is makes most sense to me, but I suppose it isn't a big deal to me that if you don't explicitly specify local you could be referring to something outside the current scope, which is the case with for loops. 

Mike Innes

unread,
Jun 12, 2014, 4:21:58 AM6/12/14
to julia...@googlegroups.com
FWIW – putting to one side the question of whether or not this is a good idea – it would be possible to do this without new language syntax. However, you'd have to either pass a type hint or be explicit about the variables you want:

e.g. 

function tick(state::SvfSinOsc, coef::SvfSinOscCoef)
  @with state::SvfSinOsc, coef::SvfSinOsc
  # or
  @with state (ic1eq, ic2eq) coef (g0, g1)
  
  lv1 = g0*ic1eq - g1*ic2eq
  lv2 = g1*ic1eq + g0*ic2eq
  SvfSinOsc(2*v1 - ic1eq, 2*v2 - ic2eq)
end

This would work in the non-mutating case by calling names() on the type and making appropriate variable declarations.

You could then go further and implement

function tick(state::SvfSinOsc, coef::SvfSinOscCoef)
  @with state (ic1eq, ic2eq) coef (g0, g1) begin
  
    lv1 = g0*ic1eq - g1*ic2eq
    lv2 = g1*ic1eq + g0*ic2eq
    SvfSinOsc(2*v1 - ic1eq, 2*v2 - ic2eq)
  end
end

Which would walk over the expression, replacing `a` with `Foo.a`. However, it would be tricky to implement this correctly since you'd have to be aware of variable scoping within the expression.

I may implement the non-mutating version of this at some point – it seems like it could be useful.

Mike Innes

unread,
Jun 12, 2014, 4:27:14 AM6/12/14
to julia...@googlegroups.com
Actually, the mutating case is easier than that, you'd just transform the code to:

function foo(a::Foo)
  a = Foo.a
  # do stuff with a
  Foo.a = a
end

So long as you don't access Foo.a directly this would work fine.

At some point I'm going to make a repository of "Frequently Asked Macros" – just to show off the things you can do with Julia even though you really shouldn't.

Andrew Simper

unread,
Jun 12, 2014, 4:44:30 AM6/12/14
to julia...@googlegroups.com
Brilliant Mike! This is exactly what I was after, I just want a way to write shorthand names for things within a scope, and the @with macro does just that :) In the example I posted I split the coefficients away from the state so that only the state needs to be returned, I think this is good for efficiency. I'll have a play with @with and see how I go. Passing in names (Typename), isn't a problem, since when a new name is added there is no duplication in doing this.

Keno, sorry for not understanding that this is probably what you meant when you said this would be best off done by using macros, I didn't think of enclosing the entire algorithm in a macro.

Mike Innes

unread,
Jun 12, 2014, 6:14:15 AM6/12/14
to julia...@googlegroups.com
Ok, managed to have a quick go at this – source with some examples:


Currently it does nothing to avoid the issue Keno pointed out, but in principle you could throw an error when the mutating version is used without explicit types.

If there's any interest in having this in Base you're welcome to it, otherwise I'll probably just clean it up and store it in Lazy.jl.

Jameson Nash

unread,
Jun 12, 2014, 7:41:11 AM6/12/14
to julia...@googlegroups.com
> Having the local keyword like it is makes most sense to me, but I suppose it isn't a big deal to me that if you don't explicitly specify local you could be referring to something outside the current scope, which is the case with for loops. 

Javascript does this. It also has the "using" block that you describe (see "with"). They are probably the worst (mis)features in the entire language. In the latest version of javascript, it has finally been removed, to the relief of javascript programmers everywhere: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions_and_function_scope/Strict_mode.

I would be interested in having a language feature akin to visual basic's with block, since it is only a trivial source transform to annotate each `.` with the result of the expression in the nearest `with` block.
function f(A)
  with A
    .c = .a + .b
  end
end
However, since this only saves a one character, it really isn't worthwhile.
function f(A)
  Z = A
  Z.c = Z.a + Z.b
end

Carlo Baldassi

unread,
Jun 12, 2014, 9:04:09 AM6/12/14
to julia...@googlegroups.com
Sorry, I haven't had the time to read the whole discussion, but it seems you may be interested in the @extract macro which can be found at https://github.com/carlobaldassi/MacroUtils.jl#extract and which I have written for this exact reason of reducing 1) code obfuscation 2) overhead 3) typing

The macro should be pretty solid and also flexible to some extent, as shown in the examples.

Best
Carlo

Andrew Simper

unread,
Jun 12, 2014, 10:03:10 AM6/12/14
to julia...@googlegroups.com
Mike, you rule!

That is a serious cool macro, thankyou so much for taking the time to write this!! I like dot notation sometimes, when you have two things like Coordinates / Points / Complex etc it makes perfect sense to be able to see which one you are talking about, but for crunching numbers on a set of states it just makes things very ugly and obfuscates what is actually going on. 

Can I please double check something with you? Apart from the one off overhead of parsing through the code and prefixing the names will this run identically fast as me having type all the names in full? (I am 99.9% the answer is yes, but I want to be sure!)

Mike Innes

unread,
Jun 12, 2014, 10:09:04 AM6/12/14
to julia...@googlegroups.com
No problem!

The non-mutating version of this is exactly equivalent to writing a = Foo.a etc., so there should be zero overhead. The mutating version uses a let binding, which I think has a very small additional overhead, but I would just benchmark it to make sure.

Andrew Simper

unread,
Jun 12, 2014, 10:09:59 AM6/12/14
to julia...@googlegroups.com
I'm all for keeping people from making stupid mistakes, so I'm happy to keep the use of local / global exactly how it is. With the help of Mike I now have exactly what I wanted to achieve, which is perfect for me, so the macro solution is great since I don't make those kinds of mistakes, and if I do then I can handle it and fix the typo since I know exactly what is going on.

Andrew Simper

unread,
Jun 12, 2014, 10:11:38 AM6/12/14
to julia...@googlegroups.com
Thanks Carlo, that is almost what I wanted, but that macro doesn't quite fit in this case. Thanks for pointing me towards your utilities, I'll have a look through them and see if there is some other stuff I like.

Andrew Simper

unread,
Jun 12, 2014, 10:16:56 AM6/12/14
to julia...@googlegroups.com
Hi Keno,

I said this in another post, but in case you missed it I now understand what you mean by the phrase "but this might be nice in a macro". I didn't realise until Mike pointed out that you could enclose the entire code block inside the macro to do what I wanted. Mike has written an @with macro that does exactly what I was after, brilliant! 

I agree that you need to keep the language consistent so please ignore my request about the using keywords, I am still getting my head around what is possible in Julia, so thanks for your time and for pointing out the problems with what I suggested.

I strongly suggest the @with macro be added to some standard package, since all the c++ hacks like me will want this functionality, and for everyone else it won't matter.


On Thursday, June 12, 2014 12:30:05 PM UTC+8, Keno Fischer wrote:

Andrew Simper

unread,
Jun 12, 2014, 10:18:08 AM6/12/14
to julia...@googlegroups.com

On Thursday, June 12, 2014 6:14:15 PM UTC+8, Mike Innes wrote:
If there's any interest in having this in Base you're welcome to it, otherwise I'll probably just clean it up and store it in Lazy.jl.


Yes please, having this in Base would be brilliant for all the c++ hacks like me! 

David Moon

unread,
Jun 12, 2014, 9:01:28 PM6/12/14
to julia...@googlegroups.com
Mike Innes' atsign with macro is good, but it would be better if it would iterate over the AST for its last argument and replace each occurrence of "field" with "obj.field".  That way there wouldn't be any unexpected assignments to fields which were not actually changed, and in general no wasted motion at run time.  The macro would be a little more complex to write but it should not be very difficult.

Mike Innes

unread,
Jun 13, 2014, 3:57:07 AM6/13/14
to julia...@googlegroups.com
That was my first thought, too – and it's fine in principle, but remember for that macro to be correct you'd have to handle let bindings, quoting, local variable declarations and expanding any macros that might result in these, then test all of those things carefully to make sure it's working correctly.

Plus, if the overhead of unnecessary writes is an issue, so will be that of the let binding used by the mutating version of the macro. This again is probably solvable, but for all that effort you could just use the faster non-mutating version and store a couple of changes by hand.

That's not meant to put off anyone who wants to have a go at this, just warning that it wouldn't be as trivial as it sounds.

David Moon

unread,
Jun 13, 2014, 9:59:30 AM6/13/14
to julia...@googlegroups.com
[I can't get this damned thing not to include a quote of all previous messages.  I guess it only works in Google Chrome; what a pain.  So sorry about the unnecessarily long post.]

In the argument to a macro all nested macro calls are already expanded, I think.  It's certainly true that for complete correctness you would need to handle shadowing of the bindings introduced by atsign-with by local bindings of the same name.  It's even more true that Julia does not provide any assistance in processing Expr's and other AST objects, nor even much documentation, so far as I know.

I don't understand your comment about the overhead of the let binding used by the mutating version of the macro.  What extra overhead is that?

Mike Innes

unread,
Jun 13, 2014, 10:17:39 AM6/13/14
to julia...@googlegroups.com
Not by default, but it should be simple enough (and correct, I think) to just call macroexpand on macro calls.

macro test(expr)
  (expr,)
end

(@test @foo x) == (:(@foo x),)

All I meant about the let binding is that the mutating version expands to:

let a = Foo.a
  # code
  Foo.a = a
end

AFAIK let bindings have a small overhead (compared to a normal declaration), so if a redundant assignment is a significant overhead in your code then using the let binding will be prohibitive anyway. I haven't particularly tested that, though, so the situation could have changed recently.

Stefan Karpinski

unread,
Jun 13, 2014, 10:26:26 AM6/13/14
to Julia Users
Keno's example showed how a simple error like forgetting that you had assigned to `a` would cause problems, but it's even worse – that's just a matter of making an error about the current state of the program. It's worse than that though: if someone adds a field to a type that is used *anywhere* with such a `using` construct, that code becomes incorrect. The fundamental problem with this construct is that it makes it locally impossible to reason about what it means when you access or assign a binding. Let's say I have this code:

function foo(x::Bar)
  using x
  a = b + 1
end

What does this mean? b could be a global variable or a field of x; a could be a local variable or a field of x. Without knowing the structure of Bar, we can't know. In Julia, we never does this: you can always tell the meaning of code from purely local syntactic analysis. The code may be wrong – you might try to access or assign a field that doesn't exist, but there is only one thing your code could mean. For the same exact reasons, I wouldn't accept a macro that simulates this into base – it has the exact same problems. The only way to make this acceptable is if can be locally disambiguated what the code means. You could, for example, do something like this:

function foo(x::Bar)
  using x: a, b
  a = b + 1
end

Now you can immediately tell that `a` and `b` are both fields of `x` and not global or local variables (not by accident, this is independent of the definition of Bar).

Tom Short

unread,
Jun 13, 2014, 10:39:02 AM6/13/14
to julia...@googlegroups.com
This "obfuscation" is also tedious with DataFrames.

I've been playing around with an `@with` macro to use symbols to
reference DataFrame columns. I extended that idea to several macros to
ease data manipulation:

https://github.com/JuliaStats/DataFramesMeta.jl

The tricky part with DataFrames and associative types is that the
types are not known ahead of time. Wrapping the code in the block in a
pseudoanonymous function gets around that issue. For DataFrames and
associative types, this also gives speed advantages because indexing
these is slow. I'm not sure if this approach could also cover mutable
and immutable objects.

Here is an example:

df = DataFrame(x = 1:3, y = [2, 1, 2])
x = [2, 1, 0]

@with(df, :y + 1)
@with(df, :x + x) # the two x's are different

x = @with df begin
res = 0.0
for i in 1:length(:x)
res += :x[i] * :y[i]
end
res
end
> SvfSinOsc (2*v1 - ic1eq, 2*v2 - ic2eq)
> end
>
>

Mike Innes

unread,
Jun 13, 2014, 10:53:33 AM6/13/14
to julia...@googlegroups.com
Absolutely – I didn't want to get into correct/incorrect when I implemented @with but I definitely think the

@with Foo::(a, b)

syntax is preferable. I think I'll disable the type syntax, add support for indexing and then send a PR to see if there's any chance of having it in Base.

(Personally I think the type syntax could be acceptable if used very sparingly and only where it has significant benefit – e.g. for a very large config object that's used in many places. But I do agree that such a construct wouldn't be best placed in Base.)

Carlo Baldassi

unread,
Jun 13, 2014, 4:01:10 PM6/13/14
to julia...@googlegroups.com
Sorry for spamming, but after reading the discussion it seems like a (slightly polished) version of the @extract macro I mentioned above (I know the name is not great) already basically does that and a bit more, IIUC the discussion here. Some more examples and documentation are in the code at https://github.com/carlobaldassi/MacroUtils.jl/blob/master/src/Extract.jl

The nice thing (I think) about it is that if you have two objects of the same kind you can assign different names to the variables you use to refer to their fields, and even apply functions, e.g. say you have

type Foo
  a::Int
  b::Vector{Float64}
end

you can do things like:

function bar(f1::Foo, f2::Foo)
  @extract f1 a1=a b1=b
  @extract f2 a2=a b2=b

  # do stuff with a1,b1,a2,b2 instead of f1.a,f1.b,f2.a,f2.b
end

or even:

function baz(f::Foo, i::Int)
  @extract f a bi=b[i]

  # do stuff with a and bi instead of f.a and f.b[i]

  @extract f b=unsafe(b)

  # do risky stuff with b
end

etc. (It may be even somewhat too fancy).

Anyway, I happen to use that macro a lot in my code.

Stefan Karpinski

unread,
Jun 13, 2014, 4:45:44 PM6/13/14
to julia...@googlegroups.com
How is this different than just assigning fields to local variables?

Carlo Baldassi

unread,
Jun 13, 2014, 5:29:22 PM6/13/14
to julia...@googlegroups.com
The point would be that it's not different, just more concise, so you're still in control of what you're doing and don't risk braking code when changing a type etc.; e.g. this is one typical function from some code I'm using:

function foo(network::Network, i::Int)
    @extract network N H0 lambda state=current_state Ji=unsafe(J[i])
    @extract state   S s=unsafe(s)
    return dot(Ji, s) - H0 - lambda * (S - 0.5 * N)
end

which would be:

function foo(network::Network, i::Int)
    N = network.N
    H0 = network.H0
    lambda = network.lambda
    state = network.current_state
    Ji = unsafe(network.J[i])
    S = state.S
    s = unsafe(state.s)
   
    return dot(Ji, s) - H0 - lambda * (S - 0.5 * N)
end

I just think the @extract version is clearer and more maintainable. Repeated across lots and lots of small functions, it means a consistent reduction in lines of code, and I think it allows to visually focus on the actual functionality rather than on the boring stuff. Also, adding more fields at the end of an @extract line is nicer when you change the fields of a type a lot (another typical situation I found myself into a lot is having a composite type which is just used to pass around tons of parameters, and having to extract them all whenever I use them).

Anyway, so far I never actually thought of proposing inclusion into Base or even registering a package, but in the span of two days this thread came up and a coworker asked me for exactly this kind of thing (since he also was becoming annoyed by the long lists of assignments at the beginning of each function), so if someone likes the syntactic sugar the functionality is basically there.

Stefan Karpinski

unread,
Jun 13, 2014, 6:50:10 PM6/13/14
to Julia Users
Does your @extract macro somehow assign values back to fields at the end of the scope?

Carlo Baldassi

unread,
Jun 13, 2014, 7:04:47 PM6/13/14
to julia...@googlegroups.com
No, that's the main annoyance admittedly, even though I find that most of the time I need to just read the values (or get a pointer to a mutable container which gets updated anyway). For that, one would either need to enclose everything in a block as in the @with macro and do some more magic (but this has problems of its own, and I'm particularly wary of unnecessary performance losses) or more simply have a mirror/sister macro to add at the end, I suppose. Up to now, I haven't really felt the need for that and therefore haven't implemented it.

Stefan Karpinski

unread,
Jun 13, 2014, 7:08:31 PM6/13/14
to Julia Users
Using a block seems like the right way to handle that.

Stefan Karpinski

unread,
Jun 13, 2014, 7:09:10 PM6/13/14
to Julia Users
But wait – if you don't assign things back at the end, how is this different than just assigning to a local variable?

Carlo Baldassi

unread,
Jun 13, 2014, 7:13:37 PM6/13/14
to julia...@googlegroups.com


On Saturday, June 14, 2014 1:09:10 AM UTC+2, Stefan Karpinski wrote:
But wait – if you don't assign things back at the end, how is this different than just assigning to a local variable?


Er, I think I'm having a deja-vu here :)


>> How is this different than just assigning fields to local variables?

Stefan Karpinski

unread,
Jun 13, 2014, 7:23:41 PM6/13/14
to Julia Users
Oh, right. Sorry.

Andrew Simper

unread,
Jun 14, 2014, 1:18:14 AM6/14/14
to julia...@googlegroups.com
Hi Stefan,

I agree with Keno that a macro is the best way to tackle this problem, I just didn't realise that the entire body of the code could easily be inside the macro, and in the end core language features are handled in a similar way. So if you are happy with the @with var (name1, name2) syntax I think it would be a good start by adding that to base, since this will be a useful addition when the number of names is small.

Now lets consider something like this:

type CircuitModel
    v1::Float64
    v5::Float64
    v7::Float64
    v2::Float64
    v3::Float64
    v4::Float64
    v9::Float64
    v12::Float64
    v15::Float64
    v18::Float64
    v19::Float64
    v20::Float64
    v21::Float64
    v8::Float64
    v22::Float64
    v23::Float64
    v24::Float64
    i1::Float64
    i2::Float64
    i3::Float64
    i4::Float64
    i5::Float64
    i6::Float64
    i7::Float64
    ic1eq::Float64
    ic2eq::Float64
    ic3eq::Float64
    ic4eq::Float64
    ic5eq::Float64
    ic6eq::Float64
    ic7eq::Float64
    ic8eq::Float64
    sr::Float64
    srinv::Float64
    pi::Float64
    gmin::Float64
    is1::Float64
    nvtf1::Float64
    is2::Float64
    nvtf2::Float64
    nvtinvf1::Float64
    vcritf1::Float64
    nvtinvf2::Float64
    vcritf2::Float64
    gc1::Float64
    gr1::Float64
    gr2::Float64
    itxr2::Float64
    gc2::Float64
    gc3::Float64
    gc4::Float64
    gc5::Float64
    gr7::Float64
    gc6::Float64
    gr3::Float64
    itxr3::Float64
    gc7::Float64
    gr4::Float64
    gc8::Float64
    gr5::Float64
    vpos::Float64
    vneg::Float64
    gin::Float64
    gininv::Float64
    vposcap::Float64
    vnegcap::Float64
    ginotacore::Float64
    ginotares::Float64
    ginotacoreinv::Float64
    ginotaresinv::Float64
    vc3lo::Float64
    vc3hi::Float64
    a4a4c::Float64
    a5a5c::Float64
    a6a6c::Float64
    a14a14c::Float64
    a16a16c::Float64
    a17a17c::Float64
    a17a17nrmc::Float64
    a12a17c::Float64
    a16a16nrmc::Float64
    a15a16c::Float64
    a14a14nrmc::Float64
    a13a14c::Float64
    a6a14c::Float64
    a13a6c::Float64
    a5a5nrmc::Float64
    a4a5c::Float64
    a2a5c::Float64
    a2a4c::Float64
    a4a4nrmc::Float64
    a1a4c::Float64
    v5c::Float64
    v7c::Float64
end

Now if I have a bunch of different functions that process on these named fields I would have something like:

process1 (state::CircuitModel, input::Float64)
    @with state (v1, v5, v7, v2, v3, v4, v9, v12, v15, v18, v19, v20, v21, v8, v22, v23, v24, i1, i2, i3, i4, i5, i6, i7, ic1eq, ic2eq, ic3eq, ic4eq, ic5eq, ic6eq, ic7eq, ic8eq, sr, srinv, pi, gmin, is1, nvtf1, is2, nvtf2, nvtinvf1, vcritf1, nvtinvf2, vcritf2, gc1, gr1, gr2, itxr2, gc2, gc3, gc4, gc5, gr7, gc6, gr3, itxr3, gc7, gr4, gc8, gr5, vpos, vneg, gin, gininv, vposcap, vnegcap, ginotacore, ginotares, ginotacoreinv, ginotaresinv, vc3lo, vc3hi, a4a4c, a5a5c, a6a6c, a14a14c, a16a16c, a17a17c, a17a17nrmc, a12a17c, a16a16nrmc, a15a16c, a14a14nrmc, a13a14c, a6a14c, a13a6c, a5a5nrmc, a4a5c, a2a5c, a2a4c, a4a4nrmc, a1a4c, v5c, v7c)
    # some in depth numerical processing on state and input that updates the state
end

process2 (state::CircuitModel, input::Float64)
    @with state (v1, v5, v7, v2, v3, v4, v9, v12, v15, v18, v19, v20, v21, v8, v22, v23, v24, i1, i2, i3, i4, i5, i6, i7, ic1eq, ic2eq, ic3eq, ic4eq, ic5eq, ic6eq, ic7eq, ic8eq, sr, srinv, pi, gmin, is1, nvtf1, is2, nvtf2, nvtinvf1, vcritf1, nvtinvf2, vcritf2, gc1, gr1, gr2, itxr2, gc2, gc3, gc4, gc5, gr7, gc6, gr3, itxr3, gc7, gr4, gc8, gr5, vpos, vneg, gin, gininv, vposcap, vnegcap, ginotacore, ginotares, ginotacoreinv, ginotaresinv, vc3lo, vc3hi, a4a4c, a5a5c, a6a6c, a14a14c, a16a16c, a17a17c, a17a17nrmc, a12a17c, a16a16nrmc, a15a16c, a14a14nrmc, a13a14c, a6a14c, a13a6c, a5a5nrmc, a4a5c, a2a5c, a2a4c, a4a4nrmc, a1a4c, v5c, v7c)
    # some in depth numerical processing on state and input that updates the state
end

process3 / process 4 etc similarly defined

If I add a single new name (variable) into the CircuitModel type I will have to go and find everywhere I use @with and insert that variable into the right place inside the list, which to me seems pretty brittle and needlessly repetitive. Is this the best way to handle this? 

Ok, so there may be problems working out which variable is being referred to, but this is all because of the convenience of the language in other aspects, eg:

a = 2 # could possibly add a new local variable, or update an existing local or global variable


So the problems you are pointing out seem mostly to be with ambiguity of scope, but in this situation I know exactly what I want access to and I'm happy to completely specify it since otherwise the code becomes horrible, so how about a new "locked down scope" block that specifically stops the introduction of implicit locals and also blocks access to globals or anything else other than what is specifically stated at the start of the block? This look something like:

process5 (state::CircuitModel, input::Float)
    @withonly state::CircuitModel, input begin
        v1 = 1      # fine since "v1" is a name of "CircuitModel"
        v5 = input  # fine since "v5" is a name of "CircuitModel", and "input" is allowed as well
        a = 2       # error, "a" is not not a name of "CircuitModel" and is not "input"
        local b = 3 # ok, since this is specifically a new local variable "b"
        gr5 = b     # ok since "b" is the new local variable and "gr5" is a name of "CircuitModel"
    end
end

So this addresses your points about not being able to reason locally about what is going on, since any variable must be from one of the blocks arguments (a name clash would be a compiler error). I would even be happy being able to limit use so that only the first argument could be a composite type, so if the variable isn't listed specifically, or defined as a local you know it must be a member of the composite type or you get a compiler error. This would allow all the c++ hacks like me to have their "this" in a safe, unambiguous manner and not have to obfuscate our code with typing "this" everywhere.


So here is an example of part of one of the process blocks if you have to put a "state." in front of everything, which is wonderfully explicit, but also completely useless since I can't read the code!


process6 (state::CircuitModel, input::Float)
    # skipped a large chunk of code, this is actually a section of an inner loop
    state.v1 = input;
    state.v2 = state.z3-(state.v1*state.a1a4c+state.v5c*state.a2a4c);
    state.v3 = state.z4-(state.v5c*state.a2a5c+state.v2*state.a4a5c);
    state.v4 = state.z5-state.v1*state.a1a6-(state.v7c*state.a3a6+state.v3*state.a5a6);
    state.v9 = state.z6+state.v3*state.a6a7-state.v4*state.a6a7;
    state.v12 = state.z7-state.v9*state.a7a8;
    state.v15 = state.z8-state.v12*state.a8a9;
    state.v18 = state.z9-state.v15*state.a9a10;
    state.v19 = state.v18;
    state.v20 = state.ia1eq-state.v19*state.ga1;
    state.v21 = state.v20*state.k-state.v1*state.k;
    state.v8 = state.z13-(state.v4*state.a6a14c+state.v21*state.a13a14c);
    state.v22 = -(state.v12*2);
    state.v23 = state.z15-state.v22*state.a15a16c;
    state.v24 = state.z16-state.v20*state.a12a17c;
    state.i1 = state.v2*state.gr1-state.v1*state.gr1;
    state.i2 = state.z1+state.v3*state.gr2+state.v4*state.gr3-(state.v5c*state.gr2+state.v7c*state.gr3);
    state.i3 = state.itr3+state.v7c*state.gr3-state.v4*state.gr3;
    state.i5 = state.v24*state.gc8-(state.ic8eq+state.v20*state.gc8);
    state.i6 = state.v8*state.gr7-state.v21*state.gr7;
    state.i7 = state.v23*state.gc7-(state.ic7eq+state.v22*state.gc7);
end



And here is the a snippet of using a "locked down" block, the code isn't easy to understand, but I can spot possible errors and fix it much more easily since I see the variable names more easily:

process7 (state::CircuitModel, input::Float)
    @withonly state::CircuitModel, input begin
        # skipped a large chunk of code, this is actually a section of an inner loop
        v1 = input;
        v2 = z3-(v1*a1a4c+v5c*a2a4c);
        v3 = z4-(v5c*a2a5c+v2*a4a5c);
        v4 = z5-v1*a1a6-(v7c*a3a6+v3*a5a6);
        v9 = z6+v3*a6a7-v4*a6a7;
        v12 = z7-v9*a7a8;
        v15 = z8-v12*a8a9;
        v18 = z9-v15*a9a10;
        v19 = v18;
        v20 = ia1eq-v19*ga1;
        v21 = v20*k-v1*k;
        v8 = z13-(v4*a6a14c+v21*a13a14c);
        v22 = -(v12*2);
        v23 = z15-v22*a15a16c;
        v24 = z16-v20*a12a17c;
        i1 = v2*gr1-v1*gr1;
        i2 = z1+v3*gr2+v4*gr3-(v5c*gr2+v7c*gr3);
        i3 = itr3+v7c*gr3-v4*gr3;
        i5 = v24*gc8-(ic8eq+v20*gc8);
        i6 = v8*gr7-v21*gr7;
        i7 = v23*gc7-(ic7eq+v22*gc7);
    end
end



Stefan Karpinski

unread,
Jun 14, 2014, 3:06:46 AM6/14/14
to Julia Users
On Sat, Jun 14, 2014 at 1:18 AM, Andrew Simper <andrew...@gmail.com> wrote:
process5 (state::CircuitModel, input::Float)
    @withonly state::CircuitModel, input begin
        v1 = 1      # fine since "v1" is a name of "CircuitModel"
        v5 = input  # fine since "v5" is a name of "CircuitModel", and "input" is allowed as well
        a = 2       # error, "a" is not not a name of "CircuitModel" and is not "input"
        local b = 3 # ok, since this is specifically a new local variable "b"
        gr5 = b     # ok since "b" is the new local variable and "gr5" is a name of "CircuitModel"
    end
end

So this addresses your points about not being able to reason locally about what is going on, since any variable must be from one of the blocks arguments

Since the meaning of the code depends on the definition of CircuitModel, which is defined elsewhere, that isn't reasoning locally.

Andrew Simper

unread,
Jun 14, 2014, 3:32:02 AM6/14/14
to julia...@googlegroups.com
But it is! Give me the name of any variable in the above block and I can tell you where it comes from without ambiguity, it has to be either locally defined with "local blah", a specifically named thing passed as in variable passed to @withonly, or a name of the one and only allowable composite type which can be passed as the first argument, otherwise there will be a compiler error.

Andrew Simper

unread,
Jun 14, 2014, 3:41:16 AM6/14/14
to julia...@googlegroups.com
Ok, I think I get what you mean by "reason locally" as "this is the only code you will ever see" then sure, but then you won't be able to tell if this is valid either:

function process8 (state::CircuitModel, input::Float)
    state.a = 5 # this should be an error but you won't know until you try and compile it
end

Andrew Simper

unread,
Jun 14, 2014, 3:45:02 AM6/14/14
to julia...@googlegroups.com
Following your logic through to completion every time we use a composite type we should specify all the names it contains otherwise we can't "reason locally".

Mike Innes

unread,
Jun 14, 2014, 5:22:23 AM6/14/14
to julia...@googlegroups.com
function process8 (state::CircuitModel, input::Float)
    state.a = 5
end

You're right that you can't tell whether the code above is correct without knowing about CircuitModel, but it is obvious where all the variables are coming from and what's happening to them – and that's really valuable if/when things do go wrong.

You're also absolutely right that @with f::Foo is statically resolved, so it's nowhere near as bad as the JS version – but nevertheless, needing extra information about Foo means that the function no longer stands for itself. That's what we mean by local reasoning.

That's why I also say that this ability might be OK for a config object – if it's something you're only using a couple of times in the same file as the object's definition, that IMO is "local enough". But using it for anything more than that will obfuscate your code more, not make it clearer.

Andrew Simper

unread,
Jun 14, 2014, 5:46:01 AM6/14/14
to julia...@googlegroups.com


On Saturday, June 14, 2014 5:22:23 PM UTC+8, Mike Innes wrote:
function process8 (state::CircuitModel, input::Float)
    state.a = 5
end

You're right that you can't tell whether the code above is correct without knowing about CircuitModel, but it is obvious where all the variables are coming from and what's happening to them – and that's really valuable if/when things do go wrong.

You're also absolutely right that @with f::Foo is statically resolved, so it's nowhere near as bad as the JS version – but nevertheless, needing extra information about Foo means that the function no longer stands for itself. That's what we mean by local reasoning.

That's why I also say that this ability might be OK for a config object – if it's something you're only using a couple of times in the same file as the object's definition, that IMO is "local enough". But using it for anything more than that will obfuscate your code more, not make it clearer.


But the local reasoning is identical in the case I have proposed, since there are no implicit local variables, and  you have to explicitly tell the block scope everything it is going to be accessing. So if you do this:

function process8 (state::CircuitModel, input::Float)
    @withonly state::CircuitModel, input begin
        a = 5
    end
end

you know that since "a" is not "input" then it what you have written will be interpreted only in one way:
state.a = 5

you would need to specify:
local b

to introduce any new names into the scope, or specify then at the start of the block, otherwise they must be from inside the (one and only possible) composite type CircuitModel.

Now I understand that this is possibly too ugly a proposition, but I feel it does hold up to the "local reasoning" assertion just as well as using explicit dot notation.

Mike Innes

unread,
Jun 14, 2014, 6:47:17 AM6/14/14
to julia...@googlegroups.com
Ok, I see what you mean – make the global scope explicit so that the local scope can be implicit. This is actually a really interesting idea and could make for a neat solution, but it also has problems of its own and would be tricky to implement well. I'll definitely think about it some more when I have time, though.

Andrew Simper

unread,
Jun 14, 2014, 9:51:35 AM6/14/14
to julia...@googlegroups.com
No problem Mike, thanks for taking the time to considering it!

The @with macro you have already written means I can get on and do exactly what I want, and I'll only use the @with f::Foo syntax in very specific cases, just like the ones I've shown, which is basically processing a complicated algorithm with lots of state where the efficiency will be in the internals of the algorithm, not with optimisation how the state is updated.

I'm really loving the whole design and philosophy of Julia. Thanks to everyone who has spent all the time and effort to make it happen! My intention is to make some specific tasks less obfuscated, but I by no means want to destroy the beauty or power of the language, and I really appreciate that Stefan is keeping tight reins to prevent bad things happening, and I am happy to work around things with macros to get what I want.

Stefan Karpinski

unread,
Jun 14, 2014, 10:50:09 AM6/14/14
to julia...@googlegroups.com
This is an issue that DataFrames has struggled with and it's not an easy one. It would be good to figure out a convenient way do deal with it, but it's a tough design problem.
Reply all
Reply to author
Forward
0 new messages