How to write a macro that can substitute variable values into an expression

1,217 views
Skip to first unread message

Walking Sparrow

unread,
Feb 1, 2014, 11:52:30 AM2/1/14
to julia...@googlegroups.com
Please forgive me if this is a stupid question. Suppose I have an expression

:(sin(x) + cos(y) * sin(z))

and the values of x, y, z.

How can I write a macro that can substitute the values of x, y, z into the above expression? The number of values that I want to substitute depends on the actual use cases and thus is unknown.

I wrote a function that can do this

function substitute(expr::Expr, vals::Array{Expr,1})
    for i = 1:length(vals)
        @eval $(vals[i])
    end
    @eval $expr
end

x = 10
y = 23

substitute(:(x+y), [:(x = 2), :(y = 3)])

x
y

But if you run the above code, you will see that the values of global x and y are changed, which is not what I intend to do. This is because "eval" does the evaluation in the global scope. Besides, I think it is a bad coding pattern to use eval and it is slow.

It would be better if this can be done using macro. But I have no idea about how to do this.

Walking Sparrow

unread,
Feb 1, 2014, 12:04:35 PM2/1/14
to julia...@googlegroups.com
So the real question is how to generate a code block like this

quote
    x = 2
    y = 3
    .....
    x + y + ....
end

Need to embed a for loop inside the macro definition?

Jameson Nash

unread,
Feb 1, 2014, 5:35:38 PM2/1/14
to julia...@googlegroups.com
You need to provide more detail on what you are trying to do with this. You seem
to be confusing several concepts involving the usage of expressions,
macros, and functions. I can't tell if you are trying to write special
syntax, or are just unaware of anonymous functions:

Mostly, why is :(sin(x) + cos(y) * sin(z)) an expression, and not a
function? It seems like you perhaps have an R background?

f(x,y,z) = (sin(x) + cos(y) * sin(z))
f(1,2,3)

Walking Sparrow

unread,
Feb 1, 2014, 6:42:23 PM2/1/14
to julia...@googlegroups.com
You are right about that I have an R background. What I am trying to do is to evaluate a function given by the user. For example,

I want to write a function that can compute the marginal effects of a linear or logistic model. For simplicity, let's just use linear regression. If the user did a linear regression using the following model (I am using the formula syntax from R)

y ~ x + z + sin(x) * sin(z) for the data set my_data, which has three columns x, y, and z

Then the marginal effects at the mean are computed like this: First, compute the first derivative of 1+ x + z + sin(x) * sin(z). This can be done in R using the function "deriv" to get the expression of the first derivative. In the second step, I need to substitute the mean values of  x and z into the result of the first step. An example of this would be the "margins" function in the R package "PivotalR" (http://cran.r-project.org/web/packages/PivotalR/ and https://github.com/gopivotal/PivotalR)

Right now, I have no idea how to do the first step in Julia. But that is OK, because I just started learning Julia.

Now my question is in the second step. The user can use any complex expressions in the linear regression like y ~ x + x*z + log(sin(x) + 2) * log(cos(z) + 2), and the data set my_data and formula can have any number of variables like x1, x2, ...., x1000. So when you write the code for the value substitution in the second step, you cannot know which function and what variables you will have.

So in Julia or R, I need a function or macro F(f, [....]) that does this: given a function f, whose format is the input from the user, and a set of variable values [...], whose number and names are also the input from the user, F(f, [...]) returns the value of f evaluated at the values [...]. For example, the user inputs

f = 1 + z + cos(x)*log(2+cos(z))/(2+sin(x))

and [x = 2.3, z = 1.4],

F should return the value of f evaluated at x = 2.3 and z = 1.4.

This can be done in R, see "margins" function in PivotalR, which actually does big data computation in-database. The problem is how to do the same thing in Julia?

Hope my explanation makes my question clearer.

John Myles White

unread,
Feb 1, 2014, 7:31:47 PM2/1/14
to julia...@googlegroups.com
If you want to do this, the easiest way is to define your own implementation of the @~ macro that the latest version Julia uses to parse expressions that look like R’s formulas.

That will give you access to the quoted expressions you’d need to manipulate to do your analysis.

Given those quoted expressions, you’ll need to define a symbolic differentiation tool that’s rich enough to handle the inputs you want to process. The Calculus package handles symbolic differentiation for a good chunk of functions, but you may need to extend it to your use case.

It may be worth noting that your example makes very heavy usage of R’s non-standard evaluation functionality, which is something that the Julia community has not invested much time into developing yet. Most Julia programmers tend to avoid operating on symbolic expressions.

 — John

Mauro

unread,
Feb 2, 2014, 7:09:05 AM2/2/14
to julia...@googlegroups.com
I don't quite comprehend your problem, so maybe this doesn't help. But
as far as I can tell, there is no need for macros:

Just define your function as a normal function, which can be evaluated
for any (x,z):

julia> f(x,z) = x + x*z + log(sin(x) + 2) * log(cos(z) + 2)
f (generic function with 1 method)
julia> x0 = 2.3; z0 = 1.4;
julia> f(x0,z0)
6.302488546391614

For the first step: use automatic differentiation on the function in
question. The package https://github.com/scidom/DualNumbers.jl can do
this:

julia> using DualNumbers
julia> xdu = dual(x0,1)
2.3 + 1.0du

# this gives (f(x0,z0), \partial f / \partial x at (x0,z0)) :
julia> f(xdu,z)
6.302488546391614 + 2.2120074784516013du

julia> zdu = dual(z,1)
1.4 + 1.0du
# this gives (f(x0,z0), \partial f / \partial z at (x0,z0)) :
julia> f(x,zdu)
6.302488546391614 + 1.8413102782559223du

(double check that the derivatives are right but I think that is how it
should work)
>> On Sat, Feb 1, 2014 at 12:04 PM, Walking Sparrow <hq...@gopivotal.com<javascript:>>
--
Sent with my mu4e

Walking Sparrow

unread,
Feb 2, 2014, 10:41:36 AM2/2/14
to julia...@googlegroups.com
John, thank you for the suggestions. I learned a lot by reading your blogs.

Walking Sparrow

unread,
Feb 2, 2014, 10:47:36 AM2/2/14
to julia...@googlegroups.com
Let me clarify a little bit. My question is actually the following:

In R, one can do something like

> f <- function(x1, x2, x3, x4, x5, x6) { some expressions that you like to use }
> evaluate.at <- list(x1 = 2, x2 = 2.3, x3 = 2, x4 = 1.2, x5 = 3.4, x6 = 5.6)
> do.call(f, evaluate.at) # get the value


"do.call" can accept any valid function, and a list of variable values. There is no restriction to the function or the number of variables. Sometimes this is very useful.

How do we do similar things in Julia? Does Julia have a function or macro similar to "do.call"?

Johan Sigfrids

unread,
Feb 2, 2014, 10:59:59 AM2/2/14
to julia...@googlegroups.com
Can't you just do this with apply? Something like this:

f = (x, y, z) -> x + y + z^2
let x
=3, y=4, z=5
    apply
(f, x, y, z)
end

Keno Fischer

unread,
Feb 2, 2014, 11:11:34 AM2/2/14
to julia...@googlegroups.com
Or you could just call the function directly:

f = (x,y,z)->x+y+z^2
let x=3, y=4, z=5
   f(x,y,z)
end

or 

f((x,y,z)...)

or 

f((1,2,3)...)

Walking Sparrow

unread,
Feb 2, 2014, 11:49:52 AM2/2/14
to julia...@googlegroups.com
I guess "apply" and "let" can do some work here. But I do not know the variable names and number that the user would use.

So now I need a macro that can construct the let-apply block with the variable number undetermined. The macro should be able to accept any number of variables.

Suppose that the user inputs

func(x, y) = x+2y and x = 1, y = 2

@my_macro func (x=1, y=2) would be expanded to

let x = 1, y = 2
    apply(func,1, 2)
end

And if the user inputs

func(a, b, c, d) = a + b + c + d, and a = 1, b = 2, c = 3, d=4

@my_macro func (a = 1, b = 2, c = 3, d = 4) would expand to

let a = 1, b = 2, c = 3, d = 4
    apply(func, 1,2,3,4)
end

How to write a macro like this?

If I knew the function and the variables, of course I could directly call the function or use let-apply, but the problem is that these are the inputs of the user, which I cannot know beforehand.

Johan Sigfrids

unread,
Feb 2, 2014, 12:03:21 PM2/2/14
to julia...@googlegroups.com
Thinking about this, the whole let or macro might be overkill. If the user provides both the function and the arguments, the user should be able to provide the arguments in the correct form for the function, in which case you need neither let nor macros. You could just call apply directly on those two:

user_function(x,y,z) = x + y + z^2
user_arguments
= (3, 4, 5)

apply
(user_function, user_arguments...)

Walking Sparrow

unread,
Feb 2, 2014, 12:35:00 PM2/2/14
to julia...@googlegroups.com
This is a solution, but it would be better if there is not restriction on the order of the variables. As I have describe in one of the above posts, what I really want to do is this:

User inputs a function, and a data.frame which contains all the variables that appear in the function. I will need to substitute the mean values of the variables into the function. (Actually for computing the marginal effects, one also needs to compute the average of the function values evaluated at all rows of the data.frame).

So in order to use apply, I will need to extract the variable order and names from the user-defined function. This is because the function might be func(x,y,z), but the data.frame has the columns z, x, y, a, b, c, d (it has more columns than what are needed by the function, which is the usual case. And the order is different).

So John Myles White's opinion is that this is very hard to do in the current Julia (see his post above).

Mauro

unread,
Feb 2, 2014, 3:35:14 PM2/2/14
to julia...@googlegroups.com
On Sun, 2014-02-02 at 17:35, hq...@gopivotal.com wrote:
> User inputs a function, and a data.frame which contains all the variables
> that appear in the function. I will need to substitute the mean values of
> the variables into the function. (Actually for computing the marginal
> effects, one also needs to compute the average of the function values
> evaluated at all rows of the data.frame).

I think you're making hacking-life more complicated than it already is!
You'll only need macros if you insist that the naming of the function
arguments is automatically matched against the column names of the
dataframe. But I don't think that that is good idea: names of function
arguments are here to refer to values inside the function and not
outside of it. Nor is it, I think, a particular Julian way of coding.

I suggest instead something like this:

user supplies
- a function:
f(a,b,c,d) = ...
- a DataFrame:
df
- a tuple/list of column names to be used in the order they need to be
inserted into f; i.e. this is a mapping from column-names to function
argument position. E.g.:
("height", :width, 'd', :x)

you provide a function like so:
function F(userfn, datafr, fields)
# take mean of dataframe columns
colmeans = [mean(datafr[fl]) for fl in fields]
# maybe do some more stuff:

# call user function
return userfn(colmeans...) # (the three dots are the syntax used here)
end

Then the user can call it like so:
F(f, df, ("height", :width, 'd', :x))


I reckon you ought to give this kind of user interface a try and see
whether that works for you.

John Myles White

unread,
Feb 3, 2014, 10:18:50 AM2/3/14
to julia...@googlegroups.com
To make sure everyone’s on the same page, Walking Sparrow’s approach is completely standard for R. The way that R treats certain DataFrames as an additional scope in which to search for variable bindings is something R users have been taught to expect, even though it is an extremely un-Julian way of coding.

All that said, in DataFrames, our current solution is to completely avoid this kind of scoping until we’re confident that we can make it work efficiently. We may come back to it in the future, but there are other priorities to work on now.

— John
Reply all
Reply to author
Forward
0 new messages