.+ and broadcasting

252 views
Skip to first unread message

Toivo Henningsson

unread,
Apr 24, 2012, 2:13:41 AM4/24/12
to julia-dev
From what I've gathered, it seems likely that pointwise operators
like .* will be getting broadcast behavior in Julia a la bsxfun/numpy.
(Please correct me if I'm wrong.)

This is a great feature which I really like about numpy, but
occasionally it's a little more magic than you asked for.
Suppose I'm writing some plain linear algebra code (perhaps for 3d
graphics) that manipulates scalars, vectors and matrices. In that
case, I would really like to have an error message if I e g try to add
a matrix to a scalar or a vector.

There's also distributivity: (.+ being +, v is a 3-vector)

(eye(3) .+ 1)*v = v + ones(3,3)*v
!= eye(3) *v .+ 1*v = 2v

So, my question:
How do people feel about having .+ as broadcasting plus, (distributive
over .*)
and + as plain plus? (distributive over *)

Excursion:
From the distributivity point of view, I guess ones(3,3) + 1 should be
ones(3,3) + eye(3). But that's probably unexpected enough that it
would be better to throw an error.

david tweed

unread,
Apr 24, 2012, 6:14:08 AM4/24/12
to julia-dev
On Apr 24, 7:13 am, Toivo Henningsson <toivo....@gmail.com> wrote:
> From what I've gathered, it seems likely that pointwise operators
> like .* will be getting broadcast behavior in Julia a la bsxfun/numpy.
> (Please correct me if I'm wrong.)
>
> This is a great feature which I really like about numpy, but
> occasionally it's a little more magic than you asked for.
> Suppose I'm writing some plain linear algebra code (perhaps for 3d
> graphics) that manipulates scalars, vectors and matrices. In that
> case, I would really like to have an error message if I e g try to add
> a matrix to a scalar or a vector.
>
> There's also distributivity: (.+ being +, v is a 3-vector)
>
>         (eye(3) .+ 1)*v = v + ones(3,3)*v
>     != eye(3) *v .+ 1*v = 2v
>
> So, my question:
> How do people feel about having .+ as broadcasting plus, (distributive
> over .*)
> and + as plain plus? (distributive over *)

In some ways this is an easy case since + is essentially equivalent
to .+ . Let's consider multiplication, where * has a distinctly
different meaning to .* . Since conventions are relatively malleable
at this point, is it worth having an additional modifier prefix symbol
(maybe ~ as in ~+, ~*, etc) with the essential meaning of Matlabs
bsxfun?

> Excursion:
> From the distributivity point of view, I guess ones(3,3) + 1 should be
> ones(3,3) + eye(3). But that's probably unexpected enough that it
> would be better to throw an error.

I wouldn't say unexpectedness is the biggest problem so much as
overwhelming programmer usage being to add the same value to all
elements of a matrix. (The only case I can think of for wanting to add
a main diagonal constant is regularisation.)

Toivo Henningsson

unread,
Apr 24, 2012, 8:03:08 AM4/24/12
to julia-dev


On Apr 24, 12:14 pm, david tweed <david.tw...@gmail.com> wrote:
> In some ways this is an easy case since + is essentially equivalent
> to .+ . Let's consider multiplication, where * has a distinctly
> different meaning to .* . Since conventions are relatively malleable
> at this point, is it worth having an additional modifier prefix symbol
> (maybe ~ as in ~+, ~*, etc) with the essential meaning of Matlabs
> bsxfun?

I thought that . was that modifier -- there are already .* ./ and .^
in julia (although they don't do any broadcasting currently).
But sure, if .+ is not the syntax to use, I would be fine with
something like ~+.

I personally think that also + and .+ are different enough operations
to warrant different operators:
my question is if other people feel the same.
In domains where broadcasting is not natural, I think restricting
broadcasting to .+ would help catch a number of otherwise nasty bugs,
instead of julia trying to be helpful by doing the wrong thing.
Writing .+ explicitly should increase the chances that you are aware
of what you are doing.

> > Excursion:
> > From the distributivity point of view, I guess ones(3,3) + 1 should be
> > ones(3,3) + eye(3). But that's probably unexpected enough that it
> > would be better to throw an error.
>
> I wouldn't say unexpectedness is the biggest problem so much as
> overwhelming programmer usage being to add the same value to all
> elements of a matrix. (The only case I can think of for wanting to add
> a main diagonal constant is regularisation.)

Yes, I agree that adding to the diagonal of a (square) matrix is not a
too common operation, I'd be more comfortable doing it explicitly.
So if you write A+1, you would get an error message saying that maybe
A.+1 is what you want, you change it, and hopefully it becomes a habit
soon enough.
Do you think I'm too optimistic?

david tweed

unread,
Apr 24, 2012, 8:51:07 AM4/24/12
to julia-dev
On Apr 24, 1:03 pm, Toivo Henningsson <toivo....@gmail.com> wrote:
> On Apr 24, 12:14 pm, david tweed <david.tw...@gmail.com> wrote:
>
> > In some ways this is an easy case since + is essentially equivalent
> > to .+ . Let's consider multiplication, where * has a distinctly
> > different meaning to .* . Since conventions are relatively malleable
> > at this point, is it worth having an additional modifier prefix symbol
> > (maybe ~ as in ~+, ~*, etc) with the essential meaning of Matlabs
> > bsxfun?
>
> I thought that . was that modifier -- there are already .* ./ and .^
> in julia (although they don't do any broadcasting currently).
> But sure, if .+ is not the syntax to use, I would be fine with
> something like ~+.

My understanding is that Julia follows MATLAB, so that

plain op is linear-algebra operator (eg, * is matrix multiplication)
.op is the operator applied elementwise (I think this is what you
refer to as broadcasting) to conformable shaped matrices (eg, .* is
multipliction of elements in the same positions in each array)

It just happens that the desired behaviour of + is actually the same
as .+. All this is independent of bsxfun which, from memory, takes a
binary operator and two arrays, one of which has fewer dimensions than
the other and (conceptually) duplicates the smaller-dimensional array
along the missing dimensions so that the operator can then be applied
to the resulting conformable shaped arrays. (In principles these could
be any dimensional, but in practice it tends to be limited to 2
dimensions.) A typical example would be if you've got a (samples x
vector-dimension) (eg, 2000 x 3) set of samples subtracting off the
mean (a 3 element 1-dimensional array) using something like
bsxfun(-,samples,meanVec) whilst samples - meanVec gives a dimensions
error. This kind of behaviour is useful, but as you say it could also
mask simple errors, so it might be useful to have a separate operator
in addition to op and .op that signfies "magical array expansion" is
potentially being used.

> I personally think that also + and .+ are different enough operations
> to warrant different operators:
> my question is if other people feel the same.
> In domains where broadcasting is not natural, I think restricting
> broadcasting to .+ would help catch a number of otherwise nasty bugs,
> instead of julia trying to be helpful by doing the wrong thing.
> Writing .+ explicitly should increase the chances that you are aware
> of what you are doing.
>
> > > Excursion:
> > > From the distributivity point of view, I guess ones(3,3) + 1 should be
> > > ones(3,3) + eye(3). But that's probably unexpected enough that it
> > > would be better to throw an error.
>
> > I wouldn't say unexpectedness is the biggest problem so much as
> > overwhelming programmer usage being to add the same value to all
> > elements of a matrix. (The only case I can think of for wanting to add
> > a main diagonal constant is regularisation.)
>
> Yes, I agree that adding to the diagonal of a (square) matrix is not a
> too common operation, I'd be more comfortable doing it explicitly.
> So if you write A+1, you would get an error message saying that maybe
> A.+1 is what you want, you change it, and hopefully it becomes a habit
> soon enough.
> Do you think I'm too optimistic?

I think I've confused two separate things in this thread: bsxfun and
the native/elementwise distinction. So the natrual question would be
what should be done in a case like (A is nxm 2-D array, v is a 1-D m
element vector) A.+v? Should that do an implicit "bsxfun", or should
doing that require a different operator?

Stefan Karpinski

unread,
Apr 24, 2012, 12:12:05 PM4/24/12
to juli...@googlegroups.com
On Tue, Apr 24, 2012 at 8:51 AM, david tweed <david...@gmail.com> wrote:
So the natrual question would be what should be done in a case like (A is nxm 2-D array, v is a 1-D m element vector) A.+v? Should that do an implicit "bsxfun", or should doing that require a different operator?

I've advocated before that it ought to automatically broadcast the singleton dimensions so that A+v broadcasts the elements of v along the rows of A and A'+v' broadcasts the elements of v' along the columns of A'. These sorts of things are very common when, e.g., normalizing matrices.

Toivo Henningsson

unread,
Apr 24, 2012, 12:22:11 PM4/24/12
to julia-dev


On Apr 24, 6:12 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
So how do you feel about having one + operator that does broadcasting
and one that doesn't?

Stefan Karpinski

unread,
Apr 24, 2012, 12:38:28 PM4/24/12
to juli...@googlegroups.com
The problem would be that * and .* already have meanings, as do / and ./, so we'd be in a situation where we'd need a whole different set of operators. I'd prefer to either always do broadcasting for element-wise operators or never do broadcasting and require a function like bsxfun to broadcast. 

Toivo Henningsson

unread,
Apr 24, 2012, 1:10:39 PM4/24/12
to julia-dev


On Apr 24, 6:38 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> The problem would be that * and .* already have meanings, as do / and ./,
> so we'd be in a situation where we'd need a whole different set of
> operators.

But I think that having .+ be broadcasting (a la bsxfun) would be
perfectly consistent with the current meaning of .* and ./, if they
will do broadcasting of course. I've just assumed when you have
advocated broadcasting that .* and ./ would do it, maybe I
misunderstood?

> I'd prefer to either always do broadcasting for element-wise
> operators or never do broadcasting and require a function like bsxfun to
> broadcast.

Ok, so how do you feel about having .+ .- .* ./ etc do broadcasting,
while + - * / don't?

Now, in case .* ./ .^ wouldn't broadcast, I think it would be really
helpful to have a set of elementwise operators that do; this would
make broadcasting style code much more readable. ~+ ~- ~* ~/ ? Not
ideal :)

I guess we should be careful with the ascii real estate :) There's not
many real good operator names left. Of course, overspending on ascii
now will only aggravate the situation further on...

Stefan Karpinski

unread,
Apr 24, 2012, 1:37:34 PM4/24/12
to juli...@googlegroups.com
You seem to be missing the fact that * and / already have non-element-wise meanings: matrix multiplication and division. We can't just redefine them to mean something else.

david tweed

unread,
Apr 24, 2012, 2:33:32 PM4/24/12
to julia-dev
On Apr 24, 5:38 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> The problem would be that * and .* already have meanings, as do / and ./,
> so we'd be in a situation where we'd need a whole different set of
> operators. I'd prefer to either always do broadcasting for element-wise
> operators or never do broadcasting and require a function like bsxfun to
> broadcast.

I'd be comfortable with either case for interactive use; you'd notice
pretty quickly that things weren't doing what you think you are. What
would be more important would be use of functions from libraries (even
libraries the programmer themselves wrote and has now forgotten
details of): it'd be unfortunate if in cases where bsxfun/broadcasting
would never make sense, the automatic usage would make incorrect
function calls "silently produce wrong results". Maybe Julia could
require explicit action like Matlab, but find some more streamlined
syntax, eg, something like A bsx(+) y. (I suspect the shorter bs(+)
won't be popular with americans who need to discuss code with their
colleagues :-) )

david tweed

unread,
Apr 24, 2012, 2:51:29 PM4/24/12
to julia-dev
On Apr 24, 7:33 pm, david tweed <david.tw...@gmail.com> wrote:
> On Apr 24, 5:38 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
>
> > The problem would be that * and .* already have meanings, as do / and ./,
> > so we'd be in a situation where we'd need a whole different set of
> > operators. I'd prefer to either always do broadcasting for element-wise
> > operators or never do broadcasting and require a function like bsxfun to
> > broadcast.
>
> I'd be comfortable with either case for interactive use; you'd notice
> pretty quickly that things weren't doing what you think you are. What
> would be more important would be use of functions from libraries (even
> libraries the programmer themselves wrote and has now forgotten
> details of): it'd be unfortunate if in cases where bsxfun/broadcasting
> would never make sense, the automatic usage would make incorrect
> function calls "silently produce wrong results". Maybe Julia could
> require explicit action like Matlab, but find some more streamlined
> syntax, eg, something like A bsx(+) y. (I suspect the shorter bs(+)
> won't be popular with americans  who need to discuss code with their
> colleagues :-) )

Maybe I haven't fully made the mental transition from Matlab to Julia.
Since unlike Matlab you've got typed function prototypes, that would
probably catch most of the errors without the need for tedious user
code checking that arguments that must match exactly (not just after
bsxfun'ing) do.

Toivo Henningsson

unread,
Apr 24, 2012, 4:19:20 PM4/24/12
to julia-dev
On Apr 24, 7:37 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> You seem to be missing the fact that * and / already have non-element-wise
> meanings: matrix multiplication and division. We can't just redefine them
> to mean something else.
>
> On Tue, Apr 24, 2012 at 1:10 PM, Toivo Henningsson <toivo....@gmail.com>wrote:
> > Ok, so how do you feel about having .+ .- .* ./ etc do broadcasting,
> > while + - * / don't?

Oh, I realize that might have sounded like * and / should be
elementwise, sorry. I definitely want to keep them as matrix multiply
and divide.
Under the premise that the pointwise .* and ./ be made to broadcast,
all I'm suggesting is to have one plus that broadcasts and one that
doesn't. That's the only system I can think of that would be
consistent with
"All pointwise operators broadcast, all others don't":

The way I think about it, * and / work on matrices, i e linear
operators. The natural definition of + that goes with this is the one
that makes * distribute over +

A*x + B*x = (A + B)*x # A, B linear operators, operating on
column vector x

i e + adds linear operators. It just happens that you do that by
adding the corresponding matrix elements (but it doesn't make sense to
broadcast).

Now .* and ./ are elementwise. Suppose that they do broadcast.
To have .* distribute over .+, we need e g that

R.*C .+ A.*C = (R .+ A).*C # R = row array, C = column array, A
= 2d array

i e since .* broadcasts, .+ must do so too. It is pointwise plus

Harlan Harris

unread,
Apr 24, 2012, 4:28:23 PM4/24/12
to juli...@googlegroups.com
I vote for broadcasting elementwise operators, as it makes REPL work much simpler. But add the Julian equivalent of a "use strict" pragma that throws an error if an operation ends up broadcasting implicitly. You'd set that pragma in production code that you know should not be broadcasting implicitly. (You could still do bsxfun manually, of course.)

 -Harlan

Stefan Karpinski

unread,
Apr 24, 2012, 4:32:38 PM4/24/12
to juli...@googlegroups.com
If it's good enough for the repl, it's good enough for production, imo. I think we should just do autobroadcasting of singleton dimensions when doing element-wise tensor operations. I really doubt that this will be a source of unexpected bugs.

Viral Shah

unread,
Apr 24, 2012, 4:36:02 PM4/24/12
to juli...@googlegroups.com
Yes, I think it is worth trying it out to have pointwise operators implement broadcasting. Anything that helps reduce temporary creation is worth a serious consideration!

-viral

Toivo Henningsson

unread,
Apr 24, 2012, 4:39:27 PM4/24/12
to julia-dev

On Apr 24, 10:32 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> If it's good enough for the repl, it's good enough for production, imo. I
> think we should just do autobroadcasting of singleton dimensions when doing
> element-wise tensor operations. I really doubt that this will be a source
> of unexpected bugs.

Sorry if I'm being a pain, but I'd really appreciate to get your
opinion on what is the bottom line to me in this discussion: Do you
think there is such a thing as a non-elementwise plus?

Stefan Karpinski

unread,
Apr 24, 2012, 4:43:13 PM4/24/12
to juli...@googlegroups.com
On Tue, Apr 24, 2012 at 4:19 PM, Toivo Henningsson <toiv...@gmail.com> wrote:
 
The way I think about it, * and / work on matrices, i e linear
operators. The natural definition of + that goes with this is the one
that makes * distribute over +

   A*x + B*x = (A + B)*x  # A, B linear operators, operating on
column vector x

i e + adds linear operators. It just happens that you do that by
adding the corresponding matrix elements (but it doesn't make sense to
broadcast).

Now .* and ./ are elementwise. Suppose that they do broadcast.
To have .* distribute over .+, we need e g that

   R.*C .+ A.*C = (R .+ A).*C  #  R = row array, C = column array, A
= 2d array

i e since .* broadcasts, .+ must do so too. It is pointwise plus

Ok, now this argument makes perfect sense to me. (I was an algebraist in a former life.)

This brings up a rather interesting point, however: scalars. By this argument, doing A+1 should not work — instead you would require A.+1 for scalar broadcasting, which, if you think about it is just the most extreme case of broadcasting singleton dimensions. And indeed, (A+1)*B does not distribute — i.e. it is not the same as A*B+1*B.

However, this does bring up an interesting possibility: what if Matrix+Scalar did something different and more algebraically correct? The algebraically correct behavior for this would be:

+(A::Matrix, x::Number) = A+x*one(A)

With that behavior * distributes over + even for matrices:

(A+1)*B == A*B+1*B

This would be a big win for generic programming — e.g. polynomial expressions would apply trivially to matrices without needing any special care. The cost, of course, is a bit of incompatibility with Matlab: most usages of Matlab's + operator should be changed to .+ instead. That's a really easy search-and-replace, however.

Stefan Karpinski

unread,
Apr 24, 2012, 5:06:50 PM4/24/12
to juli...@googlegroups.com
The fact that scalars already do broadcast in element-wise operations seems to me to be a pretty powerful argument for all singleton dimensions broadcasting in element-wise operations. Otherwise, how do you justify the scalar special case where no other broadcasting is done automatically? It also indicates that broadcasting in element-wise operations is probably not dangerous — if it was, it would already cause problems for scalar broadcasts.

Viral Shah

unread,
Apr 24, 2012, 5:07:10 PM4/24/12
to juli...@googlegroups.com
This would be a big win for generic programming — e.g. polynomial expressions would apply trivially to matrices without needing any special care. The cost, of course, is a bit of incompatibility with Matlab: most usages of Matlab's + operator should be changed to .+ instead. That's a really easy search-and-replace, however.

I like the improved consistency and better code readability, but we may get far too many questions from newcomers. We could always add matrix + scalar back at a later point if that happens.

This would really clean up the meaning of * making it a pure matrix multiplication operator.

-viral

Toivo Henningsson

unread,
Apr 24, 2012, 5:08:51 PM4/24/12
to julia-dev


On Apr 24, 10:43 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> On Tue, Apr 24, 2012 at 4:19 PM, Toivo Henningsson <toivo....@gmail.com>wrote:
>
>
>
> The way I think about it, * and / work on matrices, i e linear
>
>
>
>
>
> > operators. The natural definition of + that goes with this is the one
> > that makes * distribute over +
>
> >    A*x + B*x = (A + B)*x  # A, B linear operators, operating on
> > column vector x
>
> > i e + adds linear operators. It just happens that you do that by
> > adding the corresponding matrix elements (but it doesn't make sense to
> > broadcast).
>
> > Now .* and ./ are elementwise. Suppose that they do broadcast.
> > To have .* distribute over .+, we need e g that
>
> >    R.*C .+ A.*C = (R .+ A).*C  #  R = row array, C = column array, A
> > = 2d array
>
> > i e since .* broadcasts, .+ must do so too. It is pointwise plus
>
> Ok, now *this* argument makes perfect sense to me. (I was an algebraist in
> a former life.)
>
> This brings up a rather interesting point, however: scalars. By this
> argument, doing A+1 should not work — instead you would require A.+1 for
> scalar broadcasting, which, if you think about it is just the most extreme
> case of broadcasting singleton dimensions. And indeed, (A+1)*B does
> *not*distribute — i.e. it is not the same as A*B+1*B.
>
> However, this does bring up an interesting possibility: what if
> Matrix+Scalar did something different and more algebraically correct? The
> algebraically correct behavior for this would be:
>
> +(A::Matrix, x::Number) = A+x*one(A)
>
> With that behavior * distributes over + even for matrices:
>
> (A+1)*B == A*B+1*B
>
> This would be a big win for generic programming — e.g. polynomial
> expressions would apply trivially to matrices without needing any special
> care. The cost, of course, is a bit of incompatibility with Matlab: most
> usages of Matlab's + operator should be changed to .+ instead. That's a
> really easy search-and-replace, however.

I think we're finally on the same page! :)
And yes, I also think matrix + scalar should be done only with .+

Stefan Karpinski

unread,
Apr 24, 2012, 5:23:03 PM4/24/12
to juli...@googlegroups.com
The idea that * + / - are all consistently the correct linear operations on matrices has tremendous appeal. I would argue that it would actually prevent bugs if the intentional distinction between + and .+ was made explicit like this. Did you mean to broadcast that scalar? Being able to just write A+1 for A+one(A) — which is even more difficult to write in Matlab — is kind of a big win. Of course it might be best to leave that out for now, just for safety.

I don't think this scheme is hard to explain at all: dotless operators are pure linear operators; dotted operators are element-wise operations with autobroadcast of singleton dimensions. Done. It's the mixed up behaviors of Matlab that are hard to explain.

Jeff Bezanson

unread,
Apr 24, 2012, 5:28:03 PM4/24/12
to juli...@googlegroups.com
I like it. It's going to be possibly annoying to port our code, and
for people to port matlab code, but I do like it.
My current bsxfun is quite slow and we will need a serious implementation.

Stefan Karpinski

unread,
Apr 24, 2012, 5:34:25 PM4/24/12
to juli...@googlegroups.com
Basic porting of Matlab code will just require s/\.\+/+/g; s/\+/.+/g. If you want to be more careful, you can go through and actually think about whether each operation should be a matrix op or a broadcast op.

We might also want to insist on same actual shape for the dotless ops whereas the dotted ops would broadcast singleton dimensions, including implied trailing ones. That would make adding a vector and a column matrix with + an error, whereas using .+ the result would be a column matrix (in fact, the result would be a matrix regardless of whether the original matrix was a single column or not).

Viral Shah

unread,
Apr 24, 2012, 10:49:16 PM4/24/12
to juli...@googlegroups.com
We should also extend this to the comparison and boolean operators. .&& does not look as bad as I once used to think, but I am not so sure about .!=

APL, here we come :-)

-viral

Stefan Karpinski

unread,
Apr 24, 2012, 10:54:45 PM4/24/12
to juli...@googlegroups.com
I'm not convinced about that part.

Viral Shah

unread,
Apr 24, 2012, 11:05:09 PM4/24/12
to juli...@googlegroups.com
Would be nice to hear counter arguments from other long time matlab users on the whole idea.

What should bcast rules be? Which direction to broadcast in? How do I do it along a different direction than the default one? Perhaps we will continue to have functions to do this in a more general way. Would A + b broadcast differently than b + A, where b is a vector and A is an array?


-viral

Stefan Karpinski

unread,
Apr 24, 2012, 11:30:51 PM4/24/12
to juli...@googlegroups.com
The broadcast is simple: you broadcast any singleton dimension along the corresponding singleton dimension, considering absent trailing dimensions as singletons. Suppose A is an mxn matrix and v is an m-length vector. Then A.+v is equivalent to but more efficient than A+v*ones(1,n). Likewise, A'.+v' is equivalent to but more efficient than A'+ones(n,1)*v'.

Having A+v be different than v+A would be insane.

Jeff Bezanson

unread,
Apr 24, 2012, 11:43:08 PM4/24/12
to juli...@googlegroups.com
See our existing bsxfun() in abstractarray.jl for a "reference implementation".

Viral Shah

unread,
Apr 25, 2012, 12:29:24 AM4/25/12
to juli...@googlegroups.com
On 25-Apr-2012, at 9:00 AM, Stefan Karpinski <ste...@karpinski.org> wrote:

The broadcast is simple: you broadcast any singleton dimension along the corresponding singleton dimension, considering absent trailing dimensions as singletons. Suppose A is an mxn matrix and v is an m-length vector. Then A.+v is equivalent to but more efficient than A+v*ones(1,n). Likewise, A'.+v' is equivalent to but more efficient than A'+ones(n,1)*v'.

What if A is square or n-d cube? 

Having A+v be different than v+A would be insane.

I know, but just threw it into the mix.

-viral

Tim Holy

unread,
Apr 25, 2012, 4:28:19 AM4/25/12
to juli...@googlegroups.com
On Wednesday, April 25, 2012 08:35:09 AM Viral Shah wrote:
> Would be nice to hear counter arguments from other long time matlab users on
> the whole idea.

I'm a long-time Matlab user whose code contains, I'd guess, something like at
least one call to bsxfun every 50 or so lines (across all types of projects,
and a much higher density on some). Replacing it with broadcasting on '.'
operators seems likely to be an improvement to me, and quite intuitive given
what .* already means.

I would have more reservations about broadcasting simple '+', but that is not
what is being proposed here. So I guess I'd say I am entirely in favor.

Best,
--Tim

Toivo Henningsson

unread,
Apr 25, 2012, 5:13:13 AM4/25/12
to julia-dev
On Apr 24, 11:23 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> The idea that * + / - are all consistently the correct linear operations on
> matrices has tremendous appeal. I would argue that it would actually
> prevent bugs if the intentional distinction between + and .+ was made
> explicit like this. Did you mean to broadcast that scalar? Being able to
> just write A+1 for A+one(A) — which is even more difficult to write in
> Matlab — is kind of a big win. Of course it might be best to leave that out
> for now, just for safety.

And just as polynomial evaluation would generalize to matrix
polynomials,
insisting on .+ for

column vector .+ scalar # (1)

makes that generalize to

2d-array .+ row matrix,

i e when you stack a number of the operations in (1) rowwise.

I think this is a case in point that explicitly encoding intent in the
source code (+ vs .+ in this case) aids generalization. (by
overloading generalizations rather than intents)
I think this also illustrates the fact that mathematics is consistent,
but mathematicians aren't. They use all kinds of different notation,
where the intent/interpretation must be read just as much from the
context as the actual formulas. Now that's a bit harder for a
compiler :)

There's so many beautiful mathematical structures out there, and I
think you could make a number of them generalize in the same way in a
programming language as they do in mathematical theory. We've just
become so used to programming languages not being as consistent as
math that we don't really see that there could be another way.
So I think you should think twice before overloading different
behaviors on the same operation that correspond to different
mathematical concepts. They will probably generalize differently, like
+ and .+

> I don't think this scheme is hard to explain at all: dotless operators are
> pure linear operators; dotted operators are element-wise operations with
> autobroadcast of singleton dimensions. Done. It's the mixed up behaviors of
> Matlab that are hard to explain.

Agreed!

Toivo Henningsson

unread,
Apr 25, 2012, 5:24:46 AM4/25/12
to julia-dev
You mean to require .* to for doing scalar .* matrix? I like it!
Of course,
column_vector * scalar
scalar * row vector
would still work with *, since that's consistent with matrix
multiplication.

Toivo Henningsson

unread,
Apr 25, 2012, 3:09:35 PM4/25/12
to julia-dev


On Apr 24, 11:34 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> Basic porting of Matlab code will just require s/\.\+/+/g; s/\+/.+/g. If
> you want to be more careful, you can go through and actually think about
> whether each operation should be a matrix op or a broadcast op.
>
> We might also want to insist on same actual shape for the dotless ops
> whereas the dotted ops would broadcast singleton dimensions, including
> implied trailing ones. That would make adding a vector and a column matrix
> with + an error, whereas using .+ the result would be a column matrix (in
> fact, the result would be a matrix regardless of whether the original
> matrix was a single column or not).

How about making

ndims(X+Y) = min(ndims(X), ndims(Y)) ?

Since covectors and other ideas to make x'' == x were deemed to
complex, I think you should acknowledge the fact that matrices
sometimes get unintended singleton dimensions added. (And that scalars
sometimes become single element matrices.) By writing the linear
operator sum X+Y you assert that they were meant to have the same
ndims, which couldn't be more than either ndims(X) or ndims(Y)!

This would probably be able to remove a lot of the spurious dimensions
that show up in matrix computations, without any singleton dependence
or other nasty inconsistency.
E g x'*y + 1 would be a scalar, (x, y vectors)
and x'' + y would be a vector.

I think ones(1,1) + 1 should still produce a 1x1 matrix to be
consistent with the A+1 = A+one(A) rule. But I think the only
reasonable intent for
vector + matrix is to produce a vector, and for
scalar + vector to produce a scalar

Maybe the A+1 = A+one(A) rule is a special case of some more general
operation to raise the rank of a tensor by 2. Is there anyone here
with tensor experience who can say something about this?
But at least if ndims(X) and ndims(Y) differ by one, I think the
smaller one should be the reasonable choice for X+Y.

I guess this is a little more complicated than I thought...

Stefan Karpinski

unread,
Apr 25, 2012, 3:14:34 PM4/25/12
to juli...@googlegroups.com
I think that for now the most conservative behavior is the best choice, which is to just error out if two things aren't exactly the same shape for something like the + operator. If that turns out to be too annoying, then at least we'll have some idea what behavior would be better. At the moment we're just guessing.

Toivo Henningsson

unread,
Apr 25, 2012, 3:58:42 PM4/25/12
to julia-dev


On Apr 25, 9:14 pm, Stefan Karpinski <ste...@karpinski.org> wrote:
> I think that for now the most conservative behavior is the best choice,
> which is to just error out if two things aren't exactly the same shape for
> something like the + operator. If that turns out to be too annoying, then
> at least we'll have some idea what behavior would be better. At the moment
> we're just guessing.

Good point :)

Toivo Henningsson

unread,
Apr 26, 2012, 5:58:54 AM4/26/12
to julia-dev
Come to think about it, I think multiplying an actual scalar (non-
Array) and an Array with * should still work, as this corresponds to
multiplication of linear operators.

Kevin Squire

unread,
Apr 26, 2012, 1:31:51 PM4/26/12
to juli...@googlegroups.com
Hi all,

To express a contrary point of view, in all of Matlab/Octave, numpy, and R, (A+1) does the same thing.  As a user familiar with all of those, I kind of expect the A+1 broadcast behavior to just work, even if it's not mathematically sound.  I especially expect 2*A to work (although Toivo expressed that he thought this should still work--would it?).  

Stefan's suggested that 

   +(A::Matrix, x::Number) = A+x*one(A)

in order to make distribution over matrix multiplication work.  I'm going to suggest that that's a bad idea, because it silently produces a different behavior from what a new/naive (and former Matlab) programmer would expect.  If a user really wants to add to the diagonal, she should explicitly use one() or eye():

   (A + one(A))*B

A+1 should either work as in Matlab/numpy/R or generate an error.

While I don't think that user's habits and notational convenience are the only consideration, keeping things familiar and convenient (for the user) makes it that much easier to adopt a new language.  I had to overcome a bit of inertia to transition from Matlab to numpy, because it seemed too verbose to me (e.g., requiring commas in matrix rows, explicitly specifying array() or matrix(), etc.).  A similar thing happened when I tried scalala.  Small differences like this can be annoying when trying a new language.  Not insurmountable, just annoying.

Perhaps instead of making attempted broadcast on + an error, it could be made a warning, with an option to be silenced?  At the minimum, even if it's made an error, a friendly suggestion to the user would be helpful:

julia> A = rand(4,4)
4x4 Float64 Array:
 0.840078  0.424518   0.0314841  0.6444  
 0.161809  0.0230639  0.0210142  0.915143
 0.624939  0.339493   0.248527   0.829788
 0.440358  0.980172   0.0333061  0.207798

julia> A+1
matrix + scalar not allowed, sorry.  Try "A .+ 1".

(Still, I have to say that I find it annoying when I know the program can figure out what I mean, and choose not to do it.  I want it to "just work".)

Kevin

P.S. I just joined the group and the conversation, and despite any contrariness in my message, julia looks fantastic (even if this change is made).  Thanks!

Kevin Squire

unread,
Apr 26, 2012, 1:44:34 PM4/26/12
to juli...@googlegroups.com
Replying to my own post...

On the other hand, adding a scalar to a matrix is not something I remember doing very often.  If anything, I might have done 

   A+eps

I'm sure I have done it elsewhere, but does anyone have any good use cases?  

Kevin

Stefan Karpinski

unread,
Apr 26, 2012, 1:50:36 PM4/26/12
to juli...@googlegroups.com
You make fair points, but I suspect, as you say, it's just not that common. An interesting question is how many cases where someone does matrix+scalar are actually bugs that would be uncovered by forcing the use of .+ to express this.

Elliot Saba

unread,
Apr 26, 2012, 3:25:17 PM4/26/12
to juli...@googlegroups.com
What would (A + 1) do when A is not square, say a 2x5 matrix? Throw an error?

Adding/subtracting scalars is common when demeaning a timeseries, and in signal processing it's common to have a timeseries with more than one dimension (multichannel audio, for example).  I still think requiring (A .- mean(A)) is a good idea, but I guarantee that newbies (and anyone that has left julia for a while and comes back to it later) will write (A - mean(A)) multiple times before it sticks.
-E

Toivo Henningsson

unread,
Apr 28, 2012, 8:24:50 AM4/28/12
to julia-dev


On Apr 26, 9:25 pm, Elliot Saba <staticfl...@gmail.com> wrote:
> What would (A + 1) do when A is not square, say a 2x5 matrix? Throw an
> error?

Yes.
Reply all
Reply to author
Forward
0 new messages