Return type of eye()

351 views
Skip to first unread message

Júlio Hoffimann

unread,
Aug 28, 2016, 3:39:20 PM8/28/16
to julia-users
Hi,

I wanna revive a discussion around the return type of eye(). Currently this function always returns a dense matrix, which is not desired considering that the identity acts as a simple scalar in every single place it is used.

By exploiting the type system, one would ideally write algorithms with eye() and get specialized function calls for a "ConstantMatrix" type saving memory and unnecessary calculations. People have raised the existence of this variable "I" defined in Base that acts as identity, but I still think that this is not robust given that 1) I can easily overwrite "I" in my user code and 2) 90% of the users won't be aware of these internal details in their day-to-day coding.

Also, can someone explain what is the use of eye(m,n) with m != n? If that case is relevant for some applications, I would still expect to get a SparseMatrix instead.

Can you please share your opinions on this matter?

-Júlio


Andreas Noack

unread,
Aug 29, 2016, 8:44:16 AM8/29/16
to julia-users
You can also overwrite eye

Could you elaborate on the "90% of the users won't be aware of these internal details in their day-to-day coding" part? If we ignore the name for a while, why is `I` not what you want here? It is as efficient as it can possibly be.

Júlio Hoffimann

unread,
Aug 29, 2016, 9:17:34 AM8/29/16
to Julia Users

Hi Andreas,

As a user I would like to write

B = eye(10000) * A

and have the performance of

B = A

90% of the users won't be aware of this 1-character variable "I" defined in Base nor use it. Also, I can guarantee that "I" is much easier to overwrite than a well known function name.

-Júlio

Andreas Noack

unread,
Aug 29, 2016, 9:24:57 AM8/29/16
to julia...@googlegroups.com
If we could somehow make `I` more visible, wouldn't you think that

B = I*A

is better than 

B = eye(10000)*A

?

Small side note: the best we can hope for is probably performance similarly to B = copy(A) because it wouldn't be okay to alias A and B when B has been constructed from *.

Júlio Hoffimann

unread,
Aug 29, 2016, 9:29:56 AM8/29/16
to Julia Users

I still think that having a "global variable" named "I" is not robust. I've read so many scripts in matlab that do I = eye(n). This approach is not gonna work.

-Júlio

Andreas Noack

unread,
Aug 29, 2016, 9:32:38 AM8/29/16
to julia...@googlegroups.com
We could deprecate eye. Then the users would get a warning directing them to use `I` instead.

Júlio Hoffimann

unread,
Aug 29, 2016, 9:39:19 AM8/29/16
to Julia Users

I'd like to understand the existence of eye() in Julia, it is still not clear to me. Is it because one wants type stability when updating a matrix iteratively? Is this possibly a limitation from the design of the language?

-Júlio

Evan Fields

unread,
Aug 29, 2016, 9:49:41 AM8/29/16
to julia-users


On Monday, August 29, 2016 at 9:39:19 AM UTC-4, Júlio Hoffimann wrote:

I'd like to understand the existence of eye() in Julia, it is still not clear to me. Is it because one wants type stability when updating a matrix iteratively? Is this possibly a limitation from the design of the language?

-Júlio


I've used it for things like
m = m + lambda*eye(m)

 

Andreas Noack

unread,
Aug 29, 2016, 9:53:34 AM8/29/16
to julia...@googlegroups.com
Julia is developing over time. Originally, eye was probably implemented to mimic Matlab. Later we realized that the type system allowed us to define the much nicer UniformScaling which has the special case

const I = UniformScaling(1)

which is almost alway better to use unless you plan to modify some of the elements of the diagonal matrix after construction. See also 


and maybe the rest of that discussion.

Andreas Noack

unread,
Aug 29, 2016, 10:02:22 AM8/29/16
to julia...@googlegroups.com
Evan, this is exactly where you should use I, i.e.

m = m + λ*I

The reason is that eye(m) will first allocate a dense matrix of size(m,1)^2 elements. Then * will do size(m,1)^2 multiplications of lambda and allocate a new size(m,1)^2 matrix for the result. Finally, size(m,1)^2 additions will be computed for the elements in m and lambda*eye(m).

In contrast λ*I will only make a single multiplication, store the result in a tiny stack allocated scalar and only add this to the diagonal of m, i.e. size(m,1) additions.

Chris Rackauckas

unread,
Aug 29, 2016, 10:08:36 AM8/29/16
to julia-users
Isn't `I` better here?

Júlio Hoffimann

unread,
Aug 29, 2016, 10:21:27 AM8/29/16
to Julia Users

Andreas, is there a way to get the best of both worlds? Let's say eye() is deprecated, can we somehow set off-diagonal terms in a type that is smart like UniformScaling and supports indexing with operator []?

-Júlio

Kristoffer Carlsson

unread,
Aug 29, 2016, 10:25:49 AM8/29/16
to julia-users
You mean a sparse matrix?

Júlio Hoffimann

unread,
Aug 29, 2016, 10:31:21 AM8/29/16
to Julia Users

It is more than sparse, it acts as scalar at first, maybe the operator [] modifies the type to sparse?

-Júlio

Chris Rackauckas

unread,
Aug 29, 2016, 10:32:34 AM8/29/16
to julia-users
But you don't want a sparse matrix. It would not be an efficient way to actually use it since sparse matrices have a bit of overhead due to their table structure. Even better would be a Diagonal since it's just an array with dispatches to act like a diagonal matrix. But best would be to use the UniformScaling operator `I` which acts like eye in cases like A-λI but isn't actually saving a matrix and doesn't actually require iterating over two arrays at once. I think that amount of confusion in this thread is a good indicator that eye should be changed.

On Monday, August 29, 2016 at 7:25:49 AM UTC-7, Kristoffer Carlsson wrote:
You mean a sparse matrix?

Kristoffer Carlsson

unread,
Aug 29, 2016, 11:31:13 AM8/29/16
to julia-users
What is the confusion? Use eye(n) if you want a dense identity matrix. Use I if you want something that acts like an identity element. Use Diagonal(ones(n)) if you want a diagonal identity matrix. I see no reason at all why eye should be changed.

Júlio Hoffimann

unread,
Aug 29, 2016, 11:40:14 AM8/29/16
to Julia Users

Why would one want dense identity matrix?

Chris Rackauckas

unread,
Aug 29, 2016, 12:04:15 PM8/29/16
to julia-users
As Julio asks, why should the default be a dense identity matrix? Who actually wants to use that? I agree `I` should be in more visible in the docs and in most cases it's the better option. But if you actually want a matrix as an array instead of just an operator (this is reasonable, because maybe you're going to modify it), then a Diagonal matrix is sufficient for storing any identity matrix and should be the only thing used in most cases. If you really want a dense matrix from it, you could always call full(A). But there's almost no reason to have a dense identity matrix, and so Julia shouldn't be defaulting to it because it's just a performance trap for new users coming from MATLAB.

Chris Rackauckas

unread,
Aug 29, 2016, 12:06:57 PM8/29/16
to julia-users
The confusion is pretty clear. Someone suggested code like

m = m + lambda*eye(m)

when eye(m) shouldn't be used there. And if lambda is a vector of eigenvalues, eye(m) still shouldn't be used there and instead the Diagonal should be used. So if you shouldn't be using this command, why should it still exist? Or why should it not default to one of the commands you should use?

Tim Holy

unread,
Aug 29, 2016, 12:11:12 PM8/29/16
to julia...@googlegroups.com
On Monday, August 29, 2016 10:40:10 AM CDT Júlio Hoffimann wrote:
> Why would one want dense identity matrix?

Because you might want it to serve as an initialization for an iterative
optimization routine (e.g., ICA) that updates the solution in place, and which
assumes a dense matrix?

We could theoretically get rid of `eye` and just use `convert(Array, I)`. I
like that idea except for one detail: `I` doesn't actually have a size.

--Tim

Tobias Knopp

unread,
Aug 29, 2016, 12:28:40 PM8/29/16
to julia-users
But if it does not makes sense from your point to give a full matrix, why does it make more sense to return a diagonal matrix? It is still lots of heap allocated memory. So you replace a large performance trap with a smaller one. What has been gained? We would still have to teach people the eye should not be used because there is a faster alternative.

Chris Rackauckas

unread,
Aug 29, 2016, 12:35:18 PM8/29/16
to julia-users
The reason is because `I` doesn't have a size, so you can't `full(I)`. I think the most reasonable thing would be for `I` to have an optional size (and then act like an identity operator of that size, so error on the wrong size multiplications, etc.), and then have sparse(I), Diagonal(I), and full(I) do the sensible things when the size is declared (but nothing if it's not). In which case eye(n) would just make an `I` of size `n`.

Júlio Hoffimann

unread,
Aug 29, 2016, 1:29:19 PM8/29/16
to Julia Users

Tim,

Would it make sense to have "I" as an object that acts like UniformScaling and doesn't require any memory allocation, but is only transformed into sparse matrix via the [] operator? Maybe something similar to arrays -> subarrays?

-Júlio

Evan Fields

unread,
Aug 29, 2016, 1:57:26 PM8/29/16
to julia-users

On Monday, August 29, 2016 at 10:02:22 AM UTC-4, Andreas Noack wrote:
Evan, this is exactly where you should use I, i.e.

m = m + λ*I

The reason is that eye(m) will first allocate a dense matrix of size(m,1)^2 elements. Then * will do size(m,1)^2 multiplications of lambda and allocate a new size(m,1)^2 matrix for the result. Finally, size(m,1)^2 additions will be computed for the elements in m and lambda*eye(m).

In contrast λ*I will only make a single multiplication, store the result in a tiny stack allocated scalar and only add this to the diagonal of m, i.e. size(m,1) additions.


Good to know - learn something new ever day! :) 

Christoph Ortner

unread,
Aug 29, 2016, 4:22:45 PM8/29/16
to julia-users
Two give my two cents: 

I think this is an inconsistency in the design of the standard library.
* while we have both I and eye available for the Identity matrix,
* linspace was replaced with a lazy data-structure.

Personally I would like to keep both I and eye; but I also would have liked to keep the old linspace behaviour.



Sheehan Olver

unread,
Aug 29, 2016, 6:38:39 PM8/29/16
to julia-users
I think the ArrayFire.jl (https://github.com/JuliaComputing/ArrayFire.jl) syntax for matrices (e.g. rand(AFArray{Float64},100,100)) should be adopted in Base to allow multiple eye commands that return different types:

   eye(Diagonal{Float64},10)
   eye(SparseMatrix,10)   # default to Float?
   eye(Matrix{Int},10)

Whether eye(Int,10), eye(Float64,10), etc. get deprecated and what eye(10) should default to are debatable.

Júlio Hoffimann

unread,
Aug 29, 2016, 6:43:31 PM8/29/16
to Julia Users

Christoph,

Can you elaborate on why you want to have both? Also, why you wanted the non-lazy linspace? If I understood correctly, you're talking about returning a array with allocated memory versus a range object, isn't the latter always preferred? I don't understand.

-Júlio

Tim Holy

unread,
Aug 29, 2016, 7:38:03 PM8/29/16
to julia...@googlegroups.com
I confess I'm not quite sure what the right answer is here. It would seem
overkill to have both `I` and something that's the same thing, except sized.
OTOH, I see the attraction.

Maybe if it were part of a general mechanism, e.g., SizedArrayOperator{O,N}
(for performing an operation `O` on array with `N` dimensions of specified
size) it wouldn't be so bad. That would rely on finding other uses for this
concept, however.

Best,
--Tim

Christoph Ortner

unread,
Aug 29, 2016, 9:41:54 PM8/29/16
to julia-users

(a) teaching: it is easy to explain to beginners what an array is, not so much abstract range objects

(b) occasionally I actually need the vector and am just too lazy to write collect  

(c) compatibility with history (i.e. Matlab)

I appreciate that all of these are to some extent personal preference.

Júlio Hoffimann

unread,
Aug 29, 2016, 9:48:46 PM8/29/16
to Julia Users

Yes, they are all personal preference. None of the reasons you mentioned is strong for the existence of both.

Furthermore, I'm glad Julia doesn't need to be backward compatible with MATLAB. This is the exact reason I quited GNU Octave, no point in repeating mistakes.

-Júlio

Chris Rackauckas

unread,
Aug 29, 2016, 10:13:25 PM8/29/16
to julia-users
I just think the moment you start mixing default behaviors, you teaching becomes exponentially harder. I found range objects not that hard to teach, just "it uses an abstract version of an array to not really build the array, but know what the value would be every time you want one. If you want the array, just use collect()". I think it should just standardize like that: in every case where possible (linspace, range, eye, etc), the default option is the abstract version which gives better performance/memory usage, but at any time with any of them, you can collect() them into an array. However, if some are arrays and some are not, that's when confusion happens (especially when there's two "standard" ways of doing it, when one isn't supposed to be standard because it's not performant?).

I don't think that being too lazy to write collect() is a good reason to fill Base with extra lower performing methods, making a minefield for newcomers in the docs. Instead, the docs for something like `I` should just say it's an operator which uses dispatch to achieve what it's doing, but if you need the array, use `collect()` (though `I` doesn't work here because of the size issue, but you get the point). This gives a way to be compatible with MATLAB while at the same time showing the better way to achieve the same goal.

Sheehan Olver

unread,
Aug 29, 2016, 10:39:06 PM8/29/16
to julia...@googlegroups.com
I’d also propose adding

linspace(Vector{Float64},0,100)
linspace(Range,0,100)
linspace(LinSpace{Float64},0,100)

as constructors that return the corresponding type.  That way Christoph can use the linspace(Vector{Float64},0,100) version in his teaching.

Júlio Hoffimann

unread,
Aug 29, 2016, 10:54:02 PM8/29/16
to Julia Users

Why is it so important to have all this machinery around linspace and eye? collect is more than enough in my opinion and all the proposals for keeping both versions and pass a type as a parameter are diluting the main issue here: the need for a smart mechanism that handles things efficiently and seamlessly for the user. Teaching is a completely separate matter, let's not enter in this territory, which is much more delicate and hard to be done right.

Recaping the issue:

B = I*A should be as efficient as B = copy(A). Right now it is not.

-Júlio

Sheehan Olver

unread,
Aug 29, 2016, 11:01:00 PM8/29/16
to julia...@googlegroups.com
I doesn't have a dimension so you can't do collect(I)

Sent from my iPhone

Júlio Hoffimann

unread,
Aug 29, 2016, 11:02:31 PM8/29/16
to Julia Users

So maybe add a dimension or create a type that makes more sense for the application?

-Júlio

Sheehan Olver

unread,
Aug 29, 2016, 11:21:47 PM8/29/16
to julia...@googlegroups.com
But eye and linspace are not the only culprits: there's rand, zeros, etc.  so unless everything is a special type, ArrayFire will still need constructors like

rand(AFArray{Float64},5,5)

Sent from my iPhone

Christoph Ortner

unread,
Aug 30, 2016, 12:02:06 AM8/30/16
to julia-users
Personal opinion again: I think it is not good to underestimate the importance of teaching. Mathematics students in particular tend to stick with the language they learn first. It is part of why Matlab is so successful in the applied mathematics community. 

P.S.: Not sure why such a combative tone is needed here.

Júlio Hoffimann

unread,
Aug 30, 2016, 12:19:38 AM8/30/16
to Julia Users

Sorry for the combative tone Christoph. I thought it was necessary in order to not deviate too much from the core issue. Thank you for your participation and for raising your personal opinions about the topic.

-Júlio

Sheehan Olver

unread,
Aug 30, 2016, 1:39:40 AM8/30/16
to julia-users
But the core issue isn't 'eye' specific, it's "what should the default type for functions that create matrices be?"    Christoph's comments do not deviate from this question.

The answer to this question affects 'rand', 'zeros', 'ones', 'linspace', etc. just as much as eye.  ArrayFire means that this might want to be created on the GPU which should be taken into account: the syntax 'AFArray(eye(10))' cannot work unless eye(10) returns a special type.

Christoph Ortner

unread,
Aug 30, 2016, 12:19:31 PM8/30/16
to julia-users

I agree with Sheehan that this affect a number of functions in Base, not just eye - that was the point I was trying to make, sorry I wasn't clear.

I raised this in a discussion in a formal issue somewhere, which I can't find now. Somebody (Steven Johnson?) argued that `zeros` and `ones` are different because they are used for allocating arrays.

With every discussion like this one (previously linspace) targeted at "cleaning up" the language and disallowing "bad practices" I always feel that Julia is becoming more and more a language for Computer Scientists while casual programmers are pushed away. In fact this doesn't just affect casual programmers: for almost every new project I first cook up a little toy-model where I really just brainstorm - often I write bad code, but this is ok. I like that fact that I can do that in Julia (for now) without getting punished.  In those little toy models it is usual irrelevant that linspace and even eye return arrays instead of lazy data structures - in fact it allows me to think less about what are the objects I am manipulating. Every small distraction like that breaks the flow of thoughts.  I can always go back later and refactor my code (or usually rewrite it) with good structure and performance in mind. 

Christoph

Júlio Hoffimann

unread,
Aug 30, 2016, 12:32:37 PM8/30/16
to Julia Users

I don't think there is anything like pushing the language to computer scientists, it's the exact opposite, making it seamlessly fast without forcing the user to manipulate types. Again, you write B = I*A and get B = copy(A) performance. That is the original proposal.

Most of us follow the same development strategy while doing science, we write something quick and profile later. The fact that Julia can be fast from the beginning is the best thing about it, no need for rewriting code.

-Júlio

Christoph Ortner

unread,
Aug 30, 2016, 1:15:08 PM8/30/16
to julia-users

If this is your only use-case of I, then you don't need it anyways. Just write 1.0 * A instead; same effect, independent of what type of array A is.

But what if I use I in a different way? Suppose I want to A[1:5,1:5] = eye(5); I can't do that with I. Of course we could give it another type parameters, and then do the whole collect thing again. But then we are back to what I said before above about needless distractions.

I don't understand why there is such a resistance to providing both the explicit arrays and the lazy functionalities in Julia.

Christoph

Chris Rackauckas

unread,
Aug 30, 2016, 1:26:22 PM8/30/16
to julia-users
Even then, creating the 5x5 dense matrix to then copy it into A[1:5,1:5] is not what you'd want to do. Ideally would just have eye(5) return something like I which has a size, and just sets A[i,j]=1 if i=j 0 otherwise, with checks that it's the right size. Actually, the current I would do it if you defined a dispatch on setindex!. You still don't need the dense matrix there...

Tim Holy had the only example where you need the dense matrix, and it's when you want to make a dense matrix to start with and them modify the off diagonals of that same matrix. Even then, collect(eye(5)) where eye returned some smart type instead of a matrix wouldn't be bad syntax to do so, and it would be consistent with the rest of the library.

I agree that there can be a teaching problem with Julia. A lot of things happen like fast magic to "mathematicians not trained in CS". I don't think the right way to go is to fill Base with methods that shouldn't be used in most cases: someone could make their own "helper libraries" if they really think it's an issue (another example is "EasyFloats" which are not type stable, so they will just turn into complex numbers when needed like in MATLAB. It might help new users from MATLAB see less errors, but I'd never want to see it in Base.) Instead I think it comes down to documentation and extensive tutorials to fix the problem (and filling up StackOverflow with answers).

Christoph Ortner

unread,
Aug 30, 2016, 1:59:57 PM8/30/16
to julia-users
> I agree that there can be a teaching problem with Julia. A lot of things happen like fast magic to "mathematicians not trained in CS". 

yes, this is the issue. FWIW in other respects it is nice to use Julia for teaching, e.g., the fact that there are no classes is great!


> I don't think the right way to go is to fill Base with methods that shouldn't be used in most cases: someone could make their own "helper libraries" if they really think it's an issue

This is always a possibility but external packages quickly become incompatible with the language. It is just another little barrier.
Message has been deleted

Sheehan Olver

unread,
Aug 30, 2016, 7:14:30 PM8/30/16
to julia-users
I agree with Chris, though I prefer Matrix(eye(5)) to collect(eye(5)).

I've found teaching-wise that saying "Matlab has Dense and Sparse.  Julia has a lot more special types."  worked pretty well to explain the situation.  

The issue though is with rand:  it seems like overkill for rand(5,5) to return a special type, but sometimes you want to create it on a GPU.

Christoph Ortner

unread,
Aug 30, 2016, 10:56:39 PM8/30/16
to julia-users


On Wednesday, 31 August 2016 00:14:30 UTC+1, Sheehan Olver wrote:
I agree with Chris, though I prefer Matrix(eye(5)) to collect(eye(5)).


Matrix(eye(5)) looks reasonably pleasing to the eyes :).  (but still 8 keystrokes more than needed)
Reply all
Reply to author
Forward
0 new messages