Issue Replacing NaN with 0

1,307 views
Skip to first unread message

Alex Hollingsworth

unread,
Aug 31, 2014, 1:52:32 PM8/31/14
to julia...@googlegroups.com
Hi Everyone, 

I cannot figure out if there is an error in Julia or (more likely) in my code. I have a matrix A, which contains some NaN values and I would like to create a copy of it that is the same except that I replace the NaN values with 0's. I would also like to do this without altering the original matrix. I have tried two different approaches, both of which have the same whacky result. Where no matter what I do, the original values seem to be altered in A. My results would ideally look like:

a =[1 2 3; 4 5 NaN] and x=[1 2 3; 4 5 0]

Please let me know where my error is or if this is some oddity of julia's handling of NaN's. The same logic of code works perfectly in Matlab, so I'm really confused as to what the error is. 

Thanks!!

Method 1:
a=[1 2 3; 4 5 NaN]
x=a

for m=1:size(x,1)
for l=1:size(x,2)
  isnan(x[m,l]) ? x[m,l]=0 :  x[m,l]=x[m,l] 
end
end

Result:

julia> a

2x3 Array{Float64,2}:

 1.0  2.0  3.0

 4.0  5.0  0.0

julia> x

2x3 Array{Float64,2}:

 1.0  2.0  3.0

 4.0  5.0  0.0


Method 2:

julia> a=[1 2 3; 4 5 NaN]

2x3 Array{Float64,2}:

 1.0  2.0    3.0

 4.0  5.0  NaN  


julia> x=a

2x3 Array{Float64,2}:

 1.0  2.0    3.0

 4.0  5.0  NaN  


julia> x[isnan(x)]=0

0


julia> x

2x3 Array{Float64,2}:

 1.0  2.0  3.0

 4.0  5.0  0.0


julia> a

2x3 Array{Float64,2}:

 1.0  2.0  3.0

 4.0  5.0  0.0

Keno Fischer

unread,
Aug 31, 2014, 1:58:10 PM8/31/14
to julia...@googlegroups.com
Try x=copy(a). Matlab automatically copies the array if it's written to.

Alex Hollingsworth

unread,
Aug 31, 2014, 2:01:48 PM8/31/14
to julia...@googlegroups.com
Thanks!!! As a new comer to julia, I did not realize that they x=a linked them in a way that when I changed x, I was also changing a, thank you so much!

Ethan Anderes

unread,
Aug 31, 2014, 2:27:28 PM8/31/14
to julia...@googlegroups.com
I've come to wish that in cases like this (and in vec, reshape and soon-to-be slicing) the resulting type clearly shows the user it is a ArrayView, SubArray or something like AliasArray. I've never like the invisible fusing of variables in Python and since Julia's type system is so expressive I figure it's the perfect way to illustrates some of the virtues of Julia's design ( ie inviting to beginners but deep power).

Anyhoo, sorry for the rant. Love the language. Cheers.

John Myles White

unread,
Aug 31, 2014, 2:31:30 PM8/31/14
to julia...@googlegroups.com
I don’t think this example had any views. Both bindings had an equal right to be considered the true binding.

I think we’re better off doing more education to teach people to distinguish bindings and values.

— John

Ethan Anderes

unread,
Aug 31, 2014, 2:45:32 PM8/31/14
to julia...@googlegroups.com
Yeah, I can see your point John. It's probably not reasonable to make a new AliasedArray type.

For me I think the education would address the difference between vec, for example, when used inside another function, eg x = sin(vec(a)), or the memory overlap case, eg x = vec(a). This stung me at one point. When ArrayViews lands as slicing in Base I should take a shot at a PR for the docs.

Cheers

John Myles White

unread,
Aug 31, 2014, 3:58:43 PM8/31/14
to julia...@googlegroups.com
I think there’s a broad issue that need resolution: how do you know when a function’s output takes control of the memory used by its arguments?

— John

Ethan Anderes

unread,
Sep 1, 2014, 1:31:40 PM9/1/14
to julia...@googlegroups.com

Right, that’s a succinct way to put it. I would just add that I hope we can do this in a way that allows the beginner to reason about things like getindex, vec, reshape, transpose, vcat, etc without requiring the him/her to understand pointers and the subtleties of memory layout. In a way, I’m being selfish: I want to be able to teach my intro stats students Julia.

For example, how does one tell that vec(a) does something different than getindex(a,1:length(a)) without testing it? I guess testing it isn’t that big of a deal, but if your a beginner you’ll end up testing everything. In particular, if x = {"jill", 4} and y = {"bob", 2} it doesn’t seem a-priori clear why z = vcat(x,y) doesn’t share memory with x and y. Once you open up these possibilities, even things like x = sin(y), to a beginner, could have some weird lazy evaluation interpretation.

Another example, why does x=a do something different than x = a[:,:] (currently the latter returns a copy, but I guess it might return an ArrayView in the future)? Is x=a just one of those special cases where I tell my students x=a doesn’t mean what you think…it means that x and a are now names for the same thing.

Anyhoo, my hope is to be able to explain how to identify the different behavior to a beginner.

-Ethan

Stefan Karpinski

unread,
Sep 1, 2014, 10:02:48 PM9/1/14
to Julia Users
This will get significantly simpler in 0.4 since anything that can avoid making a copy will.

John Myles White

unread,
Sep 1, 2014, 11:52:40 PM9/1/14
to julia...@googlegroups.com
That’s a good point. But what will end being the way to recognize functions that copy? Surely all the vectorized math functions will still generate copies, while functions like convert will hopefullt make fewer copies.

 — John
Reply all
Reply to author
Forward
0 new messages