vectorized comparisons

202 views
Skip to first unread message

Greg Harris

unread,
Feb 12, 2013, 2:54:41 PM2/12/13
to julia...@googlegroups.com
Hey all,

I've just finished a large named-entity recognition school project using Julia.  It was really great.  Working with strings was fast and easy, and the parallel processing saved me a lot of time.

One thing I did miss, though, (coming from a MATLAB background) was vectorized comparisons.  In MATLAB, if one argument is a vector, and one a singleton, then the comparison is made between each element and the singleton, and a boolean vector is returned.  So, MyVector < 3, will give a vector the size of MyVector indicating which elements were less than 3.  I also like using vectorized string comparisons with strcmp.  I can say:  ix = find(strcmp(MyCellArrayOfStrings, 'juniper'));  to find all the cells that contain the word 'juniper'.

Has such functionality already been discussed and decided-against for Julia?  If not, maybe it should be considered.

Thanks,

Greg

John Myles White

unread,
Feb 12, 2013, 2:57:31 PM2/12/13
to julia...@googlegroups.com
It exists. It used to be v < 3, but is almost certainly now v .< 3.

-- John

Stefan Karpinski

unread,
Feb 12, 2013, 2:57:42 PM2/12/13
to Julia Users
The operators for that are called .< .<= etc. The normal comparison operators always return booleans instead.

Greg Harris

unread,
Feb 12, 2013, 3:13:42 PM2/12/13
to julia...@googlegroups.com
That's perfect.  I'm happy to hear it.

-Greg

Gabor

unread,
Feb 12, 2013, 3:52:32 PM2/12/13
to julia...@googlegroups.com

#1.  Dot-comparison operators work like this:

julia> [1,2,3].<[4,5,6]
3-element BitArray:
 true
 true
 true

#2.  Simple comparison operators work like this:

julia> [1,2,3]<[4,5,6]
true

which is a shorthand for this:

julia> all([1,2,3].<[4,5,6])
true

In my practice the second idiom is mostly useless.
I ALWAYS  have to write the dotted, elementwise operators
which are not well readable and their combination is even worse.

E.g.
(h.!=0) & (k.>=0) & (l.>=0) & (A.>=0.05*<max(A))

Then why not stick to the the Fortran/Matlab/Python covention?

Gabor

unread,
Feb 12, 2013, 3:55:49 PM2/12/13
to julia...@googlegroups.com
Considering v0.1 this comment may come at the wrong time,
but I can not suppress this opinion.

Stefan Karpinski

unread,
Feb 12, 2013, 5:02:12 PM2/12/13
to Julia Users
We actually used to do this and it eventually became painfully clear to everyone that it was a mistake. Consider equality. It is very common to write generic code that checks if two numerical things, be they scalars or vectors, are numerically equal, like so: x == y. If x == y returns a boolean some of the time and an array of booleans other times, it breaks all generic code that tries to check for numeric equality. Matlab deals with this problem by making if check if all values in a boolean array are true, but that's at best a hack and only a partial solution.

Gabor

unread,
Feb 13, 2013, 2:01:29 PM2/13/13
to julia...@googlegroups.com

I appreciate that universal equality is high on the priority list,
and - as you earlier said - design decisions are always compromises.

Just for the record.
Fortran and Matlab comparison operators both default to elementwise:

Fortran:
print *, [1,2,3]==[1,2,3]
 T  T  T

Matlab:
>> [1,2,3]==[1,2,3]
ans =
     1     1     1
    
Array comparison is a separate construct. For Matlab:
isequal(a,b)
ans =
     1
    
For Fortran one must use:
print *, all([1,2,3],[1,2,3])
  T 
Because it is compiled, I assume it does not have to calculate
all elements of the inner elementwise comparison either.


Julia's design decision is obviously different in this respect,
elementwise comparisons are inferior to whole arrray comparisons.

Here I am only recording that not everybody is happy with this choice.
In my practice elementwise comparisons are at a higher priority, and
would be preferred without the clumsy dot-elementwise etc. notation.

John Myles White

unread,
Feb 13, 2013, 3:13:58 PM2/13/13
to julia...@googlegroups.com
The problem with your suggestion, which is definitely otherwise reasonable, is that often you want to do these comparisons in an if-statement. But a vector of true's is not a Boolean and can't be used in an if-statement because of type restrictions.

 -- John

Gabor

unread,
Feb 13, 2013, 3:23:20 PM2/13/13
to julia...@googlegroups.com
There are any() and all() to transform a vector of Booleans
to Bool  to be used in an if-statement. 
 
Also very readable.

Jason Knight

unread,
Feb 13, 2013, 4:48:24 PM2/13/13
to julia...@googlegroups.com
I personally prefer the consistency of the (.) family operators, but you could always embrace the power of Julia and define it as you like. Just place the following in your $HOME/.juliarc.jl file:

import Base.isless
isless{T,N}(l::AbstractArray{T,N}, r::T) = l .< r
isless{T,N}(l::T, r::AbstractArray{T,N}) = l .< r

It seemed to work with some cursory testing (although there was a warning about having the isless(AbstractArray, AbstractArray) version defined first, even though this is in abstractarray.jl). 

Cameron McBride

unread,
Feb 13, 2013, 8:00:15 PM2/13/13
to julia-users
On Wed, Feb 13, 2013 at 3:23 PM, Gabor <g...@szfki.hu> wrote:
There are any() and all() to transform a vector of Booleans
to Bool  to be used in an if-statement. 
 
Also very readable.

FWIW (please read as not much at this point): I think any() and all() and default element wise comparison is both clean and the most clear.   It also seems to match a larger number of other languages, no?  

I haven't used it yet, but adding a period to the comparison operators is easy to overlook and cause confusion.  (In PDL -- perl data language, they overrode the ".=" operator for strings to make an inplace value change.  I ran into many issues as it confused people who tried to read / edit the code I created).

Cameron 

Toivo Henningsson

unread,
Feb 14, 2013, 2:11:45 PM2/14/13
to julia...@googlegroups.com


On Wednesday, 13 February 2013 22:48:24 UTC+1, Jason Knight wrote:
I personally prefer the consistency of the (.) family operators, but you could always embrace the power of Julia and define it as you like. Just place the following in your $HOME/.juliarc.jl file:

import Base.isless
isless{T,N}(l::AbstractArray{T,N}, r::T) = l .< r
isless{T,N}(l::T, r::AbstractArray{T,N}) = l .< r

It seemed to work with some cursory testing (although there was a warning about having the isless(AbstractArray, AbstractArray) version defined first, even though this is in abstractarray.jl). 

You should never monkey patch functions in Base like that to behave differently for types that Base knows about. Also, isless is meant to be a total order. If you really want to have this behavior in your own code, you should use something like

global < # prevent the default import of < from Base
<(x,y) = applicable(Base.(:.<), x, y) ? Base.(:.<)(x,y) : Base.(:<)(x,y)


and the same for >, <=, >=, == and !=.

Jason

unread,
Feb 14, 2013, 2:16:18 PM2/14/13
to julia...@googlegroups.com
You should never monkey patch functions in Base like that to behave differently for types that Base knows about.

Wouldn't it be okay in this case though since ``isless`` is undefined for AbstractArray{T,N} and T values? I definitely agree with you that redefining things in general can be very bad for other code and libraries that you may be using.
 

Toivo Henningsson

unread,
Feb 14, 2013, 2:20:20 PM2/14/13
to julia...@googlegroups.com

I think that the distinction between boolean valued < and possibly anything-valued .< is more important in Julia than in most languages, because of the focus on generic programming. Many functions should not be written to a specific set of types, but to a general abstraction. Imagine how many generic functions that would break for things that do support a partial order, just because < had been redefined to do something else than to evaluate that ordering relation! I really appreciate that the Julia devs are as strict as they are in keeping the abstractions in the base language together -- because generic programming is only as strong as the abstractions that you build to use it with.

Reply all
Reply to author
Forward
0 new messages