On Friday, September 9, 2016 1:39:39 AM CDT Data Pulverizer wrote:
> It would seem from this post
> (
http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are
> -still-slow/) that the Any type is problematic in Julia. Pherphs there needs
> to be room for breaking atomicity in a arrays or for the compiler to
> recognise heretogeneous array and label them as a some mixed typed array
> instead of type hiding with Any. Any type should be less about type hiding
> and more about type genericism. Tuples in Julia do not provide array
> operations, they are essentially immutable. So even if you supply a fix for
> dataframes you are still essentially stuck with the same problem when you
> need type heterogenicity or recognition in another situation.
"Hiding" and "genericism" are the same thing. CPUs have one set of
instructions for adding two Float64s, and another for adding two Int64s. Thus,
if you're iterating over a container where you *know* that every element is a
Float64, your loop can be compiled down to calls that just involve Float64; if
in contrast you have a mixed container, then your inner loop needs to compile
down to something that conceptually looks like this:
# x is the latest element you've pulled out of the container
if isa(x, Float64)
# CPU does the Float64 version
elseif isa(x, Int64)
# CPU does the Int64 version
end
This is true for any language, from assembly to the slowest interpreted
language you can come up with. Mostly it's a question of just how big a hit
you take from code like this (and it's fair to say that Julia could do more to
optimize how it handles this case).
> > Except that julia has simple tools to show you exactly that, from the
> > REPL.
>
> I guess this means that people that want to fully optimise code will have
> to learn assembly.
Out of curiosity, why do you think that? You should check out @code_warntype,
which does not involve assembly at all.
Moreover, why do you try to twist a good thing---the availability of
introspection tools---into a bad one?
Methinks you folks greatly underestimate just how good julia is at generating
optimized code, and over what a wide range of constructs it produces fully-
optimized code. It's fair to say that over time I've learned a few tips about
writing efficient Julia code, but for reference: I can often write a half day's
worth of code, without having to check the introspection tools once, and not
have a single type problem that matters for performance. And the performance
is awesome. :-) (As Jeff has said, it's really nice to be able to write type-
unstable code sometimes when it doesn't hurt performance and not get yelled at
by the compiler, and I take advantage of that freedom when I want/need to.)
Best,
--Tim