Automatic Differentiation

628 views
Skip to first unread message

Jonas Rauch

unread,
Mar 14, 2012, 11:55:46 AM3/14/12
to juli...@googlegroups.com
Hi Everyone. 

First of all: thank you for julia. I found it on Monday and fell in love immediately :-)

As a first exercise I had a go at implementing automatic differentiation (forward only so far). I'm sure neither style nor functionality is not optimal right now, so I would be happy if someone with a little more julia experience could have a look at it. Basically I create an AD type that is based on a Real Type and has a template parameter specifying the number of parameters w.r.t. which we wish to differentiate. The rest is a matter of defining common operations for the type. After a bit of messing around, everything seems to extend to Arrays fairly well. 

As an example (ad_test.jl), it can already be used to compute forward sensitivity for an ODE. The ode45 methods only work if the small patch (ode.patch) is applied, which changes the allocation of the array for intermediate values. (Btw: Isn't this better anyway, because it assumes less about the input?) 
However, the difference in performance to a "normal" solution is a little too high for my taste. Should by about 3x because of two parameters in this case, but I see between 5x and 20x. I guess speedup could come from a "native" support for arrays, where values and derivative are each stored consecutively.

I would be happy to improve this, but I definitely need help. Where would be the best way to discuss this? Should I make a branch in my repository and open a pull request?

Regards,
Jonas
ode.patch
ad.jl
ad_test.jl

Patrick O'Leary

unread,
Mar 14, 2012, 12:48:21 PM3/14/12
to juli...@googlegroups.com
Could you put a pull request in for the ode45() patches so I can pick them up in my next rebase? The two ode45() functions need some work anyways (including deduplication), but I'm not likely to get to them until next week given my schedule.

Patrick

Jonas Rauch

unread,
Mar 14, 2012, 12:55:37 PM3/14/12
to julia-dev
Done. It's only two lines anyway :-)

Viral Shah

unread,
Mar 14, 2012, 1:00:00 PM3/14/12
to juli...@googlegroups.com
I have heard talks on AD, but don't know much about it to say anything about the functionality. The style looks largely fine to me. Is x also a vector in ADForward, and does d represent the derivative?

Can you create a pull request for the ode45 patch? Apart from translating the code from octave to julia, I have not done much more with it at this point. Perhaps we should move the ode stuff into examples for now, and let it bake there before it is moved back into jl/.

I'm not sure I fully understand the performance bit. Are you saying that the ODE solver has poor performance, or is it the AD code that performs slower than you expect? If both x and d are vectors, you could interleave them by storing into an Nx2 Matrix. The performance of an Array of user defined types is not going to be very good right now - otherwise that would have been the way to go.

The general discussion about style and performance can probably happen on the list. But, issues related to AD could be filed, and a basic design could be created on the wiki. Your code can go into examples/ for now. Would also love to hear what the folks working on ODE stuff have to say on the topic and how to push forward.

-viral

> <ode.patch><ad.jl><ad_test.jl>

Patrick O'Leary

unread,
Mar 14, 2012, 1:10:34 PM3/14/12
to juli...@googlegroups.com
Since ode.jl isn't built into the system image and seems like the sort of thing you'd want in the standard library, I'm not sure if there's a good reason to move it only to move it back later, but if we want to do that we can. I'm not sure what kind of timeline we're working to for release ATM so I can't say for sure whether my current plans (and free time for Julia work) fit in.

I don't really know much about AD--it's on the list of "interesting things I've not had a use for so haven't thoroughly researched."

Jonas Rauch

unread,
Mar 14, 2012, 1:28:50 PM3/14/12
to juli...@googlegroups.com
By the way: Is there a way to tell the compiler that an Array has a fixed size? Here we basically have something like

type MyArray{T,n}
  x::Array{T,1}
end

where then for every instance x would have length n (much like for tuples, if I'm not mistaken). If the compiler knew this, this should speed things up a lot in some cases, right? Maybe it would make sense to introduce a type FixedArray{T,n,m} to cover this, where then the product over the sizes for each dimension would have to be equal to m.

Jonas Rauch

unread,
Mar 14, 2012, 1:38:57 PM3/14/12
to juli...@googlegroups.com
Yes, x is the original value (not an array) and d is the derivative w.r.t. multiple parameters. Operations then basically just follow the chain rule.

If there are n parameters, the whole object has size n+1. Basically in this kind of forward AD, every operation has about (n+1)x the computational cost of the original computation. When I apply the ODE solvers to the AD types however, they slow down more than that. So I feel the AD part is slow, ODE is fine imho.

What would the compiler do if I make the following type?
type Mytype{T}
  x::Array{T,1}
end
would it just store the type like it stores an Array, or is there overhead?

I feel my performance problems come from Arrays of the kind
Array{ADForward{T,n},1}
which store my custom type. If I can somehow tell the compiler that the time has fixed length (n+1), it could store it consecutively, right?

Patrick O'Leary

unread,
Mar 14, 2012, 2:37:51 PM3/14/12
to juli...@googlegroups.com
I think we need immutable composites support to get that kind of efficient storage, if I read correctly.

https://groups.google.com/forum/?fromgroups#!topic/julia-dev/_L9H6yxzcGE
https://github.com/JuliaLang/julia/issues/13

Jonas Rauch

unread,
Mar 14, 2012, 3:19:45 PM3/14/12
to juli...@googlegroups.com
I think I could live with just fixed size Arrays, since I could just store everything in a vector of size n+1.

On a similar note: Is there an elegant way to give something a new name or subtype it, so that functions can be specialized differently?
For example (useless, just an illustration), if I wanted to have a different Rational type, that uses the same kind of storage but acts differently, I would love to be able to do something like

MyRational{T}<:Rational{T}

+(x::MyRational, y::MyRational) = ...

This kind of subtyping is impossible, at least for parameterized types, correct?

Patrick O'Leary

unread,
Mar 14, 2012, 3:26:15 PM3/14/12
to juli...@googlegroups.com
On Wednesday, March 14, 2012 2:19:45 PM UTC-5, jrauch wrote:
On a similar note: Is there an elegant way to give something a new name or subtype it, so that functions can be specialized differently?
For example (useless, just an illustration), if I wanted to have a different Rational type, that uses the same kind of storage but acts differently, I would love to be able to do something like

MyRational{T}<:Rational{T}

+(x::MyRational, y::MyRational) = ...

This kind of subtyping is impossible, at least for parameterized types, correct?

I believe that is correct. Go to http://julialang.org/manual/types/ and search for "invariant" on the page.

Jonas Rauch

unread,
Mar 14, 2012, 3:40:22 PM3/14/12
to juli...@googlegroups.com
No, that's not what I meant. I'm not talking about relationships between the parameters carrying over. Instead I want to create a related family of Types. It looks like for this to work one would have to define a parametric abstract type and then derive multiple parametric types from this abstract one, while defining the common functionality for the abstract "parent type". I guess for what I want to make sense, we need actual inheritance in some fashion, because common functionality sometimes does need common assumptions on storage.

Avik Sengupta

unread,
Mar 14, 2012, 4:00:56 PM3/14/12
to juli...@googlegroups.com
In Julia you cannot subtype a concrete type, only an abstract type. Hence, as you say, you'll need to create a hierarchy with abstract types. 

Patrick O'Leary

unread,
Mar 14, 2012, 4:26:09 PM3/14/12
to juli...@googlegroups.com
Yeah, I'm nowhere near my quota for misunderstandings on this project yet. :) Yes, as Avik says you'll have to do this with abstract types--you can look at the implementations of AbstractVector and AbstractMatrix and their subtypes for a bit of the flavor.

Stefan Karpinski

unread,
Mar 14, 2012, 7:16:12 PM3/14/12
to juli...@googlegroups.com
You can't inherit from a concrete type in Julia and you won't ever be able to. This is an explicit design feature: it prevents different concrete implementations of abstractions from being coupled to each other. Inheriting common fields is a weird bogus way of saving keystrokes at the cost of tightly coupling two implementations to each other. All the examples I've ever seen where someone tacks a few fields onto another type are pretty contrived (think chapter three of an intro to some oo language book). If two types happen share a few fields, just define them so that they have the same fields. If you later decide that the implementation of one of them should change, the other one is completely unaffected. Looked at another way, the fields (or bits, or whatever) of a type are a completely inconsequential implementation details: what matters is the behavior — i.e. what functions do to them.

[Also: Any concrete type that can be extended with additional fields can never be stored compactly in a C-style array — it can only be heap-allocated and put into an array that points to the heap-allocated object, because you can never know what the actual size of the objects that get put into the array are.]

If there's a case to be made for "Rationality" being a more generically useful concept than our Rational type implementation, then that really ought to be captured by the existence of an abstract supertype to which that generic behavior can be programmed. The standard Rational{T<:Integer} type and whatever other implementations like MyRational{T<:Integer}, would then subtype that abstraction and provide specific implementations. That case would have to be made first though. This is exactly what we did for Complex — see jl/complex.jl. However, there was a strong reason to do this: we wanted Complex{T<:Real} to work for any real type, but for C-array compatibility (and the ability to call FFT libraries), we needed complex values with Float64 components to be represented differently. They are, but it's completely transparent. (This situation, however, will actually go away once we implement immutable composites with inline storage.)

Note: it's considered a best practice in OO languages to only extend classes that are intended to be extended. These typically end up being abstract base classes. For performance reasons, the concrete classes that are used are then made final. So, in effect, what standard OO languages have adopted as best practices are precisely what Julia's all-concrete-types-are-final rule enforces.

Stefan Karpinski

unread,
Mar 14, 2012, 8:35:52 PM3/14/12
to juli...@googlegroups.com
Hope this didn't come off as too ranty. Just wanted to attempt to explain and justify this fairly significant design decision.

Viral Shah

unread,
Mar 14, 2012, 11:05:44 PM3/14/12
to juli...@googlegroups.com
We can leave it in jl/ and build it into the system image when we believe that the implementation is mature enough.

-viral

Jonas Rauch

unread,
Mar 15, 2012, 1:50:37 AM3/15/12
to juli...@googlegroups.com
Everything you said makes sense. Already learned something today, great! Thank you for the detailed answer. I'm a mathematician, not a computer scientist, so I'm not that well versed in the abstract concepts of CS. This could actually be one of the reasons why many of the other scientific computing languages have all the flaws they do: Some mathematician, physicist, engineer, etc. feels he needs to solve a problem that none of the established languages is well suited for and decides to make their own. Chaos ensues. Even with my limited understanding of CS, looking at the internals of R, for example, is horrifying (everything breaks if you try to link a multi-threaded library, etc.).

Coming back to the original question and pushing the idea a little further, it seems natural to me to always have an abstract type, which contains the semantics, and a concrete type that realizes it. However, this is probably why there are a lot of C++ libraries that are cluttered with interfaces everywhere, and most of them end up having only one implementation anyway :-) So I guess it is a trade-off between generality and pragmatism. Would it make sense to associate an abstract type to every concrete type and make a distinction in the methods? This would of course only make sense if there is some kind of canonical implementation to the idea. For example

type NumericalMatrix
  data::Array{Float64, 2}
end

ref(X::NumericalMatrix, i, j)=X.data[i,j]

@abstract trace(X::NumericalMatrix)=begin
    s=0.0
    n=arraysize(X.data)[1]
    m=arraysize(X.data)[2]
    if n!=m
       error("not square")
    end
    for i=1:n
       s+=ref(X,i,i)
    end
    s
end

Now one could do sth. like

type SparseMatrix<:AbstractType{NumericalMatrix}
   ...
end

ref(X::SparseMatrix, i.j) = ...

and keep the trace functionality of the other class. Sorry for coming up with so many (probably stupid) questions and examples :)

PS: Please don't change the Rational type (unless someone else needs that)! Like I said, I have no intentions of extending it, I was just curious about how one would do something like this in general.

Jonas Rauch

unread,
Mar 15, 2012, 2:26:13 AM3/15/12
to juli...@googlegroups.com
This could actually be one of the reasons why many of the other scientific computing languages have all the flaws they do: Some mathematician, physicist, engineer, etc. feels he needs to solve a problem that none of the established languages is well suited for and decides to make their own. Chaos ensues. Even with my limited understanding of CS, looking at the internals of R, for example, is horrifying (everything breaks if you try to link a multi-threaded library, etc.).
After Stefan's (private) answer I feel I should make clear that the above is not meant as an insult to the people who wrote these languages. All of them are still great at what they do, that's why I've used R with great success and fun for quite a while. What I meant to say is: A statistician cannot know all the CS stuff you need to write conceptually perfect software, just as the computer scientist cannot know all the math that should be included in the final software to make it useful for the researcher. However, it is much easier to add nice math to a conceptually sound framework than to add structure and generality to a program that started out as a very specialized tool for a certain task (e.g. statistics for R, matrix ops for matlab, etc.). So I think that the "Julia approach" is the right one. But for this to happen, someone had to decide to make a new language without the reason being "I need a tool to do X", which takes a good deal of idealism. Kudos!

Stefan Karpinski

unread,
Mar 15, 2012, 2:45:16 AM3/15/12
to juli...@googlegroups.com
On Thu, Mar 15, 2012 at 1:50 AM, Jonas Rauch <jonas...@googlemail.com> wrote:
Coming back to the original question and pushing the idea a little further, it seems natural to me to always have an abstract type, which contains the semantics, and a concrete type that realizes it. However, this is probably why there are a lot of C++ libraries that are cluttered with interfaces everywhere, and most of them end up having only one implementation anyway :-) So I guess it is a trade-off between generality and pragmatism. Would it make sense to associate an abstract type to every concrete type and make a distinction in the methods? This would of course only make sense if there is some kind of canonical implementation to the idea. For example

type NumericalMatrix
  data::Array{Float64, 2}
end

ref(X::NumericalMatrix, i, j)=X.data[i,j]

@abstract trace(X::NumericalMatrix)=begin
    s=0.0
    n=arraysize(X.data)[1]
    m=arraysize(X.data)[2]
    if n!=m
       error("not square")
    end
    for i=1:n
       s+=ref(X,i,i)
    end
    s
end

Now one could do sth. like

type SparseMatrix<:AbstractType{NumericalMatrix}
   ...
end

ref(X::SparseMatrix, i.j) = ...

and keep the trace functionality of the other class. Sorry for coming up with so many (probably stupid) questions and examples :)

This is a really interesting idea. Jeff and I actually spent a good chunk of time talking about it — very interesting conversation. One problem is that if someone writes

type A <: B
  # stuff
end

you want to stick Abstract{A} between A and B, giving this relation:

A <: Abstract{A} <: B

Likewise, if C <: D you want

C <: Abstract{C} <: D
 
If B and D are disjoint types, this poses a big problem for Abstract because now it has "fragments" all over the place, and our type system, which is already complicated enough, doesn't allow anything like that. One way you could potentially make it work would be with multiple inheritance, but we also don't have that.

Another problem is that it doesn't really give you what you want in a lot of cases. If you declare v::Vector{A}, then you can't put B object into that vector. You'd have to declare it to be v::Vector{Abstract{A}}, in which case you can't have compact C-style array storage.

And I guess my ultimate objection is that it's just complicated — it doesn't "feel" right. That's clearly a very subjective judgement call, but that's how you've got to feel your way through this programming language design space. Oh, and it would also probably slow the language down significantly. But still, it's a very cool idea.

PS: Please don't change the Rational type (unless someone else needs that)! Like I said, I have no intentions of extending it, I was just curious about how one would do something like this in general.

 Don't worry, we wont. Well, unless there's a compelling reason for it. Then we will.

Stefan Karpinski

unread,
Mar 15, 2012, 2:55:20 AM3/15/12
to juli...@googlegroups.com
Another problem is that it doesn't really give you what you want in a lot of cases. If you declare v::Vector{A}, then you can't put B object into that vector. You'd have to declare it to be v::Vector{Abstract{A}}, in which case you can't have compact C-style array storage.

Sorry, I meant that if you declare v::Vector{B}, you can't put a B object into v; you have to declare it as v::Vector{Abstract{B}}.

Stefan Karpinski

unread,
Mar 15, 2012, 2:56:58 AM3/15/12
to juli...@googlegroups.com
Ah. I'm too tired. This is what I meant:

If you declare v::Vector{B}, you can't put a A object into v; you have to declare it as v::Vector{Abstract{B}}.

Jonas Rauch

unread,
Mar 15, 2012, 3:15:42 AM3/15/12
to juli...@googlegroups.com
I don't agree completely, I think instead of
A <: Abstract{A} <: B
you would rather have:
A <: Abstract{A} <: Abstract{B}
B <: Abstract{B}
But this would make A and B unrelated, which leads exactly to the problem you pointed out, you cant stick A in Vector{B} but only in Vector{Abstract{B}}. Thats a very good point. But when you use abstract types exlicitly, like in the sparse matrix case for example, you  have the same problem right? You can only stick a SparseMatrixCSC object in an Array{AbstractMatrix} but not in an Array{Matrix}.

So this leads to yet another question for you guys: If I declare something as x::Vector{AbstractA}, but I only stick in things of type A<:AbstractA, can the compiler figure this out and store x compactly if A has such a representation? I am amazed by how much of the high level language features just go away during compilation because of correct specialization. To me it looks like figuring out what types could be stored in x is basically the same task as figuring out which types could be stuck into methods, right?

Coming back to the original idea: If the above can be done, one could make every Array{B} actually be an Array{Abstract{B}} without losing anything and then freely stick in an A. The added cost would only occur if A and B are mixed, in which case nothing can be done anyway.

Jeff Bezanson

unread,
Mar 15, 2012, 3:58:12 AM3/15/12
to juli...@googlegroups.com
Anything is possible with enough work :)
The difficulty with arrays is that since they are mutable, the
compiler can't always see all assignments to them. For example:

a = Array(AbstractA,10)
f(a)
...

We don't know in general whether f will only store "A"s into the
array. Yes, in some cases you could see all uses and represent the
array more efficiently with that knowledge. I suspect those cases
would be rare for us.
You could also combine run-time techniques, such as switching the
representation if something other than an "A" is stored. But then
you'd be in a situation where looking at the type of an array doesn't
tell you its representation, so there would be some unavoidable
overhead.

Konrad Hinsen

unread,
Mar 15, 2012, 4:04:46 AM3/15/12
to juli...@googlegroups.com
Stefan Karpinski writes:

> Hope this didn't come off as too ranty. Just wanted to attempt to explain and justify
> this fairly significant design decision.

On the contrary, I suggest to add explanations along these lines to the documentation,
as the question will likely come up again.

> Inheriting common fields is a weird bogus way of saving
> keystrokes at the cost of tightly coupling two implementations
> to each other. All the examples I've ever seen where someone
> tacks a few fields onto another type are pretty contrived
> (think chapter three of an intro to some oo language book).

I can provide a few real-life examples if you are interested but I
agree that they are not nearly as important or frequent as OO
textbooks tend to suggest, and that they are not worth the trouble
that the most popular solution (allowing type extensions with new
fields) produce.

The day I have to tackle such a situation in Julia, I'll probably make
a macro that defines the fields of the base type plus additional types
provided as an argument to the macro. Then I can define as many
extended types as I want and still modify their common fields in a
single place if needed. That macro could even add "convert" methods to
do the equivalent of downcasting.


However, the original question was about a somewhat different situation:

> On Wednesday, March 14, 2012 2:19:45 PM UTC-5, jrauch wrote:
>
> On a similar note: Is there an elegant way to give something
> a new name or subtype it, so that functions can be
> specialized differently? For example (useless, just an
> illustration), if I wanted to have a different Rational
> type, that uses the same kind of storage but acts
> differently, I would love to be able to do something like

What he seems to ask for is a typealias that creates a distinct type
(not a subtype) which has different behavior from the original type
but shares its implementation.

I see this type of construction all the time in Haskell code, where
types are the universal weapon for everything, including program
structure. In Haskell, people create a new algebraic data type with a
single field holding the value of the original type they want to
wrap. They also define the required conversion functions to work with
the original type through the interface of the new one. In Haskell,
one can rely on the compiler to optimize away the wrappers, so this is
an acceptable solution performance-wise, even though it clutters the
code significantly.

I haven't used Julia enough to say if there is a need for some such
construction, nor whether the Haskell approach would be efficient.

One potential application that has been recently discussed here is the
different views of arrays. With different types that share a common
implementation, one could have Array as an APL-style container with
elementwise operations, Matrix for a Matlab-style interface, and Frame
for statisticians coming from R.

Konrad.

Jeff Bezanson

unread,
Mar 15, 2012, 4:23:52 AM3/15/12
to juli...@googlegroups.com
I believe abstract types handle this --- for example we have many
functions defined on AbstractArray, which are likely to work for
anything array-like.

And as Stefan said, Complex numbers were defined this way --- Complex
is abstract, and there are Complex128, Complex64, and ComplexPair
concrete implementations. All complex arithmetic is defined on
Complex, so those functions work on all concrete complex numbers.

Is this what you're referring to?

Jonas Rauch

unread,
Mar 15, 2012, 4:28:19 AM3/15/12
to juli...@googlegroups.com
That makes perfect sense. I think runtime handling is bad, because of overhead. But could promotion work? For example consider again the hierarchy
A <: Abstract{A} <: Abstract{B}
B <: Abstract{B}

one could make an implicit promotion rule
promote_rule(::Type{Vector{Abstract{A}}}, ::Type{Vector{B}) = Vector{Abstract{B}}

in this case, the function
@abstract store(a::Vector{Abstract{B}}, x::Abstract{B}, i) = begin
  a[i] = x
end

would specialize to methods
store(Vector{A}, A, Int)
store(Vector{B}, B, Int)
store(Vector{B}, A, Int)
where the last one would return a Vector{Abstract{B}) and the others Arrays of the respective concrete types.

Jonas Rauch

unread,
Mar 15, 2012, 4:30:35 AM3/15/12
to juli...@googlegroups.com
Yes, exactly. But if you want to extend a type of someone else, who did not design the type with this in mind (i.e. make an abstract parent type containing the functionality), you cannot simply extend that. So that the idea was to associate to each type an abstract type.

Jonas Rauch

unread,
Mar 15, 2012, 4:33:22 AM3/15/12
to juli...@googlegroups.com
Quick addition: The function

@abstract store(a::Vector{Abstract{B}}, x::Abstract{B}, i)
could even be defined as
@abstract store(a::Vector{B}, x::B, i)
because the @abstract keyword implies that it only acts on the concept of B rather than the implementation B. Does this make sense?

Jonas Rauch

unread,
Mar 15, 2012, 4:39:03 AM3/15/12
to juli...@googlegroups.com
Ah, sorry, forget all that. I was picturing changing a copy of the array and returning the result, which is of course something completely different. Sorry. So you really would need a runtime component, which makes things hard.

Konrad Hinsen

unread,
Mar 15, 2012, 6:17:10 AM3/15/12
to juli...@googlegroups.com
Jeff Bezanson writes:

> I believe abstract types handle this --- for example we have many
> functions defined on AbstractArray, which are likely to work for
> anything array-like.

The problem with using abstract types here is that you need to modify
the subtype graph, which also means modifying the implementation of
the original type.

To stick to the original somewhat contrived example: suppose I want to
provide alternate behavior for type Rational. That type is part of the
Julia standard library, so I'd have to introduce a new abstract
type above Rational into the standard library. That's not a workable
approach in practice.

BTW, I wonder how typealias works and what is good for. I didn't yet find
any difference in result between

typealias Foo Int32

and

Foo = Int32

Is there one?

Konrad.

Stefan Karpinski

unread,
Mar 15, 2012, 11:09:53 AM3/15/12
to juli...@googlegroups.com
BTW, I wonder how typealias works and what is good for. I didn't yet find
any difference in result between

   typealias Foo Int32

and

   Foo = Int32

Is there one?

Konrad.

For non-parametric types

typealias Foo Int32

is equivalent to

const Foo = Int32

For parametric types, you can do things like this:

typealias Vector{T} Array{T,1}

You can't do this with assignment because of the quantification over the T type parameter.

Stefan Karpinski

unread,
Mar 15, 2012, 12:57:38 PM3/15/12
to juli...@googlegroups.com
Regarding the proposal of automatically sticking abstract types "above" every concrete type, some kind of implementation like this might be possible. It seems like the proposal boils down to automatically sticking an abstract A' just above every concrete type A and then actually using abstract A' everywhere A is used for dispatch; for other uses of A, like in type parameters for arrays, A just means A. Something like that could probably be made to work (it's reminiscent of how singleton classes work in Ruby).

However, I'm going to argue against having this ability from a completely social standpoint. Since Julia is  dynamic and open source, if you need an abstract supertype of some concrete type, you can always just patch your copy of the definition of Rational and explicitly stick an abstract supertype in there. When this happens, for code maintenance sanity, you're going to want to push that change upstream to whoever maintains the official version of the Rational type. When you do that, you start a conversation with that person about why you feel that Rational needs a supertype. They can either persuade you that it's a bad idea and provide a better solution, or you can persuade them that it is a good idea and get them to change the official version. In the latter case, the official Rational really needed an abstract supertype all along, and now it gets one. Everyone benefits! Regardless of how the conversation goes, the fact that the conversation happened is a good thing.

Moreover, the Rational code most probably ought to be refactored to some extent to be used as an abstraction. Maybe you can use it that way even when that wasn't intended, but it's probably going to be suboptimal and maybe even ugly. For example, significant refactoring was needed when we changed Complex from a single concrete implementation into an abstraction with three implementations (ComplexPair, Complex64, and Complex128). If anyone can just jack a new implementation under something that was not intended to be an abstraction, then there's a huge temptation to just do that instead of trying to get the official version patched. The conversation with the maintainer never happens, the patch never gets submitted, and the next person who needs to use Rational as an abstraction ends up also doing something suboptimal that kind of works but wasn't intended.

I think the idea that you need to inherit from types that may or may not have been intended to be inherited from is an example of "closed-source thinking" and only makes sense for statically compiled languages. That ability is essential for a language like Java where the developer may not have access to the source code of a class they're extending and thus may not be able to change its behavior. In a dynamic, open source language like Julia, where you always have access to the source and can change anything, this is not necessary. Furthermore, I would argue that it is detrimental to the language ecosystem as a whole if people aren't strongly "encouraged" to push back changes to maintainers when some type doesn't anticipate some need.

Jonas Rauch

unread,
Mar 15, 2012, 5:03:29 PM3/15/12
to juli...@googlegroups.com
The social aspect is an interesting argument, that I definitely have not heard so far. "How does language design affect people's incentive to contribute?"
I think the idea of an automatic abstract type, while interesting, would probably could make things very confusing. So I agree that it probably shouldn't be done.

Am Donnerstag, 15. März 2012 17:57:38 UTC+1 schrieb Stefan Karpinski:
Regarding the proposal of automatically sticking abstract types "above" every concrete type, some kind of implementation like this might be possible. It seems like the proposal boils down to automatically sticking an abstract A' just above every concrete type A and then actually using abstract A' everywhere A is used for dispatch; for other uses of A, like in type parameters for arrays, A just means A. Something like that could probably be made to work (it's reminiscent of how singleton classes work in Ruby).

However, I'm going to argue against having this ability from a completely social standpoint. Since Julia is  dynamic and open source, if you need an abstract supertype of some concrete type, you can always just patch your copy of the definition of Rational and explicitly stick an abstract supertype in there. When this happens, for code maintenance sanity, you're going to want to push that change upstream to whoever maintains the official version of the Rational type. When you do that, you start a conversation with that person about why you feel that Rational needs a supertype. They can either persuade you that it's a bad idea and provide a better solution, or you can persuade them that it is a good idea and get them to change the official version. In the latter case, the official Rational really needed an abstract supertype all along, and now it gets one. Everyone benefits! Regardless of how the conversation goes, the fact that the conversation happened is a good thing.

Moreover, the Rational code most probably ought to be refactored to some extent to be used as an abstraction. Maybe you can use it that way even when that wasn't intended, but it's probably going to be suboptimal and maybe even ugly. For example, significant refactoring was needed when we changed Complex from a single concrete implementation into an abstraction with three implementations (ComplexPair, Complex64, and Complex128). If anyone can just jack a new implementation under something that was not intended to be an abstraction, then there's a huge temptation to just do that instead of trying to get the official version patched. The conversation with the maintainer never happens, the patch never gets submitted, and the next person who needs to use Rational as an abstraction ends up also doing something suboptimal that kind of works but wasn't intended.

I think the idea that you need to inherit from types that may or may not have been intended to be inherited from is an example of "closed-source thinking" and only makes sense for statically compiled languages. That ability is essential for a language like Java where the developer may not have access to the source code of a class they're extending and thus may not be able to change its behavior. In a dynamic, open source language like Julia, where you always have access to the source and can change anything, this is not necessary. Furthermore, I would argue that it is detrimental to the language ecosystem as a whole if people aren't strongly "encouraged" to push back changes to maintainers when some type doesn't anticipate some need.

Konrad Hinsen

unread,
Mar 16, 2012, 5:05:58 AM3/16/12
to juli...@googlegroups.com
Stefan Karpinski writes:

> However, I'm going to argue against having this ability from a
> completely social standpoint. Since Julia is  dynamic and open
> source, if you need an abstract supertype of some concrete
> type, you can always just patch your copy of the definition of
> Rational and explicitly stick an abstract supertype in there. When

That's an interesting argument, though I don't believe it will work
before I see it working. Scientists are sceptics by nature or by
training ;-)

What you are advocating is that everyone using Julia should work on
the bleeding edge, pulling in upstream changes regularly in order to
be able to use the latest features, in particular those that the user
has asked for himself. I don't quite see this happening any time soon,
judging from the prehistoric versions of many programs I see on my
colleagues' computers. Most computational scientists are very
reluctant to change anything in their work environment, partly out of
laziness, but partly out of the justified fear that some update will
break their code. There are also institutional barriers to updates,
in particular in high-security environments such as defense.

In the long run, what you describe could work out and in fact I hope it
will. I just wonder if it will happen before my retirement.

One ingredient missing in today's software development toolchain is an
integrated version control and dependency management system. I should
be able to say that my program X depends on a specific version of
Julia (referenced by the commit number). That way I could be sure that
working code remains working, even if pull in updates and use the new
features in other projects. The Nix package manager
(http://nixos.org/nix/) works along these principles, but is not yet
solid enough for daily use.

The other ingredient that would have to change is the level of
sophistication of scientists-programmers. Most of my colleagues just
begin to see the point of version control, but don't use it yet for
their daily work.

Konrad.

Stefan Karpinski

unread,
Mar 16, 2012, 1:22:13 PM3/16/12
to juli...@googlegroups.com
Skepticism is totally fair. However, this does already happen with C++ libraries a fair amount. There are all sorts of requests to make methods virtual which are then discussed (briefly) and either do or don't happen. I think that's way too granular, myself, since having these discussions on a per-method level seems like a bit of a waste of time. The discussion of whether a concrete implementation in a library ought to have an abstraction over it is a far meatier and more worthwhile discussion, imo.

Regarding package management, I completely agree that integration with version control is a great idea. I've proposed this myself in issue #432 and elsewhere (some email thread?). I favor tight integration with git, since it has become the de facto open source version control standard and it's what we use. I also had the idea of versioning snapshots of collections of packages by making the package directory a git repo with installed packages as submodules. That way entire collections of packages can be tracked using git and a single git version number tells you all of the package versions, and you can get them all on a new system with a single git checkout, and of course, you get all the benefits of versions of all the collections of packages you've ever used.

I definitely don't expect everyone to live on the bleeding edge. It should be easy to have a collection of patches that one uses and keeps rebasing on top of package upgrades. It should also be very easy, e.g. via GitHub, to share such patches, have people publicly review them and if they're appropriate, get them accepted upstream. If you upgrade and one of your patch conflicts, you'll know immediately because you get a merge conflict. Git is sufficiently good about not accidentally merging things that won't work that people will hopefully be comfortable with doing upgrades. We also intend to adhere strictly to the semantic versioning specification: point versions only fix bugs; minor versions can add features but only in a backwards-compatible way; major versions can break backwards compatibility. This is a lot of discipline, but I think it's very important and we should set the tone for the entire Julia community (which at this point is just this mailing list, I guess, but it's a really good start :-). This is why we're being very careful to make sure that after we tag v1.0, we won't have to make any breaking changes in the foreseeable future. (v1.0 will not be feature-complete, but features added in v1.1 will not break code that worked in v1.0.) Fear of upgrading is a social issue as much as a technological one.

Konrad Hinsen

unread,
Mar 19, 2012, 7:57:45 AM3/19/12
to juli...@googlegroups.com
Stefan Karpinski writes:

> Regarding package management, I completely agree that integration
> with version control is a great idea. I've proposed this myself
> in issue #432 and elsewhere (some email thread?). I favor tight
> integration with git, since it has become the de facto open source

Fine, as long as mere users of some library don't need to learn git
first.

> version control standard and it's what we use. I also had the idea
> of versioning snapshots of collections of packages by making the
> package directory a git repo with installed packages as
> submodules. That way entire collections of packages can be tracked
> using git and a single git version number tells you all of the
> package versions, and you can get them all on a new system with a

That sounds very nice, but it requires power-user knowledge of git.

Of course some more user-friendly package manager could be built
on top of raw git and make all that goodness more accessible.

My personal long-term vision for integrated version control and dependency
management goes even further to include scientific data, mixed-language
software implementations, and scientific publishing:

http://dirac.cnrs-orleans.fr/plone/software/activepapers

Unfortunately my proof-of-concept implementation is based on the JVM,
which pretty much no one uses in computational science. I hope to find
the time to explore other possible bases, e.g. LLVM.

Konrad.

Stefan Karpinski

unread,
Mar 19, 2012, 9:04:51 AM3/19/12
to juli...@googlegroups.com
My personal long-term vision for integrated version control and dependency
management goes even further to include scientific data, mixed-language
software implementations, and scientific publishing:

   http://dirac.cnrs-orleans.fr/plone/software/activepapers

Unfortunately my proof-of-concept implementation is based on the JVM,
which pretty much no one uses in computational science. I hope to find
the time to explore other possible bases, e.g. LLVM.

Konrad.

That's very cool. I had similar ambitions, although my thoughts on the subject are generally more along the lines of "github for science": a service where you can access both the data and the code that were used in the same place, and replicate results — possibly on virtualized hardware, so that an identical experimental setting can be relatively easily reproduced. Of course, for both ideas, one of the hardest problems is buy in: how do you get people to use it?

Konrad Hinsen

unread,
Mar 19, 2012, 10:35:56 AM3/19/12
to juli...@googlegroups.com
Stefan Karpinski writes:

> That's very cool. I had similar ambitions, although my thoughts on the subject are
> generally more along the lines of "github for science": a service where you can access
> both the data and the code that were used in the same place, and replicate results —
> possibly on virtualized hardware, so that an identical experimental setting can be
> relatively easily reproduced.

That sounds a bit similar to SHARE:

http://is.ieis.tue.nl/staff/pvgorp/share/

The main difference is that SHARE doesn't have anything similar to a Github fork,
meaning it's difficult to re-use code and data from others in a different context.

> Of course, for both ideas, one of the hardest problems is buy in:
> how do you get people to use it?

By making it easy to use, and integrate it into the tools that
scientists are already familiar with. At least that's my theory ;-)

Konrad.

Reply all
Reply to author
Forward
0 new messages