Online Database + IDE for Julia

508 views
Skip to first unread message

Jannis Eichborn

unread,
Apr 9, 2014, 11:59:34 AM4/9/14
to julia...@googlegroups.com

Hi, 

As previously announced by Simon Danisch, I want to present our online library concept for Julia.

It's an online library based on a graph-database, that is supposed to enable a very rich feature set including uploading and versioning of code, search for code and documentation, automated tests etc…

For starters, it will be more of an advanced interface for searching packages and documentation.
This might be a little hard to justify, as the package system combined with Github already does a great job.

But I think the idea behind it becomes understandable, when you think about scaling up the feature set in the future.
You could probably implement every feature you would like to be independent from each another, say as different libraries or tools.
For example make an extra interface for an image library, a library for statistical data sets, a library for binary dependencies, a library for tutorials and examples, a forum for discussions/bug-reports, etc....

But I think it's extremely useful to have all these different kind of "libraries" in one big library because the resources I listed all need to be treated very similar. 
They all might have tags, ratings, attached opinions/discussions, dependencies, are part of a hierarchy and finally, the search interface for them should be very similar.
In a graph library, nodes get a higher level semantic meaning very easily, by just building connections.
In the end this would also prevent fragmentation of information regarding the code over different systems or libraries. 

Just one example how things could be represented in an inhomogeneous graph-like structure.
This is just a mock-up, as things are not consistent so far and might be represented differently in the final library:

You can search for the English word "Image" and already get a first impression of what one can do with Julia and images.

This would be especially cool, if it was deeply integrated into an IDE.
You could just click on a particular datatype and get a display of what functions you could use on the type, or actually query example material.
Similarly, you can look-up documentations, or code samples for functions and so forth... I hope you get the big picture!

Another interesting use case is to combine the library with a good testing library, so that you can define tests for every function.
Like this you can extend the library in an easy and controlled way. 
Someone can just upload a new version of a function, and if it passes the test for the function, it gets accepted. 
Together with benchmarks, one could even find out, if the new implementation is better, promoting it to the new standard implementation.
This is far out for sure, as this needs a feature set similar to Github and beyond. 
But at some point, this might turn into version control system, which can make it very easy and conflict free to extend the functionality of Julia.

As mentioned above the first steps will be to create an interface that lets you insert code and search for it intelligently, based on partially given information. Right from the start the system should allow for multiple competing implementations of one symbol and also take care of versions of the same functions.

We are currently developing the first prototype of the database together with a simple IDE to connect to it. This happens in the scope of our Bachelor’s Thesis. The database is conceptually independent from the IDE but of course one does not make much sense without the other.

So if you have any kind of feedback, ideas, wishes for features, inherent flaws you want to point out or just your two cents worth, let us know! We want to know what people working with/on Julia need.

Thanks and cheers,
Jannis & Simon

Jacques Rioux

unread,
Apr 9, 2014, 12:50:01 PM4/9/14
to julia...@googlegroups.com
Did I miss a link in your post? 

Is there any place to interact with or see your progress online now?

Good luck in your endeavour

Kevin Squire

unread,
Apr 9, 2014, 1:54:07 PM4/9/14
to julia...@googlegroups.com
I think he's referring to this original post:


which links to


I was going to suggest to Simon that he add a short explanation, as the original post was pretty sparse, and doesn't really suggest to anyone why they would want to look at that package.  Jannis more than makes up for that in his post. ;-)

Cheers!
   Kevin

Jacques Rioux

unread,
Apr 9, 2014, 2:00:35 PM4/9/14
to julia...@googlegroups.com

I did see that post after but I still fail to see the connection between the two.

Kevin Squire

unread,
Apr 9, 2014, 2:06:19 PM4/9/14
to julia...@googlegroups.com
You're right.  The message talking about this was actually on julia-dev.


Cheers!  Kevin

Jacques Rioux

unread,
Apr 9, 2014, 2:23:24 PM4/9/14
to julia...@googlegroups.com
Ah! Ok, now I get it.

I was starting to wonder "Am I really that thick?" 

I may be but we will have to wait a lttle longer for confirmation... ;-)

Thanks 

Simon Danisch

unread,
Apr 10, 2014, 5:22:03 AM4/10/14
to julia...@googlegroups.com
"Announced by me" is a big word here, but thanks to Kevin the correct link found its way to this post! =)

Sorry about the confusion, we still try to figure out how to get things across on the mailing list!
And its quite a complex topic, so it's a bit difficult to get a good summary across, without writing 10 pages of description.
That's also why we kept things a little separate, so that we can have different level of details in the discussion.
This post is intended to get a more general feedback on the concept, so that we don't implement things, that are already there, or proofed to be foolish.
Posts by me about Events.jl, or the OpenGL package are more on an implementation level.
I hope this strategy enables us to talk about the implementation and the concept separately, making things more constructive.
Please feel free to also criticize the way we present things, so that we can show our project in the best way possible and get the most feedback/help!

After all, we're still rookies, shooting for a fairly ambitious project.

Best,
Simon

Mike Innes

unread,
Apr 10, 2014, 6:33:22 AM4/10/14
to julia...@googlegroups.com
It would definitely be good to have something like clojuredocs.org. You could include the docs for functions in all packages and Base, make everything searchable. If you could build meaningful connections between packages, even better.

But – and please forgive me if I've missed the point here – but, it does sound like you're trying to reinvent a lot of wheels here. I'm not saying that projects like the OpenGL IDE are a bad thing, but they seem pretty tangential to this idea of having a knowledge-graph of sorts for Julia's ecosystem. If you think you can solve all of the issues around that, that's exactly what you should be doing, you know?

What stops you from implementing automated testing and benchmarking for GitHub, or providing access to documentation within an existing IDE? It doesn't preclude them working well together, since they're pretty orthogonal. And if nothing else, rebuilding all these things is several lifetimes worth of work.

Anyway, nothing wrong with being ambitious. Good luck to you both!

Simon Danisch

unread,
Apr 12, 2014, 7:46:40 AM4/12/14
to julia...@googlegroups.com
Well, the answer to this is pretty long...
I'd really like to keep things short, but I'm afraid I can't explain, why I think that this project is relevant in a short summary.

I'm not naive enough to think that two people could pull off something big like this.
But that's why you do things Open Source right? If the project is interesting enough, other people will ultimately join and help on the way.
I just hope we can offer a cool prototype soon, so that it gets interesting enough for other people to join.
But there are certainly things, that I feel more uncomfortable re-implementing.
One of them is Github.
Github + git are so advanced and well established, that it might just seem really, really stupid to re-implement it.
But if you think about it, Github is just not up to the task.
What we want to implement, is something very closely tailored to the language.
And that is not just because I randomly like Julia.
My ideas for this started at a point, where I was very frustrated with Open Source libraries.
I have a long list about things, that I'm quite frustrated about:
  1. It's always a big pain to find well documented libraries
  2. It's very difficult to compare similar libraries in terms of performance and functionality in a broader scope
  3. You do most of this research work with google, stack overflow, or github, which is not really specialized for this task
  4. Mostly things don't work well together, because everyone re-implements basic types, or exchange formats are not available or just suck
  5. it is very hard to extend libraries, as the structure is often quite rigid
  6. you always need to decide between languages, sacrificing either performance or development speed
  7. In the end, after you went through all these struggles, you mostly end up with missing dependencies and re-implementing a lot of things
I must admit, I start feeling less strongly about a few of these points, as things get easier with experience.
But that still means, that the bar is quite high for beginners.
 
So I started thinking about the root of these problems.
My first conclusion was, that most of the languages are not a good fit for Open Source development.
Especially Object Oriented languages seem to make things hard.
Introducing Classes and inheritance means introducing a lot of rigidity and complexity, which makes it hard to extend anything, without breaking other peoples code.
Or things are so intertwined, that you just can't figure out where to start changing things.
Well to keep things short, I concluded, that what one needs for a language are six things:
  1. Speed! Nothing is a bigger turn off, than being constantly reminded, that the implementation would be 500x faster in another language
  2. A very easy but concise type system, that just offers datatypes but isn't bound to any functionality
  3. Functions, that are basically just symbols and the right implementation gets called via pattern matching
  4. Good ways to easily define new syntax, to keep the language up to date
  5. A very good low level support, to keep things up to date with new hardware
  6. High-Level features - Obvious, If you don't want to scare everyone away
I searched long for a language fulfilling this and finally settled for Julia.
Multiple dispatch is not completely what I had in mind, but works pretty well and is probably a lot more performant than what I wanted to have.
I wanted to have arrays and dicts, and than do deep pattern matching on these to find the right implementation.
With this, a type would be just a pattern matching rule. (type Person MATCH [:name => String, :surname => String])

Well, having a language that fulfills these requirements, still doesn't settle the problems that I have with Open Source.
But a performant high level language already does one big thing: It doesn't turn off high performance nerds, but also doesn't scare away beginners.
This is a huge win for an Open Source user base.
The other thing, that gets introduced by multiple dispatch is, that reusing and extending code becomes a lot easier.
You don't need to inherit from any class, you don't need to hack into anything, you just simply create a new function and suddenly, you extended some package, without making things awkward.
This doesn't work too well for types in Julia yet, as you can't really extend anything, without breaking the whole thing and/or producing redundant code.
That is where a more advanced multiple dispatch would work a lot better.
For example with what I had in mind, you  could extend a type by just supplying a new pattern matching rule.
The old functions would still work, as they can find all the old fields, and a new function needing the more specific type, would just require one more field.

But besides these problems, I think Julia is a good fit for finally settling a lot of problems that many Open Source Libraries have today.
Now, one just needs good system, for organizing the resources, produced by the Open Source community, to fully harvest Julia's advantages.

I must admit, that github is quite a good fit for doing this and I'm always impressed by how sophisticated github already is.
But I came to the conclusion, that I want something more tailored to multiple dispatch and going away from source files.
I want to have all resources, not just the code, in one connected graph.
And I want to do global queries like:
  • give me functions that have the arguments (::Image, ::Matrix)
  • search ALL the docs for: "xxxxx"
  • is there example code for this function? (or where is this function used)
  • etc...
Also, I want to have things more atomic.
It's not really practical to have functions in a file or a class.
You always need to justify, where to put a function, and then you need to make sure, that it's available at another place.
Sometimes, it is very difficult to justify, where to put a certain function.
Just look at the fft function, or the voronoi function.
They are functions, which are best described, by their signature plus a few tags, but belong to a lot of places.
Putting them into a certain module/class/file just can't represent their many use cases.
And suddenly, you need to include a completely unrelated module into your program, just because it's the only one having the right fft function.

Also, if you treat functions more like entities on their own, I hope it will become a lot more common to make tests per function and not per module.
This obviously is not trivial sometimes.
Best example are the OpenGL functions, which rely on a previously generated context, or the effects of calling a gl function is very difficult to test.
But I think that shouldn't stop us to pursue a more atomic treatment of libraries, as you gain one big advantage.
It becomes very easy to re-implement a function.
You just upload a newer version, and if it survives the tests for the function and is for example faster, voila, you can accept it as a newer version.
Then you can discuss and rate the function.
And this, without having to deal with any forks, merges, the rest of the module, etc...
You don't really have a place for that in Github, as all the functionalities are defined for repositories.
Same goes for bug reports. If functions are an entity themselves in the system, you can just file in bug reports for single functions, which keeps things more concentrated on the location, where the error really happens.
Also, when all the resources are connected in a big graph, it becomes trivial to, for example, build a minimal version of Julia.
You just query all the dependencies( which could be any type of data) for your function calls.
This surely just works well, if Julia reduces C dependencies, or has them more atomic as well.

I don't want to go into even greater detail here. So lets just assume that you have all this magic running.
It then becomes obvious, that a good IDE is needed to utilize all these functionalities.
I could do this with already existent IDE's, that's definitely true.
But I think IDE's need to change a lot in the future.
I'd just feel very uncomfortable with extending an existing IDE, because I would constantly think: Well, this needs to be completely redone soon.
I'm just very frustrated, with how you browse code, packages, documentations, example codes, etc...
And the biggest frustration is, why don't the IDE's make more use of the dynamic features of the languages?
I want to write code like, for example, with Eclipse, and then run the program like with a REPL, having all the functions of the language available at runtime.
Then I could just manipulate and display variables at runtime, like I'm used to.
This would make debugging so much easier. Especially high level debugging, where you just don't get a mathematical equation right, without getting any error.

This is basically the reason, why I want to build up an IDE with the functionalities from a native graphic library.
So I imagine the IDE more as a bundle of functions for displaying and manipulating the Julia data types together with a sophisticated search interface, which itself just returns Julia data types (which can be also binary streams or whatnot).
Just look at all the beginner questions, that don't get how to create a multi-dimensional array in Julia.
With a good interactive view on an Array, this can become much easier.

Well, I think I shouldn't make this email much longer, so lets come to a conclusion:
I'm not very afraid of re-implementing things, as I think there a lot of thinks that are in need of improvement.
I'm much more afraid, that we don't have enough experience to judge things correctly, or lay down the needed ground-work.
That's why we are here in the mailing list, to get feedback from more experienced developers.
Also the thing that is most redundant to what I have in mind is actually the new Wolfram Language, which tackles basically most of the things I have mentioned.
When it got announced I actually thought about just dropping all my plans. But it seems, that the license is quite restrictive, things rely on their cloud, and the core isn't really open for developers.
So I hope, that our efforts are still relevant!

Best,
Simon

Páll Haraldsson

unread,
Mar 27, 2015, 1:43:06 PM3/27/15
to julia...@googlegroups.com

[I just want to say good luck with your system and it seems it would be useful. Still, I do not think Julia's success depends on it. It would only help it to be that much more powerful than other languages. Maybe it would also work with other languages?]

What you say is also my feeling (and compared to OO limitations, as you say):

"The other thing, that gets introduced by multiple dispatch is, that reusing and extending code becomes a lot easier."

I have just wandered, if my feeling isn't true as I haven't written anything big in Julia. I wander if there are cons as well as pros (vs. OO or (pure) functional). Or if something even better exists.

This language is built for scientific computation (and I understand how, the multiple part of dispatch, relates to operators making Julia a better fit), but it seems just powerful in general and even big non-scientific codebases would benefit, maybe even more.

Con?: I'm not sure many people miss (C++'s) (multiple) inheritance or choosing it over composition. I also want to get my head around how Julia interfaces with this old (C++) world (Keno's Cxx).

-- 
Palli.

Reply all
Reply to author
Forward
0 new messages