Julia's documentation system

Mike Innes

unread,

Apr 8, 2014, 2:40:51 PM4/8/14

to juli...@googlegroups.com

I’ve been thinking about how Julia’s documentation system could work, particularly in terms of editor support. There are probably already ideas floating around about how to do this, but I’ll throw in mine:

Doc strings are written as triple-quoted strings placed immediately before the relevant function/module. They are written as markdown, so that they can be richly displayed in interactive environments (Light Table, IJulia) and in the terminal (e.g. by using bold, underline and colours) when help(f) or ?f is called.

What this would mean is that inline docs can be displayed richly as you edit code. For example, you open up a .jl file in Light Table and it looks like this, with the docs automatically rendered:

module TestDoc

"""

# Docs Test

This text will be displayed as (a subset of) markdown as it is edited.

Fitting the text to a reasonable width is taken care of, so you let the

docs flow *without* worrying about **formatting**.

You can use code blocks too, of course:

2+2 = 4

"""

Square a number.

square(2) = 4

"""

function square(x) # A regular comment

return x*x #| A formatted one.

end

(And of course, if you open up in a normal editor it’s just plain text – but luckily plain markdown looks pretty good anyway)

This would be fairly simply implemented in Light Table, at least. Another way to signify docs might be to have a line of comments rather than strings, marked out as docs like this:

#| # Docs Test

#| This text will blah…

I’ll probably implement something like this in LT anyway, but it’s so much better if it’s directly supported by the language.

Any thoughts on this?

Mike Innes

unread,

Apr 8, 2014, 2:44:44 PM4/8/14

to juli...@googlegroups.com

Unfortunately, google groups has messed up my formatting :/ Sorry about that.

Try here for a better example: https://docs.google.com/document/d/1_SKpl2T8JZz6BRZLshj3YNDy--wYRbGZgNvop5zXe90/edit

Stefan Karpinski

unread,

Apr 8, 2014, 2:56:29 PM4/8/14

to juli...@googlegroups.com

I like this proposal. I had considered using a comment leading up to a function or module, but this might be better since a string is easily distinguishable and a string by itself is not useful. One thing that having it in the top-level context alleviates is that you can actually allow interpolation (although editors might not be able to handle that) and it could be quite useful for generated bits of documentation.

Although I strongly prefer markdown over other markup styles, I'm not certain that docstrings should be written in markup. To some extent, you want a format that conveys meaning, not format. You want a way of expressing what things are and what they mean – formatting is secondary. Of course, when you have a body of text, you inevitably want to format it appropriately, so it may be a mix. I think that markdown with conventions about meaning – these things are argument names, this is the return type, etc. could work well. That way you can either just render it as markdown or parse it for meaning or both (or just leave it as a string literal).

John Myles White

unread,

Apr 8, 2014, 2:59:40 PM4/8/14

to juli...@googlegroups.com

There is one type of formatting that the documentation system needs to support: LaTeX. The question is how much of LaTeX you need to support in order to provide interpretable documentation.

-- John

Stefan Karpinski

unread,

Apr 8, 2014, 3:10:36 PM4/8/14

to juli...@googlegroups.com

IPython's flavor of markdown supports embedded LaTeX. I think we should match IPython here as much as we can.

John Myles White

unread,

Apr 8, 2014, 3:11:48 PM4/8/14

to juli...@googlegroups.com

That seems like a good idea, but we probably need to commit to a specific Markdown renderer then, since they vary so much in their support for LaTeX and other things. Not sure which one IPython is using.

-- John

Mike Innes

unread,

Apr 8, 2014, 3:13:56 PM4/8/14

to juli...@googlegroups.com

If you mean LaTeX formatted equations, that's definitely doable. CodeMirror should be able to handle syntax highlighting with interpolation, too, and given the nature of LT you could even embed the results live.

I often find that, even when docs are plain text, I end up writing in a subset of markdown anyway. Backticks to quote code/variables, for example, occasionally asterisks for emphasis. Arguably those are to do with meaning rather than formatting, the formatting is just a bonus that helps convey the meaning better.

In any case, I definitely like the idea of markdown + conventions for structured information, as opposed to creating a syntax.

Rahul Dave

unread,

Apr 8, 2014, 3:16:38 PM4/8/14

to juli...@googlegroups.com, Mike Innes

Just checked: dosent seem that python (about 2.0) supports latex in the docstring yet (i.e. define a function in cell 1, try to use it in cell 2), but shouldn’t be super hard to support...

--
Rahul Dave
Sent with Airmail

Stefan Karpinski

unread,

Apr 8, 2014, 4:30:12 PM4/8/14

to Julia Dev, Mike Innes

One issue with LaTeX in a doc string and interpolation is that they both use $.

Mike Innes

unread,

Apr 8, 2014, 4:39:25 PM4/8/14

to Rahul Dave, juli...@googlegroups.com

I think that, ideally, we'd want a pure-Julia markdown parser – especially if it should support multiple back ends (i.e. both HTML and the terminal), we should be able to parse into a data structure and then emit as a separate step.

I know that sounds like it's over complicating things and making more work. But actually, short term, we wouldn't need markdown parsing at all – we just need to agree to bear in mind that docs might be richly displayed in principle. That would mean, for example, avoiding the assumption that paragraph text will be displayed in fixed width, and delimiting code with back ticks or indentation; using consistent notation for bullet points, headings etc.

The advantage of establishing these conventions are twofold: (a), the consistency makes things clearer even if docs are only ever plain text, and (b) it means we don't shut the door on having rich display in the future.

In short, what I'm saying is that we don't have to solve all the problems right now, but soon-ish we will need to decide whether we want to keep the door open.

Stefan Karpinski

unread,

Apr 8, 2014, 4:59:01 PM4/8/14

to Julia Dev, Rahul Dave

Seems sane to me.

Rahul Dave

unread,

Apr 8, 2014, 5:07:38 PM4/8/14

to Julia Dev, Stefan Karpinski

Maybe wrap this? (not the most extensible but 3 files==portable, simpler)

https://github.com/vmg/sundown

--
Rahul Dave
Sent with Airmail

John Myles White

unread,

Apr 8, 2014, 5:09:42 PM4/8/14

to juli...@googlegroups.com

I started doing that, but didn't have time to get it clean:

https://github.com/johnmyleswhite/Markdown.jl

-- John

Stefan Karpinski

unread,

Apr 8, 2014, 5:16:14 PM4/8/14

to Rahul Dave, Julia Dev

It's a good reference point, but parsers written in C are not notorious for being brief or easy to use. I just glanced at it though, so I'd have to look at it some more to make an informed judgement, of course.

John Myles White

unread,

Apr 8, 2014, 5:17:57 PM4/8/14

to juli...@googlegroups.com

Another problem: Sundown is deprecated and there's no clear sign of what the future standard parser will be.

-- John

Rahul Dave

unread,

Apr 8, 2014, 5:24:14 PM4/8/14

to juli...@googlegroups.com, John Myles White

Another reason for wanting this:

http://ashkenas.com/literate-coffeescript/

I suppose one can use dexy for command line scripts, but mixed with a custom module loader one could use this for ‘non-main-function’ code, with appropriate wrapping into a non-global scope (thus fast). Kinda like the ijulia notebook, but different, more like knitr.

--
Rahul Dave
Sent with Airmail

Simon Danisch

unread,

Apr 8, 2014, 5:26:24 PM4/8/14

to juli...@googlegroups.com, Jannis E

Hi,

this sounds like this issue has a big overlap with the bachelor thesis that a friend and I are working on.

We are working on an online graph database for code snipplets, example materials, media and documentations.

The first milestone is to parse the Julia packages and the documentation and feed it into the library.

My side is the visualization of the queries to the database, which is part of an experimental IDE that I'm working on.

Our exchange format is basically just the AST put into XML, flavored with some additional information.

As I want to have everything in native OpenGL, things are moving slowly, but I hope it pays off in the end.

Having the viewer implemented in OpenGL has some nice properties:

1. The OpenGL package profits from the efforts, as things need to work not just for people with a good OpenGL knowledge

2. All efforts spent on visualizing code, images, plots, layouts, etc in a nice way are directly usable from Julia

3. One can use all the render commands, which will be needed for other scientific visualizations.

=> This enables to visualize complex data in documentations and example code, in the way people are used to from the native OpenGL visualization functions.

4. Things can be fully interactive and editable, making it easy to push corrections back to the online library.

This surely builds upon the assumption, that there is an OpenGL visualization package.

Well, I don't want to hijack this thread, but it seemed like a perfect fit!

I wanted to open a thread about our project soon anyways, so I guess I should just do so and things can be discussed from there!