Does Julia have something similar to Python's documentation string?

1,273 views
Skip to first unread message

Xiao FENG

unread,
Aug 23, 2014, 10:53:33 PM8/23/14
to julia...@googlegroups.com

Thanks.

Ivar Nesje

unread,
Aug 24, 2014, 5:07:30 PM8/24/14
to julia...@googlegroups.com
Not yet.

Job van der Zwan

unread,
Aug 24, 2014, 6:03:04 PM8/24/14
to julia...@googlegroups.com
Any plans? Discussions on Github worth reading through?

I personally am really charmed by the godoc approach - could something like that work for Julia? (so figuring out a sensible idiomatic way to document functions and modules that makes the documentation easy to read in plaintext, but also easy to be turned into pretty formatted documentation by tools)

On Sunday, 24 August 2014 23:07:30 UTC+2, Ivar Nesje wrote:
Not yet.

Jason Knight

unread,
Aug 24, 2014, 7:23:26 PM8/24/14
to julia...@googlegroups.com

Stefan Karpinski

unread,
Aug 24, 2014, 7:43:11 PM8/24/14
to julia...@googlegroups.com
I really like godoc – that's basically what I want plus a convention that the doc strings are markdown.

Job van der Zwan

unread,
Aug 25, 2014, 6:01:44 PM8/25/14
to julia...@googlegroups.com
On Monday, 25 August 2014 01:23:26 UTC+2, Jason Knight wrote:
Happy reading: https://github.com/JuliaLang/julia/issues/3988 :)

Thanks, that was indeed interesting :)


On Monday, 25 August 2014 01:43:11 UTC+2, Stefan Karpinski wrote:
I really like godoc – that's basically what I want plus a convention that the doc strings are markdown.

From what I understand of the discussion linked above, the suggested approach is a @doc macro followed by a string, making documentation part of compiling the code, correct? The godoc approach is different in two ways: documentation is not part of the runtime but a separate tool that parses Go source files, and it extracts documentation from the comments, based on where they are placed.

The former part of the difference is just a consequence of how Go and Julia are used differently, so probably not that relevant, but Go's approach of using comments to indicate documentation sounds more sensible to me - documentation is what comments are for, are they not? Then why not suggest an idiomatic way to use the comments, and make a tool/the Julia runtime capable of extracting documentation information from that structure?

Mind you, I don't use Python so perhaps this is also a personal matter of not being used to docstrings.

John Myles White

unread,
Aug 25, 2014, 6:04:41 PM8/25/14
to julia...@googlegroups.com
The issue is that you want to have all code documentation show up in REPL. In the GoDoc approach, this might require an explicit "build" step -- which is a non-trivial cost in usability.

 -- John

Job van der Zwan

unread,
Aug 26, 2014, 3:44:05 AM8/26/14
to julia...@googlegroups.com
On Tuesday, 26 August 2014 00:04:41 UTC+2, John Myles White wrote:
The issue is that you want to have all code documentation show up in REPL. In the GoDoc approach, this might require an explicit "build" step -- which is a non-trivial cost in usability.

 -- John

I assume you talking about GoDoc as a tool?

In case you are referring to comments as the source of documentation instead of docstrings: I assume comments are now simply discarded during compilation, making it impossible to use them for documentation, but if that could be changed they should be just as valid as the format for documentation, right?

John Myles White

unread,
Aug 26, 2014, 11:32:26 AM8/26/14
to julia...@googlegroups.com
No, I was talking about what I understood to be a design principle of GoDoc: doc generation and parsing occurs at doc-gen time, not at run-time.

Yes, you would have to make comments non-ignorable to get this to work.

 — John

Stefan Karpinski

unread,
Aug 26, 2014, 11:34:24 AM8/26/14
to Julia Users
To clarify – I meant that I like the style of GoDoc, not the fact that you run the tool as a separate pass. That doesn't strike me as completely out of the question, but wouldn't be optimal.

Xiao FENG

unread,
Aug 27, 2014, 8:15:56 AM8/27/14
to julia...@googlegroups.com
Glad to see this discussion.  Thank you all -- especially to Jason for the link.

Job van der Zwan

unread,
Aug 27, 2014, 5:54:44 PM8/27/14
to julia...@googlegroups.com
Right, that's what I meant with GoDoc being a separate tool: Go is statically compiled and does not have something like a REPL or runtime evaluation, so being a separate tool is only logical. In that sense it's not a comparable situation.

The comments-as-documentation and the conventions used to make it work might still be worth looking into.

I personally feel that from the point of view of people using Julia it's a better option than introducing docstrings - comments are already the source-level form of documentation-for-humans after all. Introducing docstrings feels like creating two different options for the same role, except one is ignored by tools and the other isn't. That just feels unelegant to me (not the strongest argument, I know), and I worry that code with both would become visually more noisy.

I just googled for possible reasons for having both docstrings and comments, and the only argument I found is that one describes the what and the other a how. GoDoc only counts comments above the package/variable/function definition as documentation, and ignoring comments inside a function body or struct definition. Since the former typically documents the what and the latter the how anyway, that distinction automatically emerges through convention.

Of course, if "not discarding comments during compilation" would require a major overhaul to the compiler and docstrings are technically much easier to introduce I can understand if that option is more appealing - a less elegant feasible solution is better than an inelegant infeasable one. And perhaps there are other arguments in favour of having both docstrings and comments that I'm not aware of?

John Myles White

unread,
Aug 27, 2014, 5:57:14 PM8/27/14
to julia...@googlegroups.com
Ok, thanks for clarifying. I also like the idea of strategically placed comments as automatic documentation.

 -- John

Leah Hanson

unread,
Aug 27, 2014, 6:07:05 PM8/27/14
to julia...@googlegroups.com
I like OCaml's approach, which uses comments above functions/etc, and has a neat plugin system to output them in various formats.

John Myles White

unread,
Aug 27, 2014, 6:09:24 PM8/27/14
to julia...@googlegroups.com

Jason

unread,
Aug 27, 2014, 6:22:57 PM8/27/14
to julia...@googlegroups.com
Haskell also has a similar approach with Haddock, which allows the generation of documentation for the entire package ecosystem at one location: http://hackage.haskell.org/

Steven G. Johnson

unread,
Aug 28, 2014, 6:31:19 AM8/28/14
to julia...@googlegroups.com, ja...@jasonknight.us
Another problem with using comments, besides the requirement that Stefan pointed out of a separate processing pass (as opposed to automatic availability of metadata at runtime like in Python) is that then the metadata is not Julia: we lose the flexibility of the metadata being arbitrary Julia objects, as opposed to just strings.

My proposal (in the abovementioned issue) is that the metadata be any Julia object, when then gets converted into output formats by using writemime methods.   For example, if the object has writemime(io, "text/markdown", x), then you can get output in Markdown format (probably faciliated by an md"...." constructor).   But if at some later point in time you want to attach SVG documentation, or some future documentation format, then you can easily do that by defining an appropriate container object.   And since the objects can be queried to find out what MIME formats they support, you can perform automated translation between different formats (e.g. conversion of markdown and plain-text docstrings to HTML or LaTeX output).

Of course, with comment-based documentation then we could theoretically embed a little programming language (TeX?) in the comments to achieve the same thing, but since we already have a perfectly good programming language (we hope!), this seems silly.   It is more likely that a comment-based documentation implementation would use a fixed format, something we are then stuck with for all time.

Job van der Zwan

unread,
Aug 28, 2014, 9:16:27 AM8/28/14
to julia...@googlegroups.com, ja...@jasonknight.us
Could we not have both, in a way? A sensible convention for comment-based documentation using markdown, which I expect covers the vast majority of usecases (being human-readable plaintext that converts to rich text). During compilation that documentation is converted and added to the global dictionary of metadata you propose.

So in the following case:

# documentation of function f
# foo
function f()

f() would be the key in the global dictionary, and the preceding the comments would be converted to markdown format and associated with that key.

I'm sure that would cover the majority of the usecases, and lead to prettier, well documented plain text source code. At the same time the machinery you suggested can be added to it for the more complicated cases that need it.

Jake Bolewski

unread,
Aug 28, 2014, 11:15:27 AM8/28/14
to julia...@googlegroups.com, ja...@jasonknight.us
I really like Steven Johnson's proposal.  I've often imagined a similar system, although I never considered using writemime as a generalization mechanism.  It seems like consensus is slowly building around embedding compiler specific metadata directly into the AST with a metadata node.  It would seem natural to use this for documentation as well.  Comments would similarly be embedded into the AST instead of being skipped.  That way documentation lives with a julia expression, and since the AST can could contain arbitrary Julia values (if we have a pure Julia frontend at some point) we get the full generality Steven outlined above.  Since documentation is now first class, the reflection / metaprogramming mechanisms we use in Julia could be extended to generating documentation.  Documentation, would also have access to the literal values of the runtime system instead of just symbolic values.  Documentation is dynamic and introspective, it has full access to the julia source code and parsed AST it references.  It would certainly be a more powerful system than anything out there I am aware of. 

Steven G. Johnson

unread,
Aug 28, 2014, 4:20:46 PM8/28/14
to julia...@googlegroups.com, ja...@jasonknight.us
On Thursday, August 28, 2014 9:16:27 AM UTC-4, Job van der Zwan wrote:
Could we not have both, in a way? A sensible convention for comment-based documentation using markdown, which I expect covers the vast majority of usecases (being human-readable plaintext that converts to rich text). During compilation that documentation is converted and added to the global dictionary of metadata you propose.

 I was thinking more along the lines of:

doc md""" ... markdown docs for specific method foo(...) ... """
function foo(...)
   ...
end

doc md""" ... markdown docs for foo Function general ... """ foo

which would require some parser support (though it should be easy to implement), but is much more flexible than embedding things in comments.  e.g. you can use arbitrary Julia code to evaluate/generate the documentation object.   It also keeps comments "pure" ... comments should not be part of the language or have any format that Julia cares about.

You could also extend it to add other metadata with keywords: doc section="Foo Functions" author="SGJ" md""" ... """.

Stefan Karpinski

unread,
Aug 28, 2014, 4:50:12 PM8/28/14
to Julia Users, ja...@jasonknight.us
Oh, man. I'm sorry but that makes my eyes bleed. Why can't we just associate the literal content of the comments with the appropriate function objects and other bindings and then leave the interpretation of those blocks of text to the presentation layer? I don't see much to be gained by making this completely programmable.

Leah Hanson

unread,
Aug 28, 2014, 4:50:46 PM8/28/14
to julia...@googlegroups.com
Yes, that's it. I've partially written a custom generator for it in the past; plugging into the API was pretty easy, the hard part was understanding some of the internal type-representation data structures that it exposed. The API exposes the AST to the plugin, which allows the plugin to be more powerful than just a list of comments. Like Go, OCaml uses a separate tool for building documentation; however, this is not the cool part that I wanted to point out. The interesting idea to me was that it exposes the AST and metadata to any interested OCaml plugin, and that it was pretty straight-forward to get a basic plugin working.

It's important to me that there's an API for accessing the metadata & associated AST in a way that allows for more creative displays (i.e. Haskell's fancy type-aware search engine Hoogle (built using Haddock) or some other visualization or fancy IDE integration). This would also be an API that a package for building documentation websites/manuals could use.

I like Steven Johnson's proposal, and I think it already allows/has this -- since you can look up the metadata for a Function/Method/Module/etc, and you can already get those within Julia (all functions/etc in a Module, etc), this should "just work".

(I was initially expecting a comment-base approach, which I've seen work well in languages where I've used it (OCaml, Java) with a notation (@author, etc) for metadata embedded in comments. However, I think Julia has much cooler things it can do with special string parsers (md"") and writemime, and the flexibility of documentation format should really benefit from that. A potential oddness is that we might end up with "dependencies only for documentation" the way we have dependencies only for testing, since you might have a package that defines your special documentation string, but isn't otherwise used in your code.)

Stefan Karpinski

unread,
Aug 28, 2014, 4:53:59 PM8/28/14
to Julia Users
We don't need flexible documentation – we need one simple documentation system that works.

ron...@gmail.com

unread,
Aug 28, 2014, 4:56:21 PM8/28/14
to julia...@googlegroups.com, ja...@jasonknight.us
In the meantime, is it possible to write a julia function that can parse the text of a previously defined/included function, and echo the first comment block, in the same way that the help command works in Matlab?

John Myles White

unread,
Aug 28, 2014, 5:05:42 PM8/28/14
to julia...@googlegroups.com
As we're starting to get better ideas for a documentation system, two questions I have are how we do two things:

(1) Handle documentation of generic functions and their specialized methods without requiring documentation of all specialized methods.

(2) Handle documentation of functions that being generated by macros.

Both of these come up as soon as you start writing documentation for things like getindex. We definitely don't want to require writing a comment block for every method of getindex.

 -- John

Stefan Karpinski

unread,
Aug 28, 2014, 5:30:00 PM8/28/14
to Julia Users
For that, I think it would suffice to have a programmatic way of manipulating the associated data. The comments desugar to doing that, but you can also just do it directly.

Steven G. Johnson

unread,
Aug 29, 2014, 6:54:04 AM8/29/14
to julia...@googlegroups.com
On Thursday, August 28, 2014 5:05:42 PM UTC-4, John Myles White wrote:
As we're starting to get better ideas for a documentation system, two questions I have are how we do two things:

(1) Handle documentation of generic functions and their specialized methods without requiring documentation of all specialized methods.

Note that my proposal includes this.  If you have generic documentation that applies to all methods, then you attach the documentation to the Function object.  If you have method-specific docs, you attach it to the Method object.

(Note that I think it will be annoying to implement this functionality by parsing comments.)
 

(2) Handle documentation of functions that being generated by macros.

As above.  Writing

     doc ....docs.... getindex

will attach docs to the getindex Function, i.e. it is not method-specific.

Steven G. Johnson

unread,
Aug 29, 2014, 6:59:21 AM8/29/14
to julia...@googlegroups.com


On Thursday, August 28, 2014 4:53:59 PM UTC-4, Stefan Karpinski wrote:
We don't need flexible documentation – we need one simple documentation system that works.

I don't think that simplicity needs to come at the price of flexibility.

And if you sacrifice flexibility, you might easily end up with something that works for now, but is an annoyance in a few years.  (It is really hard to add more structured information into Python docstrings, for example.)

I think putting documentation in something that is not syntactically meaningful -- comments -- will get more and more annoying, because you'll end up inventing more and more ad-hoc rules for how to associate which documentation with which object (e.g. how do you associate the docstring comment with a constant?  with a Function as opposed to a specific method) etc.  Not to mention that the documentation system will have to decide which comments are docstrings and which comments are just internal comments by programmers who didn't intend for the comment to be an end-user doc.

Job van der Zwan

unread,
Aug 29, 2014, 8:41:06 AM8/29/14
to julia...@googlegroups.com
On Friday, 29 August 2014 12:59:21 UTC+2, Steven G. Johnson wrote:
And if you sacrifice flexibility, you might easily end up with something that works for now, but is an annoyance in a few years.  (It is really hard to add more structured information into Python docstrings, for example.)

If you include programmability and flexibility when it's not needed, you invite abuse. Especially if the target audience isn't primarily trained as programmers. We could choose to have different multiline comments count as different documentation blocks, and allow declaring a mimetype per comment block:

#=
  no mimetype, defaults to markdown
=#
#= image/svg+xml
  [svg data, or link to source]
=#
function f()


That should give enough flexibility and room for growth, I would say.


I think putting documentation in something that is not syntactically meaningful -- comments -- will get more and more annoying, because you'll end up inventing more and more ad-hoc rules for how to associate which documentation with which object

First, if we design a proper convention for comments to have them work as a form of documentation, then don't they by definition become syntactically meaningful? It's a bit of a false premise. Also, the association works the same way it works for code: by placing the right tokens next to each other, following a certain standard.
 
e.g. how do you associate the docstring comment with a constant?

Is the constant publically visible? Then it's probably top level, so a comment directly above it works. That's what Go does.
 
  with a Function as opposed to a specific method

Ok, I admit this one is a bit trickier. But if we still have the store-in-a-dictionary approach you proposed (which as far as I can see does not conflict with comment-based documentation) one can directly add documentation to the dictionary in the same way one would for any dictionary.
 
Not to mention that the documentation system will have to decide which comments are docstrings and which comments are just internal comments by programmers who didn't intend for the comment to be an end-user doc.

That is essentially arguing using a convention is a bad idea because some people won't be aware of it and accidentally will show comments not relevant to the outside world, yet I have seen no examples of this confusion in the Go community.

Job van der Zwan

unread,
Aug 29, 2014, 8:42:57 AM8/29/14
to julia...@googlegroups.com
As an interaction designer I think there could also be another way to approach this problem - from the user's point of view.

I'd say there's two users to keep in mind here:

- the person who writes the documentation
- the person who reads the documentation

We need to ask: when will the users want do this, and how?

In the readers case, there's plenty of places. When looking up what a function does in IJulia - nice formatting, searchability a la Hoogle seem valuable here. When reading the source code of a .jl code directly is also an option. Then we want to have a convention that ensures the documentation is nicely laid out among the code - comment-based documentation with a convention that guides good code commenting (a la Go) makes more sense here.

In the writers case, perhaps in the repl? Then a macro attaching documentation to an existing function makes sense. In a .jl source file? In the latter case, I think it makes more sense to have the documentation right next to the thing being documented, following a sensible convention for readability.

Andreas Lobinger

unread,
Aug 29, 2014, 9:07:53 AM8/29/14
to julia...@googlegroups.com
Hello colleague,

thank you for your post. I was about to write something similar, as i see the discussion (again) talks about technical topics and things that might happen in the future.

While (imho) we need to write down a list of problems the documentation system should address and then find a method to put information into or near code with low or zero overhead for the person writing it.


Steven G. Johnson

unread,
Aug 29, 2014, 9:15:43 AM8/29/14
to julia...@googlegroups.com
 
Is the constant publically visible? Then it's probably top level, so a comment directly above it works. That's what Go does.

This would require a separate comment-processing pass, which is possible but suboptimal (because unlike Go, Julia is a dynamic language and generally has no discrete "compilation pass").   Stefan's suggestion of just storing the comments in the AST for the function doesn't apply here, because constants don't store an AST.

Magnus Lie Hetland

unread,
Sep 9, 2014, 6:53:40 AM9/9/14
to julia...@googlegroups.com, ja...@jasonknight.us
I have some level of eye-bleed from this and several other suggestions, too. The look of comments is part of the language design, and they are (IMO) unobtrusive yet visually indistinct -- and (for the single-line ones, at least) highly unsurprising and conventional. All of which I think is good. I'd *very much* prefer a solution that simply used the last comment before a method as the documentation for it, and having the convention of using Markdown in them, as Stefan argues.

There's talk about using Julia instead of some other language for more complex comment stuff. I guess that depend on what you want to use them for (or if you really want general metadata, rather than documentation). For marking up documentation text, I think a markup language is a good choice. For documentation comments, I think comments are a good choice ;-)

However, if one wants more programmability, would it be possible to treat comments as a special form of string literals in themselves (like docstrings), using the existing syntax? I'm assuming they'd just be eliminated from the compiled code, but would be available in the AST. Then one could use the existing Julia syntax for substituting values into the documentation, like:

# This is a comment. 1 + 2 = $(1 + 2)

I'm not sure I'd have any use for the extra programmability, and it doesn't mean that the comment/string could end up as anything other than a string. (There are, I guess, lots of suggestions for handling the latter issue already.)

But, yeah, I wholeheartedly agree with Stefan in that we "don't need flexible documentation – we need one simple documentation system that works."

Ross Boylan

unread,
Sep 9, 2014, 1:36:08 PM9/9/14
to julia...@googlegroups.com
How would documentation handle type information for the arguments to a
method? There are 3 possible sources: the comments, the text of the
function arguments (e.g. someArg::FooType), and the compiler.

The ::FooType notation will not always be present. The comments could
just be wrong. So it seems there's an argument for getting this info
from the compiler, which is perhaps an argument in favor of the
"comments as AST metadata" approach.

Also, even if the argument is declared ::FooType it may be that only
some subtypes are permitted because of the way the argument is used in
the body of the function.

For various purposes one might be interested in abstract types,
concrete types, or both.

An even messier question is which concrete types could actually be
used, or are actually used in a particular run.

Ross Boylan

Steven G. Johnson

unread,
Sep 11, 2014, 10:24:38 AM9/11/14
to julia...@googlegroups.com


On Tuesday, September 9, 2014 1:36:08 PM UTC-4, Ross Boylan wrote:
How would documentation handle type information for the arguments to a
method?  There are 3 possible sources: the comments, the text of the
function arguments (e.g. someArg::FooType), and the compiler.

Also handled by my proposal.    Documentation that is specific to the types of the arguments would be attached to the Method object in Julia, and Method objects include information about the argument types etc.    (Documentation that is generic to all methods of a particular function would be attached to the Function object).

I really think that embedding documentation in comments is a mistake.  Documentation needs to be a part of the language, included in a semantically meaningful way, not an add-on that is semantically meaningless (comments). 

Rafael Fourquet

unread,
Sep 11, 2014, 12:12:44 PM9/11/14
to julia...@googlegroups.com
Then one could use the existing Julia syntax for substituting values into the documentation, like:

# This is a comment. 1 + 2 = $(1 + 2)

I don't have a strong opinion on this topic, but I really don't understand why this is better than using directly:

@doc "This is a doc string: 1 + 2 = $(1+2)"

What is lost when using a string compared to comments?

Ivar Nesje

unread,
Sep 11, 2014, 12:23:54 PM9/11/14
to julia...@googlegroups.com
If code in a comment has side effects, the whole thing seems like a very bad idea, because code that was commented out, might be executed.

Jason

unread,
Sep 11, 2014, 12:38:52 PM9/11/14
to julia...@googlegroups.com
I don't have a strong opinion on this topic, but I really don't understand why this is better than using directly:

@doc "This is a doc string: 1 + 2 = $(1+2)"

What is lost when using a string compared to comments?

The primary difference to me is the ease of expansion that comes along with comments. Ie, if there is already one line of comments, I just hit newline in my editor, and it automatically adds a new comment leader and I can add an additional thought to the comment. Whereas with strings and markup, I have to go and tidy up the ending quotations.

But if we're proposing modifying the parser anyways, then we could get the best of both worlds with a doc keyword that is line terminated or block terminated just like the let keyword:

doc begin
  lots of documentation, blah
  blah
end

also one lined:

doc Foo is a function with ....

and with optional arguments for specifying documentation type:

doc rst Using some restructuredtext here.

Steven G. Johnson

unread,
Sep 11, 2014, 2:00:41 PM9/11/14
to julia...@googlegroups.com, ja...@jasonknight.us

The primary difference to me is the ease of expansion that comes along with comments. Ie, if there is already one line of comments, I just hit newline in my editor, and it automatically adds a new comment leader and I can add an additional thought to the comment. Whereas with strings and markup, I have to go and tidy up the ending quotations.

Given triple-quoted string literals, I don't see what the problem is.   Why is it hard to insert a new line in:

   doc """
       blah blah
        blah
   """
   function foo(...)
     ...
   end

 
But if we're proposing modifying the parser anyways, then we could get the best of both worlds with a doc keyword that is line terminated or block terminated just like the let keyword:

doc begin
  lots of documentation, blah
  blah
end

Why is begin...end better than """....""" ?
 

Jason

unread,
Sep 11, 2014, 4:10:25 PM9/11/14
to julia...@googlegroups.com
Why is begin...end better than """....""" ? 

For block documentation they are equivalent, but the triple quotes are heavy for lots of single line comments. Eg: look at the average comment length of this randomly chosen Haskell source file.

But in the end, it's just bikeshedding over style at this point. It looks like most of us are in agreement about:
  1. Coupling the documentation to the AST
  2. Documentation being markup agnostic/flexible 
Now it's just a matter of syntax. Which ironically can sometimes derail entire language features for years at a time. Eg: better record syntax in Haskell has been in the need of the right syntax (although semantics are a hangup there as well) for many years.

Perhaps we need a temporary BDFL for Julia to just make an arbitrary (good) decision and get us all into the glorious days of documented packages.

Leah Hanson

unread,
Sep 11, 2014, 4:18:12 PM9/11/14
to julia...@googlegroups.com
If I understand correctly, Docile.jl is a macro-based implementation of SGJ's suggestion, right? So if we're in agreement about non-comment-based documentation, we could start using that now, and later switch from "@doc" to the keyword "doc" when it's implemented.

Are any packages documented with Docile? That would be a good illustration of how well this works.

-- Leah

Tim Holy

unread,
Sep 11, 2014, 4:56:35 PM9/11/14
to julia...@googlegroups.com
I hadn't looked at Docile in a long time, and from the commit history clearly
there has been a lot of recent development.

Based on a very brief look, I'd say it's so much better than what we (don't)
have that, as long as Michael says he's committed to continuing its
development, I'd favor merging it to base in the next 30 seconds or so.

Seriously. This has dragged on so long, let's go for it. Docile looks very
good, and as we discover we need more features or changes in behavior, we can
do it.

--Tim



On Thursday, September 11, 2014 03:17:45 PM Leah Hanson wrote:
> If I understand correctly, Docile.jl is a macro-based implementation of
> SGJ's suggestion, right? So if we're in agreement about non-comment-based
> documentation, we could start using that now, and later switch from "@doc"
> to the keyword "doc" when it's implemented.
>
> Are any packages documented with Docile? That would be a good illustration
> of how well this works.
>
> -- Leah
>
> On Thu, Sep 11, 2014 at 3:10 PM, Jason <Ja...@jasonknight.us> wrote:
> > Why is begin...end better than """....""" ?
> >
> >
> > For block documentation they are equivalent, but the triple quotes are
> > heavy for lots of single line comments. Eg: look at the average comment
> > length of this randomly chosen Haskell source file
> > <http://hackage.haskell.org/package/vector-0.6.0.2/docs/src/Data-Vector.ht
> > ml> .
> >
> > But in the end, it's just bikeshedding over style at this point. It looks
> >
> > like most of us are in agreement about:
> > 1. Coupling the documentation to the AST
> > 2. Documentation being markup agnostic/flexible
> >
> > Now it's just a matter of syntax. Which ironically can sometimes derail
> > entire language features for years at a time. Eg: better record syntax
> > <https://ghc.haskell.org/trac/ghc/wiki/Records> in Haskell has been in
> > the need of the right syntax (although semantics are a hangup there as
> > well) for many years.
> >
> > Perhaps we need a temporary BDFL
> > <http://www.wikiwand.com/en/Benevolent_dictator_for_life> for Julia to

Michael Hatherly

unread,
Sep 11, 2014, 5:19:59 PM9/11/14
to julia...@googlegroups.com
Docile.jl author here,

When I began writing it had some Steven's ideas in mind from one of the earlier discussion here (or GitHub issues list perhaps).

I had initially though of following go's use of comments above code objects to document them, but that doesn't allow for interpolating data from the module into the docstrings, which I believe Stefan had suggested at some point. Doing this allows you to programmatically generate docstrings such as when generating functions using `for` and `@eval` loops. You wouldn't be able to do this with comment in their current form and I'd think it wise to just leave comment as they are.

I'd be very much in favour of a `doc` keyword rather than the current macro I'm using, but it's got me surprisingly far.

Docile is, for the most part, self-documenting. `@doc` itself can't be documented using `@doc` unfortunately, perhaps there's some way around that. Spencer Lyon mentioned recently his interest in using it for one of his own packages.

-- Mike

Michael Hatherly

unread,
Sep 11, 2014, 5:28:39 PM9/11/14
to julia...@googlegroups.com
I am committed to continuing work on this, though other work can limit the amount of time I have. There's still some rough edges, and I'm not sure how to overcome some difficulties such as `@doc` not being able to document itself.

-- Mike

Leah Hanson

unread,
Sep 11, 2014, 5:37:38 PM9/11/14
to julia...@googlegroups.com
Could you manually add the `@doc` documentation to the _METDATA_ object? The macro edits a variable, which you should be able to do outside the macro as well, right?

-- Leah

Michael Hatherly

unread,
Sep 11, 2014, 5:58:49 PM9/11/14
to julia...@googlegroups.com
I'm doing it using an internal macro `@docref` [1] to track the line number and then in `src/doc.jl` I store the documentation in `__METADATA__` along with the line and source file found using `@docref`. A bit hacky, but it's only for a couple of docs.

[1] https://github.com/MichaelHatherly/Docile.jl/blob/c82675d4a39932d1d378e954844018cefc091858/src/Docile.jl#L14-L17

Leah Hanson

unread,
Sep 11, 2014, 6:06:56 PM9/11/14
to julia...@googlegroups.com
Oh, I missed that. That's totally the approach I would take, and I don't really see it as a problem to use a separate channel to document the documentation functions/macros. It seems like a messiness related more to bootstrapping (documenting using the system you're writing) rather than a design problem.

I guess the need to document @doc goes away if it become the keyword doc, since you would need some separate way to document keywords (if you were going to do that) anyway.

-- Leah

Michael Hatherly

unread,
Sep 11, 2014, 6:35:42 PM9/11/14
to julia...@googlegroups.com
Yeah, that's how I had be rationalising it to myself, I'm glad it wasn't just me.

-- Mike

Francesco Bonazzi

unread,
Sep 12, 2014, 11:29:36 AM9/12/14
to julia...@googlegroups.com
Given my experience with Python's docstrings, I wish there could be an easy way to execute the docstring examples inside the REPL, especially when they are of the form:

"""
    >>> some-code
    return-value
"""

If the documentation is an object, and not a string, the return-value could be generated together with the documentation, or otherwise have its correctness tested.

Another great headache in Python is debugging the documentation examples, since the code-to-string-to-code conversions lose the trace of the debugger. I would prefer the documentation (especially the documentation examples) to be part of the program's AST, rather than a string.

Anyways, it would be nice to be able to execute the documentation examples from the REPL, and possibly also get the AST of the documentation examples inside the REPL.

Michael Hatherly

unread,
Sep 12, 2014, 8:26:18 PM9/12/14
to julia...@googlegroups.com
Hi Francesco,

Docile.jl partially covers what I think you're wanting out of your docstrings, namely testing examples for correctness. I've been thinking about exporting the docstrings to ijulia notebooks which might provide a more interactive experience in some cases.

Running examples from the REPL can be done with a quick copy/paste from the docs that show up when using `@query`.

I've not used python much, or the it's docstrings, so if you've got ideas that you'd like to see here feel free email or open issues at https://github.com/MichaelHatherly/Docile.jl.

-- Mike

Stefan Karpinski

unread,
Sep 13, 2014, 5:18:54 AM9/13/14
to Julia Users
On Thu, Sep 11, 2014 at 4:24 PM, Steven G. Johnson <steve...@gmail.com> wrote:
I really think that embedding documentation in comments is a mistake. Documentation needs to be a part of the language, included in a semantically meaningful way, not an add-on that is semantically meaningless (comments).

Either the documentation affects the meaning of the code or it doesn't. Type annotations give us a lot of functional documentation, with the advantage that it can't be out of sync with the code since it *is* the code. I'm in favor of having as much of this kind of "documentation" as possible.

There will always, however, be more documentation that doesn't affect the behavior of the code. I'm not even sure what it means for that kind of documentation to be "semantically meaningful" – you mean that it has a specified format means something? I'm not sure why that has anything to do with whether this is in comments or not. How is a string that you parse and associate with a function more or less meaningful than a comment that you parse and associate with a function? To me the only difference is that I really don't want to write

@doc """
commentary
"""
function ...

whereas I already write things along the lines of

# commentary
function ...

all the time. All the extra syntax makes this kind of documentation feel heavy and awkward instead of light and natural. In my experience, heavy, awkward things tend not to get used while light natural things tend to get used.

Rafael Fourquet

unread,
Sep 13, 2014, 6:03:34 AM9/13/14
to julia...@googlegroups.com
> To me the only difference is that I
`> really don't want to write
>
> @doc """
> commentary
> """
> function ...
>
>
> whereas I already write things along the lines of
>
> # commentary
> function ...

doc "function doc"
function ...

is already better, and then let's get rid of even the doc keyword. It would be kind of less breaking a change, as currently comments are mainly written for developpers consumption and not meant for documenting public API and would need to be fixed all at once. As both developper comments and API documentation are needed, I find it useful to have two distincts means: comments and strings.

Mauro

unread,
Sep 13, 2014, 9:11:14 AM9/13/14
to julia...@googlegroups.com
How about using a colon at the end of a doc string? It would signifying
that the string belongs to the object following and it is light on the
eye.

This would look like:

"Function documentation, blah":
f(x) = 2x

and

"""
A longer function documentation.
- blah
- blah
""":
f(x) = 2x

(Also, maybe the syntax could require for the documented object to
follow without an empty line.)
--

Mauro

unread,
Sep 13, 2014, 9:24:29 AM9/13/14
to julia...@googlegroups.com
And one on front might work too for one liners:

x = 5 :"this is #5"
f(x) = 2x :"double me"

Others will have to comment on whether that would not conflict with
other colon syntax.

Francesco Bonazzi

unread,
Sep 15, 2014, 3:29:08 AM9/15/14
to julia...@googlegroups.com


On Saturday, September 13, 2014 2:26:18 AM UTC+2, Michael Hatherly wrote:
Hi Francesco,

Docile.jl partially covers what I think you're wanting out of your docstrings, namely testing examples for correctness. I've been thinking about exporting the docstrings to ijulia notebooks which might provide a more interactive experience in some cases.


Docile.jl looks great, but I think that the API should be made into comments. One of Julia's goals is to have a simple syntax that even people who are not acquainted with programming can easily understand.

I still think that the best solution would be to create a new AST object to handle blocks of comments tagged as documentation. Maybe blocks of comments starting with # followed by a special character. I also think that documentation examples/tests should also be parsed by Julia's parser as valid AST objects, that is blocks containing input code and expected answers.

I believe that a tagged comment is much more readable than a block introduced by @doc or doc.

Possible examples:
##
# docstring
#
# Metadata {
# key1 => value1
# }
#
# Examples
# ========
#
# julia> f(3)
# 9
#
function f(x)
    x
^2
end

or an other idea:

#=
docstring
for square number [references.wikip]

.metadata {
  key1
=> value1,
  key2
=> value2,
 
...
}

Examples
========

julia
> f(3)
9

.references (
  wikip
: https://en.wikipedia.org/wiki/Square_number
)
=#
function f(x)
    x
^2
end




Rafael Fourquet

unread,
Sep 15, 2014, 3:39:19 AM9/15/14
to julia...@googlegroups.com
Docile.jl looks great, but I think that the API should be made into comments. One of Julia's goals is to have a simple syntax that even people who are not acquainted with programming can easily understand.

Python, despite using docstrings, is a great example of a language having "a simple syntax that ...  understand"
 
I believe that a tagged comment is much more readable than a block introduced by @doc or doc.

"Much more readable" is maybe a bit exaggerated, can you explain why you believe so?

Francesco Bonazzi

unread,
Sep 15, 2014, 4:48:52 AM9/15/14
to julia...@googlegroups.com


On Monday, September 15, 2014 9:39:19 AM UTC+2, Rafael Fourquet wrote:
Docile.jl looks great, but I think that the API should be made into comments. One of Julia's goals is to have a simple syntax that even people who are not acquainted with programming can easily understand.

Python, despite using docstrings, is a great example of a language having "a simple syntax that ...  understand"

The problems of Python docstrings:
  • no standardization on formatting (Markdown vs others).
  • loss of debugger trace while testing doctests and/or docexamples (this is due to the code-to-string-to-code conversion).
  • documentation examples are usually not recognized as code by IDEs (e.g. PyDev in Eclipse), so you don't get completions and code analysis.
  • I would prefer the documentation to precede the function declaration.
 
I believe that a tagged comment is much more readable than a block introduced by @doc or doc.

"Much more readable" is maybe a bit exaggerated, can you explain why you believe so?

Take an example from Docile.jl

module PackageName

using Docile
@docstrings # Call before any `@doc` uses. Creates module's `__METADATA__` object.

@doc """
Markdown formatted text appears here...

"""
{
   
# metadata section
   
:section => "Main section",
   
:tags    => ["foo", "bar", "baz"]
   
# ... other (Symbol => Any) pairs
   
} ->
function myfunc(x, y)
   
# ...
end

@doc "A short docstring." ->
foo
(x) = x

end

The @doc keyword looks confusing to me. If you get used to it, it's OK, but such a syntax steepens the learning curve to get into Julia, as a new user (maybe who's not very acquainted with programming) may find it difficult at first to distinguish the documentation code from the algorithmic code. I believe that documentation should be a clearly bounded block, possibly resembling a comment, yet it should contain substructures corresponding to special AST objects.

In Docile.jl, the @doc macro does not define a "clearly visible bounded block". I would prefer Julia to introduce a new kind of block formatting for documentation.
 

Michael Hatherly

unread,
Sep 15, 2014, 11:02:49 AM9/15/14
to julia...@googlegroups.com

Thanks for having a look at Docile.jl. I’ll try to explain some of
my decisions regarding it’s design.

Introducing a new AST object:

Having something built into the language would be great and I’d
definitely support that. There’s been some discussion, I think, about
doing that - probably in this thread or another from earlier, perhaps on
GitHub.

I needed something that works now and wouldn’t need changes to any
internals, hence the use of macros for this.

Using comments rather than strings:

A much earlier version of Docile.jl did do this, in part, but what I
found was that since comments get discarded during evaluation of the
code this required a separate run to capture documentation. I may have
missed a few tricks to get that to work though.

Another gripe I have about using comments: unless you use some special
syntax/tag (such as the extra # character) to denote what is actually
documentation and what isn’t then commenting out code temporarily
could cause problems if docstrings are parsed by Julia “as valid AST
objects”. Using a docstring “tag” to avoid this would work I think, so
long as it visually distinguishes plain comments from docstring
comments.

Metadata in the docstring/comment:

##
# docstring
#
# Metadata {
# key1 => value1
# }
#
# Examples
# ========

An earlier version I had did embed the metadata directly into the
docstring using the YAML.jl package. Switching to an actual Dict
simplified the code and also makes it easier to generate metadata
programmatically if necessary.

Readability of @doc:

I think that this probably just comes down to personal preference for me - I’ve not done an extensive comparison between different syntax.

@doc introduces a docstring and seems pretty straightforward to me. It
explicitly states that what follows is documentation. That example from
Docile.jl could probably do with some simplifications since that metadata
section looks terrible if I’m honest. Something like the following might be
better as an initial example:

module PackageName

using Docile
@docstrings # must appear before any `@doc` calls

@doc """

Markdown formatted text goes here...

""" ->
function myfunc(x, y)
    x + y
end

end

And then leave introducing metadata until after this since I’ve found
metadata to not be needed for every docstring I write.

I’m not sure about the “clearly visible bounded block” though, what in
particular could be clearer? I’m asking since I’ve been staring at these
for a while now and have become quite accustomed to them.

— Mike

Steven G. Johnson

unread,
Sep 15, 2014, 12:19:33 PM9/15/14
to julia...@googlegroups.com
On Saturday, September 13, 2014 5:18:54 AM UTC-4, Stefan Karpinski wrote:
There will always, however, be more documentation that doesn't affect the behavior of the code. I'm not even sure what it means for that kind of documentation to be "semantically meaningful" – you mean that it has a specified format means something?

If you can access the documentation from the code that creates it, without running any separate documentation-processing step, then it is semantically meaningful (and can affect the behavior of the code if the code so chooses).  Also, our documentation will necessarily be tied to the semantics of the language -- for example, it can be tied to a Function in general or to a Method (as opposed to semantics-unaware documentation systems where functions can only be indexed by name), and I still have yet to see a clean way to do this with comments.

Furthermore, I like having a clean separation between code comments (meant for programmers reading the source) and documentation (meant for users not looking at the source).

But my biggest problem with using documentation in comments remains the lack of flexibility in formatting, metadata, etcetera.  You're going to end up with either an inflexible system that is not easily extensible later to include richer information, or you're going to end up inventing your own mini-language for the docstrings (see the multiple PEPs on docstring formatting and metadata).  We already have a pretty good language; why not use it?

Would you want your choices of what can be represented in the documentation/metadata system right now to be dictated by formatting tools and environments circa 2004?  circa 1994?  How do you think Julia users will feel in 2024?  In 2034?  20 years is not that long of a time for a living programming language; 10 years is about the time it takes to become mainstream.

Gray Calhoun

unread,
Sep 15, 2014, 1:36:03 PM9/15/14
to julia...@googlegroups.com
Just to engage in some bikeshedding.... is @doc better than defining doc_str or d_str? The triple quote notation seems like an unnecessary pythonism. doc_str gives:
doc"
Markdown formatted text goes here...
" ->
function myfunc(x, y)
    x + y
end

Gray Calhoun

unread,
Sep 15, 2014, 1:37:30 PM9/15/14
to julia...@googlegroups.com
I should add that I'm excited to try out the package as is and successfully document my functions.

Leah Hanson

unread,
Sep 15, 2014, 1:48:37 PM9/15/14
to julia...@googlegroups.com
The @doc macro lets you do things that the doc_str can't:

1) Attach to the following method/function/etc. The string just sits there; the macro can do the work to put the string into the documentation. (the doc_str wouldn't be able to see the context around the string it's parsing)
2) Add a metadata dictionary after the doc string.
3) Allow other formats of documentation string (rst, asciidoc, whatever) as long as the implement some interface of functions (likely some writemime methods). Something like `@doc rst" ...rst formatted text"`, where using `doc" text"` would remove the possibility of format tagging via rst_str.

-- Leah

Michael Hatherly

unread,
Sep 15, 2014, 2:00:29 PM9/15/14
to julia...@googlegroups.com

Well Leah’s already answered this while I was fighting with my formatting, but here’s mine anyway :)

Welcome to the shed :) I really like that syntax and if it’s possible to get it to work that would be really nice.The problem is that a @doc_str macro wouldn’t capture the Expr that is being documented. See the dumps below:

julia> dump(quote

       doc"
       Markdown formatted text goes here...
       " ->
       function myfunc(x, y)
           x + y
       end

       end)
Expr
  head: Symbol block
  args: Array(Any,(2,))
    1: Expr
      head: Symbol line
      args: Array(Any,(2,))
        1: Int64 3
        2: Symbol none
      typ: Any
    2: Expr
      head: Symbol ->
      args: Array(Any,(2,))
        1: Expr
          head: Symbol macrocall
          args: Array(Any,(2,))
          typ: Any
        2: Expr
          head: Symbol block
          args: Array(Any,(2,))
          typ: Any
      typ: Any
  typ: Any
julia> dump(quote

       @doc """
       Markdown formatted text goes here...
       """ ->
       function myfunc(x, y)
           x + y
       end

       end)
Expr
  head: Symbol block
  args: Array(Any,(2,))
    1: Expr
      head: Symbol line
      args: Array(Any,(2,))
        1: Int64 3
        2: Symbol none
      typ: Any
    2: Expr
      head: Symbol macrocall
      args: Array(Any,(2,))
        1: Symbol @doc
        2: Expr
          head: Symbol ->
          args: Array(Any,(2,))
          typ: Any
      typ: Any
  typ: Any

@doc_str takes the contents of the string in as an argument (you can pass some flags in as well, see the Regex syntax for examples) and not the Expr appearing after. The @doc macro takes a varargs macro doc(args...) and so can capture everything after. The trick is the -> which does “line continuation” (or something like that).

The -> also allows Docile to capture line number information of things other than method definitions. If you’ve looked at the generated docs for Docile you’ll see that everything has file and line number information provided.

Glad you’re excited. Give me a shout if you run into any issues.

Michael Hatherly

unread,
Sep 15, 2014, 2:02:02 PM9/15/14
to julia...@googlegroups.com
Yes, this covers it quite well.

-- Mike

Gray Calhoun

unread,
Sep 15, 2014, 5:45:20 PM9/15/14
to julia...@googlegroups.com
Hi Leah, thanks for the explanation. That makes a lot of sense.
Reply all
Reply to author
Forward
0 new messages