Starting the AsciiDoc Specification Journey

292 views
Skip to first unread message

Dan Allen

unread,
Jan 7, 2019, 4:36:20 PM1/7/19
to asci...@googlegroups.com
I'm excited to share with you some much anticipated news.

After numerous calls for an AsciiDoc specification over the past year, it's very clear the community is ready for AsciiDoc to take this step. As we established in previous threads, Lex and I are in agreement. I also reached out to Stuart and we have his support as well. So now's the perfect time to pursue it.

== A new home at the Eclipse Foundation

We all want AsciiDoc to have a strong future and the resources it needs to evolve and grow. To achieve this, I'm planning to submit a proposal for an AsciiDoc language specification to the Eclipse Foundation. The Eclipse Foundation provides a home for developing specifications and is committed to transparency and open source, values that align well with AsciiDoc and its community. Specifically, the Eclipse Foundation Specification Process (EFSP) provides a clear, yet customizable structure that reduces the risk of the process stalling and ensures the outcome will be usable in the real world. The process is public, vendor neutral, and all source materials and final artifacts are open source.

== What will it mean for AsciiDoc to become a specification?

The specification for the AsciiDoc language (which will live at asciidoc.org) will include an open source specification document, which defines all required and optional API definitions, semantic behaviors, data formats, and protocols, as well as an open source Technology Compatibility Kit (TCK) that developers can use to develop and test compatible implementations. (Based on past experience with specifications, I consider an open source TCK to be a hard requirement). A compatible implementation, as defined by the EFSP, must fully implement all non-optional elements of a specification version, must fulfill all the requirements of the corresponding TCK, and must not alter the specified API.

For users and developers alike, the AsciiDoc specification will mean a clear, working definition of what AsciiDoc is and how it should be interpreted. Developers will be able to build implementations, tools, and services around AsciiDoc without risk of diluting its meaning or splintering it. In turn, users will have more options, greater document portability, and the assurance that compatible implementations and tools will handle their AsciiDoc documents according to a versioned specification.

== What's next?

The next step in creating the AsciiDoc specification is to propose it as a specification project to the Eclipse Foundation. The proposal, which I plan to submit shortly, will be reviewed by the Eclipse Management Organization, then posted for community review and comment. If accepted, the process to define it will begin.

To learn more about the specification process, I encourage you to check out Wayne Beaton's posts
https://blogs.eclipse.org/post/wayne-beaton/eclipse-foundation-specification-process-part-ii-efsp (Part II: the EFSP) and https://blogs.eclipse.org/post/wayne-beaton/eclipse-foundation-specification-process-part-iii-creation (Part III: Creation). You can also find the EFSP documentation at https://www.eclipse.org/projects/efsp/.

With a specification process that can be adapted to suit the needs of the AsciiDoc community, I believe the language will evolve in a sustainable and substantive manner that keeps pace with the community's needs, now and into the future. I'm really excited to get started, and I hope you'll join me on this journey to make AsciiDoc a specification!

Feedback welcome.

Cheers,

-Dan

See related post on the Asciidoctor blog: https://asciidoctor.org/news/2019/01/07/asciidoc-spec-proposal/

--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux

Lex Trotman

unread,
Jan 8, 2019, 2:04:49 AM1/8/19
to asci...@googlegroups.com
Hi Dan,

Thanks for starting the process.

The idea of developing the specification under the auspices of an
independent entity is a very good one. It will help to make the
process obviously open and independent and allow it to have a life
beyond its initial creators. I have lots of detailed queries about
that, which I will address in another email, because I want to query
something very fundamental that your mail raises and I don't want it
buried in the details.

You mention the TCK, and I agree an unambiguous and preferably
automated way of allowing implementations to demonstrate compliance is
a good thing, but...

It raises the rather basic question of "what are we standardising?"

1. Asciidoc source format, definitely, but how do you test syntax
acceptance in a markup language. Because in a markup language very
little is illegal, if its not recognised as markup its just text, so
nearly any document content is likely to be accepted by the standard,
unlike programming language syntax.

2. Asciidoc markup semantics, definitely, but at what level, eg is
*foo* specified as "must be styled bold", to be emphasised (in Docbook
speak), to be strong (in HTML speak), or some other wording that would
tend to indicate that, or is it just a grouping of characters to be
styled in "some" way. And particularly statements like "must be bold"
are presentational and not semantic, and also would mean any styling
customisation is "not standard".

3. Asciidoc output, I see that as having several problems:

3a. which output? html4, xhtml, HTML5, docbook, pdf, epub, man etc,
any output used in the specification means an implementation must
provide that format to demonstrate compliance, or if the standard
tried to specify all outputs it is preventing new targets from being
"standards compliant". For example an implementation that produces
only wonderful pdf books should not have to artificially produce HTML
as well, just to be able to demonstrate compliance which allows it to
say its compliant.

The common mark spec uses HTML, and has just this issue, it simply
doesn't address anything else. But being less formal than a
specification with trademarks managed by an entity like Eclipse
Foundation, that may not matter for it. But a formal entity like a
foundation may find it hard to accept as "compliant and therefore
allowed to use the trademark" something that doesn't pass an automated
test, and that makes it hard to include all outputs that are not
standardised.

3b. Computer language specifications like Java, C and C++ don't
specify the machine code compilers must produce, and an ARM compiler
doesn't have to work for x86 as well, just to claim compliance.

3c. I'm not convinced any of the current implementations have any
output that would be classed as "best practice" for that output, so
that it would be worthy of being standardised? (am relying on Dan's
Github comments about Asciidoctor output, I havn't examined it in
detail, but understand it is similar to Asciidoc Python which isn't
terribly "best practice")

And designing a new "standard output" during the standardisation
process without an implementation is risky.

3d. Even for an output that was specified, that output may be used in
different environments that impose constraints, eg HTML could render
to a blog or as part of a site managed by a framework, places where
there may be requirements or limitations on how the HTML is
structured. It doesn't seem sensible to prevent such uses from being
able to claim standards compliance if they accept the whole language.
If such an implementation cannot be standards compliant, there is no
incentive to implement all of Asciidoc, and no incentive to not add
some new markups just to suit their use-case. That way just leads to
fragmentation.

3e. Implementations that use follow up toolchains may not have the
level of control over the output to exactly match examples in the
specification. Even if they accept the full Asciidoc source are those
to be condemned as not standards compliant?

3f. What about generated content that is not a direct transcode of
input, such as tables of contents, indexes, section numbers. Which
organisation of those is to be standardised?

To me the point of the standardisation process is to ensure that
markup in a document is interpreted in the same way in all
implementations, and the semantics of that markup are the same, not
its presentation. Thats the core of the "Asciidoc is a semantic
markup, not a presentational markup" statement.

So it seems to me that we need to standardise the syntax of Asciidoc
markup and to some extent the semantics, but not the output, and that
unfortunately makes it difficult to generate automated tests, however
I'm happy to hear solutions.

I guess we better get these ducks in a row[1] before we propose
anything to an organisation like Eclipse.

Cheers
Lex

[1] https://dictionary.cambridge.org/dictionary/english/get-have-your-ducks-in-a-row

Dan Allen

unread,
Jan 8, 2019, 7:25:27 AM1/8/19
to asci...@googlegroups.com
Lex,

Thank you for your support.

As always, you raise thought-provoking questions. If we had answers to all these questions, we'd already be at the end of the specification process (or at least well into it). Proposing a specification isn't about showing up with answers to all these questions. Rather, it's about initiating a process where these questions can be answered. While these are fantastic questions that serve as a springboard for discussion, we're getting ahead of ourselves thinking we need answers first.

The proposal itself is surprisingly simple. It's an intent. It gets the process started and reserves the space and resources for the process to be conducted. Here's an example of one such proposal: https://projects.eclipse.org/proposals/jakarta-ee-nosql

I can directly address the question about what we're standardizing. My intention is to standardize the AsciiDoc language, first and foremost. That is, defining the structure and syntax in which information is being encoded under the name "AsciiDoc". I've always viewed AsciiDoc as having a very well-defined structure, albeit with some gray areas I'm confident we can address. No matter how far we take this, we at least need a language definition, so it's step one.

I'll pull something up from the bottom of your post that fits here:

To me the point of the standardisation process is to ensure that markup in a document is interpreted in the same way in all implementations, and the semantics of that markup are the same, not its presentation. That's the core of the "AsciiDoc is a semantic markup, not a presentational markup" statement.

100%. This would serve as great raw material for the proposal. We may consider certain outputs to be optional parts of the spec, as I explain below.

I will *speculate* about the answers to some of your other questions as a thought experiment. But nothing I'm about to say here should be construed as an official position until we're into the specification process.

any output used in the specification means an implementation must provide that format to demonstrate compliance, or if the standard tried to specify all outputs it is preventing new targets from being "standards compliant".
...

that unfortunately makes it difficult to generate automated tests

I don't think so.

I believe we can validate the parsing of the language by defining an intermediary, output-agnostic data format (an object notation). If you look at Asciidoctor, this is what the AST is (though too amorphous in its current state). Once you've proven that you've parsed the document and captured all the information, I believe you can claim compliance (at least to interpretation of the language). And that's what the TCK should focus on.

That being said, converters to certain output formats are a great example of an optional requirement (at least for the language portion of the spec). We can imagine an implementation that only cares about parsing, such as for use by an indexer. So there's no need to mandate anything more than that.

But it might still be important to specify built-in converters to HTML, DocBook, and perhaps DITA to make the specification functional. I think we can do it in such a way that allows for plenty of experimentation. In other words, the converter would be defined as an API you can implement however you like. An implementation can provide the (optional) built-in converters, or it can just go wild and make it's own (with the understanding that those outputs are custom).


also would mean any styling customisation is "not standard".

Personally, I think roles have already solved this problem. AsciiDoc has proven this simple abstraction allows for tremendous styling customization. I think that will give us a lot of flexibility. (Another idea is to support data- attributes as a direct mapping to the output, which would really open things up).


And designing a new "standard output" during the standardisation process without an implementation is risky.

On this matter, I can be specific.

Eclipse doesn't allow a standard to be defined without an implementation. It's a "come with code" specification process, as Wayne details in his posts. So this is not a risk.

Now, back to speculating.

If such an implementation (output may be used in different environments) cannot be standards compliant, there is no incentive to implement all of Asciidoc, and no incentive to not add some new markups just to suit their use-case.  That way just leads to fragmentation.

Based on what has been said so far, this concern seems contrived.

We get to define the specification in our way, and the Eclipse Specification Process allows for this. So we'll define it in a way that will afford plenty of output flexibility. That's certainly a key requirement to make it useful in the real world.

It's safe to say we're all keenly aware of the risk of fragmentation (having learned from Markdown), so we'll take necessary measures to prevent that situation. This specification is about building a common understanding, not as a tool to exclude ideas and innovation.


What about generated content that is not a direct transcode of input, such as tables of contents, indexes, section numbers. Which organisation of those is to be standardised?

This is likely going to be a follow-up specification that deals more with publishing concerns. I don't see a table of contents and index output as part of the language (certainly the structures they're built from, but not what's generated). This has more to do with how the language gets used. Perhaps an addendum specification down the road can address it.

I want to finish by clarifying that I don't make light of any of these concerns and questions. We just don't have to (and shouldn't) solve them today. I'm targeting the Eclipse Foundation Specification Process because I believe it will provide the platform to facilitate the discussions around these questions and help us address these concerns. AsciiDoc certainly deserves that foundation.

Cheers,

-Dan

Marco Ciampa

unread,
Jan 9, 2019, 1:43:36 AM1/9/19
to asci...@googlegroups.com
Hi people!
On Tue, Jan 08, 2019 at 05:04:35PM +1000, Lex Trotman wrote:
> Hi Dan,
>
> Thanks for starting the process.
>
[...]
>
> To me the point of the standardisation process is to ensure that
> markup in a document is interpreted in the same way in all
> implementations, and the semantics of that markup are the same, not
> its presentation. Thats the core of the "Asciidoc is a semantic
> markup, not a presentational markup" statement.
>
> So it seems to me that we need to standardise the syntax of Asciidoc
> markup and to some extent the semantics, but not the output, and that
> unfortunately makes it difficult to generate automated tests, however
> I'm happy to hear solutions.

This is paramount for a number of reasons and I can add one:

po4a supports asciidoc and a specification of the format (not of the
output) is fundamental for the parsing and extracting of the .po strings
process...

TIA for this GREAT effort!

--


Marco Ciampa

I know a joke about UDP, but you might not get it.

------------------------

GNU/Linux User #78271
FSFE fellow #364

------------------------

Dan Allen

unread,
Jan 9, 2019, 2:19:54 AM1/9/19
to asci...@googlegroups.com
Macro,
 
po4a supports asciidoc and a specification of the format (not of the
output) is fundamental for the parsing and extracting of the .po strings
process...

That's a great point. It's important to have these use cases to think about, because it helps us see the language from different perspectives.

TIA for this GREAT effort!

Thanks! It's just the beginning, but the most important part of any journey is the first step.

Cheers,

-Dan

--
You received this message because you are subscribed to the Google Groups "asciidoc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to asciidoc+u...@googlegroups.com.
To post to this group, send email to asci...@googlegroups.com.
Visit this group at https://groups.google.com/group/asciidoc.
For more options, visit https://groups.google.com/d/optout.

Eric Raymond

unread,
Jan 9, 2019, 10:55:09 AM1/9/19
to asciidoc
On Tuesday, January 8, 2019 at 2:04:49 AM UTC-5, Lex Trotman wrote:
It raises the rather basic question of "what are we standardising?"
 
This part is easy.

asciidoc generates XML-DocBook as a lossless translation.  You are standardizing that translation.

Therefore, conformance will be defined by a set of test pairs, one of which is asciidoc source and one of which is XML-DocBook output.  That test set will be accompanied by a narrative description of intent (the standards document itself).

This philosophy has two huge advantages:

1. It is a crisp, well-defined test.

2. It sidesteps all the downstream issues about rendering and presentation. You don't instantly land in trouble when there's a new back end.




Jaime Tarrasa

unread,
Jan 12, 2019, 5:21:34 AM1/12/19
to asciidoc
Hi, that is the right path, every format needs a specification

About submitting it to Eclipse Foundation, that is a good idea, but I think that it is just the colophon, not the goal.

Before dealing with bureaucracy of the process of Eclipse Foundation, or ISO, or whatever institution, I would write a specification and test it.
That is, you write an specification, and developers have doubts and ask questions. Then you revise the specification: Isn't that point clear enough? Should I add examples? How should this ambiguity be solved? Maybe this was not as evident and obvious as I thought.

Some day, there will barely be serious doubts or questions, and all would be clearly solved pointing to a section of the specification. Then you can begin to think about the process of submitting to any institution if you want to, but it is not necessary if there is a authoritative site to read a clear specification. i.e. yaml format is defined here: https://yaml.org/spec/ and needn't Eclipse foundation, that would be just a bonus.

Lex Trotman

unread,
Jan 12, 2019, 6:30:29 AM1/12/19
to asci...@googlegroups.com
Dan and all,

I have taken some time to consider. Whilst there are many advantages
to operating under the auspices of an organisation like Eclipse, I see
the following practical problems with that particular one:

1. Eclipse specifications must be under the auspices of a working group,
which one? Those listed at https://www.eclipse.org/org/workinggroups/explore.php
do not seem appropriate, and creating a new one is an added workload on
a limited community of resources.

2. The Eclipse standards process requires an implementation and
automated tests, appropriate for the Java related APIs that they mostly
standardise, but less so for a markup language. Many of the questions
I asked in my previous reply show that defining limited fixed
translations is more constraining than encouraging the growth of
implementations. And the suggestion that a dumped format for the
parsed tree just adds another (otherwise useless) output format
implementations must provide.

3. For any contribution to be accepted an ECA must be provided by the
contributor. This is understandable for managing contributions from
competing corporations that Eclipse normally wrangles, but is an
unacceptable impediment to the process of developing the specification
for a small community that isn't heavily related to Eclipse.

Even if we use github (which I believe we must to get maximum
contributions, few Asciidoc users are watching Eclipse projects) the
Eclipse github hooks will complain about merging pull requests from
contributors without ECAs. However an ECA is a signed document and
this is an unacceptable impediment to contribution by many people,
especially writers, the actual people we want contributions from.

It is inappropriate to place barriers in the way of contribution to the
initial specification process. And it is also an issue for the
contributions to the TCK.

These problems I believe make it unsuitable to develop the
specification under Eclipse.

However the post by Jaime Tarrasa provides a possible answer, initial
development should take place outside the Eclipse foundation, and then
it can be made available to the Eclipse formalization process.

Or a different organisations should be investigated.

Cheers
Lex

BTW I notice a "News" item on the Eclipse website that points to the
Asciidoctor blog item on the specification. As there is not yet any
agreement about using that organisation, it is premature for them
to be unilaterally posting news articles that suggest its a fait accompli.

Jaime Tarrasa

unread,
Jan 12, 2019, 6:46:39 AM1/12/19
to asciidoc
That is the point I wanted to state.

Just register a domain  i.e. asciidoc-spec.org, or something similar, and start writing the specification, even just a small introduction pointing to a github with the specification. You, people who are involved in asciidoc (i.e. asciidoctor and asciidoc-py), if you want register a legal foundation in USA, or Germany, or wherever you are more comfortable, do it. That foundation is a good idea and I think that, in the long term, necessary. But now, the first step is just a place to write a draft specification in a more or less informal way and start the process of improving and polishing it.

Thomas Beale

unread,
Apr 20, 2020, 8:07:00 AM4/20/20
to asciidoc
Interested to know the current status of this effort. Having worked in standards for over 20y, and also 4y ago converted all our specifications (which are used as de facto standards) to Asciidoctor, and having had some passing experience with Eclipse Foundation, I may be able to contribute an idea or two.

- thomas

Dan Allen

unread,
Apr 29, 2020, 7:02:24 PM4/29/20
to asci...@googlegroups.com
thomas,

That's great to hear. We welcome your participation!

Both the charter for the working group and the language specification project have been proposed (following the EFSP). You can find them here:



To follow along with this effort as it gets underway, I encourage you to join the working group mailinglist at https://accounts.eclipse.org/mailing-list/asciidoc-wg. Once the specification process is approved, it will get its own mailinglist as well, which is where the technical discussions will happen.

Best Regards,

-Dan

--
You received this message because you are subscribed to the Google Groups "asciidoc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to asciidoc+u...@googlegroups.com.

Thomas Beale

unread,
May 1, 2020, 1:41:18 PM5/1/20
to asciidoc
Hi Dan,
I gave up trying to get into the Eclipse site (couldn't manage to create a password combination that worked - tried in two different browsers in two different OSs... ) but anyway, I skimmed the group mission and saw this:

To execute this mission, the Working Group will:

Foster the design and development of the AsciiDoc language specification, which includes the syntax, rules, built-in attributes, Abstract Semantic Graph (ASG), DOM (Document Object Model), API and options, conversion model, referencing system, extension SPI, and runtime-agnostic technology compatibility kit (TCK), all to ensure interoperability and portability of information encoded in AsciiDoc.

We do a lot of exactly this, e.g. see these two specs
You'll see that we do a lot of syntax, diagrams, UML (in the second one), for which we have an OS extractor for MagicDraw. Plus bibtex. You have probably already worked out what you are doing, but feel free to plunder any stylistic, presentation or other ideas. We also manage all these specs (there are about 30) as adoc source + UML + draw.io diags in Github with Jira projects to manage change requests in the long term. We publish on the server via Github webhooks and scripts. Some governance ideas here.

Again, you've probably got everything worked out, but as we've been doing this for 20y and 4y with the Asciidoctor publishing set up) I thought I'd at least point to it in case anyone was looking for ideas.

best

- thomas
To unsubscribe from this group and stop receiving emails from it, send an email to asci...@googlegroups.com.

Dan Allen

unread,
May 6, 2020, 4:15:45 PM5/6/20
to asci...@googlegroups.com
Hi thomas!

We're definitely looking for ideas. We welcome both your participation and experience. Speaking of which...

I gave up trying to get into the Eclipse site (couldn't manage to create a password combination that worked - tried in two different browsers in two different OSs... )

That's concerning to me. I'll reach out privately to find out what's going on there, and to make sure you can get your account activated and subscribed.


We do a lot of exactly this, e.g. see these two specs

That's precisely the resources (and expertise) we're looking for. Once the mailinglist for the spec project is created (upon approval of the spec proposal), I encourage you to post this information there too.
 
You have probably already worked out what you are doing

Not by a long shot. There's still loads to figure out. It will no doubt be a learning process for me and others.

You may have noticed that the spec proposal doesn't include a specification draft. That's because, per the Eclipse Foundation guidelines, all assets must be created within the process in an open manner. I consider this a very good thing because we maintain full transparency in the development of the specification, and everyone who wants to take part has the opportunity to do so. The Eclipse Foundation is extremely adamant about following an open process (their reputation depends on it, as does ours).
 
but feel free to plunder any stylistic, presentation or other ideas.

We're happy to use these as references, but will be even more so if you yourself bring them forward. With your experience, your participation could prove to be crucial to the success of the group and mission.

Best Regards,

-Dan

Reply all
Reply to author
Forward
0 new messages