Non-Twine HTML: a bug, and a treaty change

36 views
Skip to first unread message

Dan Fabulich

unread,
Dec 27, 2021, 8:53:01 PM12/27/21
to Babel-IF
Hi! I'm Dan Fabulich, maintainer of ChoiceScript, an HTML-based IF system. I'd like to begin considering what it would be like to integrate ChoiceScript with Babel.

I wanted to raise two issues here: first, I wanted to call your attention to a bug, which I propose to fix, and I invite your feedback on how to do that. Second, I want to propose a change in the Treaty to support non-Twine HTML, and I invite your feedback on how to handle that.

1. Twine format bug: format "twine" should be format "html"

The babel tool added Twine support in October of 2020. Twine was added as a signatory in revision 10 of the Treaty of Babel in Jan 2021. I figured I'd follow in Twine's footsteps, since Twine generates playable HTML files, but I ran into an issue, which I filed on the babel-tool repository. https://github.com/iftechfoundation/babel-tool/issues/24

According to the Treaty https://babel.ifarchive.org/babel_rev10.html#format, there are 13 valid <format> values:

> zcode, glulx, tads2, tads3, hugo, alan, adrift, level9, agt, magscrolls, advsys, html, executable

But when you run the "babel" tool on a Twine file, it returns the format "twine," which isn't on this list.

I think it may have been the intention of the signatories to add twine to that list. (Was it?) But I think that's not actually desirable.

HTML, like Z-code, is an output format, which multiple dev systems can generate. ZIL, Inform, Dialog, etc. can all make Z-code files. Similarly, Twine can make HTML, but so can ChoiceScript, Texture, Undum, and Gruescript.

Instead, I think that the Babel tool should be updated to return "html" as the format of Twine files. (I'd be happy to file a PR to that effect.)

Does that sound right to y'all?

2. Adding support for non-Twine HTML by adding metadata to HTML

Now, supposing I did that, I'd then want there to be a way for the babel tool to support HTML files in general, rather than just Twine files specifically.

The way the babel tool currently detects Twine files is by looking for distinctive HTML in each file. Twine 2.x files always include a particular custom element, <tw-storydata>, and the Twine HTML generator adds an "ifid" attribute to that element. 

I think it would be more typical for HTML to self-identify using HTML-based metadata standards. For example, we could standardize on a <meta> tag, like this:

<meta property="ifiction:ifid" content="247603CE-784D-4404-90F5-5CE6356963F7">

I'm happy to file a PR (maybe combining it with my bugfix PR) to make an "html.c" format detector, which would look for a <meta> tag like the one above for the purposes of detecting IFIDs. (It would continue to search for the <tw-storydata ifid> attribute, for backwards compatibility.)

For non-Twine HTML that doesn't include the <meta> tag, running "babel -format" currently says "Format: executable (non-authoritative)," and returns the MD5 as the IFID. I think I'd update it to say "Format html (non-authoritative)" for HTML that doesn't contain the <meta> tag.

I'd also file a PR (is that possible?) to amend the Treaty to recommend the <meta> tag for HTML.

Does that sound right to y'all?

3. We could use JSON-LD, but I propose we don't

Alternately, if we wanted to get fancy, we could include rich iFiction-like metadata right there in the HTML. schema.org defines a "VideoGame" schema https://schema.org/VideoGame and I think it would be straightforward to standardize on a JSON-LD adaptation of that iFiction. Something like this would appear in the <head> of HTML files that support the Treaty of Babel:

<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "VideoGame",
"identifier": {
"@type": "PropertyValue",
"propertyID": "IFID",
"value": "EB12BAF9-37F9-459B-B65B-B32B96DBF254"
},
"name": "Birdland",
"image": "Cover/birdlandcover.jpg",
"genre": "Teens",
"description": "Fourteen-year-old Bridget's summer camp experience takes a turn for the bizarre when her otherworldly bird dreams start bleeding into reality.",
"inLanguage": "en",
"author": {
"@type": "Person",
"name": "Brendan Patrick Hennessy",
},
"gamePlatform": "HTML",
"datePublished": "2015-10-01"
}
</script>
I'm… hesitant to file a PR to do this, though. The babel tool currently parses Twine HTML by regex, which is a small and lightweight implementation, but it is a "code smell," as they say.

Parsing JSON-LD (or RDFa or microdata or whatever) would really require a full-blown HTML parser (to find the JSON-LD) and a JSON parser to parse the JSON.

A user @Uzume replied to Zarf's Twine PR last year, suggesting a full-blown HTML  parser, and Zarf understandably pushed back:

> Babel-tool is intended to be a no-hassle component, so we are really trying to avoid pulling in dependencies. Yes, it is a truism that hand-parsing HTML is stupid. Even so.

So, I propose just adding support for an IFID <meta> tag, and to find it by regex, the same way the code currently finds the <tw-storydata> element.

Does that sound agreeable?

Dan Fabulich

unread,
Dec 28, 2021, 4:58:08 PM12/28/21
to Babel-IF
Just for discussion purposes, I’ve filed these two PRs.

This PR makes the babel tool return “html” as the format for Twine games. I claim that it’s a bug fix to make the babel tool follow the Treaty.

This PR, stacked atop the other PR, adds support for detecting non-Twine HTML games and to detect <meta property="ifiction:ifid"> in HTML. This is how I’m proposing to amend the Treaty to support non-Twine HTML.

-Dan

Andrew Plotkin

unread,
Dec 28, 2021, 6:53:58 PM12/28/21
to Babel-IF
On Tue, 28 Dec 2021, Dan Fabulich wrote:

> Just for discussion purposes, I’ve filed these two PRs.
>
> This PR makes the babel tool return “html” as the format for Twine games. I
> claim that it’s a bug fix to make the babel tool follow the Treaty.
> https://github.com/iftechfoundation/babel-tool/pull/25
>
> This PR, stacked atop the other PR, adds support for detecting non-Twine
> HTML games and to detect <meta property="ifiction:ifid"> in HTML. This is
> how I’m proposing to amend the Treaty to support non-Twine HTML.
> https://github.com/dfabulich/babel-tool/pull/1/files

Hey, thanks for bringing this up, and welcome to the list.

I agree with your basic analysis that the format for Twine games should be
"html". The doc is clear that the <format> tag (and what "babel -format"
returns) represents the format of the file, not what tool generated it.
This is HTML which is meant to be run by a web browser.

Fixing it is kind of messy. The structure of the babel source sort of
assumes that a source file, a module, a format, and an IFID generator all
go together. Your PR is just changing the "twine.c" file to "html.c" and
renaming the module. But it's still looking for the <tw-storydata> tag,
and it's generating a "TWINE-{MD5}" IFID when no IFID is recognized.

> I think it would be more typical for HTML to self-identify using
> HTML-based metadata standards. For example, we could standardize on a
> <meta> tag, like this:
>
> <meta property="ifiction:ifid" content="247603CE-784D-4404-90F5-5CE6356963F7">

Seems sensible, but it will be a while before Twine catches up.

(Also, isn't it supposed to be <meta name="x" content="y">? No, I guess
both exist... Oh, and you're really supposed to have
xmlns:ifiction="http://babel.ifarchive.org/protocol/iFiction/" in the
header in order to use ifiction as a prefix. Well that's good to know.)

How about this logic:

- Rename twine module to html as proposed.
- If an HTML file contains <meta property="ifiction:ifid">, use that. If
not, look for <tw-storydata> as the current code does.
- If no IFID is found, generate "HTML-{MD5}" instead of "TWINE-{MD5}".

--Z

--
"And Aholibamah bare Jeush, and Jaalam, and Korah: these were the borogoves..."
*

kli...@gmail.com

unread,
Dec 28, 2021, 7:20:13 PM12/28/21
to Babel-IF
I'm on board with the general idea that we should align tools that produce HTML output, but I have two questions.

1. I think it's true that ChoiceScript games are single HTML files like Twine games. But for web-based works that are split up among several HTML files, what should happen? Should each file have a meta tag with the same IFID value? Just the entrypoint?

2. There are two kinds of Twine files: single stories with a single <tw-storydata> element, and archives which have multiple. The idea behind archives is to allow people to import/export stories easier as well as for backup purposes. It's important that the IFIDs are kept intact in the archive format. In that case, what would you suggest should happen? <meta> is only legal in the <head> element, I think.

A footnote that there are several tools out there that compile Twine stories (tweego, extwee, twine-utils, etc) and if we want behavior to change uniformly, we'll want to update the spec here (and folks who maintain these tools may have differing opinions about this proposal): https://github.com/iftechfoundation/twine-specs/blob/master/twine-2-htmloutput-spec.md

Dan Fabulich

unread,
Dec 28, 2021, 7:40:41 PM12/28/21
to Babel-IF
I've merged my stacked PR into the main PR, and I believe I've incorporated the changes as you recommend.

https://github.com/iftechfoundation/babel-tool/pull/25

Note that this will change the default generated IFID for legacy Twine 1.x games, from TWINE-{MD5} to HTML-{MD5}. (But that seems wise and correct to me.)

> Also, isn't it supposed to be <meta name="x" content="y">? 

"name" is standardized in HTML. "property" comes from RDFa; browsers ignore it. Open Graph uses "property", and it seems pretty straightforward, so I'm doing the same. https://ogp.me/ 

I've attached a draft candidate for Treaty revision 11, and a diff.

-Dan

draft.diff
babel_draft11.md

Dan Fabulich

unread,
Dec 28, 2021, 8:04:14 PM12/28/21
to Babel-IF
On Tuesday, December 28, 2021 at 4:20:13 PM UTC-8 kli...@gmail.com wrote:
1. I think it's true that ChoiceScript games are single HTML files like Twine games. But for web-based works that are split up among several HTML files, what should happen? Should each file have a meta tag with the same IFID value? Just the entrypoint?

Indeed, ChoiceScript games are distributed as single HTML files (perhaps with images). I wasn't intending to solve this problem for games spanning multiple HTML files.

This is kinda hard, because the babel tool assumes that one "game" = one "story file." The story file could be a blorb containing other files, but it's still just one file.

(It might be nice to have a "zip" format, akin to "blorb," but that still wouldn't solve the problem of running the babel tool on multiple uncompressed HTML files.)

I think it wouldn't work great to just use the same IFID in a multi-file game. The babel tool would say that all of those files would have the same IFID, but it wouldn't be able to distinguish between a game with multiple files and multiple revisions of a single-file game.

If I were to speculate as to how to solve that problem "properly," I'd guess that we'd want something like:

    <meta property="ifiction:ifid" content="link:./entrypoint.html">

That would allow the other HTML files to declare that they're all part of one HTML game, and that the entrypoint would be the primary source of the IFID. (But this sounds like much more work than I'd want to do today.)
 
2. There are two kinds of Twine files: single stories with a single <tw-storydata> element, and archives which have multiple. The idea behind archives is to allow people to import/export stories easier as well as for backup purposes. It's important that the IFIDs are kept intact in the archive format. In that case, what would you suggest should happen? <meta> is only legal in the <head> element, I think.

Currently, the babel tool looks for the first <tw-storydata> element in a file to find an IFID; my PR didn't change that. If archives are intended to contain multiple games with multiple IFIDs, then that's a pre-existing bug in babel, and someone would need to sort that out somehow. (Not me, I think?) 😅

IMO, it would be a substantial change in babel's functionality to have it return an array of IFIDs for a single file. (But I suppose something like that would be necessary if we wanted to process zips containing multiple story files, e.g. if one zip contained other zips containing blorbs and multiple HTML files.)
 
A footnote that there are several tools out there that compile Twine stories (tweego, extwee, twine-utils, etc) and if we want behavior to change uniformly, we'll want to update the spec here (and folks who maintain these tools may have differing opinions about this proposal): https://github.com/iftechfoundation/twine-specs/blob/master/twine-2-htmloutput-spec.md

I don't think anything on Twine's side needs to change, assuming the existing babel-tool functionality works OK today. Twine 2/tweego/etc. can continue to generate an "ifid" attribute on the <tw-storydata> element, or switch to using a <meta> tag whenever (if ever) you like.
 

Andrew Plotkin

unread,
Dec 29, 2021, 3:20:15 PM12/29/21
to Babel-IF
On Tue, 28 Dec 2021, Dan Fabulich wrote:

> On Tuesday, December 28, 2021 at 4:20:13 PM UTC-8 kli...@gmail.com wrote:
>
>> 1. I think it's true that ChoiceScript games are single HTML files like
>> Twine games. But for web-based works that are split up among several HTML
>> files, what should happen? Should each file have a meta tag with the same
>> IFID value? Just the entrypoint?
>>
>
> Indeed, ChoiceScript games are distributed as single HTML files (perhaps
> with images). I wasn't intending to solve this problem for games spanning
> multiple HTML files.

Putting the meta tag in the entry HTML file is the sensible course. You
could add it to other HTML files, but I don't see much advantage.

As for archives with multiple <tw-storydata> tags -- or any kind of
package containing multiple games -- the babel tool just doesn't support
that. Someone is welcome to volunteer to figure that out but it's not a
pressing need.

>> A footnote that there are several tools out there that compile Twine
>> stories (tweego, extwee, twine-utils, etc) and if we want behavior to
>> change uniformly, we'll want to update the spec here (and folks who
>> maintain these tools may have differing opinions about this proposal):
>> https://github.com/iftechfoundation/twine-specs/blob/master/twine-2-htmloutput-spec.md
>
> I don't think anything on Twine's side needs to change, assuming the
> existing babel-tool functionality works OK today. Twine 2/tweego/etc. can
> continue to generate an "ifid" attribute on the <tw-storydata> element, or
> switch to using a <meta> tag whenever (if ever) you like.

We're going to have to support <tw-storydata> forever. You might as well
keep including the ifid="..." attribute in <tw-storydata> forever. Some
tool out there would be sad if it went away.

It would be nice if Twine *also* generated a meta tag. (For
single-Twine-game HTML files.) This can be added incrementally into tools
as you get to it.

kli...@gmail.com

unread,
Dec 30, 2021, 2:51:42 PM12/30/21
to Babel-IF
On Wednesday, December 29, 2021 at 3:20:15 PM UTC-5 zgo...@eblong.com wrote:
As for archives with multiple <tw-storydata> tags -- or any kind of
package containing multiple games -- the babel tool just doesn't support
that. Someone is welcome to volunteer to figure that out but it's not a
pressing need.

Agreed--my question was because I took the discussion here as a move to deprecate the ifid attribute on <tw-storydata>, when that doesn't sound like it's actually the case. An archive file is not playable—it just needs to retain the IFID for each story in some way.

It looks like there's also a thread about this on intfiction.org—should we move further discussion there?

It would be nice if Twine *also* generated a meta tag. (For
single-Twine-game HTML files.) This can be added incrementally into tools
as you get to it.

This seems reasonable to me. 

Dan Fabulich

unread,
Jan 3, 2022, 3:10:44 AM1/3/22
to Babel-IF
FYI, there's a public thread around this on the intfiction forum. https://intfiction.org/t/babel-for-html/53945

There, I learned that there was already an alternative unofficial approach to IFIDs in HTML, to embed a "UUID://...//" string in the HTML, e.g. in a comment.

So, in that thread, I just proposed amending my proposal to make the ifiction:ifid attribute use the UUID://...// syntax.

<meta property="ifiction:ifid" content="UUID://448E73DF-2D2F-47E7-A494-A46B40D4CFB3//">

This would allow the Babel tool to avoid parsing the HTML by regex, and just search for UUID://...// and use what it finds.

There are now three places to discuss this proposal:

1. Here
3. The intfiction thread

I'm going to aim to discuss this mostly on intfiction, but I'll bring the discussion back here when we've reached more consensus.

-Dan

Dan Fabulich

unread,
Jan 12, 2022, 5:01:45 PM1/12/22
to Babel-IF
I think we have a good candidate draft for revision 11 of the Treaty of Babel, based on conversations on the intfiction forum. https://intfiction.org/t/babel-for-html/53945

You can read the entire current proposed draft here, and view its history: https://github.com/iftechfoundation/ifarchive-if-specs/blob/master/Babel-Treaty.md

The Babel tool has been updated to match the proposed behavior: https://github.com/iftechfoundation/babel-tool

Please review the changes to the Treat. If you're satisfied, reply with a "+1" approval.

What's changed?

There are two new sections. First, there's a new section, ""The IFID for an HTML story file," and later, there's a new section "The IFID for a legacy HTML story file" that replaces the section, "The IFID for a legacy Twine story file"

#### The IFID for an HTML story file

A number of design systems generate output in HTML format, including
Twine, ChoiceScript, Adventuron, Ink, Texture, and others.

Design systems may integrate an IFID into the output HTML by adding a
`<meta>` tag to the `<head>` section of the output:

<meta property="ifiction:ifid" content="448E73DF-2D2F-47E7-A494-A46B40D4CFB3">

(If the game comprises several HTML files, apply this to the start file.)

You should include an RDFa `prefix` attribute with the standard iFiction
URI, either on the `<meta>` tag or one of its parents. This ensures that
your HTML will be valid RDFa. Some examples of this (other arrangements
are possible):

<head>
property="ifiction:ifid" content="448E73DF-2D2F-47E7-A494-A46B40D4CFB3">
</head>

<meta property="ifiction:ifid" content="448E73DF-2D2F-47E7-A494-A46B40D4CFB3">
</head>

(Note that the `babel` tool does not check for this prefix; it just looks
for the `property="ifiction:ifid"` attribute. However, other web-based
metadata tools require the prefix. Consider using [this validator][rdfavalid]
to verify that your file validates without warnings.)


##### The IFID for a legacy HTML story file

HTML games that lack the `<meta>` tag described above may include the
text `UUID://...//` in a literal string or comment in the HTML.

Older Twine games may incorporate an IFID in a `<tw-storydata>` tag in
the HTML:

<tw-storydata name="Title" creator="Twine" ifid="8665FC08-15CD-4BEC-B15A-7F72E34F4F51" ...>

Otherwise, the IFID for a legacy HTML story file is "HTML-" followed by
the MD5 checksum of the file.

Andrew Plotkin

unread,
Jan 12, 2022, 5:32:32 PM1/12/22
to Babel-IF
On Wed, 12 Jan 2022, Dan Fabulich wrote:

> I think we have a good candidate draft for revision 11 of the Treaty of
> Babel, based on conversations on the intfiction forum.
> https://intfiction.org/t/babel-for-html/53945
>
> You can read the entire current proposed draft here, and view its
> history: https://github.com/iftechfoundation/ifarchive-if-specs/blob/master/Babel-Treaty.md
>
> The Babel tool has been updated to match the proposed
> behavior: https://github.com/iftechfoundation/babel-tool

I wrangled these changes in, based on Dan's proposal, so I'm satisfied
with them. By definition. :)

The only dangling issue is the idea that, if all else fails, you should
scan the file for the literal text "UUID://...//". Several existing
formats work this way (Z-code, Glulx, Alan, HTML as a fallback) but it's
never discussed as a general rule.

However, I am not sufficiently energized to propose it *as* a general
rule, or to update the babel tool to support "UUID://" as a universal
fallback. Unless someone else is more energized, I'll just shove it onto
the pile of future suggestions and let the current draft spec stand.

Andrew Plotkin

unread,
Jan 18, 2022, 8:27:09 PM1/18/22
to Babel-IF
On Wed, 12 Jan 2022, Andrew Plotkin wrote:

> On Wed, 12 Jan 2022, Dan Fabulich wrote:
>
>> I think we have a good candidate draft for revision 11 of the Treaty of
>> Babel, based on conversations on the intfiction forum.
>> https://intfiction.org/t/babel-for-html/53945
>>
>> You can read the entire current proposed draft here, and view its
>> history:
>> https://github.com/iftechfoundation/ifarchive-if-specs/blob/master/Babel-Treaty.md
>>
>> The Babel tool has been updated to match the proposed
>> behavior: https://github.com/iftechfoundation/babel-tool
>
> I wrangled these changes in, based on Dan's proposal, so I'm satisfied with
> them. By definition. :)

Nobody else has spoken up, so I'm pushing this out as "revision 11".
Reply all
Reply to author
Forward
0 new messages