WebM Community,
Hello. I wonder whether there is any interest in an open standard for state-of-the-art interactive videos. By interactive videos [1], I mean videos like the popular “Black Mirror: Bandersnatch”, “Puss in Book: Trapped in an Epic Tale”, and “Minecraft Story Mode”.
One can envision interactive videos providing viewers with menu options and utilizing JavaScript scripting environments provided by interactive video player applications. Depending on users’ interactions with menus, random number generators, JavaScript program logic, users’ settings and configurations, and other variables or data stored, persisted, and available through JavaScript scripting environments, interactive videos could branch to portions of video content.
I am interested in advancing interactive video technologies and in open standards. I am hoping to propose and to discuss a “WebIM” interactive video format. If there is some interest, perhaps a group of us could put together a fuller proposal?
Best regards,
Adam Sobieski
[1] https://en.wikipedia.org/wiki/Interactive_film
Thank you.
In addition to entertainment scenarios, I think about educational
applications of interactive video. More broadly, I think about educational applications of choose-your-own-adventure stories, interactive films, and serious games.
With respect to interactive storytelling and branching narratives, there exists an “authorial bottleneck”, or an exponential amount of narrative content required to create truly branching storylines. In (Stefnisson & Thue, 2018), the authors indicate that “authoring in the context of interactive storytelling is inherently difficult and there is a need for authoring tools that both enable and assist authors in the creation of new content.” New software tools can facilitate authoring interactive stories, can facilitate visualizing their structure, can facilitate writing screenplays for interactive films, can facilitate organizing the production of interactive films, can facilitate editing in post-production, and can facilitate outputting resultant content to files. Authoring interactive stories and producing interactive films is difficult and with new software tools we could expect more content.
Flash was deprecated in 2017 and reached end-of-life last year in 2020. There is a niche for new interactive video formats. I see your point about a framework or ecosystem with multiple codecs for interactive video; basing one on WebM and Matroska seems reasonable.
In these regards, there are some exciting Web scenarios to consider. Web browsers could play interactive videos, could provide the videos with secure JavaScript scripting environments (environments provided by the same engines which power WWW JavaScript scenarios), could sandbox them, could provide users with intuitive and familiar user permissions systems, and could securely handle origin/domain-based access to users’ configurations, settings, and other data persisted across interactive video viewing sessions. Perhaps there could be an <ivideo> HTML element someday.
It's an interesting set of technical topics.
Best regards,
Adam
Stefnisson, Ingibergur, and
David Thue. "Mimisbrunnur: AI-assisted authoring for interactive
storytelling." In Proceedings of the AAAI Conference on Artificial Intelligence
and Interactive Digital Entertainment, vol. 14, no. 1. 2018.
Yes, I want to abstract it so that Web browsers would be but one of many places that new interactive video format(s) could be run. To run interactive videos, any video player software applications would, as envisioned, need to provide a JavaScript scripting environment (perhaps utilizing an engine like V8).
Advantages of popular new format(s) would also include enabling the processing of collections of interactive videos, e.g. with artificial intelligence tools, for various research projects [1][2]. To process large collections of interactive videos, the interactive videos would best be stored using standard format(s).
In addition to entertainment and educational uses of interactive stories, perhaps AI systems could, someday, be trained using choose-your-own-adventure stories, interactive videos, and serious games as they are trained, today, using classic video games (see also: [3][4][5]).
Best regards,
Adam
[1] Partlan, Nathan, Elin Carstensdottir, Sam Snodgrass, Erica Kleinman, Gillian Smith, Casper Harteveld, and Magy Seif El-Nasr. "Exploratory automated analysis of structural features of interactive narrative." In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 14, no. 1. 2018.
[2] Carstensdottir, Elin. Automated Structural Analysis of Interactive Narratives. Northeastern University, 2020.
[3] Ammanabrolu, Prithviraj, and Mark O. Riedl. "Playing text-adventure games with graph-based deep reinforcement learning." arXiv preprint arXiv:1812.01628 (2018).
[4] Yin, Xusen, and Jonathan May. "Comprehensible context-driven text game playing." In 2019 IEEE Conference on Games (CoG), pp. 1-8. IEEE, 2019.
[5] Hausknecht, Matthew, Ricky Loynd, Greg Yang, Adith Swaminathan, and Jason D. Williams. "Nail: A general interactive fiction agent." arXiv preprint arXiv:1902.04259 (2019).
I would like to clarify on the Flash/SWF comparison and to indicate a cool scenario for interactive video.
Firstly, when considering interactive video and films, of primary interest is Flash/SWF’s ability to include scripts in videos and to go to – or to branch to – frames, segments, or chapters of videos, e.g. based on menu selections. Not under consideration for a new video format are Flash/SWF features for: shapes, gradients, bitmaps, shape morphing, fonts and text, sounds, sprites, 3D graphics, and so forth.
The cool scenario: with standard format(s) for interactive videos, some chatbots and dialogue systems could be stored as interactive videos. This could be useful for instructional materials, how-to videos, and automated call center systems. In addition to static interactive video files, new standards could support streaming scenarios such that video content could be dynamically generated on servers.
Interestingly, interactive video formats could include support for utilization of pronunciation lexicons and speech recognition grammars and/or be interoperable with remote speech recognition services to enable users to select menu options with spoken natural language. In Web browsers, this functionality could be achieved via the Web Speech API (https://wicg.github.io/speech-api/).
I will circle back in a few months with a more detailed letter. Thank you for the interesting discussion.
Best regards,
Adam