DAISY Pipeline - Proposal for the next phase (January 2014 to December 2015)
The DAISY Pipeline 2 project, launched in June 2010, is a long term investment to achieve an enterprise-grade production system for accessible content.
The project already delivered a powerful and extensible engine, along with a wide range of processing features. The initial focus has been to provide good support for the production of EPUB 3 content as well as the processing of the new DAISY Authoring and Interchange Framework for Adaptive XML Publishing (ANSI/NISO Z39.98-2012).
This document presents how we envision to extend the objectives and activities of the Pipeline 2 project in a possible next phase, spanning from January 2014 to December 2015.
Work on processing functionality can be focused on the following areas:
-
EPUB 3: providing support for automated production of accessible EPUB 3 content has been and will keep on being one of the primary objective of the project. Existing converters will be consolidated and can be made more configurable. New converters can be developed, accepting other input formats (e.g. office documents like OpenOffice/LibreOffice's ODT or Microsoft Word's OOXML).
-
TTS-based Production: development of a new TTS-based "Narrator" converter is in the roadmap of the current phase of the project. The next phase will follow up on the this initial ongoing work, proposing notably: more advanced XML pre-processing; adapters for a variety of TTS engines; parallelization of TTS calls and audio processing, etc.
-
Braille: the LibLouis-based Braille production system developed at SBS is giving promising results and is already highly capable. This system can be further developed in the next phase; additionally, we welcome other contributors to adapt the system to other Braille translation or formatting engines, or to other Braille codes.
-
DAISY formats: while the primary focus of the project so far has been the support for the new DAISY Authoring and Interchange (ANSI/NISO Z39.98-2012) and EPUB 3, previous DAISY formats (DAISY 2.02 and DAISY 3) are still heavily used by member organizations. The Pipeline 2 already provides support for migrating DAISY 2.02 and DAISY 3 content to EPUB 3, but we also want to be able to produce DAISY 2.02 and DAISY 3 formats. The new TTS-based "Narrator" functionality can be adapted to produce DAISY 2.02 and DAISY 3; some of the scripts of the Pipeline 1 (e.g. DTBook to DAISY 3 text-only) can be ported to the Pipeline 2 framework.
-
STEM and Educational Content: Most of the converters can be improved to better support the conversion of STEM and other educational content. We notably want to provide first-class support for MathML content and image's long descriptions (e.g. DIAGRAM content).
-
Validation: The Pipeline 2 already provides some validation functionality. We want to extend the validation offer; we also want to adapt the User Interface to simple validation use cases. Additionally, we need to make sure that the result of the EPUB 3 Preflight project integrates finely with the automated workflows of the Pipeline 2.
In addition, we will keep on working on the internal Pipeline 2 engine and user interfaces, notably in the following areas:
-
Usability: The Pipeline 2 web-based user interface has been primarily developed for server-side deployment, but we found out that there is a strong demand for desktop usage. The "desktop" distribution of the web-based user interface was a low-cost "workaround" answer to this demand, but it comes with many limitations (poor installation experience, poor OS integration). We want to generally improve the user experience on the desktop – e.g. provide real installers, improve the "native" feel of the application. In parallel, the existing Web User Interface will keep on being further developed and improved.
-
Quality: The internal Pipeline 2 engine will keep on being gradually improved to always increase its feature-set, performance and stability. We also want to further work on developer tools like testing frameworks, to make sure that our code is thoroughly tested and results in high-quality converters.
The outcome will benefit all DAISY Consortium members involved in automated content production, as well as commercial companies wishing to take advantage of the open source and liberally licensed deliverables.