Craig,
First of all I apologise for taking ten days to reply!
On 03/10/2016 17:48,
craig....@gmail.com wrote:
> Hi Mark,
>
> I also worry about a "XKCD 927", and I spent a lot of time evaluating each of the existing formats because of that.
>
> Overall, none of them address the security, accessibility, or the ease of which the documents can be created/distributed.
>
> And while I did discuss this with the W3C a while ago, they were too focused on creating their own "new standards" (e.g. PWP), all of which have their uses, but didn't actually address the problems in PDF.
>
> I did also had an interesting discussion with someone at Adobe (under NDA), and they are looking at something very similar to this (but also more complicated).
>
> The justification about this file format is at the following URL:
>
>
https://github.com/craigfrancis/wdoc
I accept that no existing format addresses the exact use case that you
have in mind for wdoc, especially taking into consideration the specific
drawbacks that you list of other formats in this context. I also
appreciate that the format you propose has the advantage of relative
simplicity and that most web browsers or mail clients could probably
fairly easily be updated to support it.
One complication would be choosing how wdoc would archive web pages that
make heavy use of Javascript for dynamically placing content. Such pages
often cause print-to-PDF/save-as-PDF and save-as-MHT extensions great
difficulty: The saved result is often quite dissimilar to the original,
dynamically-generated page. As far as I can see, the wdoc format would
have the same difficulty since, if I understand correctly, it would want
to exclude the Javascript from the saved version of the page (or at
least prevent the Javascript from requesting data from outside the wdoc
file). Unless I'm missing something, there'd be no easy solution to this.
Despite the possible issue with saving faithful renderings of
Javascript-heavy web pages, I would probably use the wdoc format if the
support was there. I already use print-to-PDF print drivers and the
UnMHT extension to save archives of web pages and I can tolerate the
Javascript-related rendering issues that these experience, so such
problems should not put me off using wdoc.
Personally, my main concern in terms of "support" would be the existence
of an IFilter
<
https://msdn.microsoft.com/en-us/library/bb331575(v=vs.85).aspx#adding_new_file_format>
for the wdoc format so that it could be properly indexed (both content
and metadata) in the Windows Search system. Both PDFs and MHT files are
automatically indexed (both text and metadata) by Window Search.
Good luck with gaining mindshare for the wdoc format. Although it seems
as if it should be relatively simple to implement, I think that its use
case is similar enough to other pre-existing formats to cause people to
reject it to begin with, as I originally did.
If you could write a proof of concept Firefox extension then perhaps
that would help a lot. (It might even be possible to do it in the new
WebExtension system).
--
Mark Rousell