Hi, Chris,
thanks a lot for the quick response and the detailed information.
Please move the conversation to the google group.
Thanks and regards,
Andrea
Andrea Heidrich
Studium und Lehre | Content Management
Georg Thieme Verlag KG
Rüdigerstraße 14 | 70469 Stuttgart
Fon +49[0]711/8931-962
Georg Thieme Verlag KG | Rüdigerstr. 14 | 70469 Stuttgart
Rechtsform: KG | Sitz und Handelsregister: Stuttgart, HRA 3499
Von: Chris Maloney [mailto:vold...@gmail.com]
Gesendet: Mittwoch, 15. Januar 2014 03:32
An: Heidrich, Andrea
Betreff: Re: dtdanalyzer parameter entities
Hi, Andrea,
First, do you mind if I move this conversation to the dtdanalyzer google group
(https://groups.google.com/forum/#!forum/dtdanalyzer)? There are a few others on that
list that might have an interest.
I've gotten started digesting your problem, but haven't had a chance to look at your
DTD yet. You are definitely on the bleeding edge here -- you're the first one to try
to use this tool so thoroughly. I apologize that the linking and escaping mechanism
for pandoc is a bit hacky -- we really didn't have time to try to make it robust at
the time we wrote it.
Anyway, from an initial perusal, I still think pandoc might be a good choice. I've
answered point-by-point below.
> We run it from the split-example directory using pandoc like this:
> ..\..\dtdanalyzer.bat --system split-example.dtd split-mockup.daz.xml –m
> ..\..\dtddocumentor.bat -d split-instance.xml –m
>
> &fleegle-pic; is a link referencing the page for the general entity, but even following
> the link, you never get to the picture. That’s not what is intended, is it?
That's a good point, and I hadn't considered it. By default, the documentor does not produce
documentation for general and parameter entities. That was a design decision based on the
idea that most *users* of the DTD wouldn't be interested in them, that they'd really be for
designers of customizations. But the problem is that it is faithfully rendering the markup
of the home page, which includes a link to the entity page. If you run it with
..\..\dtddocumentor.bat -d split-instance.xml –m -e
instead, it should create documentation without any broken links.
> With pandoc switched off, we still get a “not valid XML”-error message.
Right, the "split" example documentation is written in Markdown format, so it requires
pandoc. To get the split example to work without pandoc, you'd have to rewrite it in
XHTML. So, for example, instead of
> Here is how you make links to other documentation pages:
>
> * Element tags must be preceded with a backtick: \`\<split> -> `<split>.
> ...
you'd have:
> <p>Here is how you make links to other documentation pages:</p>
> <ul>
> <li>Element tags must be preceded with a backtick: \`\<split> -> `<split>.</li>
> ...
But I think you probably already know this ... it sounds like you've gotten this far.
> We’d like to avoid pandoc. The reason for this is that we have rather complex
> example sections, and, as far as we can see, pandoc doesn’t open up the
> opportunity to include comments, processing-instructionss, simple text ...
I'm not sure what you mean by "include comments, processing instructions, ...". You can
certainly include quoted text, including markup, using pandoc. As I checked this, I do
see that there's a problem including quoted comments, since a comment inside a DTD can't
contain the string "--". So the following *almost* works as desired:
> Here is some quoted markup:
>
> ```xml
> <?processing-instruction foo="bar"?>
> <!-\- comment comment comment -\->
> <splits>fun</splits>
> ```
But the backslashes separating the dashes in the comment delimiters remain, instead of being
removed. That should be an easy fix, though. The "xml" introducing the text block produces
a syntax-highlighted block of preformatted text. To get the syntax highlighting to show up,
you have to use a css file. For example, the default one from pandoc appears to be this
one: https://github.com/jaspervdj/hakyll/blob/master/web/css/syntax.css (but there might be
better ones, see https://benjeffrey.com/posts/pandoc-syntax-highlighting-css). Save that to the
`split-example` directory, and then cause it to be loaded in the documentation by adding the
`--include ../syntax.css` argument.
> ... or even
> to use color and emphasis within the examples section. If we’re wrong here,
> please let us know.
> First we thought about trying another pandoc flavor, but
> couldn’t find out how to make extensions work. We think if the extentions are
> available, pandoc still is no option for us, however.
One thing you might not know is that pandoc lets you mix in XHTML pretty freely. For example,
you could write most of your text in Markdown (which is much more readable than XHTML, IMO)
and switch into XHTML for the complex examples, something like this:
> Here is an example, written in XHTML, that has color and emphasis:
>
> <pre><split>
> <span color='style: color: red;'><banana instrument='guitar'>Fleegle</banana></span>
> <strong><em><banana instrument='drums'>Bingo</banana></em></strong>
> </split></pre>
This *almost* works perfectly, except that the document processor is getting confused by the
`<`, thinking that they are entity references, and hyperlinking them. I see that this is
exactly what you're complaining about with reference to the numeric character references like
`ü`, and that your fix to the dtddocumentor.xsl should take care of that.
That is as far as I've gotten. I will take a look at your DTD tomorrow.
Best,
Chris
On Tue, Jan 14, 2014 at 9:57 AM, Chris Maloney <vold...@gmail.com> wrote:
Hi, I will take a look at this today, but I have a lot of other things on my plate -- might not respond till this evening (my time).
Chris
On Tue, Jan 14, 2014 at 7:14 AM, Heidrich, Andrea <Andrea....@thieme.de> wrote:
Dear Chris,
we‘ve tried the split.example.dtd in the DtdAnalyzer-0.4 directory which seems to be identical except for one opening bracket in banana.ent:
<!--~~ banana.ent
This module defines the `<banana> element, and all the slippery things associated
with it.
~~-->
Master directory: “<banana>” instead
We run it from the split-example directory using pandoc like this:
..\..\dtdanalyzer.bat --system split-example.dtd split-mockup.daz.xml –m
..\..\dtddocumentor.bat -d split-instance.xml –m
&fleegle-pic; is a link referencing the page for the general entity, but even following the link, you never get to the picture. That’s not what is intended, is it?
With pandoc switched off, we still get a “not valid XML”-error message.
We’d like to avoid pandoc. The reason for this is that we have rather complex example sections, and, as far as we can see, pandoc doesn’t open up the opportunity to include comments, processing-instructionss, simple text or even to use color and emphasis within the examples section. If we’re wrong here, please let us know. First we thought about trying another pandoc flavor, but couldn’t find out how to make extensions work. We think if the extentions are available, pandoc still is no option for us, however.
Our DTD documentation is written in German, which means that it includes characters from Latin-1 and we don’t want a link to be created for e.g. ü (Latin small letter u with diaeresis). Moreover our notes sections contain opening/closing/empty tags, for which we don’t want a link in this place. Therefore we modified the dtddocumentor.xsl, this is the preliminary version:
<!-- Redo the hyperlink with a new href, but preserve other attributes and content -->
<xsl:choose>
<xsl:when test="@href[starts-with(.,'#p=ge-')]">
<xsl:value-of select="."></xsl:value-of>
</xsl:when>
<xsl:otherwise>
<a>
<xsl:apply-templates select='@* except @href' mode='content'/>
<xsl:attribute name='href'>
<xsl:text>#p=</xsl:text>
<xsl:call-template name="makeSlug">
<xsl:with-param name='name' select='$name'/>
<xsl:with-param name="type" select='$type'/>
</xsl:call-template>
</xsl:attribute>
<xsl:apply-templates select='node()' mode='content'/>
</a>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Before making this modification, we tried our own DTD with a general entity with pandoc switched off and got a “not valid XML” message, too. So, general entities don’t seem to work with pandoc switched off. At the moment, it looks like we don’t necessarily need general entities that need to be declared, but we’re not sure, yet. Our first intention was using general-entities (like &fleegle-pic;) to include images in our notes or examples section, but including an image-element directly seems to work fine in this case.
You find a stripped-down version of our DTD attached. If you write &fleegle; or &fish; somewhere into the notes section it will fail with pandoc switched off.
Thanks for helping, we might have a buch of questions following for the attributes section and attribute pages.
Regards,
Andrea
Hi, Andrea
I got a chance to look at your stripped-down DTD, and I think I got it working the way you would like, and am attaching a new version, along with generated documentation, in a zip file.
Here’s what I came up with:
<!--~~ <excursus>
~~ notes
Der Exkurs ist eine \</erweiterte\> \<Sonderumgebung/\>, die eingeführt ...
Here's a reference to \&fleegle;
text
<div>
<img src="..\fish.jpg" title="fish fish fish" alt="a fish"/>
</div>
Here's a preformatted example section that has a lot of weirdness:
<pre>\<splits>
<em><strong>\<fleegle/></strong></em>
<span style='color: green;'>\<drooper/></span>
<span style='color: red;'>Rhababerbarbarabarbarbarenbärte</span>
\</splits></pre>
Notes:
· As mentioned in the documentation on the wiki (https://github.com/NCBITools/DtdAnalyzer/wiki/Overview-of-DTD-annotations) in order to disable auto-linking to entities, you just precede them with a backlash. The documentor should be smart enough to know that < >, etc., and numeric character refs are not DTD entities, but it is not. So for now, precede those with a backslash. I wrote a new issue for this, https://github.com/NCBITools/DtdAnalyzer/issues/45, and hopefully someone will be able to get to it soon.
· Rather than using numeric character references like “ü”, if your DTD is encoded in UTF-8, you could just type the “ü” directly. But if that doesn’t work, go ahead and use the NCR, but just remember to precede it with a backslash: “\ü”.
· If you want the actual text “&fleegle;” to show up in the documentation, you have to xml-escape the ampersand. Remember, when you turn off Markdown, the text is interpreted as XHTML, and “&fleegle;”, unescaped, is not valid. That’s probably the problem you described in your email.
Let me know if (when) you have more questions.
Chris Maloney
NIH/NLM/NCBI (Contractor)
Building 45, 5AN.24D-22
--
You received this message because you are subscribed to the Google Groups "DtdAnalyzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
dtdanalyzer...@googlegroups.com.
To post to this group, send email to
dtdan...@googlegroups.com.
Visit this group at http://groups.google.com/group/dtdanalyzer.
For more options, visit https://groups.google.com/groups/opt_out.