Re: special characters, recent browsers and attribute annotations

17 views

Skip to first unread message

Chris Maloney

unread,

Feb 7, 2014, 8:28:12 AM2/7/14

to Heidrich, Andrea, dtdan...@googlegroups.com

[Responding on the list; see Andrea's descriptions of her problems below]

Hi, Andrea,

It would help me to diagnose your problem if I had a sample file that was failing. It's very hard for me to know what's going wrong without that, especially with regards to the character coding issue. I think that it should work to include the character as-is (ü), but usually, with this kind of thing, one app or another is getting confused about the proper encoding of a file (should always be UTF-8), and there are a number of different places where that might be happening.

> What does work for us at the moment is inserting the NCR into the input DTD and suppressing the display of a link no (backslash needed).

Could you remind me, how do you do this? Is it with a change to the XSLT stylesheet?

> Our solution works perfectly fine with the antique version 8.0 of the Internet Explorer we are using. Trying it with a more modern version results in umlaute like “ü” being displayed totally messed up.

This means that IE is confused about the proper encoding of the output file. In HTML, there are a number of places that can show up. I see in my copy of the doc output for the split example, it has this:

It seem to be working for me in my version of IE11, so I don't know. Again, a sample file would be helpful. And, let me make sure I have this straight: it fails in *recent* versions of Firefox and Chrome? That would lead me to believe that there really might be an encoding problem with the file -- like that the actual encoding doesn't match the declaration. It is strange.

> We’re not sure, if the issue could be settled by way of including a special character list into the dtdanalyzer source code in the first place, or if another solution has to be found.

I would hate to go that route -- we should try to fix the root cause.

> What is more disturbing about recent browsers (e.g. Chrome version 32.0.1700.102) concerns the index.html page. They all get stuck in the ”Loading...” sidebar, never displaying the entry results and a functioning search field

Umm, there might be one thing that I forgot to mention. When viewing the documentation, unfortunately you have to look at it through a web server. It won't work if you just try to open a file in the browser that's on your filesystem. So if you see "file://..." in the address bar, that's probably the reason for this. This is a security feature of modern browsers that they don't allow AJAX calls from javascript that is running on the local filesystem. I just Googled, and found a way around this. If you can't work in a directory that is served by an HTTP server, then you can get Chrome to work by starting it with the command line option "--allow-file-access-from-files". Let me know if you need help figuring out how to do this.

> the same attribute has different annotations depending on the element it is used with

Yes, I know this is a pretty bad limitation with our DtdAnalyzer. It was originally designed for JATS, which is defined in such a way that attribute names are not overloaded. It would be nice to fix it. I haven't reviewed your fixes in detail, but I'd like to ask, is there any chance you could push your changes up to GitHub? I could walk you through the process. It is very easy, and it makes it much easier to review and track changes. Since it looks like you're trying to tackle this problem with the DtdAnalyzer, it would be nice if we could incorporate your changes into the originals.

Hope this helps. Please send samples illustrating your encoding problems.

Chris

----------------------------------------

On Wed, Feb 5, 2014 at 8:23 AM, Heidrich, Andrea <Andrea....@thieme.de> wrote:

Dear Chris,

sorry, it took us some time to get back to you.

We are still having trouble with special characters in our input DTD for the dtdanalyzer.

Escaping them with a backslash, e.g. “\ü” doesn’t work, neither does escaping the NCR, e.g. “\ü” (using markdown). Inserting the “ü”-character into (backslahesd or not) into an annotation section results in a dtdanalyzer output with an empty annotation section. It would be most comfortable for us to be able to insert the “ü”-character directly, can you think of any way to include a list of special characters at any place in the dtdanalyzer source code (probably in SComment.java) so that all items of this list that can be processed by it?

What does work for us at the moment is inserting the NCR into the input DTD and suppressing the display of a link no (backslash needed). This solution seems to be the most sustainable, as it even offers a workaround for comments, inserting “-” instead of “-“ is processed by the dtdanalyzer and doesn’t rise a backslash issue for the display. However, we’ve run into another problem: Our solution works perfectly fine with the antique version 8.0 of the Internet Explorer we are using. Trying it with a more modern version results in umlaute like “ü” being displayed totally messed up. The same goes for a recent version of Mozilla Firefox and Chrome version 32.0.1700.102. We’re not sure, if the issue could be settled by way of including a special character list into the dtdanalyzer source code in the first place, or if another solution has to be found.

What is more disturbing about recent browsers (e.g. Chrome version 32.0.1700.102) concerns the index.html page. They all get stuck in the ”Loading...” sidebar, never displaying the entry results and a functioning search field. We guess you have already run across that problem and there is an easy fix to it (possibly also fixing our messed up umlaute issue), but we haven’t figured out, what to fix to make this run. Can you help with it?

Feedback for huge DTDs:

Apart from all of this we have adapted dtddocumentor.xsl a little, so it meets our needs closer: As our DTD is rather huge, it happens that the same attribute has different annotations depending on the element it is used with. So the declaration:

<!--~~ @attribute

annotation

~~-->

is not very helpful for our purpose. Therefore we added a table both to the element and the attributes pages that provides a column for attribute usage. This column receives the content of an ~~attributename section that is added to the annotation sections of the element in question.

<xsl:for-each select="../../attributes/attribute[attributeDeclaration/@element=$e-name]">

<xsl:sort select="@name"/>

<xsl:variable name="a-name" select="@name"/>

<table>

<tr>

<th>Attribut</th>

<th>Gebrauch</th>

</tr>

<td>

<xsl:call-template name="makeLink">

<xsl:with-param name="name" select="@name"/>

<xsl:with-param name="type" select="'attr'"/>

</xsl:call-template>

</td>

<td>

<xsl:value-of select="../../elements/element[@name=$e-name]/annotations/annotation[@type=$a-name]"/>

</td>

</tr>

</table>

</xsl:for-each>

</ul>

</xsl:if>

<xsl:variable name="a-name" select="@name"/>

<xsl:choose>

<xsl:when test="attributeDeclaration[not(not(pmc:included(@element)) or @element=//element[@reachable='false']/@name)]">

<tr>

<th><h3>Im Element</h3></th>

<th><h3>Werteliste</h3></th>

<th><h3>Gebrauch</h3></th>

</tr>

<xsl:for-each select="attributeDeclaration[not(not(pmc:included(@element)) or @element=//element[@reachable='false']/@name)]">

<xsl:sort select="@element"/>

<xsl:variable name="e-name" select="@element"/>

<td>

<xsl:value-of select="concat('<', @element, '>')"/>

</td>

<td>

<xsl:value-of select="@type"/>

</td>

<td>

<xsl:value-of select="../../../elements/element[@name=$e-name]/annotations/annotation[@type=$a-name]"/>

</td>

</tr>

</xsl:for-each>

</table>

</xsl:when>

<xsl:otherwise>

<xsl:variable name="e-name" select="../../../elements/element[annotations/annotation[@type=$a-name]]"/>

<p class="bold">Werteliste: <span class="attvalue">

<xsl:value-of select="attributeDeclaration[not(not(pmc:included(@element)) or @element=//element[@reachable='false']/@name)][1]/@type"/>

</span>

</p>

</xsl:otherwise>

</xsl:choose>

What comes in handy for us, too, is not grouping elements by the same attribute values but displaying one line for each element in ascending order allowing multiple occurrences of the same attribute value list, as it is easier to read when you get some twenty elements which use the attribute in question. One thing we still have to fix about this is that we also get rather longish attribute value lists that have to be broken down into a new line at some point for not so much overflowing the display vertically. But we can fix this for ourselves.

Thanks for your help and advice.

Andrea

Andrea Heidrich

Reply all

Reply to author

Forward

0 new messages