Adding a new <modules> section

11 views
Skip to first unread message

Chris Maloney

unread,
Sep 23, 2012, 4:05:44 PM9/23/12
to dtdan...@googlegroups.com
I just created a new issue, #14, https://github.com/NCBITools/DtdAnalyzer/issues/14, for implementing the <modules> section.  Demian, Audrey and I talked about this a little bit last week.  The description in the ticket should hopefully get you up-to-date on our discussion.  What do you think?  Do you have any feedback or suggestions?

Chris

Demian Hess

unread,
Sep 23, 2012, 4:55:22 PM9/23/12
to dtdan...@googlegroups.com, dtdan...@googlegroups.com
i suspect that the name of the module needs to be the relative path (relative to the main dtd that started the parse); this would jkeep the names unique in case there are files with the same name in differnt folders.

Sent from my iPod

On Sep 23, 2012, at 4:05 PM, Chris Maloney <vold...@gmail.com> wrote:

I just created a new issue, #14, https://github.com/NCBITools/DtdAnalyzer/issues/14, for implementing the <modules> section.  Demian, Audrey and I talked about this a little bit last week.  The description in the ticket should hopefully get you up-to-date on our discussion.  What do you think?  Do you have any feedback or suggestions?

Chris

--
You received this message because you are subscribed to the Google Groups "DtdAnalyzer" group.
To post to this group, send email to dtdan...@googlegroups.com.
To unsubscribe from this group, send email to dtdanalyzer...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/dtdanalyzer?hl=en.

Audrey Hamelers

unread,
Sep 24, 2012, 10:37:03 AM9/24/12
to dtdan...@googlegroups.com
I don't really think that the modules section is necessary, as it seems like all the modules are already listed as entities. Or am I missing something?

Chris Maloney

unread,
Sep 24, 2012, 11:07:37 AM9/24/12
to dtdan...@googlegroups.com
Dag, you are right, of course ... I hadn't thought of that!  A module is just an external parameter entity, so you could find all the modules (except the top-level dtd) with //parameterEntities/entity[@systemId != ""].

So, I could move all the annotations to there, but there's a conceptual problem now.  There are two places where annotations for a module could occur
  1. in the module file itself, tagged with (as I have it now) anything like "<!--~~ my fantastic module", and 
  2. in the invoking module, tagged with "<!--~~ %banana.ent;"; and 

And these are different, and both are important, I think.  (1) documents what the module *is*, and how it is supposed to be used.  (2) documents how it happens to be used in this particular DTD.

Likewise, there's a similar problem with names:
  1.  The name of the file, relativized from the system id of the dtd, like "../common/banana.ent",
  2.  The name of the parameter entity in the invoking module, like "banana.ent".
In our toy example, they always match, but there's no reason that they have to.

And, of course, we also need a place for the top-level DTD annotations.

Let me think about this and try to come up with a clean way to represent these.  Any suggestions are welcome, of course!

Chris

--
You received this message because you are subscribed to the Google Groups "DtdAnalyzer" group.
To view this discussion on the web visit https://groups.google.com/d/msg/dtdanalyzer/-/DgJP8Urlw68J.

Audrey Hamelers

unread,
Sep 24, 2012, 1:01:13 PM9/24/12
to dtdan...@googlegroups.com
I like the way you are currently doing top-level DTD annotations. For the documentor, I've just been turning that into the index page.

Chris Maloney

unread,
Sep 25, 2012, 1:28:01 AM9/25/12
to dtdan...@googlegroups.com
I change the mockup, you can see the new version here:  https://github.com/NCBITools/DtdAnalyzer/blob/annotations/test/split-example/split-mockup.daz.xml.

Highlights:
* The very-top-level <annotations> got wrapped in a <dtd> element, which has the system and public ids of the main module, the dtd itself.
* It also gets a @relSysId attribute, which gives its relative system id (what we could use as the module name)
* Each external parameter entity now will also get a @relSysId.
* Each of those now also gets *two* sets of <annotations>.  I know this is confusing, but if you can think of a better way, let me know!  These two are distinguished by @level='reference' and @level='module'.  The 'reference' annotations are those from the including DTD/module.  The 'module' annotations are the ones internal to the module itself.

Suggestions welcome!  If I don't hear anything, then I'll fix the code to implement this new scheme.

Chris

To view this discussion on the web visit https://groups.google.com/d/msg/dtdanalyzer/-/2-d-gWhXrNUJ.

Audrey Hamelers

unread,
Sep 25, 2012, 4:36:07 PM9/25/12
to dtdan...@googlegroups.com
I am not clear on the difference between a module and an entity, and if any particular item should be either a module or an entity, or both.

Maloney, Christopher (NIH/NLM/NCBI) [C]

unread,
Sep 25, 2012, 4:51:07 PM9/25/12
to dtdan...@googlegroups.com

There is clearly overlap between the concepts.  As you pointed out, a module is nothing but an external parameter entity.

 

Here is XQuery to get from what I have in the mockup now, to what we had before, for the modules part:

 

    for $parament in /declarations/parameterEntities/entity[@systemId!=""]

    return

      <module name='{$parament/@relSysId}'

              systemId='{$parament/@systemId}'

              publicId='{$parament/@publicId}'>{

        $parament/annotations[@level='module']

      }</module>

 

It results in this:

 

    <module name="banana.ent" publicId="-//NLM//external ent dummy public id 2//EN"

            systemId="file:///home/maloneyc/git/NCBITools/DtdAnalyzer/test/banana.ent">

       <annotations level="module">

            <annotation type="note">

              <p>This module defines the &lt;banana&gt; element, and all the slippery things associated

                with it.</p>

            </annotation>

          </annotations>

    </module>

    <module name="inline.ent" publicId="-//NLM//external ent dummy public id 1//EN"

            systemId="file:///home/maloneyc/git/NCBITools/DtdAnalyzer/test/inline.ent">

       <annotations level="module">

            <annotation type="note">

              <p>This module defines the elements that can be used inline in a description.</p>

            </annotation>

          </annotations>

    </module>

 

In the “parameter entities” section of the documentation, you should drop the annotations that are at the “module” level, and just use those that are at the “reference” level.  And use @name for the name.

 

 

Chris Maloney

NIH/NLM/NCBI (Contractor)

Building 45, 5AN.24D-22

301-594-2842

To view this discussion on the web visit https://groups.google.com/d/msg/dtdanalyzer/-/KYn0HqjGwdcJ.

Audrey Hamelers

unread,
Sep 25, 2012, 4:59:39 PM9/25/12
to dtdan...@googlegroups.com, malo...@ncbi.nlm.nih.gov
Just like it seemed superfluous to list these as <modules> and <parameterEntities> in the xml, I don't see the point of listing these entities as both modules and parameter entities in the documentation. They have no <value>, so based on what parameter entity information I'm displaying now, there isn't enough information to be worth a whole page as parameter entities. Here is what I propose:

  • List parameter entities that are modules just as modules. Post all annotations regardless of @level.
  • Create a list of items declared in the module, with links.
  • Link back to the module on the page of each item declared in the module.

If there is some problem with listing these entities as modules only, let me know.

Maloney, Christopher (NIH/NLM/NCBI) [C]

unread,
Sep 25, 2012, 5:34:04 PM9/25/12
to Hamelers, Audrey (NIH/NLM/NCBI) [C], dtdan...@googlegroups.com

This is reference documentation, so I don’t see any big problem with redundant information.  Especially since it’s auto-generated, and so is ensured to be consistent – the big reason for the DRY principle is that duplicate information can get out-of-sync – that doesn’t apply here.

 

I see two different use-cases for these.  For modules, somebody wants to view the overall structure of how the DTD is assembled, from, say, HTML, MathML, JATS core, etc.  This would be more of interest to somebody who is customizing or extending the DTD.  In that case, the existing JATS documentation, which doesn’t have this list, would be hard to use, because he wouldn’t be able to find which of the parameter entities are external.

 

For the parameter entities list, I imagine somebody reading the DTD, and seeing a reference to a parameter entity (and from the reference alone, she would have no idea if it is internal or external) and wanting to look it up.  I think it would be vexing if the parameter entity weren’t listed in the parameter entity section.  That’s why I’d prefer to keep it there – just for completeness’ sake.

 

You can see that they are included in the existing JATS documentation, for example, http://jats.nlm.nih.gov/archiving/tag-library/1.0/n-bia2.html.

 

Your second and third bullets, I really like.

Audrey Hamelers

unread,
Sep 25, 2012, 7:11:55 PM9/25/12
to dtdan...@googlegroups.com, Hamelers, Audrey (NIH/NLM/NCBI) [C], malo...@ncbi.nlm.nih.gov
You've described a potential use case for a developer. This documentation generator was created to give developers an easy way to create documentation for their users. Until we have a user story showing a need for documentation of modules, I don't think they should be included in the documentor. Developers who need those details will already have access to the source dtd.

Chris Maloney

unread,
Sep 25, 2012, 10:00:47 PM9/25/12
to dtdan...@googlegroups.com
Hi, Audrey,

Then I'm confused about what you're proposing.  In two emails previous, it sounded like you were suggesting just listing parameter entities as modules, and not including them in the parameter entities section.  I agree that the <modules> section would be of more use to a "developer" than a "user" (although, I think the lines between those, while clear-cut here at PMC, in other environments, could be more blurred), so why would you want to move these parameter entities to <modules>?  The very existence of separate modules is an implementation detail that users would not care about.

I'd also argue that the documentation wasn't *only* created for users.  Documentation of this sort certainly can be of use to developers, even though they have access to the source DTD modules.  Reading comments in source modules, for DTDs like JATS that can consist of tens of different modules, can really be tedious and difficult.  I like to think that an important use-case (Jeff might not agree) is to facilitate developers extending and customizing JATS -- which I think is still way too hard, in part because the way the modules hook together is difficult to understand.

We could make it optional, and perhaps make the "developer" option a "maybe/someday" ticket, if you don't think it's worth the effort to do it now.  I'd still like to try to nail down what the "user" option is that you are proposing.  I would think it would be not to have a modules section at all, and to include all the module-level documentation in the parameter-entities section, pretty much the way the current official JATS documentation is now.

Cheers!
Chris

To view this discussion on the web visit https://groups.google.com/d/msg/dtdanalyzer/-/y-MaGhZxlZkJ.

Audrey Hamelers

unread,
Sep 26, 2012, 11:18:38 AM9/26/12
to dtdan...@googlegroups.com
Chris,


 I would think it would be not to have a modules section at all, and to include all the module-level documentation in the parameter-entities section, pretty much the way the current official JATS documentation is now.

Yes, that's exactly what I'd like to do. Previously, I was trying to find a compromise, because it seemed like you wanted the modules section in there, but I do think it should be a "someday/maybe" feature, possibly for when we or someone else on Github has the time and inclination to make a "developer" version and a "consumer" version.

I spoke with Jeff yesterday, and he'd like us to make sure we've covered the use-case we have (the frontmatter group, who only want "reachable" items in their documentation) before working on additional use-cases.
 
Reply all
Reply to author
Forward
0 new messages