Utility for label addition?

27 views
Skip to first unread message

Sean Fitzpatrick

unread,
May 9, 2024, 5:42:26 PMMay 9
to PreTeXt support
My understanding (based on the announcement about better, faster automatic identifiers) is that if a <latex-image> or <asymptote> does not have an @label, the pre-processor will create one by looking up the tree for an xml:id.

Or something like that.

But ultimately I should provide these labels myself, and that leaves me wondering if there is (or could be) some XSL utility to do it for me.

I am pretty sure that every <image> in APEX has an xml:id. Now, I would like to do exactly what the pre-processor does:

If I have

<image xml:id="foo">
  <description>
    <p>
      Potentially a whole bunch of text over a variable number of lines,
      and it may not be present yet depending on how far into the book we are...
    </p>
  </description>
  <shortdescription>100 very descriptive characters or fewer</shortdescription>
  <latex-image>
... and all that follows...

I would like to grab the xml:id="foo" from the image element, and write a label="foo" into the latex-iamge element.

I *might* be able to do this with regex in VSCode, but I'm not confident I'll get everything, and I suspect that XSL would be a better tool for the job in any case.

Do we have a tool for this? If not, would it be hard (for me, not Rob) to make one?

Rob Beezer

unread,
May 9, 2024, 7:55:26 PMMay 9
to pretext...@googlegroups.com
Nothing like this exists. Yet.

I think it is a perfect exercise for you, and then we can make it available for everybody. Especially since I don't think I can get myself wound-up to do it. I'd thought about it at the time.

I believe the pre-processor is already doing exactly this. But easier standalone and then your source (everybody's source) can be neat and tidy.

Can you come to second hour of Drop-In tomorrow? If so, I'd think lurkers interested in XSL would be welcome.

Rob

Sean Fitzpatrick

unread,
May 9, 2024, 8:13:06 PMMay 9
to pretext...@googlegroups.com
Sure, I'll try to be there tomorrow!

--
You received this message because you are subscribed to the Google Groups "PreTeXt support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-suppo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-support/MTAwMDA0MS5iZWV6ZXI.1715298924%40quikprotect.

Rob Beezer

unread,
May 9, 2024, 8:37:45 PMMay 9
to pretext...@googlegroups.com
Great! Make a barren stylesheet that xsltproc doesn't complain about and we will add templates. Done right, it might just produce nothing, which is fine, and maybe even correct.

Sean Fitzpatrick

unread,
May 10, 2024, 7:56:39 PMMay 10
to pretext...@googlegroups.com
I've found two unexpected effects of the stylesheet we put together: in the <?xml version="1.0" encoding="UTF-8"?>  line at the top of each file, it stripped the encoding part. And anywhere there was a tag with no contents, like <cell></cell>, it was changed to the self-closing version: <cell/>

Sean Fitzpatrick

unread,
May 10, 2024, 8:02:16 PMMay 10
to pretext...@googlegroups.com
Oh, and it also stripped excess whitespace (<mdash /> becomes <mdash/>, plus blank lines are removed) and it escaped some characters (e.g. '>' was replaced by & gt ;)

Rob Beezer

unread,
May 10, 2024, 10:52:47 PMMay 10
to pretext...@googlegroups.com
Yes, I should have warned you about all that. It's been a while since I did exactly this sort of thing.

It is also possible the order of attributes could change. I'm surprised blank lines go missing. Everything else produces equivalent source (except maybe the encoding bit), but you have decide if you like it that way. A huge change is that CDATA sections see lots of changes to escaped characters and the CDATA goes away.

Rob

Sean Fitzpatrick

unread,
May 11, 2024, 12:01:13 AMMay 11
to pretext...@googlegroups.com
I think it shouldn't be an issue. Easy find+replace to pop the UTF-8 encoding bit back in.
And maybe the only blank lines that went missing were the ones right under that initial xml identifier at the top of each file.
There were no CDATA sections anymore, so I think replacing a few '>' is just making up for some laziness in transcribing TikZ code.

But the notable bit: building APEX HTML, with regeneration of all assets, took just over an hour.
A few years ago, on the same laptop, that took the better part of a day.

Sean Fitzpatrick

unread,
May 11, 2024, 10:11:24 AMMay 11
to pretext...@googlegroups.com
Admittedly, the timing without adding labels to each latex-image and asymptote is about the same.
Which I guess isn't surprising, since using XSL to add all the labels took less than 10 seconds.
Reply all
Reply to author
Forward
0 new messages