XSLT and regexp

8 views
Skip to first unread message

D. Brian Walton

unread,
Aug 28, 2025, 11:55:32 AM (10 days ago) Aug 28
to pretext-dev
Andrew and I have been discussing how fillin questions that have text answers break-down when regular expression special characters are present in the correct answer text. This is because behind the scenes in the Runestone Component, text answers are compared using regular expressions.

I am working to resolve this and plan to do this by creating an escaped version of the correct answer text by putting the slash in front of any offending characters. This could be done with a simple regexp search/replace.

Currently, PreTeXt XSL code does not use the regexp extension to XSLT 1.0. Should I avoid introducing using a new extension? If we want to avoid that, I could mimic the JSON escaping mechanism that specifies individual substitutions and then uses the str:replace method.

Thanks,
Brian

Charilaos Skiadas

unread,
Aug 28, 2025, 1:13:28 PM (10 days ago) Aug 28
to prete...@googlegroups.com
I don’t have any particular suggestions about how to implement this, but I would ask that we don’t break the ability for the answers to be regexs. To me at least the ability to specify the correct answer as a regular expression seems useful, and it’s more that I perhaps would expect this to just be better documented (it is, but I think the whole fill-in-the-blanks section of the guide could use some cleaning/expanding/examples). I ran into it this term with some of our problems (basically students being asked to write out a function call as the answer so there were naturally parentheses involved) and I simply changed the expected string into a regular expression by just adding those backslashes, which was not hard to do. So changing the behavior would certainly break those tests of ours. Not a big deal, easy to fix if this does go through, but I like having the regex behavior. But it may also break other people’s code.

How will we determine if the answer is presented as a regular expression to begin with, in which case escaping things will break the regex? The guide on this suggests "<strcmp> that does not include the @use-answer will have content that defines a matching string or more generally a matching regular-expression.” so from that reading the intent is that regular expressions are allowed. If we want to also allow regular strings that just happen to use regular expression specials, we need some way to distinguish the two cases, maybe via an attribute. Probably you are already there.
I would suggest making regex the default in that case.

Charilaos Skiadas
Department of Mathematics
Hanover College


--
You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pretext-dev/CAH7VRo%2B0eAiE1jzDVR6uzJ6_0QLg2WWRBwTSRKEcWUwVDv9MJQ%40mail.gmail.com.

Andrew Scholer

unread,
Aug 28, 2025, 1:59:14 PM (10 days ago) Aug 28
to prete...@googlegroups.com
There are two cases to worry about. The @answer that gets used with <strcmp use-answer="yes"/> and <strcmp>pattern</strcmp>. Currently, both are treated as regexes. The  <strcmp>pattern</strcmp> version seems relatively easy to add an attribute to (or provide a <regexcmp> alternate for). The @answer version is trickier.

Brian has made elsewhere the excellent point that @answer on the fillin is used for static renderings of the answer. Thus it really is not appropriate to make <fillin answer="if\s*\(\s*x\s*=\s*5\s*\)"/> even if that is the regex you want to use to verify possible versions of "if(x = 5)". A <strcmp>pattern</strcmp> is where the regex should go. So I think heading the direction of @answer always being a plain string and restricting regexes to explicit <strcmp>'s is the right call. 

Unfortunately, there are likely books out there that are currently relying on the current regex nature of @answer. (If you start from a Runestone book that used RST where using regexes for fillin answers was common, the natural seeming translation to PTX would put those regexes in the @answer). I know I had started down that road before a conversation with Brian.

I don't see an easy way to not break old books while providing new behavior for @answer + use-answer="yes" short of making the new correct behavior opt in, which would be awful in a different way.

Andrew Scholer


D. Brian Walton

unread,
Aug 28, 2025, 2:05:42 PM (10 days ago) Aug 28
to prete...@googlegroups.com
An "answer" that is provided to a fillin should **not** be a regular expression. This is used separately for static representations as the text that IS the correct answer. It will only be the provided answer that is escaped when used with @use-answer. [The guide is incorrect in implying that @use-answer="yes" avoid using regular expressions. This is what needs to be fixed.] The checker for an answer CAN be a regular expression, but it needs to be separately provided in the <evaluation> block.

Andrew gave a good example that if the correct answer is "foo()", then the checker sees the parentheses as an empty character grouping and would *not* match "foo()" as submitted text, but would match "foo". Or suppose the answer is a literal string "2+5", then regular expressions see that as any of the following: 25, 225, 2225, 22225, etc, but would not match the literal submission "2+5".

I do want to add an attribute on <strcmp>, say @literal="yes", that also treats the submitted comparison text as a literal string instead of a regular expression and would escape that text as well. For example,
<fillin mode="string" name="my_blank" answer="foo()"/>
would be paired with
<evaluate name="my_blank">
  <test correct="yes"> 
    <strcmp use-answer="yes"/> <!-- escape the text -->
  </test>
  <test>
    <strcmp literal="yes">foo[]</strcmp>
    <feedback>Functions don't use square brackets.</feedback>
  </test>
</evaluate>

And I just saw Andrew's response as well.

I would not be changing the Runestone Component behavior itself, so this is only going to affect new books authored with PreTeXt.

- Brian

Rob Beezer

unread,
Aug 28, 2025, 10:26:38 PM (9 days ago) Aug 28
to prete...@googlegroups.com
On fone, so maybe not complete.

I've tried to find documentation at the EXSLT site that the regexp extension is fully supported by xsltproc and the Python lxml module (I think they are based on the same library). I thought it was obviously listed somewhere.

Have you tried using each of the three functions? I saw some "function-avaiable()", or similar, utility that might give a definite answer?

We are not opposed to extension functions that are fully supported by our processors.

Rob
Reply all
Reply to author
Forward
0 new messages