WeBWorK Work

18 views
Skip to first unread message

Rob Beezer

unread,
Nov 18, 2022, 2:03:13 PM11/18/22
to prete...@googlegroups.com
For me, today ends WWW: WeBWorK Week (when I wasn't running a workshop for
disability services staff in Denver).

Lots of

* consolidation: WW problem generation and representaions all in one place in
the pre-processor.

* clarity around identifiers, paving the way for improvements in other places.

* upgrades to error-checking for WW processing.

A no-change refactor for projects set up correctly for WW (and everything else).
Well, not exactly. Some very minor additions of whitespace in the search
documents supporting new search function. Like 3 extra space characters in four
or five places in the sample article. I suspect footnotes, but did not confirm.

Changes in behavior for error reporting and messages. Definitely *more*.
Hopefully no false reports. I think I could provoke one seemingly false report
in a convoluted situation, but it would not affect processing.

More for Alex in a minute. Website updated just now.

Thanks to those who read all the traffic here, and stay alert for adverse
changes. Especially these -dev updates that should not concern most authors. I
know who many of you are. ;-) I appreciate you being my eyes on all the
possible inputs and outputs. Pay attention to your next WW runs, once you get a
CLI upgrade, etc., and please report here. You know the drill.

Thanks,
Rob

Rob Beezer

unread,
Nov 18, 2022, 2:19:44 PM11/18/22
to prete...@googlegroups.com
Dear Alex,

I hope you can put some WW processing through its paces with changes just
announced (no rush, and you'll need a nightly). Sample WW chapter has been hit
pretty hard with continual diffs through the week, so I'm pretty confident.

Some observations, which we can discuss at Drop-In. I'm feeling much better
about WW, and I don't think any of this is urgent.

* Sometimes diffs, of medium scale, are mighty confusing. I blame the
dictionaries in the Python code, since it happens when I mess with identifiers.
I understand the attraction of indexing into dictionaries with strings, but
maybe we can accomplish the same thing with lists (preserve order of appearance
from source) without too much trouble?

* I could cosmetically move @webwork-id in the XSL to the @ww-id that
appears in the Python. I did not look too closely at the specifics. Your call.
If it ain't broke...

* "effective_publisher_file" in the Python. I *think* this drill happens in
the interface? Maybe the CLI does it too? If my suspicions are right, the
switch always behaves one particular way? (See the "del" operation in the
interface code.)

* Suppose a @copy is busted. Say mis-spelled? We make a vanilla copy
anyway, to preserve the existence of a "webwork" element in that location. It
would be a bad thing to put a @copied-from on the vanilla version? Totally
mess up how the PG archive does not use many duplicates of each problem? (I
didn't do it.)

* There is an image thing (@syntax="PGtikz") in the sample chapter that raises
a deprecation warning. I like folks new to PreTeXt to be able to build samples
without errors. Can we trash this? There is a huge mess sample that has bad
stuff to test deprecations, but I have been less and less careful abut using it.
Not sure I want it to be a multi-step WW build.

Until the next feature request,

Thanks,
Rob

Alex Jordan

unread,
Nov 18, 2022, 6:58:29 PM11/18/22
to prete...@googlegroups.com
We discussed things in drop-in. Summarizing:

> I hope you can put some WW processing through its paces with changes just
> announced (no rush, and you'll need a nightly).

Will do.

> * Sometimes diffs, of medium scale, are mighty confusing. I blame the
> dictionaries in the Python code, since it happens when I mess with identifiers.
> I understand the attraction of indexing into dictionaries with strings, but
> maybe we can accomplish the same thing with lists (preserve order of appearance
> from source) without too much trouble?

I think the least invasive thing to do is remove the "sorted" from:
for problem in sorted(origin):
and current python will leave the dictionary in its original order.


> * I could cosmetically move @webwork-id in the XSL to the @ww-id that
> appears in the Python. I did not look too closely at the specifics. Your call.
> If it ain't broke...

Not opposed to trying it, for the sake of reducing clutter.


> * "effective_publisher_file" in the Python. I *think* this drill happens in
> the interface? Maybe the CLI does it too? If my suspicions are right, the
> switch always behaves one particular way? (See the "del" operation in the
> interface code.)

We decided the interface file should check for when a user had both:
-p my_pulbisher_file.xml
and
-x publisher my_other-publisher_file.xml
and in that cases, it should warn the user, and also do a:
del stringparams["publisher"]


> * Suppose a @copy is busted. Say mis-spelled? We make a vanilla copy
> anyway, to preserve the existence of a "webwork" element in that location. It
> would be a bad thing to put a @copied-from on the vanilla version? Totally
> mess up how the PG archive does not use many duplicates of each problem? (I
> didn't do it.)

I don't think we want a @copied-from on that. It would just cause trouble.


> * There is an image thing (@syntax="PGtikz") in the sample chapter that raises
> a deprecation warning. I like folks new to PreTeXt to be able to build samples
> without errors. Can we trash this? There is a huge mess sample that has bad
> stuff to test deprecations, but I have been less and less careful abut using it.
> Not sure I want it to be a multi-step WW build.

It can go away.


Of the things above that need code editing, go ahead and do whichever
ones you want to clear from the docket. And I can do the rest this
weekend.

Alex

Rob Beezer

unread,
Nov 18, 2022, 9:06:46 PM11/18/22
to prete...@googlegroups.com
Thanks, Alex, for the summary.

"sorted()" gone, example with deprecation gone, @webwork-id gone, and
publication file conflict gone.

Website update happening now. For the record, five commits, starting with
24d2eca35b574c2ffbc5789437e174bb1a96a695.

Rob

Alex Jordan

unread,
Nov 20, 2022, 3:14:51 PM11/20/22
to prete...@googlegroups.com
The only testing I have done is to try making ORCCA representations
and HTML, with --restrict to the first section. When I use
pretext/pretext/prretext for that , it looks like it's working. Mind
you that is a surface level visual inspection, not a diff check.

And it feels a lot faster. And using xsltproc --profile (with content
commented out instead of using --restrict) shows much less activity.
So that's a success!

When I try the same restricted build with the nightly CLI, something
is not working. It successfully builds the representations file,
stopping where it should at webwork-95 given the restriction. But then
if I build the HTML, it's clear that the restriction is no longer
being respected. (It was when I used the CLI a few days ago.) For
example, there is an error about not finding webwork-96 in the
representations file. And then there are about 6000 more such errors.

Did you foresee something about these recent changes affecting the
CLI? I can't quite recall.
> --
> You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/0f748afa-52bb-a863-c590-cb6ce51a739b%40ups.edu.

Rob Beezer

unread,
Nov 20, 2022, 3:40:00 PM11/20/22
to prete...@googlegroups.com
Thanks, Alex.

I don't think the pre-processor *ever* respects the subtree argument. It *must*
function globally for lots of things, such as numbering.

So (shooting from the hip) I think the second pass is trying to augment the
*entire* source with WW representations and not finding them in the
*abbreviated* representations file.

Once HTML processing takes over, then the subtree should be respected again.

I feared that the more comprehensive error-reporting I added might give some
false negatives. (Not 6000 of them!)

Can you say/investigate if the eventual HTML produced is (a) the right subtree,
and (b) has all the functional WW it should have?

Perhaps I can restrict the second pass to the subtree via some sort of id
correspondence...(dang I thought I had this all sorted).

Or maybe the error-checking should be a single more global message when there is
a subtree active.

Rob

Alex Jordan

unread,
Nov 20, 2022, 3:50:40 PM11/20/22
to prete...@googlegroups.com
OK, I get it. The CLI uses -v, and I was using pretext/pretext without
-v. So that is why I saw errors with the CLI but not with
pretext/pretext. When I use -v with pretext/pretext. that also gives
me the ~6K errors.

I will follow up with this, but possibly not today.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/8610da5e-5df3-b27a-1f25-3f756aeac8da%40ups.edu.

Rob Beezer

unread,
Nov 20, 2022, 4:26:07 PM11/20/22
to prete...@googlegroups.com
Thanks for the further investigation.

I suspect your errors (you didn't show me any!!!) are at (my current) line 432
of pretext-assembly.xsl:

<xsl:when test="not($the-webwork-rep)"> ...

because just above:

@ww-id != $ww-id

($ww-id is not present in the representations file)

I thought that message would be for *every* problem or for *no* problems.
Obviously overkill. Obviously not thinking about subtree generation of the
representations file.

I'm trying to avoid any global processing/warnings in the representations file.
Not sure why, just general hygiene.

Maybe the first pass (the generation)can throw up a big warning: "when subtrees
are in use, some error-checks are off". Then in the second pass, we squelch the
6000 errors.

You can get really wild circular errors when a variable/tree in the
pre-processor has to "wait" on some "later" variable. So the above will depend
on when/how the subtree formulation happens. Be careful.

Rob

Rob Beezer

unread,
Nov 20, 2022, 5:26:36 PM11/20/22
to prete...@googlegroups.com
On 11/20/22 12:14, Alex Jordan wrote:
> And it feels a lot faster. And using xsltproc --profile (with content
> commented out instead of using --restrict) shows much less activity.
> So that's a success!

Thanks for the enthusiasm for a week's worth of work. ;-) BUT, you've not
conducted a fair test.

When you "comment out" content, the computation of "webwork-NNN" becomes faster
since there is less content to traverse. I think it is O(n^2) in the size of
the source. "webwork-6032" takes a long time on all of ORCAA. "webwork-53" on
Chapter 1 does not.

(We've got to get away from the idea of a book as a program. Or as LaTeX. XML
is static and we need all of it, all the time. There is no such thing as
"commenting out".)

Have you noticed that if you restrict to Chapter 2 the divisioon numbers are
correct (I suspect!)? Because the pre-processor computes them now, looking at
all the source.

All my efforts last week were to get to a place where we could replace
"webwork-6032" by something faster to compute. But I thought this would require
every project to re-generate their representation file. A distraction I was
going to save for winter break, at best.

BUT your message made me think. I believe I can easily count "webwork" (both
kinds) in the same recursive traversal we have anyway for the "webwork" pass of
the pre-processor. Yes! Super-fast and identical to current identifier.
Brittle, but only with respect to "webwork" changes.

This whole approach of traversing the entire tree and passing information along
to update and employ reminds me of Illinois governor Rod Blagojevich in 2008,

"I've got this thing, and it's [bleeping] golden."

Rob

Rob Beezer

unread,
Nov 20, 2022, 8:11:47 PM11/20/22
to prete...@googlegroups.com
On 11/20/22 14:26, Rob Beezer wrote:
> I believe I can easily count "webwork" (both
> kinds) in the same recursive traversal we have anyway for the "webwork" pass of
> the pre-processor.  Yes!

My turn for premature enthusiasm. Same mistake I made when you and I were
talking the other day. But the computational graph-theorist in me has not given
up. There must be a way to capture a value and propogate it *back up* the tree
during the backtracking step...

Rob

Alex Jordan

unread,
Nov 20, 2022, 9:03:31 PM11/20/22
to prete...@googlegroups.com
Is there any way to use something like xsltproc's --profile using pretext/pretext? Or is there a way to use --restrict with xsltproc? Maybe a stringparam? If so for either thing, I am not aware of it.

So I use xsltproc's --profile to study things. And yet I still don't want to process all of orcca for  simple testing. So I comment out a lot. And the speed now is still noticeably faster than last week. 

--
You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.

Rob Beezer

unread,
Nov 20, 2022, 10:46:07 PM11/20/22
to prete...@googlegroups.com
On 11/20/22 18:03, Alex Jordan wrote:
> Is there any way to use something like xsltproc's --profile using
> pretext/pretext? Or is there a way to use --restrict with xsltproc? Maybe a
> stringparam? If so for either thing, I am not aware of it.

grep

name="subtree"

to find two xsl:param.

Then grep

$subtree

(careful with the dollar!) to find lots of things, some relevant, like
$subtree-root. Some not.

Then write some documentation for the Developer's Part explaining how to use
--profile??? ;-)

Rob


Alex Jordan

unread,
Nov 20, 2022, 11:42:58 PM11/20/22
to prete...@googlegroups.com
> There must be a way to capture a value and propogate it *back up* the tree
during the backtracking step...

Challenge accepted. I hope this works. Logic is:
* go to the first child
* but if there is no child, go to the first following-sibling
* but if there aren't any, now you have to go back up the tree to a
parent or ancestor, looking for its next following-sibling. I call
this "uncle".

<xsl:template match="*">
<xsl:param name="count" select="1"/>
<xsl:choose>
<xsl:when test="child::*">
<xsl:apply-templates select="child::*[1]">
<xsl:with-param name="count" select="$count + 1"/>
</xsl:apply-templates>
</xsl:when>
<xsl:when test="following-sibling::*">
<xsl:apply-templates select="following-sibling::*[1]">
<xsl:with-param name="count" select="$count + 1"/>
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="." mode="uncle">
<xsl:with-param name="count" select="$count"/>
</xsl:apply-templates>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<xsl:template match="*" mode="uncle">
<xsl:param name="count"/>
<xsl:choose>
<xsl:when test="parent::*/following-sibling::*">
<xsl:apply-templates select="parent::*following-sibling::*[1]">
<xsl:with-param name="count" select="$count + 1"/>
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="parent::*" mode="uncle">
<xsl:with-param name="count" select="$count"/>
</xsl:apply-templates>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

Now if you have a tree:
A
B
C
D
E
F
G
H

and you enter applying templates at A ,
you will get
A (1)
B (2)
C (3)
D (4)
E (5)
F (6)
G (7)
H (8)

Anyway, I think It's a way to walk the tree hitting every single node
in document order, and keeping track of how many nodes you've already
passed.
> --
> You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/9aacd378-f8c5-759e-c0c0-4d4817b0f864%40ups.edu.

Rob Beezer

unread,
Nov 21, 2022, 12:01:28 AM11/21/22
to prete...@googlegroups.com
Thanks, Alex! I didn't realize I'd laid down a challenge. ;-)

I'll take a closer look tomorrow when I'm thinking clearer and can "run" these templates.

Idle speculation: I wonder if the "uncle" template can be factored out? It looks a lot like the non-modal template. Could the non-modal template be passed a parent, or a parent+first-following-sibling, or similar? That *is* a challenge!

Rob

Alex Jordan

unread,
Nov 21, 2022, 12:29:43 AM11/21/22
to prete...@googlegroups.com
The problem is the that when you need an uncle, maybe it's actually a great uncle or a great great uncle that you need. Etc. So some kind of recursion is needed there.

The main template could identify the uncle using the modal. But then the main could actually be the one to apply templates to it 

Alex Jordan

unread,
Nov 21, 2022, 12:31:35 AM11/21/22
to prete...@googlegroups.com
In hindsight, I think the key is only applying templates to the one "next up" element. No select="*" allowed or you necessarily have to lose information when it goes back up the tree.

Rob Beezer

unread,
Nov 21, 2022, 11:30:30 AM11/21/22
to prete...@googlegroups.com
Maybe just one template, but a second parameter is a boolean for descending v.
ascending.

First half of the template is left-most child or next sibling, while the other
half is next sibling or parent. The boolean gets flipped by each half at the
right time.

You only increment $count while descending.

A twist is that we need to xerox all of the tree, so we can *add* the
information from the counting. Maybe that is "easy" at the same places where
the count is incremented.

All speculation.

Rob

On 11/20/22 21:31, Alex Jordan wrote:
> In hindsight, I think the key is only applying templates to the one "next up"
> element. No select="*" allowed or you necessarily have to lose information when
> it goes back up the tree.
>
> On Sun, Nov 20, 2022, 9:29 PM Alex Jordan <jordanc...@gmail.com
> <mailto:jordanc...@gmail.com>> wrote:
>
> The problem is the that when you need an uncle, maybe it's actually a great
> uncle or a great great uncle that you need. Etc. So some kind of recursion
> is needed there.
>
> The main template could identify the uncle using the modal. But then the
> main could actually be the one to apply templates to it
>
> On Sun, Nov 20, 2022, 9:01 PM Rob Beezer <bee...@ups.edu
> <mailto:bee...@ups.edu>> wrote:
>
> Thanks, Alex!  I didn't realize I'd laid down a challenge.  ;-)
>
> I'll take a closer look tomorrow when I'm thinking clearer and can "run"
> these templates.
>
> Idle speculation: I wonder if the "uncle" template can be factored out?
> It looks a lot like the non-modal template.  Could the non-modal
> template be passed a parent, or a parent+first-following-sibling, or
> similar?  That *is* a challenge!
>
> Rob
>
> On November 20, 2022 8:42:46 PM PST, Alex Jordan
> <mailto:pretext-dev%2Bunsu...@googlegroups.com>.
> >> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/9aacd378-f8c5-759e-c0c0-4d4817b0f864%40ups.edu <https://groups.google.com/d/msgid/pretext-dev/9aacd378-f8c5-759e-c0c0-4d4817b0f864%40ups.edu>.
> >
>
> --
> You received this message because you are subscribed to the Google
> Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to pretext-dev...@googlegroups.com
> <mailto:pretext-dev%2Bunsu...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/MTAwMDAwMi5iZWV6ZXI.1669006887%40quikprotect <https://groups.google.com/d/msgid/pretext-dev/MTAwMDAwMi5iZWV6ZXI.1669006887%40quikprotect>.
>
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/CA%2BR-jrcrYErt505KQDr51%2BEYRu%3DVPHumsPxS_CrK%2B_7yRo%3Depw%40mail.gmail.com <https://groups.google.com/d/msgid/pretext-dev/CA%2BR-jrcrYErt505KQDr51%2BEYRu%3DVPHumsPxS_CrK%2B_7yRo%3Depw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Alex Jordan

unread,
Nov 21, 2022, 11:45:59 AM11/21/22
to prete...@googlegroups.com
There is a complication when you have a tree like

A
    B
        C
            D
        E
F

A needs to pass an F reference to B as the next sibling. B has no next sibling so it passes the F reference along to C. Now C would pass an E reference to D, overwriting the F reference. After E is processed, nothing takes us to F.

Solution: these references could stack up as a space delimited list of references and we pop one off when it is used.




To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/b623e8a6-71dc-f33b-a2f8-807989f401c6%40ups.edu.

Alex Jordan

unread,
Nov 21, 2022, 12:56:01 PM11/21/22
to prete...@googlegroups.com
Er, wait. This is all happening before we have id's. So a
space-delimited string is out of reach.

Can we pass a node set as a param? And be able to predict which node
is top of the stack ("next up")?

Rob Beezer

unread,
Nov 21, 2022, 1:19:28 PM11/21/22
to prete...@googlegroups.com
Hmmmmm. Yes, you can pass a whole node-set as a parameter. That would be an
XSL-thonic thing to do. There are places in the code where I strip off one at a
time (front-end, back-end) and pass the remainder in the recursion.

Make a node-set of all "webwork" once before recursion goes, should be in
document-order. Pass it around (un-altered) in the usual recursion (xeroxing
template, with exceptions). The "number" of any given "webwork" is its
position() in this node-set (a technique which I use somewhere else). Or strip
the front-end on each discovery of a "webwork" in the source and compute number
based on original length versus remainder of node-set.

Rob

Alex Jordan

unread,
Nov 21, 2022, 1:40:43 PM11/21/22
to prete...@googlegroups.com
If $nodeset is a node set that includes an node $node, can you do
something like:

select="$nodeset[not $node]"

I'm not suggesting that is valid syntax, but is that an idea that can
be captured by XSLT?
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/eb630a12-2a06-f68c-ef3b-2d28a1fcfa23%40ups.edu.

Rob Beezer

unread,
Nov 21, 2022, 2:16:06 PM11/21/22
to prete...@googlegroups.com
I think you want a filter like

$nodeset[count(.|$node) = 2]

The context of the filter runs over all of $nodeset. "." is that current
element/context. The union ("|") is either 1 (equality) or 2 (inequality).

But any idea of stripping nodes and passing the result through recursion has the
same problem of it being difficult to propogate back up the tree.

~~~~~~~~~~~

I've got a global variable, with *all* webwork. That is the only scan of the
entire document.

Then at each individual WW I can "loop" through that global variable until an
equality test (ala above) yields a match. At that point position() yields its
location in the variable/list. So maybe for ORCCA

6000*5999/2

equality tests rather than 6000 "xsl:number" that each peruses half (on average)
of all of the 8 billion nodes in the whole document?

~~~~~~~~~~~

My grand scheme for numbering is like this. Make subtrees of say, all blocks,
at the depth where numbering resets. The for any given block, locate its
position in the nodeset that is the subtree (which is in document order) to get
a serial number.

Rob

Alex Jordan

unread,
Nov 21, 2022, 2:44:22 PM11/21/22
to prete...@googlegroups.com
> But any idea of stripping nodes and passing the result through recursion has the
same problem of it being difficult to propogate back up the tree.

Famous last words, but I think that is the issue that is solved by
this thread. We identify the "next up" element by using a node set.
(Not the one you are describing below though.) I'll revise my drafted
templates a bit later to show something that is cleaned up with the
additional ideas that have come out in the last few messages. It's one
of those things I find easier to describe with code than in words.

> I've got a global variable, with *all* webwork.
This is a different idea that is better than the status quo. I suspect
all those equality tests and use of position() make it slower than
what I have in mind, but maybe negligibly so.

Day job duties are calling me, but I hope to get back with a draft this evening.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/62b9987e-c9d3-c9a3-26f6-dd0a32661849%40ups.edu.

Rob Beezer

unread,
Nov 21, 2022, 2:46:45 PM11/21/22
to prete...@googlegroups.com
And it "works". Zero change to representation file. Backwards-compatible.
Hopefully faster! Pushed, rebuilding website.

Can you test on *all of* ORCAA? Profile?

If anything is amiss, it'll be easy to revert. See the short commit if you want
precise details on approach.

I suspect there will be a wholesale change of identifiers soon, and everybody
will need to regenerate everything. :-( But this stopgap will get us there and
it can be all-at-once for everybody.

Thanks for all the discussion, good for our collective XSL-fu.

Rob

Rob Beezer

unread,
Nov 21, 2022, 9:18:20 PM11/21/22
to prete...@googlegroups.com
Way more fun than publisher switches. ;-)

Attached produces messages:

AT: A COUNT: 1
AT: B COUNT: 2
AT: C COUNT: 3
AT: D COUNT: 4
AT: E COUNT: 5
ASCEND TO: C
AT: F COUNT: 6
AT: G COUNT: 7
AT: H COUNT: 8
AT: I COUNT: 9
ASCEND TO: F
ASCEND TO: B
AT: J COUNT: 10
AT: K COUNT: 11
AT: L COUNT: 12
ASCEND TO: J
ASCEND TO: A

You can try to draw the tree based on that? Without peeking?

I have *no idea* how to reconstruct the XML. I want the output to be the same
XML structure, but "adorn" each element with an attribute having its count.

I've tried putting "xsl:copy" in all sorts of places and always come sorta
close, but I'm really just guessing. Feels like the "apply-templates" that pass
"$count + 1" must have something to do with it. Maybe it'll come to me overnight?

Rob
dfs.xsl
tree-one.xml

Alex Jordan

unread,
Nov 22, 2022, 12:24:51 AM11/22/22
to prete...@googlegroups.com
Here's my version. Apply to your tree file, and get:

AT: A COUNT: 1 NEXT UP: B
AT: B COUNT: 2 NEXT UP: C
AT: C COUNT: 3 NEXT UP: D
AT: D COUNT: 4 NEXT UP: E
AT: E COUNT: 5 NEXT UP: F
AT: F COUNT: 6 NEXT UP: G
AT: G COUNT: 7 NEXT UP: H
AT: H COUNT: 8 NEXT UP: I
AT: I COUNT: 9 NEXT UP: J
AT: J COUNT: 10 NEXT UP: K
AT: K COUNT: 11 NEXT UP: L
AT: L COUNT: 12 NEXT UP:


With profiling, yours has 17 calls to the template. This version has only 12.
The difference is yours hits nodes on the way up.

Yours passes booleans, mine passes (small) node sets.

Bad news. Neither scales up for a fundamental reason. (I applied both
to ORCCA.) Both stylesheets ask us to recurse down/sideways/up through
every single element. So you hit the recursion depth real fast and "A
potential infinite template recursion was detected." Mine makes it to
the 3000th element before stopping. I think 3000 may be a default max
depth. Your makes it to the 1990th element because it uses up some
recursion depth ascending.

Is this the death of this line of investigation?
> --
> You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/10b7fdb7-c011-c80f-648f-555d936276e1%40ups.edu.
uncles.xsl

Alex Jordan

unread,
Nov 22, 2022, 12:31:15 AM11/22/22
to prete...@googlegroups.com
With xsltproc, I added:
--maxdepth 60000 --maxvars 60000
(up from defaults of 3000 and 15000)
just to see what happens.

The uncles stylesheet reaches:
AT: p COUNT: 10274
and then I get:
Segmentation fault: 11

which Google says means I used up all my memory :(

Alex Jordan

unread,
Nov 22, 2022, 12:35:22 AM11/22/22
to prete...@googlegroups.com
And sorry. I see that the uncles.xsl file I uploaded had some small
edits from when I ran it to produce the claimed output. If it matters,
here is the right one that includes the "NEXT UP" bit in the messages.
uncles.xsl

Rob Beezer

unread,
Nov 22, 2022, 12:37:24 AM11/22/22
to prete...@googlegroups.com
Very good! ;-)

More in the AM. -Rob

Rob Beezer

unread,
Nov 22, 2022, 12:40:47 PM11/22/22
to prete...@googlegroups.com
That's a nice solution, too!

I had meant to run mine on something large, but was distraced in other
directions. Mine visits every node twice and never unwraps the recursion until
the end, when it just exits out the root.

> Is this the death of this line of investigation?

Well certainly the recursion limits/capacity is a deal-breaker. I have to
double the "max-depth" to 6000 in order to do some low-level line-by-line
processing when making the "View Source" knowls for the annotated version (which
reminds me to continue pursuing the lxml method for changing this).

An unspoken premature enthusiasm was my thought that we could run DFS and run
counters (ala LaTeX) for numbering. But we'd need to recreate the orginal
source. I finally got the leaves of the tree out at the right place, but
interior nodes would need to be opened in one stanza, and closed in another.
You can do this with a text version of elements, but then attributes are harder
to add, and there are namespaces, and .... So not workable. And it looks like
"uncles" has the same problem.

Short answer: yes, I think this is dead. My "webwork" solution is much like my
perpetually-delayed numbering solution, so that gives me more confidence that is
a good way to go.

Thanks for all the investigation!

Rob

Alex Jordan

unread,
Nov 23, 2022, 11:55:39 AM11/23/22
to prete...@googlegroups.com
My closing thoughts on this. The other big issue (where the first big
issue is the recursion depth) is how to actually write a new XML tree
that would have these things as label attributes. I know from the
thread that you already get the difficulty of that. I cleared it up
for myself thinking as follows. If the input tree is:
A
B
C
D

Then you start building the output tree. In the output tree, A' opens.
B' opens inside it. C' opens inside that.
Output:
A'
B'
C'
cursor...template processing C is still active

Now what we want to do is start processing the *input* node D after
passing it parameters (e.g. the count) from our processing of C. But
we don't want to make this:
A'
B'
C'
D'
or this:
A'
B'
C'
D'

Somehow we need to close C' and B', then begin building D' while we
are still processing C. But this seems impossible unless these is some
XSLT wizardry that I haven't learned yet. Say you close C' while you
are still processing C. I could imagine at this point creating a
following-sibling for C' while still processing C. But I just can't
think of any way you could make a following-sibling to B' while in the
middle of processing C.

So that's my "proof" that passing information back up the tree in a
way that helps compose an enhanced version of the original XML tree is
not possible.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/67a14134-1b6e-e0b9-106b-ecc90810e6d5%40ups.edu.

Rob Beezer

unread,
Nov 23, 2022, 12:53:58 PM11/23/22
to prete...@googlegroups.com
Exactly.

> But this seems impossible unless these is some
> XSLT wizardry

"xsl:copy", "xsl:element", "xsl:attribute" still just produce text, as part of a
"result tree fragment" which can them be made into a real node-set. Those are
great conveniences. Instead, you can manage the text at a lower-level. Open B
before dealing with C, and close B after dealing with C. I think that may be
easier without the $uncles variable, you close on the second visit to a node,
the backtracking steps, the ascents. It seems inadvisable to do this (I did it
some in the early days, when making an HTML index, iirc, but "fixed" it since).

One of my deep-seated fears is that we trade space for time and we run out of
memory on large projects. Sound familiar? ;-) So that is really my main
reason for not pursuing this.

Rob

Alex Jordan

unread,
Nov 23, 2022, 1:25:53 PM11/23/22
to prete...@googlegroups.com
> Open B
> before dealing with C, and close B after dealing with C.

Hmm, I might look into this then, if only for the sake of my XSLT-fu.

I did not mention that I had success with ORCCA and getting around the
recursion depth issue. (I had success printing out IDs for all
elements in ~1 second, not success building an XML tree.) The key is
that there is no recursion when you simply apply-templates to
something. It only counts towards the recursion depth when you use a
with-param. So the question becomes, where can we find places to stop
using with-param? Well each element you reach that has an xml:id
already is a good place to "reset".

So I got it to spit out all these labels like:
AT: m LABEL: webwork-use-quadratic-formula-irrational-roots==19
(using == as a separator).

Meaning, if you took the subtree starting at
"webwork-use-quadratic-formula-irrational-roots" and pruned away all
descendants that have an xml:id, then this "m" element was 19th in
document order of that pruned subtree.

ORCCA took 1 second. APEX took 2.

I still had occasional counts reaching the 1700s, too close to 3000
for comfort. But if there was a first pass that gave automated
xml:id~s to all division and specialized division elements, that
potential obstacle could go away.

If I understand this thing better about separately opening and closing
tags in the output, and couple that with the approach I used above,
maybe there is something interesting there. I will leave it at that:
"interesting". It still may not be *useful* for us in the end.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/4dcd27b6-3987-bd49-69f0-92810d4a5425%40ups.edu.

Rob Beezer

unread,
Nov 23, 2022, 9:41:10 PM11/23/22
to prete...@googlegroups.com
On 11/23/22 10:25, Alex Jordan wrote:
> I did not mention that I had success with ORCCA and getting around the
> recursion depth issue. (I had success printing out IDs for all
> elements in ~1 second, not success building an XML tree.) The key is
> that there is no recursion when you simply apply-templates to
> something. It only counts towards the recursion depth when you use a
> with-param.

Right, hadn't thought about that.

> So the question becomes, where can we find places to stop
> using with-param? Well each element you reach that has an xml:id
> already is a good place to "reset".

Two templates - one "regular", another matching on having an @xml:id, say, which
does not accept a parameter.

> So I got it to spit out all these labels like:
> AT: m LABEL: webwork-use-quadratic-formula-irrational-roots==19
> (using == as a separator).
>
> Meaning, if you took the subtree starting at
> "webwork-use-quadratic-formula-irrational-roots" and pruned away all
> descendants that have an xml:id, then this "m" element was 19th in
> document order of that pruned subtree.
>
> ORCCA took 1 second. APEX took 2.
>
> I still had occasional counts reaching the 1700s, too close to 3000
> for comfort. But if there was a first pass that gave automated
> xml:id~s to all division and specialized division elements, that
> potential obstacle could go away.
>
> If I understand this thing better about separately opening and closing
> tags in the output, and couple that with the approach I used above,
> maybe there is something interesting there. I will leave it at that:
> "interesting". It still may not be *useful* for us in the end.

I'm most interesting in getting serial numbers. Counters reset at a certain
depth. So a default value of 0 for the receiving parameter might mean breaking
teh recursion as the routine crosses into a new subtree (at the right "level" in
PTX-speak). Maybe.

Rob
Reply all
Reply to author
Forward
0 new messages