a new use case, overhaul, compatibility

Skip to first unread message

David Farmer

Aug 13, 2022, 7:11:43 PMAug 13
to prete...@googlegroups.com

At MAA Mathfest in Philadelphia last week, I spoke with a
publisher of journal articles. The topic was the fact that
PreTeXt HTML looks good, has many options for interactivity,
and allows embedded computations. And none of those statements
are true for the HTML articles from the publisher.

So we started wondering whether PreTeXt could fit into the
publisher's workflow. (See "The Workflow" below, for context.)

It may or may not work out for this specific publisher to
do something with PreTeXt. But that is an exciting use case
and we should go to some effort to make that a possibility.

An issue is the JavaScript, in particular the React framework.
React is a great solution for our current main use cases:
standalone textbook or a textbook in Runestone.

But if the output is the HTML version of a research paper,
and that is going on the publisher's website, we need the
JS (and CSS) for the paper to be compatible with that of
the publisher. A section of the paper, for example, will
be inside a larger page with publisher branding and styling.

My understanding is that this is not compatible with using
React, unless (as happened with Runestone) the publisher is
willing to do some work and let the PTX JS be in control.

The conclusion I am suggesting is that we need to maintain a
standalone non-React version of the JavaScript, which can
more easily slip into a larger system.

I don't really understand the technical issues, so please
offer comments.

My goal is to make sure that we to not make any choices which
would cut off some possible use cases for PreTeXt.



The Workflow.

In the past, publishers would take the LaTeX of your submitted
paper and ship it off to India where it would be "fixed".
Here "fixed" means: do absurd hacks which make the LaTeX even
worse, but results in something which fits the publisher's style.
(I can provide details, if you ask me when we see each other in person.)

These days the larger publishers convert the LaTeX to XML, but to
channel our fearless leader, it is "crappy XML", meaning that it
is not actually structured much better than the original LaTeX.
(And I suspect there are several different proprietary formats.)

If the publisher could adopt PreTeXt as their XML format, maybe
it would not require a great disturbance in their workflow.

And if they provided guidance to help authors write good LaTeX,
maybe it would actually cost less than their current methods.

Jason Siefken

Aug 13, 2022, 9:30:42 PMAug 13
to prete...@googlegroups.com, David Farmer
I think this requires some investigation into what publishers would accept. Right now, Javascript controls
 1) knowls and knowl-likes (e.g., footnotes),
 2) TOC,
 3) image zooming, and
 4) interactive exercises.

I assume that journals would not want 4 and that they have their own code for 2 and 3. I have no idea what they would think about 1.

So, I guess what I'm saying is that, based on what I see on current Journal websites, they would not use *any* Pretext Javascript at all. Shipping no Javascript would certainly be the easiest thing to do!

The current React overhaul is focused on being a UI for pretext content. I see the TOC and the nav buttons, etc. as a big part of that UI. However, the code could be split so that the UI is a separate project that interacts with "Pretext Javascript plugins" that control what happens inside of main.ptx-main. If the code were split like this, someone could load the main.ptx-main plugin without the rest of the UI and an API could be developed for embedding into other pages.


Steven Clontz

Aug 13, 2022, 9:52:24 PMAug 13
to PreTeXt development
I suspect that we could offer a service where we (1) help publishers adopt [a subset of] PreTeXt as their XML vocabuary to build whatever they currently do with their propretiary XML, or (2) help publishers create a stylesheet that converts [a subset of] PreTeXt to the crappy proprietry XML they already use. Then we don't have to maintain anything: they probably already like their current HTML builds from their proprietary XML, so they can continue to maintain that while Rob does what he really wants to do: maintain the schema. And then they could let authors submit PreTeXt source instead of LaTeX source upon publication.

To emphasize the point: I think the way less brittle solution is to get research publishers using PreTeXt *source* directly somehow, rather than our HTML output.

Alex Jordan

Aug 14, 2022, 12:58:04 AMAug 14
to prete...@googlegroups.com
> I assume that journals would not want 4

I would think some journals should want to have 4 if they do not yet
have it. Especially something where there is a large data set. Let the
reader directly run some calculation or other exploration of the data.
Let them validate the data directly. Etc. Is the current standard to
just link to some data set on the internet somewhere? And then only
the truly dedicated will follow through and look at it or play with
> --
> You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/df1b644a-3bf0-408f-a416-efddc65bb5ecn%40googlegroups.com.

Rob Beezer

Aug 14, 2022, 10:59:18 AMAug 14
to prete...@googlegroups.com
Let's presume two sets of JS+CSS is a good idea (I think it is).

And I'm assuming we then have only one flavor of HTML.

Question: can we adjust current JS+CSS to work with the new HTML on the
"overhaul" branch? So we have new-old for current use, and React can be
developed in parallel (not on a branch). Then at some point we can flip a
switch to what is the default for most cases (React) and what is an alternate
(for the situation David F describes).

I know current JS+CSS has grown organically and is hard to work on. I also know
that the overhaul branch has been in the way of other (seemingly unrelated)
development work for quite a long time.

Thanks, David, for your work on this one!


David Farmer

Aug 14, 2022, 8:47:06 PMAug 14
to prete...@googlegroups.com

I think (the lack of) interactive computation is one of the big
shortcomings of journal articles -- articles which use data to draw
a conclusion, anyway.

The NSF and others are pouring a lot of money into projects designed
to preserve scientific data. They seem to be unaware of the shortcomings
of the current approach. That approach is to host data on a centralized
website, with a DOI or some other unique identifier, in a specified
format, referenced from the published paper. Everything is perfect
because the reader can see where to get the data, download it, and then
examine it using their preferred tools.

But it is not perfect. Here is an actual use case which happened to me.
The published paper included some summary statistics about some
numbers. It does not matter what the numbers represent. What matters
is that the paper did not include a histogram of the data, which I
considered to be a serious omission.

So, I went to the repository and downloaded the data, which was a plain
text file with one decimal number on each line. My preferred way to
examine such data is in Mathematica. So I had to change the format
into a comma separated list delimited by curly brackets, and I had
to put "mynameforthelist =" at the beginning. They I read that file
into Mathematica and I had the data in a named list.

For me it look maybe 15 seconds to add all the commas and put in
the curly brackets, because I know how to use an editor. But even so,
there was a lot of tedium and jumping through hoops. (If Mathematica
has a way to directly read in such a file and interpret it as a list,
please to not tell me because I already have a method that works for
me and is faster than looking up that documentation.)

I have encountered other examples where the data format was difficult
to understand or difficult to parse. What I describe above is the
best possible case within the current approach. And as Alex noted,
there is a barrier to going through those steps, particularly if the
reader is a domain expert but is not technically savvy.

It would have been better if the paper had a Sage/R cell loaded with the
data, so I could just process it immediately. It took me 5 seconds
to find that hist() is how you make a histogram in R.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/CA%2BR-jrcodSLL7UVgspWhGV7hRH28H0O15Sj5QKo7pQCEcg-3YQ%40mail.gmail.com.

David Farmer

Aug 14, 2022, 9:02:53 PMAug 14
to prete...@googlegroups.com

I imagine the journal use case would omit our TOC, navigation
menu, header, and footer.

If it got to that point, the journal would probably host the
CSS and JS, presumably basing it off our non-React version.

Sage cells and other interactivity will require some thought.

I suspect it is the journal editors and other interested parties
who would provide the motivation. The publisher, who may care
more about money than scholarship, would do it in response to
outside pressure, possibly motivated by the chance that some
other publisher would do it first.

Initially, the only realistic expectation is that authors would
submit papers in LaTeX. Ideally well-written LaTeX. That could
be converted to PreTeXt. Then it is a question of how that XML
can fit into their pipeline which expects different XML.

How to mark up Sage cells and other features which do not
appear in current research papers, is a trickier question.

Sean Fitzpatrick

Aug 14, 2022, 9:22:57 PMAug 14
to PreTeXt development
This is well known, but probably worth repeating: this exact problem of publishing data along with results is one of the reasons Fernando Perez came up with Jupyter notebooks :-)

Reply all
Reply to author
0 new messages