Guillemets

31 views
Skip to first unread message

Rob Beezer

unread,
Feb 11, 2026, 3:19:22 PMFeb 11
to prete...@googlegroups.com
Something for the French, Italians, French-Canadians, Swiss, Senegalese,...

I want to support guillemets (<<, >>) for quotations, as is done in French, and
other languages.

You are currently able to place an @xml:lang on the #pretext element (i.e.
document-wide choice) AND place @xml:lang on more interior elements, like a
division, a block, or a #p, as an interior override.

So you could mark a #p in an English document as being French, and a contained
quotation would render in the French style.

Implementation:

* The localization files would have attributes that specified the type of
quotation marks, say "english" and "guillemet" initially.

* The #q element would render as <<...>> if the type were "guillemet".

* The #sq element would render as <...> if the type were "guillemet".

* The French put in a thin space and the Swiss don't? So there would be
attributes in the localization files for this variation.

* The outer can be guillemets, with the inner english: << blah 'blah' blah >>?

* A little bit of work suggests this is all feasible, with four attributes
(outer, inner, outer spacing, inner spacing).


First Round of Questions

1. Are the rules for these universal for each of the relevant languages we have
(French, French Canadian, Italian)? I would like to perhaps avoid consulting
#docinfo for overrides on a per-language basis.

2. Would anybody's project (in an affected language) be adversely impacted if
english quotes suddenly became guillemet quotes? Maybe folks are hard-coding
Unicode symbols for guillemets already and they'd perhaps want to move to #q?
I'll ask more broadly eventually, but just hitting -dev right now.


Please correct me (gently!) where I do not have a complete understanding of the
typographical customs. And even if you are not affected, this will be the first
feature of this type (variant typography on a non-document-wide basis), so
comments on implementation are welcome.

Rob

Jean-Sébastien Turcotte

unread,
Feb 16, 2026, 10:48:17 PMFeb 16
to PreTeXt development
I'll only speak for my own French Canadian experience.

1. We put a non breaking thin space (espace insécable) after the opening << and before the closing >>. When nesting quotes, we'd use the English marks, I'd say " first then ', but don't "quote"me on that. If you'd ever want to nest more than 3 levels of quotes... Maybe don't!
2. I've resorted to hard code them when I cared enough or used the <q> tag. I don't think anything would break in my use case. Perhaps some sort of warning would be useful, that could turn off when some new parameter is set up?
3. In French, "guillemet" is the all around word for the punctuation that encloses a quote. The French guillemets are actually called "chevron(s)" .

Rob Beezer

unread,
Feb 17, 2026, 12:19:43 PM (13 days ago) Feb 17
to prete...@googlegroups.com
Dear Jean-Sébastien,

All very useful. Thanks very much. Some comments below.

On 2/16/26 19:48, Jean-Sébastien Turcotte wrote:
> I'll only speak for my own French Canadian experience.
>
> 1. We put a non breaking thin space (espace insécable) after the opening << and
> before the closing >>. When nesting quotes, we'd use the English marks, I'd say
> " first then ', but don't "quote"me on that. If you'd ever want to nest more
> than 3 levels of quotes... Maybe don't!

I'm on top of most of this. I'm doing fr-FR first as a test case as work on the
generalization of the code. So maybe fr-CA can just be identical. I've not
really thought about third-level (just don't do it!).

> 2. I've resorted to hard code them when I cared enough or used the <q> tag. I
> don't think anything would break in my use case. Perhaps some sort of warning
> would be useful, that could turn off when some new parameter is set up?

Well, the rub here is that the "variant" typography is just going to happen once
merged. There is no author/publisher control, opt-in, or upgrade path. So I'm
hoping/expecting most authors are in a similar situation as you. (For isolated
small constructs, one could specify an alternate language to get a desired
treatment.)

> 3. In French, "guillemet" is the all around word for the punctuation that
> encloses a quote. The French guillemets are actually called "chevron(s)" .

Aah, thanks for that distinction. I've settled on "angle", mostly motivated by
a comprehensive Wikipedia table (that is not without some inaccuracies).

Rob

Rob Beezer

unread,
Feb 17, 2026, 7:08:39 PM (13 days ago) Feb 17
to prete...@googlegroups.com
A small update as part of this project. I have reworked the LaTeX output for
"regular" "english" quotation marks. Gone are backticks and apostrophes, and in
are (precise) macros. Another move for more accurate end products (PDF) at the
sake of readability of the raw LaTeX.

Much cleaner XSL, too. We jumped through a lot of hoops to write out {`}``, but
only when necessary.

Please report any surprises. Thanks.

Rob Beezer

unread,
Feb 17, 2026, 7:29:58 PM (13 days ago) Feb 17
to prete...@googlegroups.com
Almost surely just for Alex.

xsl/extract-pg.xsl says:

> PGML is content with "dumb" quotes and will do
> the right thing in a conversion to "smart" quotes
> in various WW output formats

Would it be just as happy with Unicode smart quotes?

And if so, how about a whole range of symbols used in other languages, expressed
in Unicode?

Do we want to allow/handle WW problems in other languages, or with bits and
pieces of them in other languages?

If the answer to the second question is "yes", I could recycle some tedious code.

Or, we can leave things as they are, and #q and #sq will not be language-aware
inside #webwork (I think).

Rob

Alex Jordan

unread,
Feb 18, 2026, 2:27:16 PM (12 days ago) Feb 18
to prete...@googlegroups.com
>  Would it be just as happy with Unicode smart quotes?

Not presently "happy". Somewhere closer to "reluctantly accepting".

The one thing that occurs to me is that if you put unicode quotes here, PG will leave them alone. That is, it's not currently going to do anything to convert them back to apostrophes. And so if that PG problem lands in a WeBWorK server and is used in a WeBWorK course problem set, a student might want to make a PDF hardcopy of the set. And behind the scenes, there will be a .tex file with those unicode quotes in it. And I haven't tried, but I expect pdflatex to either choke or do something bad with that character like represent it as a "?". (Some webwork2 servers use pdflatex, some use xelatex.)

Over in the PG developers group, I could propose making PGML recognize raw smart quotes and convert them to something else for PG's TeX output. Not sure if that would happen, but if it did, it would not be in production until this summer.


--
You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/pretext-dev/MTAwMDAwNy5iZWV6ZXI.1771374596%40pnsh.

Rob Beezer

unread,
Feb 18, 2026, 2:50:35 PM (12 days ago) Feb 18
to prete...@googlegroups.com
On 2/18/26 11:27, Alex Jordan wrote:
> > Would it be just as happy with Unicode smart quotes?
>
> Not presently "happy". Somewhere closer to "reluctantly accepting".

<audible-chuckle/>

> The one thing that occurs to me is that if you put unicode quotes here, PG will
> leave them alone. That is, it's not currently going to do anything to convert
> them back to apostrophes. And so if that PG problem lands in a WeBWorK server
> and is used in a WeBWorK course problem set, a student might want to make a PDF
> hardcopy of the set. And behind the scenes, there will be a .tex file with those
> unicode quotes in it. And I haven't tried, but I expect pdflatex to either choke
> or do something bad with that character like represent it as a "?". (Some
> webwork2 servers use pdflatex, some use xelatex.)

Sort of as I expected. Thanks. We will just leave things as they are (which is
what I was about to do anyway). But now I can drop an informed comment in the
code, instead of saying we are punting.

A twist of sorts. A WW problem goes to the server and comes back in a static
(legal PreTeXt) form which includes a #q element (say). I think that must be a
real possibility. Now an author's overall document is marked as French (say) so
they get English quotes in the HTML interactive problem and guillemets in the
PDF version.

__ No problem, I like it.

__ We can have the server put an @xml:lang="en-US" on the #q element (and
cousins).

__ The pre-processor can match on "webwork/../q" and add @xml:lang="en-US" to
static versions.

> Over in the PG developers group, I could propose making PGML recognize raw smart
> quotes and convert them to something else for PG's TeX output. Not sure if that
> would happen, but if it did, it would not be in production until this summer.

Your call. I don't think we want WW devs to haveto go down this rabbit hole
unprepared. I've enjoyed it, but I've been saving it for the right moment.
(See the oldest outstanding PR!) Likely depends on your reaction to the
non-exclusive-or listed above.

Just an aside. There are LaTeX macros for the four English style of quotes
(left/right, single/double). Consider that three backticks may not render the
way it was intended. Rare, but still another gotcha for LaTeX. Perhaps WW
might like to output those, just for openers.

Thanks for the assist.

Rob

Alex Jordan

unread,
Feb 18, 2026, 3:44:23 PM (12 days ago) Feb 18
to prete...@googlegroups.com
I guess the PG file should have a lang attribute somewhere. This would be a new thing for PG to parse out of PG files. And then it's PG's problem to render the exercise in a way that respects that appropriate language localization. For now, the live HTML would continue to use English quotes instead of chevrons, and it's just a shortcoming of PG. Perhaps to be overcome later by PG developers. But PTX will have done all that it can do.

--
You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.

Rob Beezer

unread,
Feb 25, 2026, 6:38:08 PM (5 days ago) Feb 25
to prete...@googlegroups.com
Update: I have rounded up (and implemented) Unicode characters to support
"angle" (chevron, no space) quotation marks, and "down-up" (baseline left,
German style) quotation marks.

They have been tested in the lab, but are not yet available as an automatic
reaction to a language choice. But it is now a simple matter for language
maintainers to choose a style and specify it in their language's localization file.

So I guess this is a shout-out to those maintainers to see if they can "turn on"
a new/better style for their language. Discussion can happen here if there are
subtleties I'm not aware of.

CJK ("corner brackets") and Tibetian (etc) ("arrow brackets") will require more
care on the LaTeX side, so those are not yet available.

Rob

Reply all
Reply to author
Forward
0 new messages