(apologies if this message makes it twice to the list; I erred with google groups)
Hi Sphinx developers
I am moving to this list a discussion about Docutils smart quotes
which I started at
https://sourceforge.net/p/docutils/mailman/message/35861133/
It is related to
https://github.com/sphinx-doc/sphinx/issues/3788
and
https://github.com/sphinx-doc/sphinx/issues/3806
and
https://github.com/sphinx-doc/sphinx/pull/3808
and
https://github.com/sphinx-doc/sphinx/pull/3811
About the Docutils warnings reported initially at #3788,
1. the first one is due to settings['language_code'] being
set to a value for which Docutils has no localization available
2. the numerous other ones about smart quotes should be only
one per document, the issue will be fixed upstream at Docutils
https://sourceforge.net/p/docutils/mailman/message/35862092/
The rest of this message is a quote from Günter which describes
well the whole context with one reply of mine on one paragraph
Dear Jean-François,
how about moving this discussion to the sphinx-devel mail list? (Feel
free to quote me there. No need to send a copy, I am subscribed there.)
On 26.05.17, jfbu wrote:
.. and only a little with Sphinx ;-) by the way it looks as if
with Sphinx before 1.6.1, calls to Docutils ``get_language()``
always were with ``'en'``, as far as I understand. So the issue
did not arise so far.
This would be a true bug.
Your tests with revealed that it is rather an "implementation detail" --
Sphinx bypasses the Docutils language support mechanism.
I don't knwo why it is done this way, maybe it seemed easier than fixing
Docutils or moving the Docutils-lang support to "*po" files.
In any case it has some disadvantages.
...
on a project with language Turkish containing
a ``.. caution::`` directive. The html output contains Uyarı which
I assume is Turkish for Caution ;-)
OK, so Sphinx expanded the language support.
...
According to your earlier explanations this might be explained if
Sphinx relies entirely on its own localization, not on Docutils's one.
Yes, this seems to be the case.
How about localized directive names?
With Docutils, I can write ::
.. astuce::
Some directives can be given in the document's language, too!
translate with ``rst2html5 --language=fr`` and I get:
<div class="admonition tip">
<p class="admonition-title">Astuce</p>
<p>Some directives can be given in the document’s language, too!</p>
</div>
The localisations are in "docutils/parsers/rst/languages/fr.py".
This does not work with Sphinx:
make -e SPHINXOPTS="-D language='fr'" html
3788smartquotes/index.rst:21: ERROR: Unknown directive type "astuce".
.. astuce::
c'est du français
Also, the header says::
<!DOCTYPE html>
<html xmlns="
http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">
Is the correct language given in Sphinx-exported documents? (I think so,
but it would be nice to know for sure.)
for html:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="
http://www.w3.org/1999/xhtml" lang="fr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
by the way Firefox displays in red the DOCTYPE tag, it complains about it.
this is with 1.5.6. I get the same with 1.6.1 + PR #3811.
The correct lang="fr" appears in html despite Docutils having been
used with ``language_code`` set to ``'en'`` in the document settings
And testing with French I see the Caution is translated into Prudence,
(Avertissement is used for Warning, according to the fr/sphinx.po
file), despite the Docutils get_language having been called
only with 'en'.
This means that the same French rST document will be converted differently
by Docutils vs. Sphinx:
docutils/languages/fr.py contains the translations
u'caution': u'Avertissement!',
...
u'warning': u'Avis',
It would be good to "harmonize" the translations between Doctutils and Sphinx.
However, this goal may clash with compatibility considerations :-(
The Sphinx message catalogs are under
https://github.com/sphinx-doc/sphinx/tree/master/sphinx/locale
But translations are managed at transifex and merged into master prior
to major releases (as far as I understand...)
...
I tend to deduce from your precise explanations and the experiment
reported above that it should indeed be possible to access the Docutils
smart quotes facilities without restraining the language to those
for which Docutils has already contributed translations available.
I think so.
Assumptions:
* Old Sphinx bypassed the Docutils language framework by passing "en" to
"docutils.languages.get_language()".
* Now, in order to get "localized" smart-quotes, the document language
must be known to Docutils.
Currently this has the side-effect that it is also passed to
"docutils.languages.get_language()"
which leads to "missing language support" warnings for languages not fully
supported by Docutils.
Currently the fix mentioned above drops smart quotes if get_language()
informs the language has no Docutils provided translations,
Alternatives:
a) silence all warnings by setting the "report-level" Docutils setting.
+1 fast and easy workaround for affected users
-1 suppresses also warnings the user might better see :-(
b) define and use smart-quotes for a supported "mock language" ("en", say).
-1 dirty hack
c) add full support for the respective languages to Docutils
+1 helps Docutils evolve
-1 takes time to help the Sphinx end user
d) let Sphinx overwrite docutils.languages.get_language() to always
return the "en" module regardless of input and without warning.
+1 functionally equivalent to the previous behaviour (passing "en" to
get_language).
+1 selective silencing of a warning that in Sphinx is a false positive
+1 Docutils still has the correct document language setting.
-1 deepens the split between Docutils and Sphinx language
support.
-1 no incentive to provide translations to Docutils.
Günter
Thanks again Günter for the time spent on your explanations
best,
Jean-François