Attributes set to a node during the EnvironmentCollector vanishes

6 views
Skip to first unread message

Yves Chevallier

unread,
Aug 9, 2020, 3:35:44 PM8/9/20
to sphinx-users
I noticed a lot of complexity in Sphinx due to the fact the nodes cannot be altered during the EnvironmentCollector phase. However I don't understand why it works that way. 

For example, here below I would like to *tag* each `nodes.title` with an attribute, but as this is not the same instance, the added information is lost somewhere. 

Does anybody know what is the properway of passing information form the EnvironmentCollector to the Builder, and then to the Writer?

```python
from docutils import nodes
from sphinx.writers.latex import LaTeXTranslator
from sphinx import addnodes
from sphinx.environment.collectors import EnvironmentCollector


def depart_title(self, node):
    if not node['foobar']:
        raise ValueError('Why?')


class TitleCollector(EnvironmentCollector):
    def get_updated_docs(self, app, env):
        def traverse_all(app, env, docname):
            doctree = env.get_doctree(docname)

            for toc in doctree.traverse(addnodes.toctree):
                for _, subdocname in toc['entries']:
                    traverse_all(app, env, subdocname)

            for node in doctree.traverse(nodes.title):
                node['foobar'] = 42

        traverse_all(app, env, env.config.master_doc)
        return []

    def clear_doc(self, app, env, docname):
        pass

    def process_doc(self, app, doctree):
        pass

    def merge_other(self, app, env, docnames, other):
        pass

def setup(app):
    app.add_env_collector(TitleCollector)
    LaTeXTranslator.depart_title = depart_title

    return {
        'version': '0.1',
        'parallel_read_safe': True,
        'parallel_write_safe': True,
    }
```

Komiya Takeshi

unread,
Aug 9, 2020, 9:55:08 PM8/9/20
to sphinx...@googlegroups.com
Hi,

The collector.get_updated_docs() is called to detect output files that
is needed to be re-generated. It is called just after the reading
phase. It means the event is designed as read-only. For example, it is
useful to detect files that is effected to the change of ToC numbers
when a new chapter is inserted at the top.

There are two ways to apply the changes into the doctree:

* Using a transform component. It's main purpose is modifying doctree
on the reading phase. So it is good choice for your purpose.
* Storing the doctree to the file after modification on
get_updated_docs(). But that is a bit hacky because it is out of
purpose for the event.

As its name lets us know, the EnvironmentCollector is designed to
collect metadata to the environment object (that is metadata database
across the document). So it is not good to modify the doctree.

Thanks,
Takeshi KOMIYA

2020年8月10日(月) 4:35 Yves Chevallier <cana...@gmail.com>:
> --
> You received this message because you are subscribed to the Google Groups "sphinx-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sphinx-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sphinx-users/db187481-3f38-4c9a-8dd2-e7a7334f9e4cn%40googlegroups.com.

Daniel Scott

unread,
Nov 25, 2020, 10:35:31 PM11/25/20
to sphinx...@googlegroups.com
--
Reply all
Reply to author
Forward
0 new messages