Two pass doctree transforms

Michael Jones

unread,

Apr 7, 2014, 6:16:11 AM4/7/14

to sphin...@googlegroups.com

Hi,

I'm the maintainer of the Breathe extension for Sphinx. Breathe provides a way to get doxygen information into the Sphinx output. It works with the doxygen xml and normally expects the xml files to have been generated prior to the Sphinx build.

However, now we're attempting to implement a system in which Breathe manages the doxygen xml generation. To do this we'd like to figure out all the C++ source files referenced by Breathe directives which therefore need to have doxygen xml generated from them and then automatically run doxygen on them and then process the resulting xml into the Sphinx output.

It seem to me that I need to be able to scan over all the rst source files once to collect the information and then go over them again to be able to process them properly with all the information at hand. My initial approach used the docutils Transforms but they seem to be applied to each file in isolation rather than across the whole project.

Can you recommend an approach I might take for this? I'm struggling to see my best way forward with the current events and API. I thought maybe a "all-sources-read" type event which would allow me to load up doctrees and edit them before the final reference resolving was complete. I welcome suggestions though.

Thanks,

Michael

ps. I've posted this on the Sphinx issue tracker, but after doing that realised it would have been better here. Sorry for the repeat postings.

Robert Lehmann

unread,

Apr 9, 2014, 2:42:12 AM4/9/14

to sphin...@googlegroups.com

I believe you can access the BuildEnvironment, which is shared among the whole project, from a document's settings in a Transform:

def apply(self):

env = self.document.settings.env

You can achieve the same, I guess, in a Directive's run method through self.state.document….

So, you could collect all referenced doxygen documents in Directive.run, run Breathe/doxygen (maybe as an immediate transform) and then run your Transform as you'd planned.

Hope that helps,

Robert

--
You received this message because you are subscribed to the Google Groups "sphinx-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sphinx-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Jones

unread,

Apr 9, 2014, 2:11:45 PM4/9/14

to sphin...@googlegroups.com

Thank you for the response and for pointing out the availability of the additional environment information. There certainly might be enough there for me to achieve what I need.

Unfortunately I am still not completely sure how best to approach the situation as I feel my issue is that each rst document seems to be processed in isolation. Each document is read, its directives are expanded and the resulting nodes are transformed without reference to other documents. So if I want to collect information from documents A and B, before going back and transforming the doctree for document A then I am in trouble as the process is completely finished with A before starting on B.

I feel that with the extra environment information, I might be able to collect information as each document is processed and then figure out when my code is being called to process the last document and whilst that is going on load up the doctrees for all the other documents and make adjustments as I see fit.

That however would be a complete hack. I don't see any other way of approaching the situation as there doesn't seem to be a mechanism to revisit documents once all the documents have been processed.

I'm not sure how clear I am being. Do you feel there is something I am missing here?

Cheers,

Michael

Michael Jones

unread,

Apr 20, 2014, 8:35:17 AM4/20/14

to sphin...@googlegroups.com

Hi,

This is just a notice to say that, after looking more closely at the Sphinx code, I don't believe I can achieve what I want and I would be far from confident about trying to introduce the kind of hooks I'd like to see into Sphinx.

I think that in an ideal world each transform would be run against all the documents before moving to the next transform so that the sort of approach I'd like would be possible. However I imagine that would be considerably slower than the current approach as I think Sphinx would have to read/write the doctree for each document from/to disk numerous times.

I'm going to try to change the Breathe setup so that users have to declare the necessary information in the config.py file instead of trying to collect it from the document contents.