ENB: Improving the python importer

42 views
Skip to first unread message

Edward K. Ream

unread,
Aug 27, 2023, 6:44:23 AM8/27/23
to leo-editor

PR #3515 contains preliminary work improving Leo's python importer. This work has uncovered several related issues. This Engineering Notebook discusses the way forward.


Background


The files in leo/plugins/importers contain all of Leo's importers. Only ic.createOutline instantiates the importer classes. 


The RecursiveImportController (RIC) class calls ic.createOutline to instantiate and run the importers. At present, the RIC runs a postpass after the importers complete. This postpass performs importer-independent adjustments, such as deleting empty nodes and generating @path statements.


Packaging


Moving the RIC postposs into the base Importer class would clarify the code. It's too easy to forget that the RIC postpass exists!


Alas, the RIC postpass knows the base path of each imported file. The importer classes do not. Passing such data to the importer classes might constitute a breaking api change.


@path directives


#3328 (in Leo 6.7.4) suggested using relative paths in @path directives. The revised code is dubious. The generated @path directives are relative to the root directory (the dir_ kwarg passed to c.recursiveImport).


Let the outline directory be the directory containing the .leo file.


Oops: The generated @path directives will be wrong unless dir_ is either the outline directory or a subdirectory. Let's call this the directory constraint.


No relative @path directives are possible unless the directory constraint holds.


My workflow often violates this constraint temporarily:


- Execute c.recursiveImport in some random directory.

- Copy/Paste the resulting nodes to a new file.

- Save the new file in the proper directory.


Strategy


c.recursiveImport probably should require relative @path directives. c.recursiveImport will refuse to import files if the directory constraint does not hold.


With this policy, importers can assume that the root directory is the outline directory. The RIC postpass could move into the Importer code!


Summary


Experiments will show whether enforcing the directory constraint makes sense. It appears logically essential. The unpleasant alternative would be to allow absolute paths in @path directives.


Given the directory constraint, the RIC post-pass can move into the Importer class without changing Leo's api.


All questions and comments are welcome.


Edward

Edward K. Ream

unread,
Aug 28, 2023, 4:30:31 AM8/28/23
to leo-editor
On Sunday, August 27, 2023 at 5:44:23 AM UTC-5 Edward K. Ream wrote:

> Oops: The generated @path directives will be wrong unless dir_ is either the outline directory or a subdirectory. Let's call this the directory constraint.


os.path.relpath works in all common situations. The output from this test script:


from os.path import relpath
   
for sep in '\\', '/':
    base = fr'C:{sep}Users{sep}Dev'
    path = fr"C:{sep}Repos{sep}leo-editor"
    print('base:', base)
    print('path:', path)
    print('relpath', relpath(path, base))
    print('')


is:


base: C:\Users\Dev
path: C:\Repos\leo-editor
relpath ..\..\Repos\leo-editor

base: C:/Users/Dev
path: C:/Repos/leo-editor
relpath ..\..\Repos\leo-editor


I'm not sure recursive imports should allow such "distant" @path directives.


Edward

Edward K. Ream

unread,
Aug 28, 2023, 6:48:33 AM8/28/23
to leo-e...@googlegroups.com
On Mon, Aug 28, 2023 at 3:30 AM Edward K. Ream <edre...@gmail.com> wrote:

> os.path.relpath works in all common situations.

> I'm not sure recursive imports should allow such "distant" @path directives.


Recent revs avoid os.path.relpath entirely.


Instead, the path to each imported file must start with the path to the present outline.


Imo this policy is reasonable. It's also the simplest thing that could possibly work.


Edward

Edward K. Ream

unread,
Aug 29, 2023, 6:06:51 PM8/29/23
to leo-editor
On Sunday, August 27, 2023 at 5:44:23 AM UTC-5 Edward K. Ream wrote:

PR #3515 contains preliminary work improving Leo's python importer.


This PR has now been merged into devel.
 
Packaging


Moving the RIC postposs into the base Importer class would clarify the code. It's too easy to forget that the RIC postpass exists!


I have chosen to leave RIC.postpass where it is because it contains language-independent code.

Alas, the RIC postpass knows the base path of each imported file. The importer classes do not. Passing such data to the importer classes might constitute a breaking api change.


Aha: everything should be relative to the outline's directory, that is os.path.dirname(c.fileName()). This information is available to each importer class, but it seemed cleaner to leave the postpass in the RIC class.

However, the PR does add Importer.postpass as a hook for language-dependent adjustments. The Python_Importer class overrides i.postpass to move docstrings to more favorable location. The docstring for i.postpass reminds readers that RIC.postpass happens last.
 

Strategy


c.recursiveImport probably should require relative @path directives.


Yes, it does. Moreover, the 'dir_' kwarg to c.recursiveImport can be a relative path.
 
Summary


Experiments will show whether enforcing the directory constraint makes sense.


Being clear that all directories must be sub-directories of the outline directory (or the outline directory itself) clarified many issues. The PR also does a lot more careful testing of assumptions.

The PR is a step forward. If I find other problems I'll create new issues.

Edward
Reply all
Reply to author
Forward
0 new messages