Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ast.parse, ast.dump, but with comment preservation?

78 views
Skip to first unread message

samue...@gmail.com

unread,
Dec 15, 2021, 10:37:39 PM12/15/21
to
I wrote a little open-source tool to expose internal constructs in OpenAPI. Along the way, I added related functionality to:
- Generate/update a function prototype to/from a class
- JSON schema
- Automatically add type annotations to all function arguments, class attributes, declarations, and assignments

alongside a bunch of other features. All implemented using just the builtin modules (plus astor on Python < 3.9; and optionally black).

Now I'm almost at the point where I can run it—without issue—against, e.g., the entire TensorFlow codebase. Unfortunately this is causing huge `diff`s because the comments aren't preserved (and there are some whitespace issues… but I should be able to resolve the latter).

Is the only viable solution available to rewrite around redbaron | libcst? - I don't need to parse the comments just dump them out unedited whence they're found…

Thanks for any suggestions

PS: Library is https://github.com/SamuelMarks/cdd-python (might relicense with CC0… anyway too early for others to use; wait for the 0.1.0 release ;])

Chris Angelico

unread,
Dec 16, 2021, 1:44:52 AM12/16/21
to
I haven't actually used it, but what you may want to try is lib2to3.
It's capable of full text reconstruction like you're trying to do.

Otherwise: Every AST node contains line and column information, so you
could possibly work the other way: keep the source code as well as the
AST, and make changes line by line as you have need.

ChrisA

Barry

unread,
Dec 16, 2021, 3:42:50 AM12/16/21
to


> On 16 Dec 2021, at 03:49, samue...@gmail.com <samue...@gmail.com> wrote:
>
> I wrote a little open-source tool to expose internal constructs in OpenAPI. Along the way, I added related functionality to:
> - Generate/update a function prototype to/from a class
> - JSON schema
> - Automatically add type annotations to all function arguments, class attributes, declarations, and assignments
>
> alongside a bunch of other features. All implemented using just the builtin modules (plus astor on Python < 3.9; and optionally black).
>
> Now I'm almost at the point where I can run it—without issue—against, e.g., the entire TensorFlow codebase. Unfortunately this is causing huge `diff`s because the comments aren't preserved (and there are some whitespace issues… but I should be able to resolve the latter).
>
> Is the only viable solution available to rewrite around redbaron | libcst? - I don't need to parse the comments just dump them out unedited whence they're found…
>
> Thanks for any suggestions

Have a look at the code that is used by https://github.com/asottile/pyupgrade
There are a couple of libraries that it uses that does what I think you want to do.

Barry

>
> PS: Library is https://github.com/SamuelMarks/cdd-python (might relicense with CC0… anyway too early for others to use; wait for the 0.1.0 release ;])
> --
> https://mail.python.org/mailman/listinfo/python-list

lucas

unread,
Dec 16, 2021, 5:56:51 AM12/16/21
to
Hi !

Maybe RedBaron may help you ?

https://github.com/PyCQA/redbaron

IIRC, it aims to conserve the exact same representation of the source
code, including comments and empty lines.

--lucas

samue...@gmail.com

unread,
Jan 11, 2022, 12:30:51 AM1/11/22
to
Ended up writing my own CST and added it to that library of mine (link above).

My target is adding/removing/changing of: docstrings, function return types, function arguments, and Assign/AnnAssign. All but the last are now implemented.

I was careful not to replace code elsewhere in my codebase, so everything except my new CST code (in its own files) stays, and everything else works exclusively with the builtin `ast` module as before.

samue...@gmail.com

unread,
Jan 11, 2022, 11:25:18 AM1/11/22
to
> > PS: Library is https://github.com/SamuelMarks/cdd-python (might relicense with CC0… anyway too early for others to use; wait for the 0.1.0 release ;])
0 new messages