[Python-Dev] Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

69 views
Skip to first unread message

Fabio Zadrozny

unread,
Feb 17, 2022, 12:59:38 PM2/17/22
to pytho...@python.org
Hi all,

I'm stumbling with an issue where the co_firstlineno behavior changed from Python 3.9 to Python 3.10 and I was wondering if this was intentional or not.

i.e.: Whenever a code is compiled in Python 3.10, the `code.co_firstlineno` is now always 1, whereas previously it was equal to the first statement.

Also, does anyone know if there is any way to restore the old behavior in Python 3.10? I tried setting the `module.lineno` but it didn't really make any difference...

As an example, given the code below:

import dis

source = '''
print(1)

print(2)
'''

initial_module = compile(source, '<nofilename>', 'exec', PyCF_ONLY_AST, 1)

import sys
print(sys.version)

for i in range(2):
    module = Module([initial_module.body[i]], [])
    module_code = compile(module, '<no filename>', 'exec')
    print(' --> First lineno:', module_code.co_firstlineno)
    print(' --> Line starts :', list(lineno for offset, lineno in dis.findlinestarts(module_code)))
    print('---- dis ---')
    dis.dis(module_code)


I have the following outputs for Pyhon 3.9/Python 3.10:

3.9.6 (default, Jul 30 2021, 11:42:22) [MSC v.1916 64 bit (AMD64)]
 --> First lineno: 2
 --> Line starts : [2]
---- dis ---
  2           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 (1)
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE
 --> First lineno: 4
 --> Line starts : [4]
---- dis ---
  4           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 (2)
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE


3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)]
 --> First lineno: 1
 --> Line starts : [2]
---- dis ---
  2           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 (1)
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE
 --> First lineno: 1
 --> Line starts : [4]
---- dis ---
  4           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 (2)
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE
Thanks,

Fabio

Mark Shannon

unread,
Feb 17, 2022, 2:09:50 PM2/17/22
to Fabio Zadrozny, Python Dev
Hi Fabio,

This happened as part of implementing PEP 626.
The previous behavior isn't very robust w.r.t doc strings and
compiler optimizations.

OOI, why would you want to revert to the old behavior?

Cheers,
Mark.
> _______________________________________________
> Python-Dev mailing list -- pytho...@python.org
> To unsubscribe send an email to python-d...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/VXW3TVHVYOMXDQIQBJNZ4BTLXFT4EPQZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/2YHNQVGQEDDDKF7MVZIQA4GBIMYC2CJD/
Code of Conduct: http://python.org/psf/codeofconduct/

Fabio Zadrozny

unread,
Feb 17, 2022, 2:35:20 PM2/17/22
to Mark Shannon, Python Dev
Em qui., 17 de fev. de 2022 às 16:05, Mark Shannon <ma...@hotpy.org> escreveu:
Hi Fabio,

This happened as part of implementing PEP 626.
The previous behavior isn't very robust w.r.t doc strings and
compiler optimizations.

OOI, why would you want to revert to the old behavior?


Hi Mark,

The issue I'm facing is that ipython uses an approach of obtaining the ast for a function to be executed and then it goes on node by node executing it.

When running in the debugger, the debugger caches some information based on (co_firstlineno, co_name, co_filename) to have information saved across multiple calls to the same function, which works in general because each function in a given python file would have its own co_firstlineno, but in this specific case here it gets a single function and then recompiles it expression by expression -- so, it'll have the same co_filename (<cell>) and the same co_name (<module>), but then the co_firstlineno would be different (because the statement resides in a different line), but with Python 3.10 this assumption fails as even the co_firstlineno will be the same...


After thinkering a bit it seems it's possible to create a new code object based on an existing code object with `code.replace` (re-assembling the co_lnotab/co_firstlineno), so, I'm going to propose that as a fix to ipython, but I found it really strange that this did change in Python 3.10 in the first place as the old behavior seemed reasonable for me (i.e.: with the new behavior it's a bit strange that the user is compiling something with a single statement on line 99 and yet the resulting code object will have the co_firstlineno == 1).

-- note: I also couldn't find any mention of this in the changelog, so, I thought this could've happened by mistake.

Best regards,

Fabio

Gabriele

unread,
Feb 17, 2022, 3:59:43 PM2/17/22
to Fabio Zadrozny, Python Dev
Hi Fabio

Does the actual function object get re-created as well during the
recompilation process that you have described? Perhaps it might help
to note that the __code__ attribute of a function object f can be
mutated and that f is hashable?

Cheers,
Gabriele
> _______________________________________________
> Python-Dev mailing list -- pytho...@python.org
> To unsubscribe send an email to python-d...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/DVP4VK3BY4XDC6B6HSVPLJTPCQKISAPC/
--
"Egli è scritto in lingua matematica, e i caratteri son triangoli,
cerchi, ed altre figure
geometriche, senza i quali mezzi è impossibile a intenderne umanamente parola;
senza questi è un aggirarsi vanamente per un oscuro laberinto."

-- G. Galilei, Il saggiatore.
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/O5EEGEHE7G6UFTYO4UX7Y7QHZXA4ACYG/

Fabio Zadrozny

unread,
Feb 18, 2022, 5:36:31 AM2/18/22
to Gabriele, Python Dev
Em qui., 17 de fev. de 2022 às 17:55, Gabriele <phoen...@gmail.com> escreveu:
Hi Fabio

Does the actual function object get re-created as well during the
recompilation process that you have described? Perhaps it might help
to note that the __code__ attribute of a function object f can be
mutated and that f is hashable?

Thank you for the reminder... Right now the way that it works in ipython the code object is really recreated and then is directly executed (which kind of makes sense since it's expected that cells change for re-evaluation).

I had previously considered caching in the debugger using the code object, but as code objects can be created during the regular execution, the debugger could end up creating a huge leak.

Best regards,

Fabio

Mark Shannon

unread,
Feb 18, 2022, 6:11:37 AM2/18/22
to Fabio Zadrozny, Python Dev
Hi Fabio,

On 17/02/2022 7:30 pm, Fabio Zadrozny wrote:
>
> Em qui., 17 de fev. de 2022 às 16:05, Mark Shannon <ma...@hotpy.org <mailto:ma...@hotpy.org>> escreveu:
>
> Hi Fabio,
>
> This happened as part of implementing PEP 626.
> The previous behavior isn't very robust w.r.t doc strings and
> compiler optimizations.
>
> OOI, why would you want to revert to the old behavior?
>
>
> Hi Mark,
>
> The issue I'm facing is that ipython uses an approach of obtaining the ast for a function to be executed and then it goes on node by node executing it.
>
> When running in the debugger, the debugger caches some information based on (co_firstlineno, co_name, co_filename) to have information saved across multiple calls to the same function, which works in general because each function in a given python file would have its own co_firstlineno, but in this specific case here it gets a single function and then recompiles it expression by expression -- so, it'll have the same co_filename (<cell>) and the same co_name (<module>), but then the co_firstlineno would be different (because the statement resides in a different line), but with Python 3.10 this assumption fails as even the co_firstlineno will be the same...

A bit off topic, but why not use a different name for each cell?

>
> You can see the actual issues at: https://github.com/microsoft/vscode-jupyter/issues/8803 <https://github.com/microsoft/vscode-jupyter/issues/8803> / https://github.com/ipython/ipykernel/issues/841/ <https://github.com/ipython/ipykernel/issues/841/> https://github.com/microsoft/debugpy/issues/844 <https://github.com/microsoft/debugpy/issues/844>
>
> After thinkering a bit it seems it's possible to create a new code object based on an existing code object with `code.replace` (re-assembling the co_lnotab/co_firstlineno), so, I'm going to propose that as a fix to ipython, but I found it really strange that this did change in Python 3.10 in the first place as the old behavior seemed reasonable for me (i.e.: with the new behavior it's a bit strange that the user is compiling something with a single statement on line 99 and yet the resulting code object will have the co_firstlineno == 1).

That's the behavior for functions. If I define a function on line 10, but the first line of code in that function is on line 100, then `func.__code__.co_firstlineno == 10`, not 100. Modules start on line 1, by definition.

You can find the first line of actual code using the `co_lines()` iterator.

firstline = next(mod.__code__.co_lines())[2]

Cheers,
Mark.
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/4JWS4QUENUSBWVXUFPNR5IWYFMC7AV53/
Reply all
Reply to author
Forward
0 new messages