Hello,
I was working on fixing [1] when I realized that there are some deeper
problems with the way that the introspection tool (`meson introspect
meson.build ...` and `meson rewrite ...`) works. Since the website [2]
says "contact the mailing list before embarking on large scale projects
to avoid wasted effort", I'm writing this mail:
1. Dataflow as a DAG of AST nodes:
Currently, the introspection interpreter works by going through
meson.build line by line and keeping a dictionary of variable
assignments. So if the meson.build file contains
foo = 'abc'
bar = 'def'
executable(foo, 'input.c')
other = '123'
and the interpreter currently reads the 3rd line, then
self.assignments == {'foo': StringNode('abc'), 'bar': StringNode('def')}
This would work fine for a SSA [3], but meson is no SSA, so this method
is *fundamentally* broken. If meson.build contains
foo = 'abc'
bar = foo
foo = 'def'
then bar is 'abc' and not 'def' as the introspection tool currently
thinks, which results in issue #11763.
My proposed solution is for the introspection interpreter to execute
a.value_at_that_time = b
when it parsed the 2nd line ('bar = foo'), where a is the AST Node
corresponding to 'foo' in the 2nd line and b is the AST Node
corresponding to "foo = 'abc'" in the 1st line. That way we have all the
information we need if we later want to analyse the data flow or find
the value of a variable.
This scheme cannot properly handle some cases of variable assignments
inside a foreach loop, but neither can the current implementation and I
can't figure out a better solution. Example of a tricky case that won't
be handled properly:
var = 'foo'
foreach x : ['a.c', 'b.c']
executable(var, x)
var = 'bar'
endforeach
2. Add proper error handling:
The introspection tool has some limitations and will always some
limitations, i.e. not every information can be statically deduced. For
example, if meson.build contains
executable(run_command('command').stdout(), 'foo.c', 'bar.c')
the introspection tool *definitely* cannot say what the name of this
executable will be. So how does the introspection tool currently handle
this?
$ meson introspect --targets meson.build
[{"name": "foo.c", "id": "foo.c@exe", "type": "executable",
"defined_in": "meson.build", "filename": ["foo.c"], "build_by_default":
true, "target_sources": [{"language": "unknown", "compiler": [],
"parameters": [], "sources":
["/home/volker/Documents/mesoncases/11763/foo.c",
"/home/volker/Documents/mesoncases/11763/bar.c"], "generated_sources":
[]}], "extra_files": [], "subproject": null, "installed": false}]
It thinks the executable is named "foo.c". If we look at the code we see
that this was not a conscious decision, but an accident:
(All line numbers are for commit adaea4136fdbf24933f243400f7771128f74deed)
mesonbuild/interpreterbase/interpreterbase.py:509 is
(h_posargs, h_kwargs) = self.reduce_arguments(node.args)
and here, node.args.arguments is [MethodNode(...), StringNode(...),
StringNode(...)] but h_posargs is ['foo.c', 'bar.c']. In other words,
self.reduce_arguments just ignores arguments it cannot evaluate, which
is not good error handling (does anyone disagree?).
mesonbuild/ast/introspection.py:227 is
name = args[0]
because the first argument to `executable` is its name. Since this
"args" here is the "h_posargs" from above, it thinks that 'foo.c' is the
name of the executable. A better way to handle variables with a value
that cannot be statically deduced would be to introduce a class named
`UnknownValue` and make h_posargs be [UnknownValue(...), 'foo.c', 'bar.c'].
Many parts of the introspection tool ignore errors in a similar way, I
want them to produce UnknownValue-objects too.
Greetings
Volker
[1]
https://github.com/mesonbuild/meson/issues/11763
[2]
https://mesonbuild.com/Contributing.html
[3]
https://en.wikipedia.org/wiki/Static_single-assignment_form