You may want to do this in two steps: first go over all the C files and
generate a data file with syscalls and corresponding conditionals, and
then have a separate script that reads that data file and an object file
and generates the config snippet. That way, you only do the source
analysis once, and you can hand-edit the results if you need to clean
them up.
> Signed-off-by: Iulia Manda <
iulia....@gmail.com>
> ---
> scripts/compile_syscalls.py | 194 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 194 insertions(+)
> create mode 100755 scripts/compile_syscalls.py
>
> diff --git a/scripts/compile_syscalls.py b/scripts/compile_syscalls.py
> new file mode 100755
> index 0000000..e573686
> --- /dev/null
> +++ b/scripts/compile_syscalls.py
> @@ -0,0 +1,194 @@
> +#!/usr/bin/python
> +
> +import re, sys, os, fileinput
You don't appear to be using fileinput here.
> +import pprint
You're not using pprint except for some commented-out debugging output;
you could import it next to that debugging output, also commented out.
> +
> +if len(sys.argv) < 3:
> + sys.stderr.write("usage: %s object_file syscalls-optional source_files\n"
> + % sys.argv[0])
> + sys.exit(-1)
Conventionally, Python scripts define a function "main(args)", and then at the
bottom of the script, they have:
if __name__ == "__main__":
sys.exit(main(sys.argv))
You can then refer to main's argument instead of sys.argv, and use
return instead of sys.exit.
Also, -1 isn't a valid program return code; return codes go from 0-255,
with 0 being success. I'd suggest using 1 instead.
> +
> +# Find what syscalls a userspace uses
> +def get_userspace_syscalls(file):
Since this one just gets all glibc symbol names, I'd suggest calling it
"get symbols"; later on you'll determine whether they're present in your
list of syscalls.
> + sym = []
> + lines = iter(os.popen("nm " + file).readlines())
You don't need to call readlines() and then call iter() on the result.
You can just do "for l in os.popen(...):". Iteration by line is the
default behavior you get if you iterate over a file or file-like object.
Also, nm works on an object file, but if you ran this on an executable,
you'd need "nm -D" instead. I don't know how to handle both cases
transparently, other than trying one first, and if you see a line ending
in "no symbols" as you loop through, trying the other one. Or just
always running both.
Finally, you can pass --undefined-only to nm to only show the symbols
that the object or program wants from some other library, rather than
the symbols the object or program defines.
> + for l in lines:
> + if not '@@GLIBC' in l:
Python has a "not in" operator, which allows you to write this more
naturally as:
if "@@GLIBC" not in l:
> + continue
> + words = l.split()
> + for e in words:
You can just use "for e in l.split():".
Or, better yet, since the output of nm with --undefined-only should
always consist of two columns, the first just containing "U", and the
second containing a symbol name, you can just write:
t, n = l.split()
Then check if t == 'U', and if so, check n, without looping.
> + if '@@GLIBC' in e:
> + sym.append(re.split("@@GLIBC", e)[0])
> + break
> + return sym
> +
> +
> +# Find which syscalls from userspace can be optionally compiled in the kernel
> +def get_optional_syscalls(file):
> + cnf = []
> + # Run this on the object file of the application
> + sym = get_userspace_syscalls(sys.argv[1])
> + for e in sym:
> + with open(file) as f:
> + lines = f.read().splitlines()
> + i = "sys_" + e
> + if i in lines:
> + cnf.append(e)
> + return cnf
This function is reading every line of f for every symbol.
Instead, I'd suggest reading the file once, putting the results into a
dictionary, and looking up each of the smbols in the dictionary.
On top of that, get_optional_syscalls shouldn't be looking at
sys.argv[1] or calling get_userspace_syscalls; you should process the
optional syscall list and return a dictionary, and the caller can
evaluate the userspace syscalls against that dictionary.
Also, a few lines there have trailing spaces. You should check for
those in the whole file, and drop any trailing spaces.
> +
> +
> +def c_to_o(file):
> + f = re.split('/', file)[-1]
This is os.path.basename(file).
> + name, ext = os.path.splitext(f)
> + return " " + name + ".o"
Prepending a space seems very odd here. If that's needed as part of the
search in the makefile, the caller should do that. More importantly,
including that space will miss some cases, since it's perfectly legal to
write:
obj-$(CONFIG_SOMETHING):=something.o
without spaces.
> +
> +
> +def add_to_dictionary(dict1, key, value):
> + if key in dict1:
> + dict1[key].extend(value)
> + else:
> + dict1[key] = value
There's a simpler pattern for this:
dict1.setdefault(key, []).extend(value)
setdefault looks up the key in the dictionary, sets it to the passed
default value if not already set, and then returns it.
That's a common enough pattern that I'd suggest just inlining it into
the caller rather than having a function for it.
Also, be careful about the difference between "append" and "extend".
"extend" adds each item in an iterable to the list, while "append" adds
its argument to the list. And since a string is iterable, if you pass
it to "extend" you add all the characters rather than the whole string.
If you're adding a single value, I'd suggest using append and passing
that value, rather than using extend and passing the single value
wrapped in a list. (On the other hand, if you actually do have a list
of values to add, definitely use extend.)
Nice. This took me a minute to understand, but it looks like you're
handling lines that end in a backslash and thus get continued in the
next line. This deserves a comment.
> + yes = '\n'.join([l for l in lines if search_for in l])
> + if re.search('.*-\$\(CONFIG.*\)', yes):
> + value = re.split('[$()]', yes)[2]
> + for e in sys_list:
> + if not e in map_sys:
> + map_sys[e] = []
> + map_sys[e].append(value)
This is that same pattern mentioned above, where you can just use
setdefault. Also, since sys_list is a list, you can use extend rather
than a loop over calls to append.
> + # Check if a file is compiled under ifdefs
> + for l in lines:
> + if re.search('^ifdef', l):
As mentioned on our previous call, you may also need to handle certain
cases of "ifeq" or "ifneq". At least, if any syscalls depend on those;
if not, don't worry about it.
> + name = get_ifdef_symbols(l)
> + new = Node(parent=curr, name=name)
> + curr.add_child(new)
> + curr = new
> + elif search_for in l:
> + if
curr.name:
> + for e in sys_list:
> + add_to_dictionary(map_sys, e,
curr.name)
> + elif re.search('^endif', l):
> + if curr.parent is not None:
> + curr = curr.parent
> + except:
> + pass
There's potentially one additional case you might have to handle: a
Makefile in a higher-level directory might potentially configure out the
entire directory. I'm not sure if there are any cases of that happening
for syscalls, though.
Also, the Node mechanism may be more complex than you need; it looks
like you only ever have a single child for each node, so instead of a
tree-like structure, you can just use a stack (implemented as a list).
> --
> You received this message because you are subscribed to the Google Groups "opw-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
opw-kernel+...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.