[PATCH RFC] scripts: Detect what syscalls a userspace uses

18 views
Skip to first unread message

Iulia Manda

unread,
Feb 6, 2015, 11:09:10 AM2/6/15
to jo...@joshtriplett.org, opw-k...@googlegroups.com
This is the first part of a patchset that should find out what syscalls a
specific userspace uses and, in the end, compile only the needed
implementations in the kernel.

It searches for values smaller that 360 (in most of the cases, a bigger
value would actually be an address) that are stored in an arch specific
register (mentioned in documentation). Then, it creates a list of numbers
corresponding to the syscalls used by that binary.

I tested this on libc builds for i386 and x86_64.

For any random userspace, lets say, a C program that only uses 3 libc wrappers,
one solution would be to just find what symbols are defined and search for
syscall wrapper calls.
e.g. callq ... <read@plt>

Signed-off-by: Iulia Manda <iulia....@gmail.com>
---
scripts/syscall_list.py | 48 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
create mode 100755 scripts/syscall_list.py

diff --git a/scripts/syscall_list.py b/scripts/syscall_list.py
new file mode 100755
index 0000000..557bb01
--- /dev/null
+++ b/scripts/syscall_list.py
@@ -0,0 +1,48 @@
+#!/usr/bin/python
+
+import sys, os, re
+
+ARCH_SYSCALL_REG = {
+ 'x86_64': 'eax'
+ #TODO add more
+}
+
+def print_usage():
+ # we need to know the arch in order to know in which register
+ # the number of the syscall is set:
+ sys.stderr.write("Usage: %s object_file arch\n" % sys.argv[0])
+ sys.stderr.write("Please select arch as one of the:\n")
+ for arch in ARCH_SYSCALL_REG.keys():
+ sys.stderr.write("\t%s\n" % arch)
+ sys.exit(-1)
+
+if len(sys.argv) != 3:
+ print_usage()
+
+ARCH = ARCH_SYSCALL_REG.get(sys.argv[2])
+
+if ARCH is None:
+ print_usage()
+
+def get_sys_no(file):
+ sym = []
+ lines = os.popen("objdump -lD " + file).readlines()
+ for l in lines:
+ l1 = l.strip().split()
+ if 'mov' not in l1:
+ continue
+ for e in l1:
+ # TODO use ARCH variable
+ if 'eax' in e and e.split(',')[1] == '%eax':
+ test = (e.split(',')[0])[1:].split("(")[0]
+ try:
+ value = int(test, 16)
+ # if value is larger, it is most probably an address
+ if value < 360 and value not in sym:
+ sym.append(value)
+ except Exception as e:
+ pass
+ sym.sort()
+ print sym
+
+get_sys_no(sys.argv[1])
--
1.7.10.4

Reply all
Reply to author
Forward
0 new messages