[PATCH] scripts: upgrade to python 3 - part 1

56 views
Skip to first unread message

Waldemar Kozaczuk

unread,
Feb 17, 2020, 1:22:50 AM2/17/20
to osv...@googlegroups.com, Waldemar Kozaczuk, Matt Bass
This patch incorporates changes from original patch
submitted by Matt Bass as well as extra changes made
by Waldemar Kozaczuk to upgrade python scripts from version
2 to version 3.

Some trivial changes are the result of automated upgrade tools like
"Future" module or 2to3. Others involving proper handling of binary and string
data especially when using subprocess module have been manually made. Same goes
for any code that tried to compare non-uniform data which python 3 is more
strict.

Most of the scripts have been tested but there maybe be scenarios/flows
which have been missed.

Ref #1056

Signed-off-by: Matt Bass <bassma...@gmail.com>
Signed-off-by: Waldemar Kozaczuk <jwkoz...@gmail.com>
---
modules/openjdk8-from-host/module.py | 4 +-
scripts/export_manifest.py | 14 +--
scripts/firecracker.py | 10 +--
scripts/gen-rofs-img.py | 28 +++---
scripts/imgedit.py | 11 +--
scripts/libosv.py | 4 +-
scripts/loader.py | 54 ++++++------
scripts/manifest_common.py | 2 +-
scripts/metadata.py | 6 +-
scripts/mkbootfs.py | 9 +-
scripts/module.py | 4 +-
scripts/nbd_client.py | 2 +-
scripts/osv/client.py | 2 +-
scripts/osv/debug.py | 3 +-
scripts/osv/modules/filemap.py | 17 +++-
scripts/osv/prof.py | 20 +++--
scripts/osv/trace.py | 45 ++++++----
scripts/osv/tree.py | 4 +-
scripts/run.py | 4 +-
scripts/setup.py | 10 +--
scripts/test.py | 10 +--
scripts/tests/test_app.py | 3 +-
scripts/tests/test_app_with_test_script.py | 2 +-
.../tests/test_http_app_with_curl_and_ab.py | 12 +--
scripts/tests/testing.py | 4 +-
scripts/trace.py | 88 ++++++++++---------
scripts/upload_manifest.py | 15 +---
27 files changed, 202 insertions(+), 185 deletions(-)

diff --git a/modules/openjdk8-from-host/module.py b/modules/openjdk8-from-host/module.py
index a55c78e9..faab8c36 100644
--- a/modules/openjdk8-from-host/module.py
+++ b/modules/openjdk8-from-host/module.py
@@ -11,12 +11,12 @@ if subprocess.call(['which', 'javac']) != 0:
print('Could not find any jdk on the host. Please install openjdk8!')
os.exit(-1)

-java_version = subprocess.check_output(['java', '-version'], stderr=subprocess.STDOUT)
+java_version = subprocess.check_output(['java', '-version'], stderr=subprocess.STDOUT).decode('utf-8')
if not 'openjdk version "1.8.0' in java_version:
print('Could not find openjdk version 8 on the host. Please install openjdk8!')
os.exit(-1)

-javac_path = subprocess.check_output(['which', 'javac']).split('\n')[0]
+javac_path = subprocess.check_output(['which', 'javac']).decode('utf-8').split('\n')[0]
javac_real_path = os.path.realpath(javac_path)
jdk_path = os.path.dirname(os.path.dirname(javac_real_path))

diff --git a/scripts/export_manifest.py b/scripts/export_manifest.py
index 5e8fb4e7..370b1f29 100755
--- a/scripts/export_manifest.py
+++ b/scripts/export_manifest.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3

import optparse, os, shutil
from manifest_common import add_var, expand, unsymlink, read_manifest, defines, strip_file
@@ -8,7 +8,7 @@ from manifest_common import add_var, expand, unsymlink, read_manifest, defines,
# support for links in OSv, e.g., /etc/mnttab: ->/proc/mounts.
def export_package(manifest, dest):
abs_dest = os.path.abspath(dest)
- print "[INFO] exporting into directory %s" % abs_dest
+ print("[INFO] exporting into directory %s" % abs_dest)

# Remove and create the base directory where we are going to put all package files.
if os.path.exists(abs_dest):
@@ -39,7 +39,7 @@ def export_package(manifest, dest):
os.makedirs(target_dir)

os.symlink(link_source, name)
- print "[INFO] added link %s -> %s" % (name, link_source)
+ print("[INFO] added link %s -> %s" % (name, link_source))

else:
# If it is a symlink, then resolve it add to the list of host symlinks to be created later
@@ -58,23 +58,23 @@ def export_package(manifest, dest):
hostname = strip_file(hostname)

shutil.copy(hostname, name)
- print "[INFO] exported %s" % name
+ print("[INFO] exported %s" % name)
elif os.path.isdir(hostname):
# If hostname is a dir, it is only a request to create the folder on guest. Nothing to copy.
if not os.path.exists(name):
os.makedirs(name)
- print "[INFO] created dir %s" % name
+ print("[INFO] created dir %s" % name)

else:
# Inform the user that the rule cannot be applied. For example, this happens for links in OSv.
- print "[ERR] unable to export %s" % hostname
+ print("[ERR] unable to export %s" % hostname)

for link_source, name in host_symlinks:
target_dir = os.path.dirname(name)
if not os.path.exists(target_dir):
os.makedirs(target_dir)
os.symlink(link_source, name)
- print "[INFO] added link %s -> %s" % (name, link_source)
+ print("[INFO] added link %s -> %s" % (name, link_source))

def main():
make_option = optparse.make_option
diff --git a/scripts/firecracker.py b/scripts/firecracker.py
index 3f21081d..7d57dffe 100755
--- a/scripts/firecracker.py
+++ b/scripts/firecracker.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
#
# pip install requests-unixsocket
import sys
@@ -152,14 +152,14 @@ def print_time(msg):
def setup_tap_interface(mode, tap_interface_name, tap_ip=None, physical_nic=None, bridge_name=None):
# Setup tun tap interface if does not exist
# sudo ip link delete fc_tap0 - this deletes the tap device
- tuntap_interfaces = subprocess.check_output(['ip', 'tuntap'])
+ tuntap_interfaces = subprocess.check_output(['ip', 'tuntap']).decode('utf-8')
if tuntap_interfaces.find(tap_interface_name) < 0:
print("The tap interface %s not found -> needs to set it up!" % tap_interface_name)
dirname = os.path.dirname(os.path.abspath(__file__))
setup_networking_script = os.path.join(dirname, 'setup_fc_networking.sh')
# Check if the bridge exists if user specified it
if mode == 'bridged' and bridge_name:
- bridges = subprocess.check_output(['brctl', 'show'])
+ bridges = subprocess.check_output(['brctl', 'show']).decode('utf-8')
if bridges.find(bridge_name) < 0:
print("The bridge %s does not exist per brctl. Please create one!" % bridge_name)
exit(-1)
@@ -182,7 +182,7 @@ def find_firecracker(dirname):
if not os.path.exists(firecracker_path):
url_base = 'https://github.com/firecracker-microvm/firecracker/releases/download'
download_url = '%s/%s/firecracker-%s' % (url_base, firecracker_version, firecracker_version)
- answer = raw_input("Firecracker executable has not been found under %s. "
+ answer = input("Firecracker executable has not been found under %s. "
"Would you like to download it from %s and place it under %s? [y|n]" %
(firecracker_path, download_url, firecracker_path))
if answer.capitalize() != 'Y':
@@ -233,7 +233,7 @@ def start_firecracker(firecracker_path, socket_path):
def start_firecracker_with_no_api(firecracker_path, firecracker_config_json):
# Start firecracker process and pass configuration JSON as a file
api_file = tempfile.NamedTemporaryFile(delete=False)
- api_file.write(firecracker_config_json)
+ api_file.write(bytes(firecracker_config_json, 'utf-8'))
stty_save()
return subprocess.Popen([firecracker_path, "--no-api", "--config-file", api_file.name],
stdout=sys.stdout, stderr=subprocess.STDOUT), api_file.name
diff --git a/scripts/gen-rofs-img.py b/scripts/gen-rofs-img.py
index e825c639..2614a45a 100755
--- a/scripts/gen-rofs-img.py
+++ b/scripts/gen-rofs-img.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3

#
# Copyright (c) 2015 Carnegie Mellon University.
@@ -93,9 +93,7 @@ class DirectoryEntry(object):
pos = fp.tell()
fp.write(c_ulonglong(self.inode_no))
fp.write(c_ushort(len(self.filename)))
- for c in self.filename:
- fp.write(c_char(c))
-
+ fp.write(bytes(self.filename,'utf-8'))
return fp.tell() - pos

class SymbolicLink(object):
@@ -105,9 +103,7 @@ class SymbolicLink(object):
def write(self,fp):
pos = fp.tell()
fp.write(c_ushort(len(self.path)))
- for c in self.path:
- fp.write(c_char(c))
-
+ fp.write(bytes(self.path,'utf-8'))
return fp.tell() - pos

directory_entries = []
@@ -151,7 +147,7 @@ def next_inode():
return inode

def pad(fp, size):
- fp.write('\0' * size)
+ fp.write(b'\0' * size)
return size

def write_initial_superblock(fp):
@@ -218,13 +214,13 @@ def write_dir(fp, manifest, dirpath, parent_dir):
inode.data_offset = symlinks_count
inode.count = 1
next_symlink(val[2:],manifest)
- print 'Link %s to %s' % (dirpath + '/' + entry, val[2:])
+ print('Link %s to %s' % (dirpath + '/' + entry, val[2:]))
else: #file
inode.mode = REG_MODE
global block
inode.data_offset = block
inode.count = write_file(fp, val)
- print 'Adding %s' % (dirpath + '/' + entry)
+ print('Adding %s' % (dirpath + '/' + entry))

# This needs to be added so that later we can walk the tree
# when fining symlinks
@@ -264,7 +260,7 @@ def write_fs(fp, manifest):
return (block_no, bytes_written)

def gen_image(out, manifest):
- print 'Writing image'
+ print('Writing image')
fp = open(out, 'wb')

# write the initial superblock
@@ -272,7 +268,7 @@ def gen_image(out, manifest):

system_structure_block, bytes_written = write_fs(fp, manifest)
structure_info_last_block_bytes = bytes_written % OSV_BLOCK_SIZE
- structure_info_blocks_count = bytes_written / OSV_BLOCK_SIZE + (1 if structure_info_last_block_bytes > 0 else 0)
+ structure_info_blocks_count = bytes_written // OSV_BLOCK_SIZE + (1 if structure_info_last_block_bytes > 0 else 0)

pad(fp,OSV_BLOCK_SIZE - structure_info_last_block_bytes)

@@ -290,10 +286,10 @@ def gen_image(out, manifest):
sb.symlinks_count = len(symlinks)
sb.inodes_count = len(inodes)

- print 'First block: %d, blocks count: %d' % (sb.structure_info_first_block, sb.structure_info_blocks_count)
- print 'Directory entries count %d' % sb.directory_entries_count
- print 'Symlinks count %d' % sb.symlinks_count
- print 'Inodes count %d' % sb.inodes_count
+ print('First block: %d, blocks count: %d' % (sb.structure_info_first_block, sb.structure_info_blocks_count))
+ print('Directory entries count %d' % sb.directory_entries_count)
+ print('Symlinks count %d' % sb.symlinks_count)
+ print('Inodes count %d' % sb.inodes_count)

fp.seek(0)
fp.write(sb)
diff --git a/scripts/imgedit.py b/scripts/imgedit.py
index 3fa08e75..d314d8cf 100755
--- a/scripts/imgedit.py
+++ b/scripts/imgedit.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3

import sys, struct
import subprocess
@@ -31,14 +31,15 @@ def chs(x):
return c, h, s

def read_chars_up_to_null(file):
- while True:
+ keep_reading = True
+ while keep_reading:
try:
- c = file.read(1)
+ c = file.read(1).decode()
if c == '\0':
- raise StopIteration
+ keep_reading = False
yield c
except ValueError:
- raise StopIteration
+ keep_reading = False

def read_cstr(file):
return ''.join(read_chars_up_to_null(file))
diff --git a/scripts/libosv.py b/scripts/libosv.py
index 5e5aa8cc..e10ef21c 100755
--- a/scripts/libosv.py
+++ b/scripts/libosv.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3
import sys
import re

@@ -30,7 +30,7 @@ for line in f.readlines():
try:
value, tp, bnd, sym = elf.match(line).groups()
out.write("%s = 0x%s;\n"%(sym, value))
- print (".global %s\n.type %s,@%s"%(sym, sym, asm_type[tp]))
+ print(".global %s\n.type %s,@%s"%(sym, sym, asm_type[tp]))
except AttributeError:
pass

diff --git a/scripts/loader.py b/scripts/loader.py
index 500d864a..b7ddc88b 100644
--- a/scripts/loader.py
+++ b/scripts/loader.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python2
+#!/usr/bin/python3

import gdb
import re
@@ -37,8 +37,8 @@ def phys_cast(addr, type):

def values(_dict):
if hasattr(_dict, 'viewvalues'):
- return _dict.viewvalues()
- return _dict.values()
+ return _dict.values()
+ return list(_dict.values())

def read_vector(v):
impl = v['_M_impl']
@@ -426,19 +426,19 @@ class osv_zfs(gdb.Command):

print ("\n:: ARC SIZES ::")
print ("\tCurrent size: %d (%d MB)" %
- (arc_size, arc_size / 1024 / 1024))
+ (arc_size, arc_size // 1024 // 1024))
print ("\tTarget size: %d (%d MB)" %
- (arc_target_size, arc_target_size / 1024 / 1024))
+ (arc_target_size, arc_target_size // 1024 // 1024))
print ("\tMin target size: %d (%d MB)" %
- (arc_min_size, arc_min_size / 1024 / 1024))
+ (arc_min_size, arc_min_size // 1024 // 1024))
print ("\tMax target size: %d (%d MB)" %
- (arc_max_size, arc_max_size / 1024 / 1024))
+ (arc_max_size, arc_max_size // 1024 // 1024))

print ("\n:: ARC SIZE BREAKDOWN ::")
print ("\tMost recently used cache size: %d (%d MB) (%.2f%%)" %
- (arc_mru_size, arc_mru_size / 1024 / 1024, arc_mru_perc))
+ (arc_mru_size, arc_mru_size // 1024 // 1024, arc_mru_perc))
print ("\tMost frequently used cache size: %d (%d MB) (%.2f%%)" %
- (arc_mfu_size, arc_mfu_size / 1024 / 1024, arc_mfu_perc))
+ (arc_mfu_size, arc_mfu_size // 1024 // 1024, arc_mfu_perc))

# Cache efficiency
arc_hits = get_stat_by_name(arc_stats_struct, arc_stats_cast, 'arcstat_hits')
@@ -618,7 +618,7 @@ class osv_mmap(gdb.Command):
end = ulong(vma['_range']['_end'])
flags = flagstr(ulong(vma['_flags']))
perm = permstr(ulong(vma['_perm']))
- size = '{:<16}'.format('[%s kB]' % (ulong(end - start)/1024))
+ size = '{:<16}'.format('[%s kB]' % (ulong(end - start)//1024))

if 'F' in flags:
file_vma = vma.cast(gdb.lookup_type('mmu::file_vma').pointer())
@@ -648,7 +648,7 @@ class osv_vma_find(gdb.Command):
if start <= addr and end > addr:
flags = flagstr(ulong(vma['_flags']))
perm = permstr(ulong(vma['_perm']))
- size = '{:<16}'.format('[%s kB]' % (ulong(end - start)/1024))
+ size = '{:<16}'.format('[%s kB]' % (ulong(end - start)//1024))
print('0x%016x -> vma 0x%016x' % (addr, vma_addr))
print('0x%016x 0x%016x %s flags=%s perm=%s' % (start, end, size, flags, perm))
break
@@ -671,7 +671,7 @@ def ulong(x):
def to_int(gdb_value):
if hasattr(globals()['__builtins__'], 'long'):
# For GDB with python2
- return long(gdb_value)
+ return int(gdb_value)
return int(gdb_value)

class osv_syms(gdb.Command):
@@ -751,7 +751,7 @@ def get_base_class_offset(gdb_type, base_class_name):
name_pattern = re.escape(base_class_name) + "(<.*>)?$"
for field in gdb_type.fields():
if field.is_base_class and re.match(name_pattern, field.name):
- return field.bitpos / 8
+ return field.bitpos // 8

def derived_from(type, base_class):
return len([x for x in type.fields()
@@ -808,11 +808,8 @@ class intrusive_list:
yield node_ptr.cast(self.node_type.pointer()).dereference()
hook = hook['next_']

- def __nonzero__(self):
- return self.root['next_'] != self.root.address
-
def __bool__(self):
- return self.__nonzero__()
+ return self.root['next_'] != self.root.address

class vmstate(object):
def __init__(self):
@@ -832,7 +829,7 @@ class vmstate(object):
self.cpu_list = cpu_list

def load_thread_list(self):
- threads = map(gdb.Value.dereference, unordered_map(gdb.lookup_global_symbol('sched::thread_map').value()))
+ threads = list(map(gdb.Value.dereference, unordered_map(gdb.lookup_global_symbol('sched::thread_map').value())))
self.thread_list = sorted(threads, key=lambda x: int(x["_id"]))

def cpu_from_thread(self, thread):
@@ -896,7 +893,7 @@ def show_thread_timers(t):
gdb.write(' timers:')
for timer in timer_list:
expired = '*' if timer['_state'] == timer_state_expired else ''
- expiration = int(timer['_time']['__d']['__r']) / 1.0e9
+ expiration = int(timer['_time']['__d']['__r']) // 1.0e9
gdb.write(' %11.9f%s' % (expiration, expired))
gdb.write('\n')

@@ -911,7 +908,7 @@ class ResolvedFrame:
self.frame = frame
self.file_name = file_name
self.line = line
- self.func_name = func_name
+ self.__name__ = func_name

def traverse_resolved_frames(frame):
while frame:
@@ -989,14 +986,14 @@ class osv_info_threads(gdb.Command):
function_whitelist = [sched_thread_join]

def is_interesting(resolved_frame):
- is_whitelisted = resolved_frame.func_name in function_whitelist
+ is_whitelisted = resolved_frame.__name__ in function_whitelist
is_blacklisted = os.path.basename(resolved_frame.file_name) in file_blacklist
return is_whitelisted or not is_blacklisted

fr = find_or_give_last(is_interesting, traverse_resolved_frames(newest_frame))

if fr:
- location = '%s at %s:%s' % (fr.func_name, strip_dotdot(fr.file_name), fr.line)
+ location = '%s at %s:%s' % (fr.__name__, strip_dotdot(fr.file_name), fr.line)
else:
location = '??'

@@ -1009,7 +1006,7 @@ class osv_info_threads(gdb.Command):
)
)

- if fr and fr.func_name == sched_thread_join:
+ if fr and fr.__name__ == sched_thread_join:
gdb.write("\tjoining on %s\n" % fr.frame.read_var("this"))

show_thread_timers(t)
@@ -1176,6 +1173,7 @@ def all_traces():
max_trace = ulong(trace_buffer['_size'])

if not trace_log_base:
+ print('!!! Could not find any trace data! Make sure "--trace" option matches some tracepoints.')
raise StopIteration

trace_log = inf.read_memory(trace_log_base, max_trace)
@@ -1214,7 +1212,7 @@ def all_traces():
unpacker.align_up(8)
yield Trace(tp, Thread(thread, thread_name), time, cpu, data, backtrace=backtrace)

- iters = map(lambda cpu: one_cpu_trace(cpu), values(state.cpu_list))
+ iters = [one_cpu_trace(cpu) for cpu in values(state.cpu_list)]
return heapq.merge(*iters)

def save_traces_to_file(filename):
@@ -1281,7 +1279,7 @@ def show_leak():
gdb.flush()
allocs = []
for i in range(size_allocations):
- newpercent = '%2d%%' % round(100.0*i/(size_allocations-1))
+ newpercent = '%2d%%' % round(100.0*i//(size_allocations-1))
if newpercent != percent:
percent = newpercent
gdb.write('\b\b\b%s' % newpercent)
@@ -1343,10 +1341,10 @@ def show_leak():
allocations=cur_n,
minsize=cur_min_size,
maxsize=cur_max_size,
- avgsize=cur_total_size/cur_n,
+ avgsize=cur_total_size//cur_n,
minbirth=cur_first_seq,
maxbirth=cur_last_seq,
- avgbirth=cur_total_seq/cur_n,
+ avgbirth=cur_total_seq//cur_n,
callchain=callchain)
records.append(r)
cur_n = 0
@@ -1538,7 +1536,7 @@ class osv_percpu(gdb.Command):
gdb.write('%s\n'%e)
return
percpu_addr = percpu.address
- for cpu in vmstate().cpu_list.values():
+ for cpu in list(vmstate().cpu_list.values()):
gdb.write("CPU %d:\n" % cpu.id)
base = cpu.obj['percpu_base']
addr = base+to_int(percpu_addr)
diff --git a/scripts/manifest_common.py b/scripts/manifest_common.py
index 58b0794b..0eeb6d42 100644
--- a/scripts/manifest_common.py
+++ b/scripts/manifest_common.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3

import os, io, re, subprocess

diff --git a/scripts/metadata.py b/scripts/metadata.py
index 761a927f..37a53323 100644
--- a/scripts/metadata.py
+++ b/scripts/metadata.py
@@ -1,7 +1,7 @@
-import SimpleHTTPServer as http
-import SocketServer
import subprocess
import os
+import http.server as http
+import socketserver

METADATA_IP = '169.254.169.254'
port = 80
@@ -32,7 +32,7 @@ def start_server(path):
try:
os.chdir(path)
handler = http.SimpleHTTPRequestHandler
- server = SocketServer.TCPServer(("", port), handler, False)
+ server = socketserver.TCPServer(("", port), handler, False)
server.allow_reuse_address = True
server.server_bind()
server.server_activate()
diff --git a/scripts/mkbootfs.py b/scripts/mkbootfs.py
index 134fb7b0..fc57bf31 100755
--- a/scripts/mkbootfs.py
+++ b/scripts/mkbootfs.py
@@ -1,10 +1,7 @@
-#!/usr/bin/python
+#!/usr/bin/python3

import os, struct, optparse, io
-try:
- import configparser
-except ImportError:
- import ConfigParser as configparser
+import configparser
from manifest_common import add_var, expand, unsymlink, read_manifest, defines, strip_file

def main():
@@ -80,7 +77,7 @@ def main():
if hostname.startswith("->"):
link = hostname[2:]
out.write(link.encode())
- out.write('\0')
+ out.write(b'\0')
else:
out.write(open(hostname, 'rb').read())

diff --git a/scripts/module.py b/scripts/module.py
index 548dd0c7..c253968c 100755
--- a/scripts/module.py
+++ b/scripts/module.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3

import re
import os
@@ -226,7 +226,7 @@ def build(args):
else:
print(prefix)

- for module, run_config_name in modules_to_run.items():
+ for module, run_config_name in list(modules_to_run.items()):
run_config = resolve.get_run_config(module, run_config_name)
if run_config:
run_list.append(run_config)
diff --git a/scripts/nbd_client.py b/scripts/nbd_client.py
index 02c99d07..89a1158c 100644
--- a/scripts/nbd_client.py
+++ b/scripts/nbd_client.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# Copyright (C) 2013 Nodalink, SARL.
diff --git a/scripts/osv/client.py b/scripts/osv/client.py
index 0126fe39..435ff621 100644
--- a/scripts/osv/client.py
+++ b/scripts/osv/client.py
@@ -2,7 +2,7 @@ import os
try:
from urllib.parse import urlparse
except ImportError:
- from urlparse import urlparse
+ from urllib.parse import urlparse

_osv_base = '.'
_default_cert_base = os.path.join(_osv_base, 'modules', 'certs', 'build')
diff --git a/scripts/osv/debug.py b/scripts/osv/debug.py
index fe42be60..83372ada 100644
--- a/scripts/osv/debug.py
+++ b/scripts/osv/debug.py
@@ -38,7 +38,7 @@ class SymbolResolver(object):
if show_inline:
flags += 'i'
self.addr2line = subprocess.Popen(['addr2line', '-e', object_path, flags],
- stdin=subprocess.PIPE, stdout=subprocess.PIPE)
+ stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True)
self.cache = {}

def next_line(self):
@@ -82,6 +82,7 @@ class SymbolResolver(object):
if self.show_inline:
self.addr2line.stdin.write('0\n')

+ self.addr2line.stdin.flush()
result = self.parse_line(addr, self.next_line())

if self.show_inline:
diff --git a/scripts/osv/modules/filemap.py b/scripts/osv/modules/filemap.py
index 8121ba74..fc55673f 100644
--- a/scripts/osv/modules/filemap.py
+++ b/scripts/osv/modules/filemap.py
@@ -19,15 +19,26 @@ def _pattern_to_regex(path):
% bad_token.group(1))

path = '^' + re.escape(path) + '$'
+ # Python 3.7 and above does NOT escape '/' so we need to detect it and handle accordingly
+ slash_escaped = '\\/' in path

# Normalize path
- path = re.sub(r'(\\/)+', '\/', path)
+ if slash_escaped:
+ path = re.sub(r'(\\/)+', '\/', path)
+ else:
+ path = re.sub(r'(/)+', '/', path)

# Merge consecutive **/ components
- path = _reduce_path(path, r'\\\*\\\*\\/\\\*\\\*', '\*\*')
+ if slash_escaped:
+ path = _reduce_path(path, r'\\\*\\\*\\/\\\*\\\*', '\*\*')
+ else:
+ path = _reduce_path(path, r'\\\*\\\*/\\\*\\\*', '\*\*')

# Transform ** component
- path = re.sub(r'(\^|\\/)\\\*\\\*(\\/|\$)', '((^|\/).*($|\/|^))', path)
+ if slash_escaped:
+ path = re.sub(r'(\^|\\/)\\\*\\\*(\\/|\$)', '((^|\/).*($|\/|^))', path)
+ else:
+ path = re.sub(r'(\^|/)\\\*\\\*(/|\$)', '((^|/).*($|\|^))', path)

path = path.replace('\\*', '[^/]*')
path = path.replace('\\?', '.')
diff --git a/scripts/osv/prof.py b/scripts/osv/prof.py
index 95db15b8..465a223f 100644
--- a/scripts/osv/prof.py
+++ b/scripts/osv/prof.py
@@ -51,7 +51,7 @@ time_units = [
]

def parse_time_as_nanos(text, default_unit='ns'):
- for level, name in sorted(time_units, key=lambda (level, name): -len(name)):
+ for level, name in sorted(time_units, key=lambda level_name: -len(level_name[1])):
if text.endswith(name):
return float(text.rstrip(name)) * level
for level, name in time_units:
@@ -60,7 +60,7 @@ def parse_time_as_nanos(text, default_unit='ns'):
raise Exception('Unknown unit: ' + default_unit)

def format_time(time, format="%.2f %s"):
- for level, name in sorted(time_units, key=lambda (level, name): -level):
+ for level, name in sorted(time_units, key=lambda level_name1: -level_name1[0]):
if time >= level:
return format % (float(time) / level, name)
return str(time)
@@ -207,10 +207,16 @@ class timed_trace_producer(object):
self.last_time = None

def __call__(self, sample):
+ if not sample.time:
+ return
+
if not sample.cpu in self.earliest_trace_per_cpu:
self.earliest_trace_per_cpu[sample.cpu] = sample

- self.last_time = max(self.last_time, sample.time)
+ if not self.last_time:
+ self.last_time = sample.time
+ else:
+ self.last_time = max(self.last_time, sample.time)

matcher = self.matcher_by_name.get(sample.name, None)
if not matcher:
@@ -239,7 +245,7 @@ class timed_trace_producer(object):
return trace.TimedTrace(entry_trace, duration)

def finish(self):
- for sample in self.open_samples.itervalues():
+ for sample in self.open_samples.values():
duration = self.last_time - sample.time
yield trace.TimedTrace(sample, duration)

@@ -275,7 +281,7 @@ def get_idle_profile(traces):

def trim_samples(cpu, end_time):
if cpu.idle:
- for w in cpu.waits.values():
+ for w in list(cpu.waits.values()):
begin = max(w.time, cpu.idle.time)
yield ProfSample(begin, w.cpu, w.thread, w.backtrace, resident_time=end_time - begin)

@@ -295,7 +301,7 @@ def get_idle_profile(traces):

last = t

- for cpu in cpus.values():
+ for cpu in list(cpus.values()):
for s in trim_samples(cpu, t.time):
yield s

@@ -402,7 +408,7 @@ def print_profile(samples, symbol_resolver, caller_oriented=False,
if not order:
order = lambda node: (-node.resident_time, -node.hit_count)

- for group, tree_root in sorted(groups.iteritems(), key=lambda (thread, node): order(node)):
+ for group, tree_root in sorted(iter(groups.items()), key=lambda thread_node: order(thread_node[1])):
collapse_similar(tree_root)

if max_levels:
diff --git a/scripts/osv/trace.py b/scripts/osv/trace.py
index 2c28582b..636ea0a4 100644
--- a/scripts/osv/trace.py
+++ b/scripts/osv/trace.py
@@ -65,7 +65,12 @@ class TimeRange(object):
return self.end - self.begin

def intersection(self, other):
- begin = max(self.begin, other.begin)
+ if not self.begin:
+ begin = other.begin
+ elif not other.begin:
+ begin = self.begin
+ else:
+ begin = max(self.begin, other.begin)

if self.end is None:
end = other.end
@@ -143,11 +148,11 @@ class Trace:
class TimedTrace:
def __init__(self, trace, duration=None):
self.trace = trace
- self.duration = duration
+ self.duration_ = duration

@property
def duration(self):
- return self.duration
+ return self.duration_

@property
def time(self):
@@ -183,6 +188,8 @@ def do_split_format(format_str):

_split_cache = {}
def split_format(format_str):
+ if not format_str:
+ return []
result = _split_cache.get(format_str, None)
if not result:
result = list(do_split_format(format_str))
@@ -190,7 +197,7 @@ def split_format(format_str):
return result

formatters = {
- '*': lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
+ '*': lambda bytes: '{' + ' '.join('%02x' % b for b in bytes) + '}'
}

def get_alignment_of(fmt):
@@ -238,16 +245,15 @@ class SlidingUnpacker:
size = struct.calcsize(fmt)
val, = struct.unpack_from(fmt, self.buffer[self.offset:self.offset+size])
self.offset += size
- values.append(val)
+ if fmt.startswith('50p'):
+ values.append(val.decode('utf-8'))
+ else:
+ values.append(val)

return tuple(values)

- def __nonzero__(self):
- return self.offset < len(self.buffer)
-
- # Python3
def __bool__(self):
- return self.__nonzero__()
+ return self.offset < len(self.buffer)

class WritingPacker:
def __init__(self, writer):
@@ -270,7 +276,10 @@ class WritingPacker:
if fmt == '*':
self.pack_blob(arg)
else:
- self.writer(struct.pack(fmt, arg))
+ if fmt == '50p':
+ self.writer(struct.pack(fmt, arg.encode('utf-8')))
+ else:
+ self.writer(struct.pack(fmt, arg))
self.offset += struct.calcsize(fmt)

def pack_blob(self, arg):
@@ -298,7 +307,7 @@ class TraceDumpReaderBase :
self.endian = '<'
self.file = open(filename, 'rb')
try:
- tag = self.file.read(4)
+ tag = self.file.read(4).decode()
if tag == "OSVT":
endian = '>'
elif tag != "TVSO":
@@ -347,7 +356,7 @@ class TraceDumpReaderBase :

def readString(self):
len = self.read('H')
- return self.file.read(len)
+ return self.file.read(len).decode()

class TraceDumpReader(TraceDumpReaderBase) :
def __init__(self, filename):
@@ -378,7 +387,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
sig = ""
for j in range(0, n_args):
arg_name = self.readString()
- arg_sig = self.file.read(1)
+ arg_sig = self.file.read(1).decode()
if arg_sig == 'p':
arg_sig = '50p'
sig += arg_sig
@@ -405,7 +414,7 @@ class TraceDumpReader(TraceDumpReaderBase) :

backtrace = None
if flags & 1:
- backtrace = filter(None, unpacker.unpack('Q' * self.backtrace_len))
+ backtrace = [_f for _f in unpacker.unpack('Q' * self.backtrace_len) if _f]

data = unpacker.unpack(tp.signature)
unpacker.align_up(8)
@@ -414,7 +423,7 @@ class TraceDumpReader(TraceDumpReaderBase) :
yield last_trace

def traces(self):
- iters = map(lambda data: self.oneTrace(data), self.trace_buffers)
+ iters = [self.oneTrace(data) for data in self.trace_buffers]
return heapq.merge(*iters)


@@ -523,7 +532,7 @@ def read(buffer_view):

while unpacker:
tp_key, thread_ptr, thread_name, time, cpu = unpacker.unpack('QQ16sQI')
- thread_name = thread_name.rstrip('\0')
+ thread_name = thread_name.rstrip(b'\0').decode('utf-8')
tp = tracepoints[tp_key]

backtrace = []
@@ -551,7 +560,7 @@ def write(traces, writer):
trace.time, trace.cpu)

if trace.backtrace:
- for frame in filter(None, trace.backtrace):
+ for frame in [_f for _f in trace.backtrace if _f]:
packer.pack('Q', frame)
packer.pack('Q', 0)

diff --git a/scripts/osv/tree.py b/scripts/osv/tree.py
index 594b00e2..86345157 100644
--- a/scripts/osv/tree.py
+++ b/scripts/osv/tree.py
@@ -18,11 +18,11 @@ class TreeNode(object):

def squash_child(self):
assert self.has_only_one_child()
- self.children_by_key = next(self.children_by_key.itervalues()).children_by_key
+ self.children_by_key = next(iter(self.children_by_key.values())).children_by_key

@property
def children(self):
- return self.children_by_key.itervalues()
+ return iter(self.children_by_key.values())

def has_only_one_child(self):
return len(self.children_by_key) == 1
diff --git a/scripts/run.py b/scripts/run.py
index 9ab8d86f..f4452345 100755
--- a/scripts/run.py
+++ b/scripts/run.py
@@ -1,5 +1,5 @@
-#!/usr/bin/env python
-from __future__ import print_function
+#!/usr/bin/env python3
+
import subprocess
import sys
import argparse
diff --git a/scripts/setup.py b/scripts/setup.py
index 2a7f1c18..a58132ab 100755
--- a/scripts/setup.py
+++ b/scripts/setup.py
@@ -1,8 +1,8 @@
-#!/usr/bin/python2
+#!/usr/bin/python3

# set up a development environment for OSv. Run as root.

-import sys, platform, argparse
+import sys, distro, argparse
import subprocess

standard_ec2_packages = ['python-pip', 'wget']
@@ -319,11 +319,11 @@ parser.add_argument("-t", "--test", action="store_true",
help="install packages required by testing tools")
cmdargs = parser.parse_args()

-(name, version, id) = platform.linux_distribution()
+(name, version, id) = distro.linux_distribution()

for distro in distros:
if type(distro.name) == type([]):
- dname = filter(lambda n: name.startswith(n), distro.name)
+ dname = [n for n in distro.name if name.startswith(n)]
if len(dname):
distro.name = dname[0]
else:
@@ -349,5 +349,5 @@ for distro in distros:
print ('Your distribution %s version %s is not supported by this script' % (name, version))
sys.exit(1)

-print 'Your distribution is not supported by this script.'
+print('Your distribution is not supported by this script.')
sys.exit(2)
diff --git a/scripts/test.py b/scripts/test.py
index 02eb4b55..0f9c07c1 100755
--- a/scripts/test.py
+++ b/scripts/test.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
import atexit
import subprocess
import argparse
@@ -81,7 +81,7 @@ def is_not_skipped(test):
return test.name not in blacklist

def run_tests_in_single_instance():
- run(filter(lambda test: not isinstance(test, TestRunnerTest), tests))
+ run([test for test in tests if not isinstance(test, TestRunnerTest)])

blacklist_tests = ' '.join(blacklist)
args = run_py_args + ["-s", "-e", "/testrunner.so -b %s" % (blacklist_tests)]
@@ -103,7 +103,7 @@ def pluralize(word, count):

def make_export_and_conf():
export_dir = tempfile.mkdtemp(prefix='share')
- os.chmod(export_dir, 0777)
+ os.chmod(export_dir, 0o777)
(conf_fd, conf_path) = tempfile.mkstemp(prefix='export')
conf = os.fdopen(conf_fd, "w")
conf.write("%s 127.0.0.1(insecure,rw)\n" % export_dir)
@@ -155,12 +155,12 @@ def run_tests():
"/tst-nfs.so --server 192.168.122.1 --share %s" %
export_dir) ]

- line = proc.stdout.readline()
+ line = proc.stdout.readline().decode()
while line:
print(line)
if "/tmp" in line:
break
- line = proc.stdout.readline()
+ line = proc.stdout.readline().decode()


run(tests_to_run)
diff --git a/scripts/tests/test_app.py b/scripts/tests/test_app.py
index 27112ff7..1738e755 100755
--- a/scripts/tests/test_app.py
+++ b/scripts/tests/test_app.py
@@ -1,7 +1,6 @@
-#!/usr/bin/python
+#!/usr/bin/python3
from testing import *
import argparse
-import subprocess
from time import sleep

def run(command, hypervisor_name, image_path=None, line=None, guest_port=None, host_port=None, input_lines=[], kill_app=False):
diff --git a/scripts/tests/test_app_with_test_script.py b/scripts/tests/test_app_with_test_script.py
index 2ab1b731..8a2c7295 100755
--- a/scripts/tests/test_app_with_test_script.py
+++ b/scripts/tests/test_app_with_test_script.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3
from testing import *
import argparse
import runpy
diff --git a/scripts/tests/test_http_app_with_curl_and_ab.py b/scripts/tests/test_http_app_with_curl_and_ab.py
index 067fcc83..9d613d01 100755
--- a/scripts/tests/test_http_app_with_curl_and_ab.py
+++ b/scripts/tests/test_http_app_with_curl_and_ab.py
@@ -1,11 +1,11 @@
-#!/usr/bin/python
+#!/usr/bin/python3
from testing import *
import argparse
import subprocess
from time import sleep

def check_with_curl(url, expected_http_line):
- output = subprocess.check_output(["curl", "-s", url])
+ output = subprocess.check_output(["curl", "-s", url]).decode('utf-8')
print(output)
if expected_http_line not in output:
print("FAILED curl: wrong output")
@@ -39,9 +39,9 @@ def run(command, hypervisor_name, host_port, guest_port, http_path, expected_htt
check_with_curl(app_url, expected_http_line)

if no_keep_alive:
- output = subprocess.check_output(["ab", "-l", "-c", str(concurrency), "-n", str(count), app_url]).split('\n')
+ output = subprocess.check_output(["ab", "-l", "-c", str(concurrency), "-n", str(count), app_url]).decode('utf-8').split('\n')
else:
- output = subprocess.check_output(["ab", "-l", "-k", "-c", str(concurrency), "-n", str(count), app_url]).split('\n')
+ output = subprocess.check_output(["ab", "-l", "-k", "-c", str(concurrency), "-n", str(count), app_url]).decode('utf-8').split('\n')

failed_requests = 1
complete_requests = 0
@@ -74,11 +74,11 @@ def run(command, hypervisor_name, host_port, guest_port, http_path, expected_htt
success = False

if failed_requests > 0:
- print("FAILED ab - encountered failed requests: %d" % failed_requests)
+ print("FAILED ab - encountered failed requests: %d" % failed_requests)
success = False

if complete_requests < count:
- print("FAILED ab - too few complete requests : %d ? %d" % (complete_requests, count))
+ print("FAILED ab - too few complete requests : %d ? %d" % (complete_requests, count))
success = False

if success:
diff --git a/scripts/tests/testing.py b/scripts/tests/testing.py
index c5249753..c3aaf218 100644
--- a/scripts/tests/testing.py
+++ b/scripts/tests/testing.py
@@ -133,7 +133,7 @@ class SupervisedProcess:
self.cv.release()

line = ''
- ch_bytes = ''
+ ch_bytes = bytes()
while True:
ch_bytes = ch_bytes + self.process.stdout.read(1)
try:
@@ -144,7 +144,7 @@ class SupervisedProcess:
if ch == '\n':
append_line(line)
line = ''
- ch_bytes = ''
+ ch_bytes = bytes()
except UnicodeError:
continue

diff --git a/scripts/trace.py b/scripts/trace.py
index 34cfb2ab..71dc47d5 100755
--- a/scripts/trace.py
+++ b/scripts/trace.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python2
+#!/usr/bin/env python3
import sys
import errno
import argparse
@@ -13,6 +13,7 @@ from collections import defaultdict
from osv import trace, debug, prof
from osv.client import Client
import memory_analyzer
+from functools import reduce

class InvalidArgumentsException(Exception):
def __init__(self, message):
@@ -114,7 +115,7 @@ def list_trace(args):
with get_trace_reader(args) as reader:
for t in reader.get_traces():
if t.time in time_range:
- print t.format(backtrace_formatter, data_formatter=data_formatter)
+ print(t.format(backtrace_formatter, data_formatter=data_formatter))

def mem_analys(args):
mallocs = {}
@@ -164,7 +165,7 @@ def add_profile_options(parser):
add_time_slicing_options(parser)
group = parser.add_argument_group('profile options')
group.add_argument("-r", "--caller-oriented", action='store_true', help="change orientation to caller-based; reverses order of frames")
- group.add_argument("-g", "--group-by", choices=groupers.keys(), default='none', help="group samples by given criteria")
+ group.add_argument("-g", "--group-by", choices=list(groupers.keys()), default='none', help="group samples by given criteria")
group.add_argument("--function", action='store', help="use given function as tree root")
group.add_argument("--min-duration", action='store', help="show only nodes with resident time not shorter than this, eg: 200ms")
add_backtrace_options(group)
@@ -236,7 +237,7 @@ def show_profile(args, sample_producer):

def node_filter(*args):
for filter in node_filters:
- if not filter(*args):
+ if not list(filter(*args)):
return False
return True

@@ -276,7 +277,7 @@ def extract(args):
stderr=subprocess.STDOUT)
_stdout, _ = proc.communicate()
if proc.returncode or not os.path.exists(args.tracefile):
- print(_stdout)
+ print(_stdout.decode())
sys.exit(1)
else:
print("error: %s not found" % (elf_path))
@@ -332,8 +333,10 @@ def write_sample_to_pcap(sample, pcap_writer):
}

pkt = dpkt.ethernet.Ethernet()
- pkt.data = sample.data[1]
pkt.type = eth_types[proto]
+ pkt.src = b''
+ pkt.dst = b''
+ pkt.data = sample.data[1]
pcap_writer.writepkt(pkt, ts=ts)

def format_packet_sample(sample):
@@ -343,7 +346,7 @@ def format_packet_sample(sample):
pcap = dpkt.pcap.Writer(proc.stdin)
write_sample_to_pcap(sample, pcap)
pcap.close()
- assert(proc.stdout.readline() == "reading from file -, link-type EN10MB (Ethernet)\n")
+ assert(proc.stdout.readline().decode() == "reading from file -, link-type EN10MB (Ethernet)\n")
packet_line = proc.stdout.readline().rstrip()
proc.wait()
return packet_line
@@ -361,7 +364,7 @@ def pcap_dump(args, target=None):
needs_dpkt()

if not target:
- target = sys.stdout
+ target = sys.stdout.buffer

pcap_file = dpkt.pcap.Writer(target)
try:
@@ -439,7 +442,10 @@ def print_summary(args, printer=sys.stdout.write):
else:
min_time = min(min_time, t.time)

- max_time = max(max_time, t.time)
+ if not max_time:
+ max_time = t.time
+ else:
+ max_time = max(max_time, t.time)

if args.timed:
timed = timed_producer(t)
@@ -450,42 +456,42 @@ def print_summary(args, printer=sys.stdout.write):
timed_samples.extend((timed_producer.finish()))

if count == 0:
- print "No samples"
+ print("No samples")
return

- print "Collected %d samples spanning %s" % (count, prof.format_time(max_time - min_time))
+ print("Collected %d samples spanning %s" % (count, prof.format_time(max_time - min_time)))

- print "\nTime ranges:\n"
- for cpu, r in sorted(cpu_time_ranges.items(), key=lambda (c, r): r.min):
- print " CPU 0x%02d: %s - %s = %10s" % (cpu,
+ print("\nTime ranges:\n")
+ for cpu, r in sorted(list(cpu_time_ranges.items()), key=lambda c_r: c_r[1].min):
+ print(" CPU 0x%02d: %s - %s = %10s" % (cpu,
trace.format_time(r.min),
trace.format_time(r.max),
- prof.format_time(r.max - r.min))
+ prof.format_time(r.max - r.min)))

- max_name_len = reduce(max, map(lambda tp: len(tp.name), count_per_tp.iterkeys()))
+ max_name_len = reduce(max, [len(tp.name) for tp in iter(count_per_tp.keys())])
format = " %%-%ds %%8s" % (max_name_len)
- print "\nTracepoint statistics:\n"
- print format % ("name", "count")
- print format % ("----", "-----")
+ print("\nTracepoint statistics:\n")
+ print(format % ("name", "count"))
+ print(format % ("----", "-----"))

- for tp, count in sorted(count_per_tp.iteritems(), key=lambda (tp, count): tp.name):
- print format % (tp.name, count)
+ for tp, count in sorted(iter(count_per_tp.items()), key=lambda tp_count: tp_count[0].name):
+ print(format % (tp.name, count))

if args.timed:
format = " %-20s %8s %8s %8s %8s %8s %8s %8s %15s"
- print "\nTimed tracepoints [ms]:\n"
+ print("\nTimed tracepoints [ms]:\n")

- timed_samples = filter(lambda t: t.time_range.intersection(time_range), timed_samples)
+ timed_samples = [t for t in timed_samples if t.time_range.intersection(time_range)]

if not timed_samples:
- print " None"
+ print(" None")
else:
- print format % ("name", "count", "min", "50%", "90%", "99%", "99.9%", "max", "total")
- print format % ("----", "-----", "---", "---", "---", "---", "-----", "---", "-----")
+ print(format % ("name", "count", "min", "50%", "90%", "99%", "99.9%", "max", "total"))
+ print(format % ("----", "-----", "---", "---", "---", "---", "-----", "---", "-----"))

- for name, traces in get_timed_traces_per_function(timed_samples).iteritems():
+ for name, traces in get_timed_traces_per_function(timed_samples).items():
samples = sorted(list((t.time_range.intersection(time_range).length() for t in traces)))
- print format % (
+ print(format % (
name,
len(samples),
format_duration(get_percentile(samples, 0)),
@@ -494,9 +500,9 @@ def print_summary(args, printer=sys.stdout.write):
format_duration(get_percentile(samples, 0.99)),
format_duration(get_percentile(samples, 0.999)),
format_duration(get_percentile(samples, 1)),
- format_duration(sum(samples)))
+ format_duration(sum(samples))))

- print
+ print()

def list_cpu_load(args):
load_per_cpu = {}
@@ -550,7 +556,7 @@ def list_timed(args):

for timed in timed_traces:
t = timed.trace
- print '0x%016x %-15s %2d %20s %7s %-20s %s%s' % (
+ print('0x%016x %-15s %2d %20s %7s %-20s %s%s' % (
t.thread.ptr,
t.thread.name,
t.cpu,
@@ -558,7 +564,7 @@ def list_timed(args):
trace.format_duration(timed.duration),
t.name,
trace.Trace.format_data(t),
- bt_formatter(t.backtrace))
+ bt_formatter(t.backtrace)))

def list_wakeup_latency(args):
bt_formatter = get_backtrace_formatter(args)
@@ -575,9 +581,9 @@ def list_wakeup_latency(args):
return "%4.6f" % (float(nanos) / 1e6)

if not args.no_header:
- print '%-18s %-15s %3s %20s %13s %9s %s' % (
+ print('%-18s %-15s %3s %20s %13s %9s %s' % (
"THREAD", "THREAD-NAME", "CPU", "TIMESTAMP[s]", "WAKEUP[ms]", "WAIT[ms]", "BACKTRACE"
- )
+ ))

with get_trace_reader(args) as reader:
for t in reader.get_traces():
@@ -594,14 +600,14 @@ def list_wakeup_latency(args):
if t.cpu == waiting_thread.wait.cpu:
wakeup_delay = t.time - waiting_thread.wake.time
wait_time = t.time - waiting_thread.wait.time
- print '0x%016x %-15s %3d %20s %13s %9s %s' % (
+ print('0x%016x %-15s %3d %20s %13s %9s %s' % (
t.thread.ptr,
t.thread.name,
t.cpu,
trace.format_time(t.time),
format_wakeup_latency(wakeup_delay),
trace.format_duration(wait_time),
- bt_formatter(t.backtrace))
+ bt_formatter(t.backtrace)))

def add_trace_listing_options(parser):
add_time_slicing_options(parser)
@@ -615,7 +621,7 @@ def convert_dump(args):
if os.path.exists(args.tracefile):
os.remove(args.tracefile)
assert(not os.path.exists(args.tracefile))
- print "Converting dump %s -> %s" % (args.dumpfile, args.tracefile)
+ print("Converting dump %s -> %s" % (args.dumpfile, args.tracefile))
td = trace.TraceDumpReader(args.dumpfile)
trace.write_to_file(args.tracefile, list(td.traces()))
else:
@@ -631,7 +637,7 @@ def download_dump(args):
client = Client(args)
url = client.get_url() + "/trace/buffers"

- print "Downloading %s -> %s" % (url, file)
+ print("Downloading %s -> %s" % (url, file))

r = requests.get(url, stream=True, **client.get_request_kwargs())
size = int(r.headers['content-length'])
@@ -641,7 +647,7 @@ def download_dump(args):
for chunk in r.iter_content(8192):
out_file.write(chunk)
current += len(chunk)
- sys.stdout.write("[{0:8d} / {1:8d} k] {3} {2:.2f}%\r".format(current/1024, size/1024, 100.0*current/size, ('='*32*(current/size)) + '>'))
+ sys.stdout.write("[{0:8d} / {1:8d} k] {3} {2:.2f}%\r".format(current//1024, size//1024, 100.0*current//size, ('='*32*(current//size)) + '>'))
if current >= size:
sys.stdout.write("\n")
sys.stdout.flush()
@@ -789,7 +795,7 @@ if __name__ == "__main__":
args = parser.parse_args()

if getattr(args, 'paginate', False):
- less_process = subprocess.Popen(['less', '-FX'], stdin=subprocess.PIPE)
+ less_process = subprocess.Popen(['less', '-FX'], stdin=subprocess.PIPE, text=True)
sys.stdout = less_process.stdin
else:
less_process = None
@@ -797,7 +803,7 @@ if __name__ == "__main__":
try:
args.func(args)
except InvalidArgumentsException as e:
- print "Invalid arguments:", e.message
+ print("Invalid arguments:", e.message)
except IOError as e:
if e.errno != errno.EPIPE:
raise
diff --git a/scripts/upload_manifest.py b/scripts/upload_manifest.py
index 42033203..f036cbb9 100755
--- a/scripts/upload_manifest.py
+++ b/scripts/upload_manifest.py
@@ -1,17 +1,11 @@
-#!/usr/bin/python
+#!/usr/bin/python3

import optparse, os, subprocess, socket, threading, stat, sys
from manifest_common import add_var, expand, unsymlink, read_manifest, defines, strip_file
from contextlib import closing

-try:
- import StringIO
- # This works on Python 2
- StringIO = StringIO.StringIO
-except ImportError:
- import io
- # This works on Python 3
- StringIO = io.StringIO
+import io
+StringIO = io.StringIO

def find_free_port():
with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as s:
@@ -53,8 +47,7 @@ def upload(osv, manifest, depends, port):
def cpio_field(number, length):
return ("%.*x" % (length, number)).encode()
def cpio_header(filename, mode, filesize):
- if sys.version_info >= (3, 0, 0):
- filename = filename.encode("utf-8")
+ filename = filename.encode("utf-8")
return (b"070701" # magic
+ cpio_field(0, 8) # inode
+ cpio_field(mode, 8) # mode
--
2.20.1

Nadav Har'El

unread,
Feb 18, 2020, 5:34:58 AM2/18/20
to Waldemar Kozaczuk, Osv Dev, Matt Bass
Thanks. I commented with a few concerns below. I'm only really worried about loader.py - the gdb script, which was already supposed to be working for both Python 2 and Python 3 (although we probably didn't test it recently), and I'm worried these changes are breaking it, rather than improving it - and it's difficult to test because these changes change all sorts of "osv" commands in gdb which I guess you didn't test individually.

I ask that you please at least try to run the affected scripts in Python3 before "fixing" them at all. In particular, some scripts that had "/usr/bin/python" at the top (and not /usr/bin/python2) at the top already worked correctly for Python 3 (and Python 2) because some people already had python 3 as their default python (although, it is quite likely that we haven't tested this in a while so things broke).

--
Nadav Har'El
n...@scylladb.com


On Mon, Feb 17, 2020 at 8:22 AM Waldemar Kozaczuk <jwkoz...@gmail.com> wrote:
--- a/scripts/loader.py
+++ b/scripts/loader.py

Please make sure to test this file a bit, because it isn't part of the standard build or run.
This is the script loaded automatically when you use gdb OSv.

Also, I was always under the impression that this script already worked on both Python 2 and
Python 3, because we had no idea what gdb runs for us (the "#!" in the top of this script isn't relevant, right?)?

@@ -1,4 +1,4 @@
-#!/usr/bin/python2
+#!/usr/bin/python3

 import gdb
 import re
@@ -37,8 +37,8 @@ def phys_cast(addr, type):

 def values(_dict):
     if hasattr(_dict, 'viewvalues'):
-        return _dict.viewvalues()
-    return _dict.values()
+        return _dict.values()

This doesn't look right - you check if there is a viewvalues() function and then call values()?
Maybe the whole "hasattr" check isn't needed on Python 2 any more?

+    return list(_dict.values())

I wonder why you need to convert it to a list - what's wrong with a generator which isn't officially a list? Did this solve a real problem, or some automatic converter suggested it?
Again, this change is wrong, was it automated?
The whole point of this code was to support *both* python 2
and python 3, and now we go breaking the old python 2 code,
but not even removing the if :-)

     return int(gdb_value)

 class osv_syms(gdb.Command):
@@ -751,7 +751,7 @@ def get_base_class_offset(gdb_type, base_class_name):
     name_pattern = re.escape(base_class_name) + "(<.*>)?$"
     for field in gdb_type.fields():
         if field.is_base_class and re.match(name_pattern, field.name):
-            return field.bitpos / 8
+            return field.bitpos // 8

 def derived_from(type, base_class):
     return len([x for x in type.fields()
@@ -808,11 +808,8 @@ class intrusive_list:
             yield node_ptr.cast(self.node_type.pointer()).dereference()
             hook = hook['next_']

-    def __nonzero__(self):
-        return self.root['next_'] != self.root.address
-

Any reason to remove this? (I don't know why it was needed, but why remove?)

     def __bool__(self):
-        return self.__nonzero__()
+        return self.root['next_'] != self.root.address

 class vmstate(object):
     def __init__(self):
@@ -832,7 +829,7 @@ class vmstate(object):
         self.cpu_list = cpu_list

     def load_thread_list(self):
-        threads = map(gdb.Value.dereference, unordered_map(gdb.lookup_global_symbol('sched::thread_map').value()))
+        threads = list(map(gdb.Value.dereference, unordered_map(gdb.lookup_global_symbol('sched::thread_map').value())))

I don't understand why this change is needed... sorted() should work just fine on an interable, it doesn't need a list as far as I know.
Did something now work without this change?

         self.thread_list = sorted(threads, key=lambda x: int(x["_id"]))

     def cpu_from_thread(self, thread):
@@ -896,7 +893,7 @@ def show_thread_timers(t):
         gdb.write('  timers:')
         for timer in timer_list:
             expired = '*' if timer['_state'] == timer_state_expired else ''
-            expiration = int(timer['_time']['__d']['__r']) / 1.0e9
+            expiration = int(timer['_time']['__d']['__r']) // 1.0e9
             gdb.write(' %11.9f%s' % (expiration, expired))
         gdb.write('\n')

@@ -911,7 +908,7 @@ class ResolvedFrame:
         self.frame = frame
         self.file_name = file_name
         self.line = line
-        self.func_name = func_name
+        self.__name__ = func_name

Why rename this field?? 
It seems like *all* divisions in this file were converted to integer divisions. Are you sure this was the intent in all of them?
Here in particular, it's pretty clear by the 100.0 that floating point division was intended!

 
diff --git a/scripts/module.py b/scripts/module.py
index 548dd0c7..c253968c 100755
--- a/scripts/module.py
+++ b/scripts/module.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3

 import re
 import os
@@ -226,7 +226,7 @@ def build(args):
         else:
             print(prefix)

-    for module, run_config_name in modules_to_run.items():
+    for module, run_config_name in list(modules_to_run.items()):

This smells like an automated change - I don't see any reason why this list is needed!

Can you please review all these changes? Better yet, just run the code with python3 and see if anything *really* doesn't work...

 
diff --git a/scripts/osv/prof.py b/scripts/osv/prof.py
index 95db15b8..465a223f 100644
--- a/scripts/osv/prof.py
+++ b/scripts/osv/prof.py
@@ -51,7 +51,7 @@ time_units = [
 ]

 def parse_time_as_nanos(text, default_unit='ns'):
-    for level, name in sorted(time_units, key=lambda (level, name): -len(name)):
+    for level, name in sorted(time_units, key=lambda level_name: -len(level_name[1])):

I don't understand the changes in this file, they don't look like simple syntax changes... Did you review them?


Hmm, doesn't ord(b) work in Python3 any more?

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/20200217062240.21903-1-jwkozaczuk%40gmail.com.

Waldek Kozaczuk

unread,
Feb 18, 2020, 8:42:47 AM2/18/20
to OSv Development
Hey,

Thanks for the review.

Let me first explain what my process was. At first, I just tried to submit Matt's changes with proper signoff. But then I decided to test it a bit and discovered some things were breaking. Now I do not think it was because of any problems in his original script, but mostly due to the fact that his changes were intertwined with other scripts and I had to change those. Also, pretty modules/openjdk8-from-host/module.py (was not there when Matt was changing) did not work And finally trace.py somehow came along. So here we go.

As far as mechanics goes, I retained most of the Matt's patch as is and I believe he used the "Future" module. I used the 2to3 tool (more specifically 2to3-2.7 in my case) first and then manually fixed any problems I discovered. Most painful to fix was trace.py and all related ones (scripts/osv/*.py) which I tested quite thoroughly by running trace.py (I learned more about it which is great but probably way too much than what I wanted to know at this point ;-)). I also tested all other tests more implicitly by running scripts/build with some variations of image= (java is good as it exercises a lot) and fs=.

But you are right that I could not thoroughly test loader.py (did many commands but not all). Shall we exclude it from this patch?

Lastly, I am probably not the right person to do this upgrade exercise. I do not use Python daily so I am not an expert. Lack of compiler (aka 'complainer') did not make me very confident especially with those larger scripts. But I did not want Matt's work to go to waste. So here we go :-)

On Tuesday, February 18, 2020 at 5:34:58 AM UTC-5, Nadav Har'El wrote:
Thanks. I commented with a few concerns below. I'm only really worried about loader.py - the gdb script, which was already supposed to be working for both Python 2 and Python 3 (although we probably didn't test it recently), and I'm worried these changes are breaking it, rather than improving it - and it's difficult to test because these changes change all sorts of "osv" commands in gdb which I guess you didn't test individually.
Shall we leave it out? 

I ask that you please at least try to run the affected scripts in Python3 before "fixing" them at all. In particular, some scripts that had "/usr/bin/python" at the top (and not /usr/bin/python2) at the top already worked correctly for Python 3 (and Python 2) because some people already had python 3 as their default python (although, it is quite likely that we haven't tested this in a while so things broke).
In most cases, I first ran 2to3 and then applied manual changes if I found any problems (like related to bytes vs str type of issue - really painful to find and fix). Indeed 2to3 is somewhat suspicious as it sometimes put extra parentheses when there were already (I manually removed those changes). Not sure about list-like related changes - all done by 2to3.

--
Nadav Har'El
n...@scylladb.com


On Mon, Feb 17, 2020 at 8:22 AM Waldemar Kozaczuk <jwkoz...@gmail.com> wrote:
--- a/scripts/loader.py
+++ b/scripts/loader.py

Please make sure to test this file a bit, because it isn't part of the standard build or run.
This is the script loaded automatically when you use gdb OSv.

Also, I was always under the impression that this script already worked on both Python 2 and
Python 3, because we had no idea what gdb runs for us (the "#!" in the top of this script isn't relevant, right?)?

Shall we change at least this '#!/usr/bin/python2' to '#!/usr/bin/python' then?
 
@@ -1,4 +1,4 @@
-#!/usr/bin/python2
+#!/usr/bin/python3

 import gdb
 import re
@@ -37,8 +37,8 @@ def phys_cast(addr, type):

 def values(_dict):
     if hasattr(_dict, 'viewvalues'):
-        return _dict.viewvalues()
-    return _dict.values()
+        return _dict.values()

This doesn't look right - you check if there is a viewvalues() function and then call values()?
Maybe the whole "hasattr" check isn't needed on Python 2 any more?

+    return list(_dict.values())

I wonder why you need to convert it to a list - what's wrong with a generator which isn't officially a list? Did this solve a real problem, or some automatic converter suggested it?
Changed by 2to3. 
Changed by 2to3. 

     def __bool__(self):
-        return self.__nonzero__()
+        return self.root['next_'] != self.root.address

 class vmstate(object):
     def __init__(self):
@@ -832,7 +829,7 @@ class vmstate(object):
         self.cpu_list = cpu_list

     def load_thread_list(self):
-        threads = map(gdb.Value.dereference, unordered_map(gdb.lookup_global_symbol('sched::thread_map').value()))
+        threads = list(map(gdb.Value.dereference, unordered_map(gdb.lookup_global_symbol('sched::thread_map').value())))

I don't understand why this change is needed... sorted() should work just fine on an interable, it doesn't need a list as far as I know.
Did something now work without this change?
Changed by 2to3. 

         self.thread_list = sorted(threads, key=lambda x: int(x["_id"]))

     def cpu_from_thread(self, thread):
@@ -896,7 +893,7 @@ def show_thread_timers(t):
         gdb.write('  timers:')
         for timer in timer_list:
             expired = '*' if timer['_state'] == timer_state_expired else ''
-            expiration = int(timer['_time']['__d']['__r']) / 1.0e9
+            expiration = int(timer['_time']['__d']['__r']) // 1.0e9
             gdb.write(' %11.9f%s' % (expiration, expired))
         gdb.write('\n')

@@ -911,7 +908,7 @@ class ResolvedFrame:
         self.frame = frame
         self.file_name = file_name
         self.line = line
-        self.func_name = func_name
+        self.__name__ = func_name

Why rename this field?? 
Changed by 2to3. 
Changed by 2to3. But you are right this does not look correct. 
Will do. 

 
diff --git a/scripts/osv/prof.py b/scripts/osv/prof.py
index 95db15b8..465a223f 100644
--- a/scripts/osv/prof.py
+++ b/scripts/osv/prof.py
@@ -51,7 +51,7 @@ time_units = [
 ]

 def parse_time_as_nanos(text, default_unit='ns'):
-    for level, name in sorted(time_units, key=lambda (level, name): -len(name)):
+    for level, name in sorted(time_units, key=lambda level_name: -len(level_name[1])):

I don't understand the changes in this file, they don't look like simple syntax changes... Did you review them?
2to3.  I may have not tested trace.py with min_duration argument.
Python 2:
>>> f = lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
>>> f(b'test')
'{74 65 73 74}'

Python 3:
>>> f = lambda bytes: '{' + ' '.join('%02x' % ord(b) for b in bytes) + '}'
>>> f(b'test')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
  File "<stdin>", line 1, in <genexpr>
TypeError: ord() expected string of length 1, but int found

f2 = lambda bytes: '{' + ' '.join('%02x' % b for b in bytes) + '}'
>>> f2(b'test')
'{74 65 73 74}'

The ord still works but in the end this lambda breaks.
...

Waldek Kozaczuk

unread,
Feb 18, 2020, 8:54:28 AM2/18/20
to OSv Development
This - https://stackoverflow.com/questions/27476079/why-does-2to3-change-mydict-keys-to-listmydict-keys - might explain this "list" related changes by 2to3. 
...

Nadav Har'El

unread,
Feb 18, 2020, 11:53:19 AM2/18/20
to Waldek Kozaczuk, OSv Development
On Tue, Feb 18, 2020 at 3:54 PM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
This - https://stackoverflow.com/questions/27476079/why-does-2to3-change-mydict-keys-to-listmydict-keys - might explain this "list" related changes by 2to3. 

Yet it doesn't explain the change - it explains why it is *not* actually needed ;-)
I know why it's needed in some specific cases where a list is required and .keys() is no longer a list. But  it's not needed here (in a for loop) and in most other locations where this list() was added :-(

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.

Waldek Kozaczuk

unread,
Feb 18, 2020, 12:04:36 PM2/18/20
to OSv Development
Let me split this patch into some smaller ones and take your approach of first checking if anything breaks with python3.

I think that loader.py needs to be treated separately. Is it true that gdb comes in with its own python runtime and these days it should be the version 3 (we are not relying on externals anymore, right)? The problem with loader.py is that it relies in some places on tracing related scripts:

from osv.trace import (Trace, Thread, TracePoint, BacktraceFormatter,
from osv import trace, debug

Does it mean we want to make those trace related scripts be 2 and 3 compatible?
...

Waldek Kozaczuk

unread,
Feb 18, 2020, 5:52:54 PM2/18/20
to OSv Development
So I think I have a small subset of this patch ready that encompassed everything from the original path but loader.py and trace related files.

But I just realized a problem with updates setup.py. It now depends on distro package - https://pypi.org/project/distro/ - which requires pip to install. Which means we now require python3-pip to be installed first before running setup.py. Chicken or egg kind of problem. This is not an issue with docker as we simply have docker install it first. How do we solve it in non-docker setup? 

Ideas?
Waldek

Fotis Xenakis

unread,
Feb 19, 2020, 5:15:53 PM2/19/20
to OSv Development
Since there is no easy and robust way to determine the distro with python >=3.8 (when platform.linux_distribution() was removed), I see mostly two ways forward:
  • Just require pip for running setup.py, which I personally find reasonable (python is an established requirement anyway).
  • Allow the user to specify the distro to setup.py, to avoid the need for the distro package.
These two could be combined of course, i.e. provide an optional distro flag and if that's not specified, then try pip.
...

Nadav Har'El

unread,
Feb 19, 2020, 5:28:51 PM2/19/20
to Waldek Kozaczuk, OSv Development
On Wed, Feb 19, 2020 at 12:52 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
So I think I have a small subset of this patch ready that encompassed everything from the original path but loader.py and trace related files.

But I just realized a problem with updates setup.py. It now depends on distro package - https://pypi.org/project/distro/ - which requires pip to install. Which means we now require python3-pip to be installed first before running setup.py. Chicken or egg kind of problem. This is not an issue with docker as we simply have docker install it first. How do we solve it in non-docker setup? 

My answer would be: *don't* depend on some random package that doesn't come preinstalled on 99% of the installations and cause people grief.
What I would have done personally is to just parse /etc/os-release with a few lines of hand-written code. Not everything needs to come from a package... Something like (untested)

with open("/etc/os-release") as f:
    distro = {}
    for line in f:
        k,v = line.rstrip().split("=")
        distro[k] = v
distro[k] = v.strip('"')

Now distro['NAME'] is Fedora, distro['VERSION_ID'] is 31, for example.


--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.

Nadav Har'El

unread,
Feb 19, 2020, 5:40:15 PM2/19/20
to Waldek Kozaczuk, OSv Development
On Thu, Feb 20, 2020 at 12:28 AM Nadav Har'El <n...@scylladb.com> wrote:
On Wed, Feb 19, 2020 at 12:52 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
So I think I have a small subset of this patch ready that encompassed everything from the original path but loader.py and trace related files.

But I just realized a problem with updates setup.py. It now depends on distro package - https://pypi.org/project/distro/ - which requires pip to install. Which means we now require python3-pip to be installed first before running setup.py. Chicken or egg kind of problem. This is not an issue with docker as we simply have docker install it first. How do we solve it in non-docker setup? 

My answer would be: *don't* depend on some random package that doesn't come preinstalled on 99% of the installations and cause people grief.
What I would have done personally is to just parse /etc/os-release with a few lines of hand-written code. Not everything needs to come from a package... Something like (untested)

with open("/etc/os-release") as f:
    distro = {}
    for line in f:
        k,v = line.rstrip().split("=")
        distro[k] = v

This line is of course not needed.

Waldek Kozaczuk

unread,
Feb 19, 2020, 5:48:02 PM2/19/20
to OSv Development
I like the idea and this by the was what we use in this shell script - https://github.com/cloudius-systems/osv/blob/69486729cde2a991335ad4d155d274b32a966231/scripts/build-capstan-mpm-packages#L93-L101.

Now how standard is /etc/os-release? Can we rely on it be present on all linux distributions, at least those supported by setup.py?
...

Nadav Har'El

unread,
Feb 20, 2020, 4:29:36 AM2/20/20
to Waldek Kozaczuk, OSv Development
On Thu, Feb 20, 2020 at 12:48 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
I like the idea and this by the was what we use in this shell script - https://github.com/cloudius-systems/osv/blob/69486729cde2a991335ad4d155d274b32a966231/scripts/build-capstan-mpm-packages#L93-L101.

Now how standard is /etc/os-release? Can we rely on it be present on all linux distributions, at least those supported by setup.py?

I don't know. I think it is on the distributions already supported on setup.py (Fedora, RHEL, Ubuntu), but don't know how to make sure.
I assume that's what the various Python packages use, because there is no old-style Unix mechanism (a la uname(1)) to get this information... 

We can always change this if someone complains...

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages