In commit
fde393b, I have introduced a mechanism that should dramatically improve one of the main problems with Brython so far : the speed of module imports.
Before that, to import a module, at each page load the Python source code was translated into Javascript by the Brython engine. For standard library modules, the source code is found either by an Ajax call, or from the file
brython_stdlib.js. Pre-compiling is technically possible and has been suggested many times, but the generated Javascript is something like 10 times bigger than the Python source code, so the whole standard library would be around 30 Mb.
The solution introduced in the commit works under 2 conditions :
- the browser must support the indexedDB database engine (most of them do, including on smartphones)
- the Brython page must use
brython_stdlib.js, or the reduced version
brython_modules.js generated by the CPython brython module
The main idea is to store the Javascript translation of stdlib modules in an indexedDB database : the translation is done only once for each new version of Brython ; the generated Javascript is stored on the client side, not sent over the network, and indexedDB can easily handle a few Mb of data.
Unfortunately, indexedDB works asynchronously, while import is blocking. With this code:
import datetime
print(datetime.datetime.now())
using indexedDB
at runtime to get the
datetime module is not possible, because the code that follows the import statement is not in a callback function that could be called when the indexedDB asynchronous request completes.
The solution is to scan the script
at translation time. For each import statement in the source code, the name of the module to import is stored in a list. When the translation is finished, the Brython engine enters an execution loop (defined in function
loop() in
py2js.js) that uses a tasks stack. The possible tasks are:
- call function inImported() that checks if the module is already in the imported modules. If so, the control returns to loop()
- if not, add a task to the stack : a call to function idb_get() that makes a request to the indexedDB database to see if the Javascript version of the Python module is already stored ; when the task is added, control returns to loop()
- in the callback of this request (function idb_load() ) :
- if the Javascript version exists in the database, it is stored in a Brython variable (__BRYTHON__.precompiled) and the control returns to loop()
- otherwise, the Python source for the module (found in brython_stdlib.js) is translated and another task is added to the stack : a request to store the Javascript code in the indexedDB database. The callback of this request adds another task : a new call to idb_get(), that is sure to succeed this time
- the last task on the stack is the execution of the original script
At run time, there is a change in
py_import.js : when a module in the standard library is imported, the Javascript translation stored in
__BRYTHON__.precompiled is executed : the Python to Javascript translation has been made previously.
Cache updateThe indexedDB database is associated with the browser and persists between browser requests, when the browser is closed, when the PC is restarted, etc. The process described above must define a way to update the Javascript version stored in the database when the Python source code in the stdlib is changed, or when the translation engine changes.
To achieve this, cache update relies on a timestamp. Each version of Brython is marked with a timestamp, updated by the script
make_dist.py. When a script in the stdlib is precompiled and stored in the indexedDB database, the record in the database has a timestamp field set to this Brython timestamp. If a new version of Brython is used in the HTML page, it has a different timestamp and in the result of
idb_load(), a new translation is performed.
LimitationsThe detection of the modules to import is made by a static code analysis, relying on "
import moduleX" of "
from moduleY import foo". It cannot work for imports performed with the built-in function
__import__(), or for code passed to
exec(). In these cases, the previous solution of on-the-fly compilation at each page load is used.
The mechanism is only implemented for modules in the standard library. Using it for modules in site-packages or in the application directory is not implemented at the moment.
Pseudo-code
Below is a simplified version of the cache implementation, written in a Python-like pseudo code.
def brython():
<get Brython scripts in the page>
for script in scripts:
# Translate Python script source to Javascript
root = __BRYTHON__.py2js(script.src)
js = root.to_js()
if hasattr(__BRYTHON__, "VFS") and __BRYTHON__.has_indexedDB:
# If brython_stdlib.js is included in the page, the __BRYTHON__
# object has an attribute VFS (Virtual File System)
for module in root.imports:
tasks.append([inImported, module])
tasks.append(["execute", js])
loop()
def inImported(module_name):
if module_name in imported:
pass
elif module_name in stdlib:
tasks.insert(0, [idb_get, module_name])
loop()
def idb_get(module_name):
request = database.get(module_name)
request.bind("success",
lambda evt: idb_load(evt, module_name))
def idb_load(evt, module_name):
result = evt.target.result
if result and result.timestamp == __BRYTHON__.timestamp:
__BRYTHON__.precompiled[module] = result.content
for subimport in result.imports:
tasks.insert(0, [inImported, subimport])
else:
# Not found or outdated : precompile source code found
# in __BRYTHON__.VFS
js = __BRYTHON__.py2js(__BRYTHON__.VFS[module]).to_js()
tasks.insert(0, [store_precompiled, module, js])
loop()
def store_precompiled(module, js):
"""Store precompiled Javascript in the database."""
request = database.put({"content": js, "name": module})
def restart(evt):
"""When the code is inserted, add a new request to idb_get (this time
we are sure it will find the precompiled code) and call loop()."""
tasks.insert(0, [idb_get, module])
loop()
request.bind("success", restart)
def loop():
"""Pops first item in tasks stack, run task with its arguments."""
if not tasks:
return
func, *args = tasks.pop(0)
if func == "execute":
js_script = args[0]
<execute js_script>
else:
func(*args)