Kaleido M2 Pro

1 view

Skip to first unread message

Kayla Munl

unread,

Aug 5, 2024, 5:25:44 AM8/5/24

to ncestuwerlia

Kaleidois a cross-platform library for generating static images (e.g. png, svg, pdf, etc.) for web-based visualization libraries, with a particular focus on eliminating external dependencies. The project's initial focus is on the export of plotly.js images from Python for use by plotly.py, but it is designed to be relatively straight-forward to extend to other web-based visualization libraries, and other programming languages. The primary focus of Kaleido (at least initially) is to serve as a dependency of web-based visualization libraries like plotly.py. As such, the focus is on providing a programmatic-friendly, rather than user-friendly, API.

The kaleido Python package provides a low-level Python API that is designed to be used by high-level plotting libraries like Plotly. Here is an example of exporting a Plotly figure using the low-level Kaleido API:

Note: This particular example uses an online copy of the plotly JavaScript library from a CDN location, so it will not work without an internet connection. When the plotly Python library uses Kaleido (as in the example above), it provides the path to its own local offline copy of plotly.js and so no internet connection is required.

As simple as it sounds, programmatically generating static images (e.g. raster images like PNGs or vector images like SVGs) from web-based visualization libraries (e.g. Plotly.js, Vega-Lite, etc.) is a complex problem. It's a problem that library developers have struggled with for years, and it has delayed the adoption of these libraries among scientific communities that rely on print-based publications for sharing their research. The core difficulty is that web-based visualization libraries don't actually render plots (i.e. color the pixels) on their own, instead they delegate this work to web technologies like SVG, Canvas, WebGL, etc. Similar to how Matplotlib relies on various backends to display figures, web-based visualization libraries rely on a web browser rendering engine to display figures.

When the figure is displayed in a browser window, it's relatively straight-forward for a visualization library to provide an export-image button because it has full access to the browser for rendering. The difficulty arises when trying to export an image programmatically (e.g. from Python) without displaying it in a browser and without user interaction. To accomplish this, the Python portion of the visualization library needs programmatic access to a web browser's rendering engine.

While approaches 1 and 2 can both be installed using conda, they still rely on all of the system dependencies of a complete web browser, even the parts that aren't actually necessary for rendering a visualization. For example, on Linux both require the installation of system libraries related to audio (libasound.so), video (libffmpeg.so), GUI toolkit (libgtk-3.so), screensaver (libXss.so), and X11 (libX11-xcb.so) support. Many of these are not typically included in headless Linux installations like you find in JupyterHub, Binder, Colab, Azure notebooks, SageMaker, etc. Also, conda is still not as universally available as the pip package manager and neither approach is installable using pip packages.

Additionally, both 1 and 2 communicate between the Python process and the web browser over a local network port. While not typically a problem, certain firewall and container configurations can interfere with this local network connection.

The advantage of options 3 is that it introduces no additional system dependencies. The disadvantage is that it only works when running in a notebook, so it can't be used in standalone Python scripts.

The end result is that all of these libraries have in-depth documentation pages on how to get image export working, and how to troubleshoot the inevitable failures and edge cases. While this is a great improvement over the state of affairs just a couple of years ago, and a lot of excellent work has gone into making these approaches work as seamlessly as possible, the fundamental limitations detailed above still result in sub-optimal user experiences. This is especially true when comparing web-based plotting libraries to traditional plotting libraries like matplotlib and ggplot2 where there's never a question of whether image export will work in a particular context.

To accomplish this goal, Kaleido introduces a new approach. The core of Kaleido is a standalone C++ application that embeds the open-source Chromium browser as a library. This architecture allows Kaleido to communicate with the Chromium browser engine using the C++ API rather than requiring a local network connection. A thin Python wrapper runs the Kaleido C++ application as a subprocess and communicates with it by writing image export requests to standard-in and retrieving results by reading from standard-out. Other language wrappers (e.g. R, Julia, Scala, Rust, etc.) can fairly easily be written in the future because the interface relies only on standard-in / standard-out communication using JSON requests.

By compiling Chromium as a library, we have a degree of control over what is included in the Chromium build. In particular, on Linux we can build Chromium in headless mode which eliminates a large number of runtime dependencies, including the audio, video, GUI toolkit, screensaver, and X11 dependencies mentioned above. The remaining dependencies can then be bundled with the library, making it possible to run Kaleido in minimal Linux environments with no additional dependencies required. In this way, Kaleido can be distributed as a self-contained library that plays a similar role to a matplotlib backend.

Kaleido can be used in just about any online notebook service that permits the use of pip to install the kaleido package. These include Colab, Sagemaker, Azure Notebooks, Databricks, Kaggle, etc. In addition, Kaleido is compatible with the default Docker image used by Binder.

While this approach has many advantages, the main disadvantage is that building Chromium is not for the faint of heart. Even on powerful workstations, downloading and building the Chromium code base takes 50+ GB of disk space and several hours. On Linux this work can be done once and distributed as a large docker container, but we don't have a similar shortcut for Windows and MacOS.

While motivated by the needs of plotly.py, we made the decision early on to design Kaleido to make it fairly straightforward to add support for additional libraries. Plugins in Kaleido are called "scopes". For more information, see -(Plugin)-Architecture.

While Python is the initial target language for Kaleido, it has been designed to make it fairly straightforward to add support for additional languages. For more information, see -wrapper-architecture.

Instructions for building Kaleido differ slightly across operating systems. All of these approaches assume that the Kaleido repository has been cloned and that the working directory is set to the repository root.

The Linux build relies on the jonmmease/chromium-builder docker image, and the scripts in repos/linux_scripts, to download the chromium source to a local folder and then build it.Download docker image

Then build the kaleido application to repos/build/kaleido, and bundle shared libraries and fonts. The input source for this application is stored under repos/kaleido/cc/. The build step will alsocreate the Python wheel under repos/kaleido/py/dist/

The chromium-builder container mostly follows the instructions at +/master/docs/linux/build_instructions.md to install depot_tools and run install-build-deps.sh to install the required build dependencies the appropriate stable version of Chromium. The image is based on ubuntu 16.04, which is the recommended OS for building Chromium on Linux.

To update the version of Chromium in the future, the docker images will need to be updated. Follow the instructions for the DEPOT_TOOLS_COMMIT and CHROMIUM_TAG environment variables in linux_scripts/Dockerfile.

The CMakeLists.txt file in repos/ is only there to help IDE's like CLion/KDevelop figure out how to index the chromium source tree. It can't be used to actually build chromium. Using this approach, it's possible to get full completion and code navigation from repos/kaleido/cc/kaleido.cc in CLion.

Searching for documentation, it seems that problems with Kaleido have been present for a long time, with plenty of contradictory information found on forums, so I thought perhaps I should try PlotlyLight (not really knowing whether that would change anything with saving figures).

If you use Pluto.jl (a Julia package itself and a remarkable IDE), you should not use PlotlyJS.jl directly; otherwise, you will get the usual Kaleido problems. In Pluto, if you want to plot with the functionalities of PlotlyJS.jl, it would be best if you used PlotlyBase.jl instead, which is the partner of PlotlyJS that, in fact, renders the plots created by the former. For more details on how to use PlotlyJS.jl inside Pluto, check this thread Cannot use PlotlyJS .

Finally, if you want to use PlotlyJS and not bother about what IDE you should use, you can use PlutoPlotly.jl, a package developed by @disberd that mimics the functionalities of PlotlyJS and always works out of the box.

You have a lot of packages in the main environment. Maybe, some package precludes you from updating PlotlyJS to the latest version. Your version is v0.18.11 while the latest one is v0.18.12. Can you try to update PlotlyJS and see what you get by doing in Julia mode:

Thanks for the suggestions. I ran the command to change access rights, it made no difference.

I then updated all packages, it did find PlotlyJS 0.18.12 (which is very recent, since I installed the former version 2 days ago!) and installed it. Again, no difference, same error.

I assume this is normal, but out of curiosity I tried to run kaleido.cmd which is in the artifacts directeory and it gives this error:

If there is a chance it can work by deleting julia entirely and reinstalling, this time with PlotlyJS 0.18.12 directly (and not upgrading from 0.18.11), I am willing to try, but this will take quite a bit of time so other suggestions are welcome.