Unicode Greek letters in matplotlib images with nbconvert's python API

Paul Hobson

unread,

Feb 7, 2017, 9:31:35 PM2/7/17

to Project Jupyter

I'm having trouble displaying Greek letters when using the nbconvert python API. Is there a trick to dumping PNGs from the python API?

Interestingly, the images look fine when dumped from the command line via:

jupyter nbconvert img_test.ipynb --to rst

My function to dump the images is here (inspired by: http://nbconvert.readthedocs.io/en/latest/nbconvert_library.html#Extracting-Figures-using-the-RST-Exporter)

    import nbformat
    from nbconvert import RSTExporter
    nbfile = 'img_test.ipynb'
    basename, _ = os.path.splitext(nbfile)

    with open(nbfile, 'r') as nb:
        nbdata = nbformat.reads(nb.read(), as_version=4)

    body, images = RSTExporter().from_notebook_node(nbdata)
    with open(basename + '.rst', 'w') as rst_out:
        rst_out.write(body)

    for img_name, img_data in images['outputs'].items():
        with open(img_name, 'wb') as img:
            img.write(img_name)

Here the good image from the command line:

Here's the bad image from the python function. Notice the difference in the x-axis label

Here's the notebook spec that generated both of those:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from matplotlib.pyplot import subplots\n",
    "fig, ax = subplots()\n",
    "ax.set_xlabel('Alpha and Beta (α=0, β=1)')\n",
    "fig.tight_layout()"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

This is on Windows 10, if that matters.

Cheers,

-Paul

Thomas Kluyver

unread,

Feb 8, 2017, 7:15:14 AM2/8/17

to Project Jupyter

In this line:

On 8 February 2017 at 02:31, Paul Hobson <pmho...@gmail.com> wrote:

with open(nbfile, 'r') as nb:

Can you try adding a parameter to open(..., encoding='utf-8')

I think it's reading the notebook file wrong because of the default encoding on Windows.

You could also use the filename directly:

nbdata = nbformat.read(nbfile, as_version=4)

Thomas

Paul Hobson

unread,

Feb 8, 2017, 6:58:10 PM2/8/17

to jup...@googlegroups.com

On Wed, Feb 8, 2017 at 4:14 AM, Thomas Kluyver <tak...@gmail.com> wrote:

In this line:

On 8 February 2017 at 02:31, Paul Hobson <pmho...@gmail.com> wrote:
with open(nbfile, 'r') as nb:

Can you try adding a parameter to open(..., encoding='utf-8')

I think it's reading the notebook file wrong because of the default encoding on Windows.

Thanks for the response. By itself, it didn't do much. But while I was trying to trim down a SSCCE to demonstrate that I found the issue.

The crux of it was that I was evaluating the notebook, writing that to a new file, then reading it again and extracting the images. The solution was to simply store the evaluating notebook in memory and extract them from there. Looking back at the documentation, I should have started with that approach.

For those facing similar issues, here's my final function that reads an un-executed notebook, executes it in memory, writes an RST of the executed notebook, and saves all of the images:

def convert(nbfile):

basename, _ = os.path.splitext(nbfile)

meta = {'metadata': {'path': '.'}}

with open(nbfile, 'r', encoding='utf-8') as nbf:

nbdata = nbformat.read(nbf, as_version=4, encoding='utf-8')

runner = ExecutePreprocessor(timeout=600, kernel_name='probscale')

runner.preprocess(nbdata, meta)

img_folder = basename + '_files'

body_raw, images = RSTExporter().from_notebook_node(nbdata)

body_final = body_raw.replace('.. image:: ', '.. image:: {}/'.format(img_folder))

with open(basename + '.rst', 'w', encoding='utf-8') as rst_out:

rst_out.write(body_final)

for img_name, img_data in images['outputs'].items():

img_path = os.path.join(img_folder, img_name)

with open(img_path, 'wb') as img:

img.write(img_data)

Thanks for the nudge!

-paul

Thomas Kluyver

unread,

Feb 9, 2017, 5:38:29 AM2/9/17

to Project Jupyter

On 8 February 2017 at 23:58, Paul Hobson <pmho...@gmail.com> wrote:

nbdata = nbformat.read(nbf, as_version=4, encoding='utf-8')

For reference, I'm pretty sure that passing encoding= on this line has no effect - it's not passed down to anything that takes an encoding argument. Passing it to open() is the important bit.

Thomas

Reply all

Reply to author

Forward