nbdime - cell keys

Tony Hirst

unread,

Jun 19, 2017, 3:44:48 PM6/19/17

to Project Jupyter

Pondering generating notebooks from nbdime again (as per https://github.com/jupyter/nbdime/issues/251 ) I was wondering if there's a simple way of getting numeric cell key values from a notebook?

eg if ai have a setup

fn1="test1.ipynb"

fn2="test2.ipynb"

a = nbformat.read(fn1, as_version=4)

b= nbformat.read(fn2, as_version=4)

I seem to be able to get a list of diffs as: nbdime.diffing.diff_notebooks(a, b)

nbdime.diffing.generic.diff_dicts(a, b)

though I don't really understand the datastructure (eg multiple levels of "diff" inside a "diff" per things like: {'diff': [{'diff': [{'diff': [{'diff': [{'diff': [{'diff': [{'key': 46, }

I assume that the 'key' uniquely identifies a line in the notebook a - if so, is there anything that takes notebook and returns it as a data structure with 'key' elements identified according to the same numbering scheme? Then I could print out the (unchanged) elements from a if the key is not in the diff file, or print the diff element if the key is in the diff file?

eg as a roundabout case in point - I can print notebook a as:

nbdime.prettyprint.pretty_print_notebook(a)

and diff between notebook a and b:

nbdime.prettyprint.pretty_print_notebook_diff(fn1,fn2,a,d)

that gives a display akin to nbdiff-web with "Hide unchanged cells" checked:

But how would I go about doing a pretty print to display the equivalent of nbdiff-web with "Hide unchanged cells" unchecked (i.e. displaying unchanged cells too)?

--tony

PS context is:

What I have in mind exploring is a set of functions equivalent to pretty_print_diff_entry() etc along the lines of nb_create_diff_entry() that add metadata annotated cells to an output notebook rather than as colour coded lines in an output text stream. Cell metadata state would identifying the state of the cell (added, removed, etc). I'd also like a switch to be able to add (optionally) unchanged cells to the output notebook, or just unchanged notebook cells with a particular metadata element set in the original notebook (so eg I could force a cell to always appear in the output diff whether it was changed or not).

My use case is trying to explore the space of instructor marking of student notebooks where students are provided with a template notebook which may contain lots of text. Idea would be to create an diff-notebook that contains just student modified cells and some heading cells from the original notebook. (Thinks: actually, I could do that anyway - add a 'stripme' metadata element to cells in notebook supplied to students then run student returned notebooks through a processor that creates a child notebook without the 'stripme' cells.)

The nbgrader doesn't quite work for my setting (and I'm not sure what workflow would). Something around diffing feels right, but I want to try to keep things as diff-revealing jupyter notebooks rather than a diff revealing text files.

I think I need to walk the dog again to try to clarify my thoughts a bit more!

Vidar Tonaas Fauske

unread,

Jun 20, 2017, 10:25:14 AM6/20/17

to jup...@googlegroups.com

Hi Tony,

You've seem to have made decent progress. First off, the diff format
is documented here:
http://nbdime.readthedocs.io/en/latest/diffing.html Hope that can
help understand the format.

To help with the most tricky bit, I implemented the iterator function
I alluded to in the issue:
https://gist.github.com/vidartf/2551f2a825e412fb323a8d67e341e75f

Not sure if this is something we will put into nbdime directly.

Let me know if you have any further problems!

Best,
Vidar Tonaas Fauske

Tony Hirst

unread,

Jun 21, 2017, 10:49:36 AM6/21/17

to Project Jupyter

Wonderful - thanks... :-)

Will have another look...

--tony

Reply all

Reply to author

Forward