Logging notebook activity (redux)

Gary Page-Wood

no leída,

19 dic 2018, 15:41:2719/12/18

a Project Jupyter

Hey folks,

I've been researching options for enabling user activity monitoring in Jupyter notebooks, and I've come across various related topics here in the group and in GitHub over the last year or two [1][2][3][4]. My use case is deploying JupyterHub in an environment where compliance requirements compel us to record all user activity on their notebook server.

gclen's implementation in [4] is the simpler of the approaches listed back in last year's discussion [2], being very specifically logging-based with the python logger config file for the kernel messages the only config option. There was also mention of what sounded like a more general 'message middleware' kind of approach where logging would just be something you could add to a configurable pipeline of pre/post message processors that could enable much more powerful and far-reaching customisations.

My question is; before I dive in too far into reviving the simple logging approach for the kernel message handler is there any opposition to taking this route now? Is kernel message-handling middleware a thing that might be on the horizon that would clash with this approach, or could we possibly go ahead and just move auditing/logging concerns around later should that become a reality?

Cheers

Gary

[1]https://groups.google.com/d/msg/jupyter/bZlWn_Tas1c/WN5w4T6GCwAJ

[2]https://groups.google.com/d/msg/jupyter/sLKCCBwlKEc/CqrvYCvfBwAJ

[3]https://github.com/jupyter/notebook/issues/4136

[4]https://github.com/jupyter/notebook/issues/2251

Tony Hirst

no leída,

21 dic 2018, 12:03:2621/12/18

a Project Jupyter

A recent post [ https://medium.com/adyen/building-our-data-science-platform-with-spark-and-jupyter-1894c33e6dd0 ] describing the adoption of Jupyter notebooks by Adyen mentioned some logging issues:

We added instrumentation across all levels of data analysis workflows — from looking when users looked in and which notebooks were opened to linking the code entered in notebooks with actual files created and accessed on HDFS. Most of the work was related to creating a specific fork of Jupyter protocol client library and making custom Java agent for drivers and executor jobs. This allowed us to create custom events we track for auditing.

I'm not sure if their repos have code examples though?

Gary Page-Wood

no leída,

22 dic 2018, 3:03:0722/12/18

a Project Jupyter

Thanks I hadn't come across that; I'll take a look...

Responder a todos

Responder al autor

Reenviar