cleanup of modin/ray on AWS

36 views
Skip to first unread message

Nerdromancer

unread,
Apr 30, 2020, 5:04:06 PM4/30/20
to modin-dev
looking for guidance on how to make sure modin and/or ray are properly cleanup up from a running notebook when i'm done with them.
want to prevent problems where multiple notebooks use modin

so far I can only see ray.shutdown()
do i force the unload of the modin module, call ray.shutdown(), then gc.collect()?

i believe i'm seeing residual processes running on my AWS Notebook instance...


Devin Petersohn

unread,
Apr 30, 2020, 6:20:39 PM4/30/20
to Nerdromancer, modin-dev
i believe i'm seeing residual processes running on my AWS Notebook instance...

This is not uncommon, sometimes it is necessary to run `ray stop` from the command line to kill all relevant processes. I find myself doing this sometimes as well. After that, restarting the notebook kernel should work.

Ray doesn't support multi-tenancy yet, but I believe they are working on it.

Devin

--
You received this message because you are subscribed to the Google Groups "modin-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modin-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/modin-dev/1e49360d-e5e7-48d8-9470-59a188bd5899%40googlegroups.com.

Nerdromancer

unread,
May 6, 2020, 5:54:48 PM5/6/20
to modin-dev
thanks. so not a lot of (reliable) luck yet with deleting modin/ray packages to force their unloading but after some more tests i seem to have ray.init() and ray.shutdown() working reliably as long as they are called appropriately within a single notebook, before activating in another

# STARTUP

# one-time startup and modin import
# keep modin and pandas modules separated with aliases

import os

os.environ["MODIN_ENGINE"] = "ray"  # Modin will use Ray
#os.environ["MODIN_ENGINE"] = "dask"  # Modin will use Dask

import psutil
num_cpus = psutil.cpu_count()
# num_cpus = 4 # limit CPUs to try and cut overhead if workloads not that heavy
num_gpus = 0

import ray
# ray.init() with no spec of CPU count will automagically figure out your CPUs
# however, too many CPUs might be an unneeded overhead expense
# try operating with fewer cpus

ray.init(num_cpus=num_cpus, num_gpus=num_gpus, ignore_reinit_error=True)

del num_cpus, num_gpus
import modin.pandas as mpd
import pandas as pd
pandas_impl_module = mpd # change dynamically to experiment w pandas vs modin, write code to this variable

# SHUTDOWN

ray.shutdown()

On Thursday, April 30, 2020 at 6:20:39 PM UTC-4, Devin Petersohn wrote:
i believe i'm seeing residual processes running on my AWS Notebook instance...

This is not uncommon, sometimes it is necessary to run `ray stop` from the command line to kill all relevant processes. I find myself doing this sometimes as well. After that, restarting the notebook kernel should work.

Ray doesn't support multi-tenancy yet, but I believe they are working on it.

Devin

On Thu, Apr 30, 2020 at 2:04 PM Nerdromancer <mscoo...@gmail.com> wrote:
looking for guidance on how to make sure modin and/or ray are properly cleanup up from a running notebook when i'm done with them.
want to prevent problems where multiple notebooks use modin

so far I can only see ray.shutdown()
do i force the unload of the modin module, call ray.shutdown(), then gc.collect()?

i believe i'm seeing residual processes running on my AWS Notebook instance...


--
You received this message because you are subscribed to the Google Groups "modin-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages