Best way to clean up GPU memory

Christopher Wright

unread,

Sep 16, 2015, 5:34:06 PM9/16/15

to Numba Public Discussion - Public

Hi all,
What is the best way to free the GPU memory using numba CUDA?
Background:

I have a pair of GTX 970s
I access these GPUs using python threading
My problem, while massively parallel, is very memory intensive. So I break up the work between the GPUs based on their free memory, usually this means that each gpu will get used a bunch of times.
However, when I access them and print out their memory using cuda.current_context().get_memory_info() I find that after the kernels have completed the memory is not free.

Using cuda.current_context().reset() seems to free up the memory, but it also gives me a 139 error when I try to run the next thread on the GPU.
For an example see the printout below. Note ‘i’ is the pre-wrapper memory and ‘f’ is post wrapper, the first number is the free memory, the second the total. This is using a single GPU, for ease of reading.

('i', 4138168320L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1194045440L, 4294770688L)
('i', 1194045440L, 4294770688L)
('f', 1420013568L, 4294770688L)

Diogo Silva

unread,

Oct 8, 2015, 4:14:31 AM10/8/15

to numba...@continuum.io

Have you found any solution to this? I'm just having this problem myself and am in dire need of a good solution.

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/2cb328fa-eb25-4704-a371-e4f0235c264e%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Andrew Kenny

unread,

Oct 8, 2015, 9:55:05 AM10/8/15

to numba...@continuum.io

The CUDA equivalent as far as I can tell is described on this page:
http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__MEMORY_gb17fef862d4d1fefb9dba35bd62a187e.html

Is there a Numba equivalent that can be explicitly called?

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/CAKiRKhTa_yEF2P6n%2BenehKwftCms0CVnNjE6Yz6wSjodM%3D4Knw%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

-- 
Andrew Kenny
Test Engineer
Forwessun
17 Hurricane Court,
International Business Park,
Speke, Liverpool, Merseyside,
L24 8RL

Tel   : +44 (0)151 3700 112

Christopher J. Wright

unread,

Oct 8, 2015, 11:07:38 AM10/8/15

to numba...@continuum.io

There is a numba way to force a GPU garbage collection.

See the TrashService

https://github.com/numba/numba/blob/master/numba/cuda/cudadrv/driver.py#L286

From: numba...@continuum.io [mailto:numba...@continuum.io] On Behalf Of Andrew Kenny
Sent: Thursday, October 8, 2015 9:55 AM
To: numba...@continuum.io
Subject: Re: [Numba] Best way to clean up GPU memory

The CUDA equivalent as far as I can tell is described on this page:
http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__MEMORY_gb17fef862d4d1fefb9dba35bd62a187e.html

Is there a Numba equivalent that can be explicitly called?

On 08/10/2015 09:13, Diogo Silva wrote:

Have you found any solution to this? I'm just having this problem myself and am in dire need of a good solution.

On 16 September 2015 at 22:34, Christopher Wright <cjwrig...@gmail.com> wrote:

Hi all,
What is the best way to free the GPU memory using numba CUDA?
Background:

1.     I have a pair of GTX 970s
2.     I access these GPUs using python threading
3.     My problem, while massively parallel, is very memory intensive. So I break up the work between the GPUs based on their free memory, usually this means that each gpu will get used a bunch of times.
4.     However, when I access them and print out their memory using cuda.current_context().get_memory_info() I find that after the kernels have completed the memory is not free.
5.     Using cuda.current_context().reset() seems to free up the memory, but it also gives me a 139 error when I try to run the next thread on the GPU.

For an example see the printout below. Note ‘i’ is the pre-wrapper memory and ‘f’ is post wrapper, the first number is the free memory, the second the total. This is using a single GPU, for ease of reading.

6. ('i', 4138168320L, 4294770688L)

7. ('f', 1194045440L, 4294770688L)

8. ('i', 1194045440L, 4294770688L)

9. ('f', 1194045440L, 4294770688L)

10.         ('i', 1194045440L, 4294770688L)

11.         ('f', 1194045440L, 4294770688L)

12.         ('i', 1194045440L, 4294770688L)

13.         ('f', 1194045440L, 4294770688L)

14.         ('i', 1194045440L, 4294770688L)

15.         ('f', 1194045440L, 4294770688L)

16.         ('i', 1194045440L, 4294770688L)

17.         ('f', 1194045440L, 4294770688L)

18.         ('i', 1194045440L, 4294770688L)

19.         ('f', 1194045440L, 4294770688L)

20.         ('i', 1194045440L, 4294770688L)

21.         ('f', 1194045440L, 4294770688L)

22.         ('i', 1194045440L, 4294770688L)

23.         ('f', 1194045440L, 4294770688L)

24.         ('i', 1194045440L, 4294770688L)

25.         ('f', 1194045440L, 4294770688L)

26.         ('i', 1194045440L, 4294770688L)

27.         ('f', 1194045440L, 4294770688L)

28.         ('i', 1194045440L, 4294770688L)

29.         ('f', 1194045440L, 4294770688L)

30.         ('i', 1194045440L, 4294770688L)

31.         ('f', 1194045440L, 4294770688L)

32.         ('i', 1194045440L, 4294770688L)

33.         ('f', 1194045440L, 4294770688L)

34.         ('i', 1194045440L, 4294770688L)

35.         ('f', 1194045440L, 4294770688L)

36.         ('i', 1194045440L, 4294770688L)

37.         ('f', 1194045440L, 4294770688L)

38.         ('i', 1194045440L, 4294770688L)

39.         ('f', 1194045440L, 4294770688L)

40.         ('i', 1194045440L, 4294770688L)

41.         ('f', 1194045440L, 4294770688L)

42.         ('i', 1194045440L, 4294770688L)

43.         ('f', 1194045440L, 4294770688L)

44.         ('i', 1194045440L, 4294770688L)

45.         ('f', 1194045440L, 4294770688L)

46.         ('i', 1194045440L, 4294770688L)

47.         ('f', 1194045440L, 4294770688L)

48.         ('i', 1194045440L, 4294770688L)

49.         ('f', 1194045440L, 4294770688L)

50.         ('i', 1194045440L, 4294770688L)

51.         ('f', 1194045440L, 4294770688L)

52.         ('i', 1194045440L, 4294770688L)

53.         ('f', 1194045440L, 4294770688L)

54.         ('i', 1194045440L, 4294770688L)

55.         ('f', 1194045440L, 4294770688L)

56.         ('i', 1194045440L, 4294770688L)

57.         ('f', 1194045440L, 4294770688L)

58.         ('i', 1194045440L, 4294770688L)

59.         ('f', 1194045440L, 4294770688L)

60.         ('i', 1194045440L, 4294770688L)

61.         ('f', 1194045440L, 4294770688L)

62.         ('i', 1194045440L, 4294770688L)

63.         ('f', 1194045440L, 4294770688L)

64.         ('i', 1194045440L, 4294770688L)

65.         ('f', 1194045440L, 4294770688L)

66.         ('i', 1194045440L, 4294770688L)

67.         ('f', 1194045440L, 4294770688L)

68.         ('i', 1194045440L, 4294770688L)

69.         ('f', 1194045440L, 4294770688L)

70.         ('i', 1194045440L, 4294770688L)

71.         ('f', 1194045440L, 4294770688L)

72.         ('i', 1194045440L, 4294770688L)

73.         ('f', 1194045440L, 4294770688L)

74.         ('i', 1194045440L, 4294770688L)

75.         ('f', 1194045440L, 4294770688L)

76.         ('i', 1194045440L, 4294770688L)

77.         ('f', 1194045440L, 4294770688L)

78.         ('i', 1194045440L, 4294770688L)

79.         ('f', 1420013568L, 4294770688L)

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/561675B4.6060600%40forwessun.com.

Diogo Silva

unread,

Oct 8, 2015, 1:44:02 PM10/8/15

to numba...@continuum.io

I don't see how you can use that class directly. The TrashService is instantiated upon context creation. When the memfree method is called (with a device_pointer), the device_pointer given is removed from the allocations dictionary of the context (which holds pointers to all device arrays allocated within that context) and then the service method from the instantiated TrashService is called. And this method is inherited from the servicelib.Service. Still, when I to use memfree to free allocated memory explicitly it gave me errors. Can you provide a simple example how to clear explicitly free the memory?

What I did for now is manually remove the pointers from the allocations dictionary. This seemed to work but I'm afraid that it might break spectacularly.

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/005c01d101db%24103e29b0%2430ba7d10%24%40gmail.com.

Christopher J. Wright

unread,

Oct 8, 2015, 1:50:26 PM10/8/15

to numba...@continuum.io

Yes, that is exactly what I did, remove the data from the allocations and then use the process method or the clear method of the TrashService to finally clear the memory. I haven’t used this in a while, since the ending of a context was able to get rid of all the memory allocation, even if the get memory info function did not show it. And yes there is a high likelihood of breaking stuff, I would be careful with explicit removal of memory. Make double certain that you remove the data once, otherwise odd stuff happens.

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/CAKiRKhRgoPDVquQU%3Dhmjoabp6G6bTk_%3DPHTVR-Q%3D0hb4ypE8Eg%40mail.gmail.com.

Diogo Silva

unread,

Oct 8, 2015, 1:55:24 PM10/8/15

to numba...@continuum.io

I'm missing the part of calling the service. I'll play with that. Thanks for the feedback!

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/006f01d101f1%24cdcfd170%24696f7450%24%40gmail.com.

Christopher Wright

unread,

Nov 11, 2015, 1:25:23 PM11/11/15

to Numba Public Discussion - Public

Just if anyone else has this question. It seems that the magic command is
cuda.current_context().trashing.clear()
If you have deleted arrays and need them to not show up when you call get_mem_info, it seems that this truly removes the array from memory.

...

Diogo Silva

unread,

Nov 12, 2015, 3:46:52 PM11/12/15

to numba...@continuum.io

It seems like a much safer way to do it. Thank you for the input!

--

You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.
To post to this group, send email to numba...@continuum.io.

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/4376e863-f0c7-4634-aec0-f3fda6c9ee87%40continuum.io.

Srikar Kodavati

unread,

Sep 15, 2020, 10:27:04 AM9/15/20

to Numba Public Discussion - Public, cjwrig...@gmail.com

hello,

I didn't understand the solution you posted. can you please help me.

Reply all

Reply to author

Forward