New issue 17 by ygrekheretix: String.nsplit memory usage
http://code.google.com/p/ocaml-extlib/issues/detail?id=17
What steps will reproduce the problem?
# let st = Gc.quick_stat ();;
val st : Gc.stat =
{Gc.minor_words = 1034985.; Gc.promoted_words = 216549.;
Gc.major_words = 726404.; Gc.minor_collections = 33;
Gc.major_collections = 6; Gc.heap_words = 552960; Gc.heap_chunks = 4;
Gc.live_words = 0; Gc.live_blocks = 0; Gc.free_words = 0;
Gc.free_blocks = 0; Gc.largest_free = 0; Gc.fragments = 0;
Gc.compactions = 1; Gc.top_heap_words = 679936}
# String.nsplit (String.make 10000 ' ') " ";;
- : string list =
[ ... ]
# let st2 = Gc.quick_stat ();;
val st2 : Gc.stat =
{Gc.minor_words = 1897029.; Gc.promoted_words = 553575.;
Gc.major_words = 7064466.; Gc.minor_collections = 236;
Gc.major_collections = 17; Gc.heap_words = 6774784; Gc.heap_chunks = 53;
Gc.live_words = 0; Gc.live_blocks = 0; Gc.free_words = 0;
Gc.free_blocks = 0; Gc.largest_free = 0; Gc.fragments = 0;
Gc.compactions = 1; Gc.top_heap_words = 6774784}
# print_endline & gc_diff st st2;;
allocated 52.4MB, heap 47.5MB, collection 0 11 203
- : unit = ()
# Gc.compact ();;
- : unit = ()
# let st3 = Gc.quick_stat ();;
val st3 : Gc.stat =
{Gc.minor_words = 1950890.; Gc.promoted_words = 559622.;
Gc.major_words = 7073711.; Gc.minor_collections = 238;
Gc.major_collections = 19; Gc.heap_words = 634880; Gc.heap_chunks = 5;
Gc.live_words = 0; Gc.live_blocks = 0; Gc.free_words = 0;
Gc.free_blocks = 0; Gc.largest_free = 0; Gc.fragments = 0;
Gc.compactions = 2; Gc.top_heap_words = 6774784}
# print_endline & gc_diff st2 st3;;
allocated 445KB, heap -46.8MB, collection 1 2 2
- : unit = ()
What is the expected output? What do you see instead?
We can see 52 MB allocated during nsplit of 10K string. Memory consumption
of nsplit is currently quadratic to the length of string, in the
patalogical case (n**2)/2 (50MB ~ 10K/2*10K)
What version of the product are you using? On what operating system?
1.5.1
Please provide any additional information below.
Better solution (linear memory usage) can be easily implemented with
find_from from issue #15