[Haskell-cafe] preventing space leaks?

15 views
Skip to first unread message

Baojun Wang

unread,
Sep 19, 2014, 4:00:31 AM9/19/14
to Haskell-Cafe
Hi Cafe,

I'm very new to Haskell, and I'm confused about haskell space leaks, even I read some wiki pages from haskell.org. Below is the scenario I''m facing at:

When I use listArray to create either IArray or UArray, I found it could lead to space leaks (very high GC time), is there a way to prevent this, and why there is no fusion when create IArray/UArray from the list? I don't need the list anyway after array is created. (I used this few times for DP, based on this link: http://jelv.is/blog/Lazy-Dynamic-Programming/).

Further more, suppose if I create a IArray A from list 1, later I create UArray B based some computations based on A (also I need discard another list 2). After B is generated, I don't need A anymore (so this lead to space leaks, correct)?

Another quetion, by the way, after the lazy IArray is created (say from a DP), is there a way convert it to UArray for better performance (IArray -> UArray)?

How can I improve the space leak problems with above use case?

Best Regards
Baojun

Tom Ellis

unread,
Sep 19, 2014, 4:05:59 AM9/19/14
to haskel...@haskell.org
On Fri, Sep 19, 2014 at 01:00:19AM -0700, Baojun Wang wrote:
> When I use listArray to create either IArray or UArray, I found it could
> lead to space leaks (very high GC time), is there a way to prevent this,
> and why there is no fusion when create IArray/UArray from the list? I don't
> need the list anyway after array is created. (I used this few times for DP,
> based on this link: http://jelv.is/blog/Lazy-Dynamic-Programming/).

Could you post some code?
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Baojun Wang

unread,
Sep 19, 2014, 4:23:18 AM9/19/14
to Tom Ellis, Haskell-Cafe
I'm trying to solve this issue: http://www.spoj.com/problems/NFACTOR/
I'm not asking how to solve this issue, you could try if you like, I found it's pretty hard with Haskell :p

Basically I'm trying to find numbers of distinct prime factors in [1..10^6], and because the query (begin, end, k) could be huge, so I create a dedicate UArray for each size k, so that for every (begin, end, k) query, I only need to binary search the UArray with given size k, no need to iterate from begin to end.

important steps are:
1) sieve [1..10^6];             (generate UArray a)
2) compute number of distinct prime factors from above result;             (generate IArray b with little bit DP)
3) generate a IArray Int (UArray Int Int) for each given size k                (generate 10 UArray Int Int from 10 lists, and then another UArray Int (UArray Int Int) wrap the 10 UArray to a 2D IArray.

source code is nfactor4.hs from the attachment.

--
Below is the sample input I'm using (/tmp/n2.txt):

15
1 3 1
1 10 2
1 10 3
1 100 3
1 1000 0
1 1000000 1
1 1000000 2
1 1000000 3
1 1000000 4
1 1000000 5
1 1000000 6
1 1000000 7
1 1000000 8
1 1000000 9
1 1000000 10

--

Here is the output:

$ time cat /tmp/n2.txt | ./nfactor4 +RTS -sstderr -p
2
2
0
8
1
78734
288726
379720
208034
42492
2285
8
0
0
0
     923,210,584 bytes allocated in the heap
     996,236,400 bytes copied during GC
     321,372,456 bytes maximum residency (7 sample(s))
       4,069,728 bytes maximum slop
             496 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1450 colls,     0 par    0.74s    0.74s     0.0005s    0.0195s
  Gen  1         7 colls,     0 par    1.00s    1.00s     0.1424s    0.3483s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.62s  (  0.61s elapsed)
  GC      time    1.73s  (  1.73s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.09s  (  0.09s elapsed)
  Total   time    2.44s  (  2.44s elapsed)

  %GC     time      71.0%  (71.2% elapsed)

  Alloc rate    1,500,811,457 bytes per MUT second

  Productivity  28.9% of total user, 29.0% of total elapsed

[1]+  Done                    emacs nfactor4.hs

real    0m2.445s
user    0m2.095s
sys     0m0.351s

--

From profiling data (nfactor4.prof is the full profiling data), below functions leak hell lot of space:

     buildnfactscache.go        Main                      138     1000000   30.8   54.9    41.3   54.9
      buildnfactorsmemo.memo    Main                      133           1   10.4   26.7    39.0   42.3
       buildnfactorsmemo.go     Main                      139     1000000   24.8   15.6    28.6   15.6



nfactor4.prof
nfactor4.hs
Reply all
Reply to author
Forward
0 new messages