Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Haskell-beginners] Interpreting profiling results

0 views
Skip to first unread message

Patrick LeBoutillier

unread,
Jan 1, 2010, 10:41:31 AM1/1/10
to beginners
Hi,

Firstly, Happy New Year!

Secondly, I've written (and improved using other solutions I've found
on the net) a simple sudoku solver which I'm trying to profile.
Here is the code:


import Array
import List (transpose, nub, (\\))
import Data.List

data Sudoku = Sudoku { unit :: Int, cells :: Array (Int, Int) Int,
holes :: [(Int, Int)] }

cell :: Sudoku -> (Int, Int) -> Int
cell s i = (cells s) ! i

instance Read Sudoku where
readsPrec _ s = [(Sudoku unit cells holes, "")]
where unit = length . words . head . lines $ s
cells = listArray ((1, 1), (unit, unit)) (map read . words $ s)
holes = [ c | c <- indices cells, (cells ! c) == 0]

instance Show Sudoku where
show s = unlines [unwords [show $ cell s (x,y) | x <- [1 .. unit s]]
| y <- [1 .. unit s]]

genNums :: Sudoku -> (Int, Int) -> [Int]
genNums s c@(i,j) = ([1 .. u] \\) . nub $ used
where
used = (row s u i j) ++ (col s u i j) ++ (square s sq u i j)
u = unit s
sq = truncate . sqrt . fromIntegral $ u

row s u i j = [cell s (i, y) | y <- [1 .. u]]

col s u i j = [cell s (x, j) | x <- [1 .. u]]

square s sq u i j = [cell s (x, y) | y <- [1 .. u], x <- [1 .. u], f x i, f y j]
where f a b = div (a-1) sq == div (b-1) sq

solve :: Sudoku -> [Sudoku]
solve s =
case holes s of
[] -> [s]
(h:hs) -> do
n <- genNums s h
let s' = Sudoku (unit s) ((cells s) // [(h, n)]) hs
solve s'

main = print . head . solve . read =<< getContents


When I compile as such:

$ ghc -O2 --make Sudoku.hs -prof -auto-all -caf-all -fforce-recomp

and run it on the following puzzle:

0 2 3 4
3 4 1 0
2 1 4 0
0 3 2 1

I get the following profiling report:

Fri Jan 1 10:34 2010 Time and Allocation Profiling Report (Final)

Sudoku +RTS -p -RTS

total time = 0.00 secs (0 ticks @ 20 ms)
total alloc = 165,728 bytes (excludes profiling overheads)

COST CENTRE MODULE %time %alloc

CAF GHC.Handle 0.0 10.7
CAF Text.Read.Lex 0.0 2.1
CAF GHC.Read 0.0 1.2
square Main 0.0 2.8
solve Main 0.0 1.3
show_aVx Main 0.0 3.7
readsPrec_aYF Main 0.0 60.6
main Main 0.0 9.6
genNums Main 0.0 5.0
cell Main 0.0 1.2

individual inherited
COST CENTRE MODULE
no. entries %time %alloc %time %alloc

MAIN MAIN
1 0 0.0 0.3 0.0 100.0
main Main
186 1 0.0 9.6 0.0 85.6
show_aVx Main
196 2 0.0 3.7 0.0 3.7
cell Main
197 16 0.0 0.0 0.0 0.0
solve Main
188 5 0.0 1.3 0.0 11.8
genNums Main
189 8 0.0 5.0 0.0 10.4
square Main
194 88 0.0 2.8 0.0 3.2
cell Main
195 16 0.0 0.4 0.0 0.4
col Main
192 4 0.0 0.7 0.0 1.1
cell Main
193 16 0.0 0.4 0.0 0.4
row Main
190 4 0.0 0.7 0.0 1.1
cell Main
191 16 0.0 0.4 0.0 0.4
readsPrec_aYF Main
187 3 0.0 60.6 0.0 60.6
CAF GHC.Read
151 1 0.0 1.2 0.0 1.2
CAF Text.Read.Lex
144 8 0.0 2.1 0.0 2.1
CAF GHC.Handle
128 4 0.0 10.7 0.0 10.7
CAF GHC.Conc
127 1 0.0 0.0 0.0 0.0

Does the column 'entries' represent the number of times the function
was called? If so, I don't understand
how the 'square' function could be called 88 times when it's caller is
only called 8 times. Same thing with
'genNums' (called 8 times, and solve called 5 times)

What am I missing here?

Patrick

--
=====================
Patrick LeBoutillier
Rosem�re, Qu�bec, Canada
_______________________________________________
Beginners mailing list
Begi...@haskell.org
http://www.haskell.org/mailman/listinfo/beginners

Patrick LeBoutillier

unread,
Jan 3, 2010, 8:17:32 PM1/3/10
to Haskell Cafe
Hi,

This question didn't get any replies on the beginners list, I thought
I'd try it here...

� � � � � Sudoku +RTS -p -RTS

Patrick

Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Daniel Fischer

unread,
Jan 4, 2010, 10:07:51 AM1/4/10
to haskel...@haskell.org
Am Montag 04 Januar 2010 02:17:06 schrieb Patrick LeBoutillier:
> Hi,
>
> This question didn't get any replies on the beginners list, I thought
> I'd try it here...

Sorry, been occupied with other things. I already took a look, but hadn't anything
conclusive enough to reply yet.

>
> I've written (and improved using other solutions I've found on the
> net) a simple sudoku solver which I'm trying to profile. Here is the
> code:
>
>
> import Array

Better

import Data.Array.Unboxed

*much* faster

> import List (transpose, nub, (\\))
> import Data.List
>
> data Sudoku = Sudoku { unit :: Int, cells :: Array (Int, Int) Int,

cells :: UArray (Int,Int) Int

> holes :: [(Int, Int)] }
>
> cell :: Sudoku -> (Int, Int) -> Int
> cell s i = (cells s) ! i
>
> instance Read Sudoku where
> �readsPrec _ s = [(Sudoku unit cells holes, "")]
> � �where unit = length . words . head . lines $ s
> � � � � �cells = listArray ((1, 1), (unit, unit)) (map read . words $ s)
> � � � � �holes = [ c | c <- indices cells, (cells ! c) == 0]
>
> instance Show Sudoku where
> �show s = unlines [unwords [show $ cell s (x,y) | x <- [1 .. unit s]]
>
> | y <- [1 .. unit s]]
>
> genNums :: Sudoku -> (Int, Int) -> [Int]
> genNums s c@(i,j) = ([1 .. u] \\) . nub $ used
> �where

nub isn't nice. It's quadratic in the length of the list. Use e.g.

map head . group . sort

or

Data.[Int]Set.toList . Data.[Int]Set.fromList

if the type is in Ord (and you don't need the distinct elements in the order they come
in). That gives an O(n*log n) nub with a sorted result.
And (\\) isn't particularly fast either (O(m*n), where m and n are the lengths of the
lists). If you use one of the above instead of nub, you can use the O(min m n) 'minus' for
sorted lists:

xxs@(x:xs) `minus` yys@(y:ys)
| x < y = x : xs `minus` yys
| x == y = xs `minus` ys
| otherwise = xxs `minus` ys
xs `minus` _ = xs

Here, you can do better:

genNums s c@(i,j) = nums
where
nums = [n | n <- [1 .. u], arr!n]
arr :: [U]Array Int Bool
arr = accumArray (\_ _ -> False) True (0,u) used

> � �used = (row s u i j) ++ (col s u i j) ++ (square s sq u i j)
> � �u = unit s

Not good to calculate sq here. You'll use it many times, calculate once and store it in s.

> � �sq = truncate . sqrt . fromIntegral $ u
>
> row s u i j = [cell s (i, y) | y <- [1 .. u]]
>
> col s u i j = [cell s (x, j) | x <- [1 .. u]]
>
> square s sq u i j = [cell s (x, y) | y <- [1 .. u], x <- [1 .. u], f x i, f
> y j] where f a b = div (a-1) sq == div (b-1) sq

Test for f y j before you generate x to skip early.

square s sq u i j = [cell s (ni+x,nj+y) | x <- [1 .. sq], y <- [1 .. sq]]
where
qi = (i-1) `div` sq
qj = (j-1) `div` sq
ni = qi*sq
nj = qj*sq

>
> solve :: Sudoku -> [Sudoku]
> solve s =
> �case holes s of
> � �[] -> [s]
> � �(h:hs) -> do
> � � �n <- genNums s h
> � � �let s' = Sudoku (unit s) ((cells s) // [(h, n)]) hs
> � � �solve s'
>
> main = print . head . solve . read =<< getContents
>
>
> When I compile as such:
>
> $ ghc -O2 --make Sudoku.hs -prof -auto-all -caf-all -fforce-recomp
>
> and run it on the following puzzle:
>
> 0 2 3 4
> 3 4 1 0
> 2 1 4 0
> 0 3 2 1
>
> I get the following profiling report:
>
> � � � �Fri Jan �1 10:34 2010 Time and Allocation Profiling Report �(Final)
>
> � � � � � Sudoku +RTS -p -RTS
>
> � � � �total time �= � � � �0.00 secs � (0 ticks @ 20 ms)

That means the report is basically useless. Not entirely, because the allocation figures
may already contain useful information. Run on a 9x9 puzzle (a not too hard one, but not
trivial either).

Also, run the profiling with -P instead of -p, you'll get more info about time and
allocation then.

Number of times it was 'entered', not quite the same as the number of times it was called.
I think (Warning: speculation ahead, I don't *know* how the profiles are generated) it's
thus:
Say you call a function returning a list. One call, first entry. It finds the beginning of
the list, the first k elements and hands them to the caller. Caller processes these, asks
"can I have more, or was that it?". Same call, second entry: f looks for more, finds the
next m elements, hands them to caller. Caller processes. Repeat until whatever happens
first, caller doesn't ask whether there's more or callee finds there's nothing more (or
hits bottom).

Patrick LeBoutillier

unread,
Jan 5, 2010, 7:31:56 PM1/5/10
to Daniel Fischer, haskel...@haskell.org
On Mon, Jan 4, 2010 at 10:05 AM, Daniel Fischer
<daniel.i...@web.de> wrote:
> Am Montag 04 Januar 2010 02:17:06 schrieb Patrick LeBoutillier:
>
>> Hi,
>
>>
>
>> This question didn't get any replies on the beginners list, I thought
>
>> I'd try it here...
>
> Sorry, been occupied with other things. I already took a look, but hadn't
> anything conclusive enough to reply yet.

No sweat... I didn't mean to be pushy :)

Thanks a lot for all the pointers, they have speeded up my code a lot.

Patrick

--

Jean-Marie Gaillourdet

unread,
Feb 16, 2010, 9:45:41 AM2/16/10
to Daniel Fischer, haskel...@haskell.org
Hi,

Warning: speculation ahead, but is based on my knowledge on other profilers.

Many profilers work statistically, they interrupt a program at more less
random (or equal) intervals and inspect the stack, whick is of course
quite difficult in Haskell as far as I understand it. I have interpreted
the entries column as an indication for the amount of "profile
interrupts" which happened when a function f was on top of the stack,
whatever that means in Haskell.

The manual of GHC 6.10.4, chapter 5 states:

>The actual meaning of the various columns in the output is:
>
>entries
>
> The number of times this particular point in the call graph was
> entered.


So for me the question remains open, is "entries" a precisely counted
value or a statistically determined one?


Best regards,
Jean

Daniel Fischer

unread,
Feb 16, 2010, 10:20:14 AM2/16/10
to Jean-Marie Gaillourdet, haskel...@haskell.org
Am Dienstag 16 Februar 2010 15:45:38 schrieb Jean-Marie Gaillourdet:
> Warning: speculation ahead, but is based on my knowledge on other
> profilers.
>
> Many profilers work statistically, they interrupt a program at more less
> random (or equal) intervals and inspect the stack, whick is of course
> quite difficult in Haskell as far as I understand it. I have interpreted
> the entries column as an indication for the amount of "profile
> interrupts" which happened when a function f was on top of the stack,
> whatever that means in Haskell.
>
> The manual of GHC 6.10.4, chapter 5 states:
> >The actual meaning of the various columns in the output is:
> >
> >entries
> >
> >    The number of times this particular point in the call graph was
> >    entered.
>
> So for me the question remains open, is "entries" a precisely counted
> value or a statistically determined one?

I have one observation that supports "precisely counted", namely, while the
time spent in each cost centre (number of ticks) varies between profiling
runs of the same code, the number of bytes allocated and the number of
entries remain the same.

It's far from conclusive, though. Anybody willing to dig into the profiler
code? 8-)

>
>
> Best regards,
> Jean

Cheers,
Daniel

Johan Tibell

unread,
Feb 16, 2010, 10:53:57 AM2/16/10
to Jean-Marie Gaillourdet, haskel...@haskell.org


I believe it's the latter. I think the RTS uses a periodic SIGALRM to figure
out which function is currently executing and to record that in the profile.
Simon Marlow would know.

0 new messages