I'm playing a little with Haskell and dynamic programming.
I have implemented a lot of problems, but in Fibonacci's case i'm getting some results that are PC dependants and i would like to confirm.
Assume the following implementations:
1)- List:
memoized_fib_list n = fibTab !! n
where fibTab = map fibm [0 ..]
fibm 0 = 0
fibm 1 = 1
fibm n = fibTab !! (n-2) + fibTab !! (n-1)
2)- Array:
memoized_fib_array n = fibTab ! n
where fibTab = listArray (0, n) [mfib x | x <- [0..n]]
mfib 0 = 0
mfib 1 = 1
mfib x = fibTab ! (x - 1) + fibTab ! (x - 2)
Result (with Criterion):
N = 15.000:
List implementation: 171.5 μs
Array implementation: 8.782 ms
N = 100.000:
List implementation: 2.289 ms
Array implementation: 195.7 ms
N = 130.000:
List implementation: 3.708 ms
Array implementation: 410.4 ms
The tests were run on a Notebook with a Core i7 Skylake, 8gb DDR4 and SSD (Ubuntu).
I was expecting the array implementation to be much better, and this was the only problem where the list implementation is better.
Could it be because of the sequential access? On some hardware with lower specs the list implementation has worse performance.
Note: I'm using the last (edit: latest) version of GHC.
Thanks.
Edit:
benchmark n = defaultMain [
bgroup "fibonacci" [
bench "memoized_fib_list" $ whnf (memoized_fib_list) n
, bench "memoized_fib_array" $ whnf (memoized_fib_array) n
]
]
main = do
{
putStrLn "--------------EJECUTANDO BENCHMARK N=40------------------";
benchmark 40;
putStrLn "--------------EJECUTANDO BENCHMARK N=15000---------------";
benchmark 15000;
putStrLn "--------------EJECUTANDO BENCHMARK N=50000---------------";
benchmark 50000;
putStrLn "--------------EJECUTANDO BENCHMARK N=100000--------------";
benchmark 100000;
putStrLn "--------------EJECUTANDO BENCHMARK N=130000--------------";
benchmark 130000;
}
Edit2: I installed Haskell Platform 8.2.2 on my windows 10 PC and got very similar results.
Intel i5 6600K, 16gb DDR4, SSD.
-------------------EJECUTANDO BENCHMARK N=130000------------------------
benchmarking best algo/memoized_fib_list
time 1.818 ms (1.774 ms .. 1.855 ms)
0.993 R² (0.985 R² .. 0.998 R²)
mean 1.853 ms (1.826 ms .. 1.904 ms)
std dev 119.2 μs (84.15 μs .. 191.3 μs)
variance introduced by outliers: 48% (moderately inflated)
benchmarking best algo/memoized_fib_array
time 139.8 ms (63.05 ms .. 221.8 ms)
0.884 R² (0.623 R² .. 1.000 R²)
mean 287.0 ms (221.4 ms .. 353.0 ms)
std dev 83.83 ms (64.91 ms .. 101.6 ms)
variance introduced by outliers: 78% (severely inflated)
Edit3: Some additional information after running criterion with Linear Regression. All the values correspond to the execution with N = 130000.
-Number of garbage collections:
List implementation:
numGcs: NaN R² (NaN R² .. NaN R²)
iters 0.000 (0.000 .. 0.000)
y 0.000 (0.000 .. 0.000)
Array Implementation:
numGcs: 1.000 R² (1.000 R² .. 1.000 R²)
iters 739.000 (739.000 .. 739.000)
y 2.040e-12 (-3.841e-12 .. 2.130e-12)
-Bytes allocated:
List implementation:
allocated: 0.001 R² (0.000 R² .. 0.089 R²)
iters 1.285 (-9.751 .. 13.730)
y 2344.014 (1748.809 .. 2995.439)
Array Implementation:
allocated: 1.000 R² (1.000 R² .. 1.000 R²)
iters 7.586e8 (7.586e8 .. 7.586e8)
y 1648.000 (1648.000 .. NaN)
-CPU cycles:
List implementation:
cycles: 0.992 R² (0.984 R² .. 0.997 R²)
iters 6759303.406 (6579945.392 .. 6962148.091)
y -141047.582 (-4701325.840 .. 4674847.149)
Array Implementation:
cycles: 1.000 R² (NaN R² .. 1.000 R²)
iters 1.729e9 (1.680e9 .. 1.757e9)
y -3311041.000 (NaN .. 6.513e7)
What's happening here is quite simple: with -O2
, GHC decides to make the memoisation-list in memoized_fib_list
global.
$ ghc -fforce-recomp wtmpf-file4545.hs -O2 -ddump-prep
...
Main.memoizedFib_list :: GHC.Types.Int -> GHC.Integer.Type.Integer
[GblId, Arity=1, Str=<S(S),1*U(U)>, Unf=OtherCon []]
Main.memoizedFib_list
= \ (n_sc61 [Occ=Once] :: GHC.Types.Int) ->
GHC.List.!! @ GHC.Integer.Type.Integer Main.main_fibTab n_sc61
...
Main.main_fibTab :: [GHC.Integer.Type.Integer]
[GblId]
Main.main_fibTab
= case Main.$wgo 0# of
{ (# ww1_sc5M [Occ=Once], ww2_sc5N [Occ=Once] #) ->
GHC.Types.: @ GHC.Integer.Type.Integer ww1_sc5M ww2_sc5N
}
...
That means, your criterion benchmark doesn't actually evaluate the fibonacci function repeatedly – it just performs repeated lookups in the same global list. And averaged over many evaluations, this gives a very good score, which is however not representative of how fast the calculation is.
GHC performs this optimisation in the list implementation because you don't need lists of different length – it's always an infinite list of all Fibonacci number. That's not possible in the array implementation, so this can't keep up here.
The simples way to prevent this globalisation would be to make fibm
explicitly dependent on n
, by just trimming it to the needed finite lenght like the arrays are as well.
memoizedFib_list :: Int -> Integer
memoizedFib_list n = fibTab !! n
where fibTab = map fibm [0 ..]
fibm 0 = 0
fibm 1 = 1
fibm n = fibTab !! (n-2) + fibTab !! (n-1)
With this, the list implementation becomes much slower than the array one, as one would expect seeing memo-lookup is O(n) for lists:
$ ghc -fforce-recomp wtmpf-file4545.hs -O2 && ./wtmpf-file4545
[1 of 1] Compiling Main ( wtmpf-file4545.hs, wtmpf-file4545.o )
Linking wtmpf-file4545 ...
--------------EJECUTANDO BENCHMARK N=40------------------
benchmarking fibonacci/memoizedFib_list
time 10.47 μs (10.42 μs .. 10.51 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 10.40 μs (10.35 μs .. 10.44 μs)
std dev 163.3 ns (122.2 ns .. 225.8 ns)
variance introduced by outliers: 13% (moderately inflated)
benchmarking fibonacci/memoizedFib_array
time 1.618 μs (1.617 μs .. 1.620 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.620 μs (1.618 μs .. 1.623 μs)
std dev 7.521 ns (4.079 ns .. 12.48 ns)
benchmarking fibonacci/memoizedFib_vector
time 1.573 μs (1.572 μs .. 1.574 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.572 μs (1.571 μs .. 1.573 μs)
std dev 2.351 ns (1.417 ns .. 4.040 ns)
--------------EJECUTANDO BENCHMARK N=1500----------------
benchmarking fibonacci/memoizedFib_list
time 18.52 ms (18.41 ms .. 18.68 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 18.65 ms (18.53 ms .. 18.84 ms)
std dev 355.1 μs (204.8 μs .. 592.1 μs)
benchmarking fibonacci/memoizedFib_array
time 135.2 μs (131.2 μs .. 140.1 μs)
0.996 R² (0.991 R² .. 1.000 R²)
mean 132.7 μs (131.9 μs .. 135.0 μs)
std dev 4.463 μs (2.024 μs .. 8.327 μs)
variance introduced by outliers: 32% (moderately inflated)
benchmarking fibonacci/memoizedFib_vector
time 131.8 μs (130.6 μs .. 133.2 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 132.5 μs (131.4 μs .. 134.1 μs)
std dev 4.383 μs (3.463 μs .. 5.952 μs)
variance introduced by outliers: 31% (moderately inflated)
Vector
which I also tested here performs yet a bit faster, but not really significantly. I think as soon as you use a container with O(1) lookup, the performance is dominated by the additions of the pretty huge numbers, so you're really benchmarking GMP rather than anything Haskell has to do with.
import qualified Data.Vector as V
memoizedFib_vector :: Int -> Integer
memoizedFib_vector n = fibTab V.! n
where fibTab = V.generate (n+1) mfib
mfib 0 = 0
mfib 1 = 1
mfib x = fibTab V.! (x - 1) + fibTab V.! (x - 2)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments