I wrote two code files in MIPS assembly for the expression below:
R(n) = (i to n) SUM { (i+2)/(i+1- 1/i) - i/(i+ 1/i) }
One code calculates the whole expression R(n) as summation and gives the result.
The second code first calculates first term, i.e., (i+2)/(i+1- 1/i) in a loop and then calculates the second term, i.e., i/(i+ 1/i) in another loop. It then simply subtracts the two summations.
Following are the results for the two programs for different values of n:
Program 1:
N Result
-----------
10 5.07170725
100 7.41927338
1000 9.72636795
10000 12.02908134
100000 14.33149338
1000000 16.63462067
Program 2:
N Result
---------
10 5.07170773
100 7.41923523
1000 9.72259521
10000 12.31250000
100000 8.61718750
1000000 6.50000000
Program 1 is giving more accurate results (compared with Wolfram Alpha results for R(n)). Why does Program 2 gives odd results here for large values of n? My question is related to floating point precision here.
Note: I am using single-precision numbers.
Say you have un=an-bn and you want sum(un)
lim an -> 1 when n -> infinity so the sum of P terms tends to P+cte_a, same for bn, the sum tends to P+cte_b
When you differentiate the two, (P+cte_a) - (P+cte_b), you should mathematically retrieve sum(un).
But with floating point, that's not what happens, because (P+cte_a) is rounded to nearest float. And the bigger P is, the less float(P+cte_a)-float(P) will be close to cte_a...
To convince yourself, try to evaluate these ops:
10.0f+0.1f-10.0f
100.0f+0.1f-100.0f
...
1.0e7f+0.1f-1.0e7
lim un -> 1/n when n -> infinity, so program 1 does a bit better...
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments