Tuesday, November 29, 2016

Using GMP to Approximate π

Okay, now I have
I guess I can now use the method as a way to show a little about how to use the Gnu Multi-Precision library.


The Gnu Multi-Precision library, GMP:

Here are
If your native language is not English, the wikipedia English wikipedia article has links to wikipedia articles in other languages. It's not much beyond an overview, but it should help a little, and those pages link some non-English resources.

Wish I had time to write this all in Japanese, or even to try to figure out how to annotate what I've written properly.

Once you get used to GMP, you'll probably find MPFR and MPC quite useful, as well. Probably in that order. (MPC builds new functionality onto MPFR which builds new functionality on GMP, if I read the docs correctly.) These add flexibility and methods of dealing with precision issues, and functions for complex numbers.

Note that GMP includes rational arithmetic -- explicit fractions!

Also note that the manual strongly suggests that, for what I'm doing here with multi-precision math, the MPFR library is more complete. If I were attempting a formal discussion of precision, yes, I would go with MPFR. But I'm not, and we need to start somewhere that's not too hard, so I'm starting with GMP.)



First, I converted the program I wrote using C's standard floating point math to the gmp libraries. The result is here:
https://osdn.net/users/reiisi/pastebin/4462

[JMR201705122219:

I should note that the program is not an efficient or fast way to calculate π. It's more of an easy demonstration of overly simplified use of multi-precision arithmetic. And, even there, I don't track the precision very carefully.

]

You can refer to my blog posts above for general explanations of what the program does. And comments in the source code should explain much of the differences between C's standard floating point math and the handling of GMP math, but I should probably add a few notes:

In the comments, I pretty much imply that mpf_t is a pointer. It isn't. Sorry. But having already pasted that into the paste buffer, I'll satisfy myself with an explanation here.

The type is a small structure containing a pointer, as is partially explained in the manual entry on memory management. Either way, you have to initialize it before you use it, as you see in the code. (You do have the code open in a text editor, or in another browser window, don't you?) 

C++ should (I think) allow using the multi-precision math almost the same as the standard floating point and integer types. I haven't tried, there may be some tricky stuff in loop syntax or such. I'm not a fan of C++, or of the idea of attempting to fold all program functionality onto existing syntax. This may have something to do with my coming from a FORTH background, where inventing new syntax on-the-fly doesn't interfere that much with the day-to-day workflow. (But it may create design nightmares down the road, and that's-a-different-story-sort-of.)

So using the GMP variables requires resigning yourself to C's function call syntax. But C makes that rather convenient, so it shouldn't be a problem, really.

The way they have set it up here, it requires the initialization to be separated from the declaration. Thus, all the variables of types provided by GMP that need to be accessible in the global scope of the main function are declared there at the top of the main function.

(I believe that C11 or somewhere says we should be able to declare them close to their first use, but I'm sticking with the most common idiom here.)

With all the globals declared, we are free to start ordinary procedural stuff, so the first thing I do is set the precision. The function call provided sets a precision that gives at least the number of bits of precision specified, including rounding. (MPFR allows you to be a little more precise so that you can control the loss of precision yourself.)
mpf_set_default_prec ( 67 )
ensures that we have at least 67 bits of precision. I want this to mimic (sort of) the default precision of bc when you call it as "bc -l" and load in the default math library.

You should want to know how I chose that 67 bits, so I'll tell you.

One binary digit (also called a "bit" -- binary digit) allows you to represent two numbers, usually interpreted as zero and one. That's not enough to represent decimal numbers.

Two bits give you zero to three, which is great for base four. Three bits give you zero to seven which is great for base eight, or octal. Four bits give you zero to fifteen, great for base sixteen, or hexadecimal.

So, if we wanted twenty digits of base sixteen precision, 20 ✕ 4 or eighty bits would be enough. (And we could set it at eighty without causing too many issues, really.)

Since you can also represent zero to nine in four bits, four bits is enough for one digit of binary coded decimal. And twenty binary coded decimal (BCD) would also take eighty bits.

But GMP is not doing BCD. It's doing binary math. Eighty bits of integer binary gives you the ability to represent from 0 to (280 - 1) That's
00000000000000000000000000000000000000000000000000000000000000000000000000000000 
to
11111111111111111111111111111111111111111111111111111111111111111111111111111111
That is easy to convert to hexadecimal. Each group of four binary 0s is one hexadecimal 0, and each group of four binary 1s is one hexadecimal "F". So, in hexadecimal, that's
0x00000000000000000000
to
0xFFFFFFFFFFFFFFFFFFFF
which probably doesn't look very meaningful to you, unless you play with this stuff for fun like I sometimes do. How can we convert that to decimal numbers that we feel comfortable with? Of course!
me@mymachine:~/games/piArea$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
10^20
100000000000000000000
2^80-1
1208925819614629174706175
so it's from 0 to 1,208,925,819,614,629,174,706,175 (using the US centric notation). So, if we plug in 20 digits of 9s and output that in binary, we can count the number of bits. Uggh. ;-)
obase=2
99999999999999999999
1010110101111000111010111100010110101100011000011111111111111111111
But, wait, there's a better way!

The common logarithm (log base 10, or log10) of a number is strongly related to the number of decimal digits that it takes to write:
log10(10000) == 4
The base two logarithm is strongly related to the number of binary digits it takes to write:

log2(16) == 4
If only bc could give us a way to calculate log10 and log2. :)

bc gives us the natural logarithm (ln) in the form of "l()"! And any logb is related to any other logc like this:
logb(x) = logc(x) / logc(b)
So, let's set the output base back to ten and define log2
obase=10
define l2(x) { return l(x)/l(2); }
l2(16)
4.00000000000000000002
l2(10^20)
66.43856189774724695808
If you're using a reasonably modern bc, you'll probably want to call the function "log2" instead of "l2", since "l2" looks like "12":
define log2(x) { return l(x)/l(2); }
l2(10^20)
66.43856189774724695808 
Having four tenths of a bit is kind of hard to do, so let's just say 67 instead of 66.4 bits, okay? And GMP gives us some extra bits and, by default, takes care of 4/5 rounding for us, which is a little more convenient here than what bc does. (We'll expect some difference in results at some point.)


Okay, now that the default precision is set, let's initialize all our variables. (We could have explicitly declared the precision on each initialization, but that would not have saved us the long explanation.)

We should note here that, if this program were more complex, especially if we were using GMP variables as local variables to functions other than main(), we would need to remember to clear them after use in order to avoid memory leaks. (I probably should have cleared them explicitly here to emphasize that.)

I avoid this allocation/deallocation problem here by declaring xsquared, ysquared, y, and deltaA in the outer block of main() rather than in the block local to the for loop, as I had in the native C code. (More on those, later.)

The initialize from string functions are really convenient:
mpf_init_set_str( mpf_t variable, char value[], int base )
so
mpf_init_set_str(deltaX, "0.025", 10);
initializes deltaX with 1/40 at 67 bits of precision (interpreting 0.025 as a base ten fraction).

When we have the value in one mpf_t variable, we can use that value directly in further initializations, as in
mpf_init_set( x, lastX );
And, if we really don't need to specify a value, just want to set the variables up for later use, we can do that, too:
mpf_init( deltaA );
These variables which I declared without initial values are the ones I'm using in the for loop, by the way.

If there are command line parameters (argc > 1), we have to convert the value for deltaX from the parameter. But the parameter is in a string, so that's straightforward:
mpf_set_str( deltaX, argv[ 1 ], 10 )
It's safer to be sure we got a good conversion when we are using unlimited precision, so I am checking the return value for indication of an error now. In the native C version, I figured there would be no harm in a bad conversion, but it probably would have been better to check.

Moving on to the interesting stuff, the for loop is not impacted as much as we might fear: 
for ( ; mpf_cmp( x, limit ) <= 0; mpf_add( x, x, deltaX ) )
The loop counter initialization here was completed explicitly when the variable was initialized, so I left it out of the loop. (It's just not a good idea to leave variables lying around declared but uninitialized. Unless you enjoy working with hidden bugs.)

The comparison expression
mpf_cmp( x, limit ) <= 0
is similar to saying
x <= limit
with standard C floating point types. mpf_cmp( a, b ) returns
  • an integer value less than zero (negative) if a is less than b, 
  • integer zero if a == b (something we should not expect or depend on),
  • and an integer greater than zero (positive) if a is greater than b.

Thus, the for loop continues until the value of x exceeds the limit.


The function call
mpf_add( x, x, deltaX )
is essentially the same thing as saying
x = x + deltaX
would be for standard C floating point types (except it does not yield the value assigned as a return value), thus, it means "increment x by deltaX".


The function parameters are listed in the order they would be written in our ordinary (modern infix) algebraic expressions, so, for each of add, mul (multiply), sub (subtract), and div (divide), the parameters are the same:
mpf_sub( mpf_t target, mpf_t left_op, mpf_t right_op )
Target gets the result, and the right op is added to, subtracted from, multiplied with, or divided into the left, as if it were written
target = left_op - right_op
in the case of mpf_sub, and the same with the rest. Again, an important difference is that these functions do not return a result, so we can't nest them and we have to deal with intermediate results ourselves.

Now you know why I have xsquared and deltaA as variables in the GMP version. We need some place to put the intermediate values. Thus
ysquared = 1.0 - ( x * x );
becomes
mpf_mul( xsquared, x, x );
mpf_ui_sub( ysquared, 1L, xsquared );
and
area += deltaX * y;
becomes
mpf_mul( deltaA, deltaX, y );
mpf_add( area, area, deltaA );
I'll repeat myself here -- I did not want xsquared, ysquared, y, and deltaA going out of scope in the loop, so I declared and initialized them in the outer block of main().

If I had a really good reason to keep them local to the loop block, I'd have needed to (re-)initialize them at the top of the loop and clear them at the bottom, to be sure the variables would be valid (and non-segment-faulting) and to keep memory from leaking. That takes enough extra time that I'd prefer to avoid it in a loop like this. And, of course, it gives more opportunity for bugs to creep in.

(In practice, I don't know of a compiler that would deliberately move loop local variables from iteration to iteration of a simple loop like this, but depending on that behavior is depending on something that is not promised in the standard. Not a good idea, and you'll forget to do it right in some places it really will matter.)

Finally, we can celebrate that the GMP provides us with a version of printf that handles gmp variables. Making your own output functions is fun, but sometimes you don't need that kind of fun.

And that's about it. Play with the code and have good kinds of fun!

(I may use this as an excuse to show some ways to parse command line parameters, but, if I do, it will be in another post.)

Friday, November 18, 2016

Using π to Check Your Floating Point Accuracy

In my post on using areas of rectangles to approximate π in C, I blithely asserted that standard C floating point math would lose accuracy compared to bc. But I didn't explain how I came to that conclusion.

I'll show you a little bit about how I got there, tiptoeing around some of the steps where it's easy to get lost.

First, refresh your memory about the bc program:
And here's how we'll call the program from the command line:
echo "q=1;m=32;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
The name for the number of rectangles, "m" was rather awkward, but "q" for quiet should be easy to remember.

I suppose you may be wondering how that works.

First, echo just echoes what it is given into the standard output stream.

Second, the output stream of echo is piped to the standard input stream for cat and the command line tells cat to continue with the contents of the source code of the bc program. This way, cat concatenates things so that the result is a program that sets q, m, and scale appropriately before it begins:
q=1;m=32;scale=41;
/* ... */
if ( m < 1 ) { m = 11; }
s=0;
w = 1/m;
Unless we set them first, q, m, and d start at zero, and scale starts at 20 when we give bc the "-l" option.

So cat assembles the lines to set q, m, and scale with the program source and sends it all to the standard output stream.

Then the command line pipes the output of "cat" to "bc -l", and the program starts with scale, q, and m set to 41, 32, and 1.

Setting m changes the number of rectangles to divide the circle into, thus changing the width of the rectangles. We'll change that and examine the resulting attempt to calculate π.

The C program, halfpiArea, accepts a width as a parameter.

(That was not the best choice. You can change it if you want. I'm using the program as it is.)

The width is the inverse of the number of rectangles because the radius of the circle is 1. So we can calculate the width using bc:
echo "1/32" | bc -l
which tell us that one thirty-second is .03125000000000000000 (0.03125). We can select the decimal fraction that bc gives us on the terminal screen and copy it. In a Linux OS shell, we can usually use right-click after selecting the text. In a Mac OS X shell, the usual copy keyboard shortcut works.

In MSWindows, I think you will have to click the application icon in the left of the title bar and select "Edit" from the menu that pops up, then select "Copy" from the submenu. ;-*

Then we can run the C program by typing the program name and pasting the fraction in after as follows:
./halfpiArea .03125000000000000000
Except we want to run the bc program after the C program, so that we can get the output of both on the terminal screen at the same time:



Now we can select the output estimates for π and copy them to a text editor, where we can compare them at leisure.

We will notice that the two programs produce nearly the same results for counts of rectangles that are powers of two. Two rectangles, for example:
jwr@fun:~/work/mathgames/piArea$ ./halfpiArea .5
(             -0.5,0.866025403784439): 0.433012701892219
(                0,                1): 0.933012701892219
(              0.5,0.866025403784439):  1.36602540378444
(                1,                0):  1.36602540378444
Almost Pi:  2.73205080756888
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
3: (1.00000000000000000000000000000000000000000, 0): 1.3660254037844\
3864676372317075293618347140
2.73205080756887729352744634150587236694280
These are the same, except for the last digit of the C program, which is actually correctly rounded, rather than just different:
2.73205080756888
2.73205080756887729352744634150587236694280
Similarly, 2048 rectangles:
jwr@fun:~/work/mathgames/piArea$ echo "1/2048" | bc -l
.00048828125000000000
jwr@fun:~/work/mathgames/piArea$ ./halfpiArea .00048828125000000000
(   -0.99951171875,0.0312461850698753): 1.52569263036501e-05
...
(                1,                0):  1.57078998270572
Almost Pi:  3.14157996541144
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2048;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
2049: (1.00000000000000000000000000000000000000000, 0): 1.5707899827\
0572260019288624147880158332158
3.14157996541144520038577248295760316664316
which are the same to the last digit of the C output:
3.14157996541144
3.14157996541144520038577248295760316664316
In fact, the C program is well behaved all the way to 216:
jwr@fun:~/work/mathgames/piArea$ echo "2^16" | bc -l
65536
jwr@fun:~/work/mathgames/piArea$ echo "1/(2^16)" | bc -l
.00001525878906250000
jwr@fun:~/work/mathgames/piArea$ ./halfpiArea .00001525878906250000
Almost Pi:  3.14159258349584
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2^16;scale=41;" | cat - halfpiArea_by_m.bc | bc -l
65537: (1.00000000000000000000000000000000000000000, 0): 1.570796291\
74791608251698437807118686209362
3.14159258349583216503396875614237372418724
jwr@fun:~/work/mathgames/piArea$ echo "q=1;m=2^16;scale=81;" | cat - halfpiArea_by_m.bc | bc -l
65537: (1.0000000000000000000000000000000000000000000000000000000000\
00000000000000000000000, 0): 1.5707962917479160825169843780711868627\
48822128541338008872254934835925225596242002
3.141592583495832165033968756142373725497644257082676017744509869671\
850451192484004
(Going to scale=81 did take a minute or four. I should time it:
scale=81
a(1)*4
3.141592653589793238462643383279502884197169399375105820974944592307816406286208996
Okay, three minutes, forty seconds.)

Here are the estimates:

3.14159258349584
3.14159258349583216503396875614237372418724
3.141592583495832165033968756142373725497644257082676017744509869671850451192484004 
The C program produces the same as the bc program to the C program's second-to-the-last digit.

Note that the bc program is showing precision loss out in the 37th digit. Again, remember why, from the fly-in-the-pi post.

And, just as a reminder (in bc -l),
scale=81
a(1)*4
3.141592653589793238462643383279502884197169399375105820974944592307816406286208996
So, even with all this effort, we have only calculated π to the sixth digit past the decimal point, which is also expected, see the theory post.

That's not bad results, really, but if we don't choose the number of rectangles carefully, the C program output starts varying from the bc program quite quickly.

Ten, 100, and 1000 are good, but 10,000 is not. 11 is good, but 23 is not. Try a few, and see if you can guess what numbers work better and which work worse, and why.

Now, if our only purpose were to calculate π to six decimal digits, we'd be satisfied with using 216 for the count and be satisfied with standard C floating point math.

But, of course, that's not the real purpose.

What we are really talking about here is a way to get good solutions to equations involving integral calculus. (Was I being sneaky for not warning you in advance?)

So we do care about the differences, sometimes. We can't always select the convenient counts of rectangles to estimate with. Which is why I'm going to demonstrate using the Gnu Multi-Precision library real-soon-now.