Friday, November 6, 2015

Small Prime Sieve in Ada (3 versions)

An explanation of the algorithm here can be found in a blog entry on the Sieve of Eratosthenes in my math and English blog, and in the previous blog entry, along with my reasons for doing this.

But that doesn't explain why I would do this three times.

Ada does not have a stepped loop. Altering the value of the index of a counted loop is considered dangerous, even if it is only to step the counter. Also, terminating a counted loop is considered difficult. So the counted loop is only present in forms that specify a range or a set of objects to count through. The programmer is to leave the details of setting up the count and counting through it to the compiler.

I understand the theory. I'll use strong language about this elsewhere, but I strongly disagree.

That said, there is one way to conform with the design of Ada. That is to set up a count (an artificial count in this case) and multiply by the step:


-- Archetypical implementation of the sieve of eratosthenes in Ada, with index multiply
-- By Joel Rees, Amagasaki, Japan, 2015
-- All rights reserved.
-- Permission granted by the author to use this code
-- for any purpose, 
-- on condition that substantial use 
-- shall retain this copyright and permission notice.
--
-- This is much faster than the modular division version, 
-- but reveals the effects of the tight semantics of Ada.
-- Still not anywhere close to the speed of the C version.
--
-- (With MAXSIEVE set to 131072:
-- me@fun:~/work/mathgames/sieve$ time ./adasieve >adaout.text
--
-- real    0m9.386s
-- user    0m1.112s
-- sys    0m7.948s
-- me@fun:~/work/mathgames/sieve$ time ./sieve_c > cout.text
--
-- real    0m0.148s
-- user    0m0.112s
-- sys    0m0.032s
--
-- With MAXSIEVE set to 1048576 and the output commented out:
-- me@fun:~/work/mathgames/sieve$ time ./adasieve 
-- 
-- real    0m0.125s
-- user    0m0.116s
-- sys    0m0.004s
-- me@fun:~/work/mathgames/sieve$ time ./sieve_c 
-- 
-- real    0m0.099s
-- user    0m0.096s
-- sys    0m0.004s
--
-- .. the latter of which is about the expected result.)
--
-- I read somewhere that it's the recommended approach.
--

with Ada.Text_IO; -- use Ada.Text_IO;    -- #include <stdio.h>
-- use would make it more like C, but I'm having trouble finding Integer_IO.
-- #include <stdlib.h> // We aren't using EXIT_SUCCESS

procedure AdaSieve is
    MAXSIEVE: constant := 256;
    MAXCANDIDATE: constant := MAXSIEVE - 1;
    FINALPASS: constant := ( ( MAXSIEVE + 1 ) / 2 );

    type Integer_Type is range 0 .. MAXCANDIDATE;
    subtype Candidate_Type is Integer_Type range
      2 .. Integer_Type'Last;

    package TextIO renames Ada.Text_IO;
    package IntIO is new Ada.Text_IO.Integer_IO( Integer_Type );
    sieve: array ( 2 .. MAXCANDIDATE ) of Boolean;
    -- index, prime: Integer_Type; -- loop variables
    -- This is the recommended approach!
    limit: Integer_Type;

begin
    -- sieve[ 0 ] := 0;
    -- sieve[ 1 ] := 0;
    for index in sieve'Range loop
        sieve( index ) := True;
    end loop;

    for prime in 2 .. FINALPASS loop
        if sieve( prime ) = True then
            limit := Integer_Type( Standard.Integer( Integer_Type'last ) / prime ); -- (gulp!)
            -- index check failed:
            -- limit := Integer_Type( ( Standard.Integer( Integer_Type'last ) + 1 ) / prime );
            for index in 2 .. Integer( limit ) loop 
-- TextIO.Put( "clearing index " ); -- debug
-- IntIO.Put( Item  => Integer_Type( index ) ); -- debug
-- TextIO.Put( " element " ); -- debug
-- IntIO.Put( Item  => Integer_Type( index * prime ) ); -- debug
-- TextIO.Put_Line( "." ); -- debug
                sieve( index * prime ) := False; 
            end loop;
        end if;
    end loop;

    
    IntIO.Put( Item  => 0 );
    TextIO.Put_Line( ": is not prime." );
    IntIO.Put( Item  => 1 );
    TextIO.Put_Line( ": is not prime." );
    for index in sieve'Range loop
        IntIO.Put( Item  => Integer_Type( index ),
                   Width => IntIO.Default_Width,
                   Base  => IntIO.Default_Base );
        TextIO.Put( ": is " );
        if not sieve( index ) then
            TextIO.Put( "not " );
        end if;
        TextIO.Put_Line( "prime." );
    end loop;
    -- return EXIT_SUCCESS;
end AdaSieve;



This approach is useful for walking through the bitbuckets in certain hash implementations, but I don't consider it a natural fit for this algorithm. There is no particular motivation for the division that tells how many times the loop must run, and the multiplication, while much less time consuming than division, is not as cheap as the add.

(You'll note the debugging prints I used to make sure the count was surviving the typecasting.)


Then there is a way to cheat and use the natural range, but use modular division to skip the steps where the loop should not execute. You get punished for using this one by rather slow running time:


-- Archetypical implementation of the sieve of eratosthenes in Ada, with index modular divide
-- By Joel Rees, Amagasaki, Japan, 2015
-- All rights reserved.
-- Permission granted by the author to use this code
-- for any purpose, 
-- on condition that substantial use 
-- shall retain this copyright and permission notice.
--
-- The modular division in the inner loop makes this really, really slow.
--

with Ada.Text_IO; -- use Ada.Text_IO;    -- #include <stdio.h>
-- use would make it more like C, but I'm having trouble finding Integer_IO.
-- #include <stdlib.h> // We aren't using EXIT_SUCCESS

procedure AdaSieve is
    MAXSIEVE: constant := 256;
    MAXCANDIDATE: constant := MAXSIEVE - 1;
    FINALPASS: constant := ( ( MAXSIEVE + 1 ) / 2 );

    type Integer_Type is range 0 .. MAXCANDIDATE;
    subtype Candidate_Type is Integer_Type range
      2 .. Integer_Type'Last;

    package TextIO renames Ada.Text_IO;
    package IntIO is new Ada.Text_IO.Integer_IO( Integer_Type );
    sieve: array ( 2 .. MAXCANDIDATE ) of Boolean;
    -- index, prime: Integer_Type; -- loop variables

begin
    -- sieve[ 0 ] := 0;
    -- sieve[ 1 ] := 0;
    for index in sieve'Range loop
        sieve( index ) := True;
    end loop;

    for prime in 2 .. FINALPASS loop
        if sieve( prime ) = True then
            for index in prime + prime .. MAXCANDIDATE loop 
                if index mod prime = 0 then
                    sieve( index ) := False;
                  end if;
            end loop;
        end if;
    end loop;

    
    IntIO.Put( Item  => 0 );
    TextIO.Put_Line( ": is not prime." );
    IntIO.Put( Item  => 1 );
    TextIO.Put_Line( ": is not prime." );
    for index in sieve'Range loop
        IntIO.Put( Item  => Integer_Type( index ),
                   Width => IntIO.Default_Width,
                   Base  => IntIO.Default_Base );
        TextIO.Put( ": is " );
        if not sieve( index ) then
            TextIO.Put( "not " );
        end if;
        TextIO.Put_Line( "prime." );
    end loop;
    -- return EXIT_SUCCESS;
end AdaSieve;



This is actually a useful pattern. There are numerous algorithms in which one examines every member of a set or array in turn, performing some calculation to determine how to process the element under consideration.


Then there is a work-around, manufacturing the counted loop using a while. Yes, if you can handle counted loop termination, you also know how to manufacture a counted loop with a general pre-test loop:


-- Archetypical implementation of the sieve of eratosthenes in Ada, with while loop
-- By Joel Rees, Amagasaki, Japan, 2015
-- All rights reserved.
-- Permission granted by the author to use this code
-- for any purpose, 
-- on condition that substantial use 
-- shall retain this copyright and permission notice.
--
-- Timing results with MAXSIEVE set to 4194304 and output commented out:
-- me@fun:~/work/mathgames/sieve$ time ./adasieve
--
-- real    0m0.464s
-- user    0m0.448s
-- sys    0m0.012s
-- me@fun:~/work/mathgames/sieve$ time ./sieve_c
--
-- real    0m0.457s
-- user    0m0.436s
-- sys    0m0.020s
--
-- This is what we should expect for a running time.
--

with Ada.Text_IO; -- use Ada.Text_IO;    -- #include <stdio.h>
-- use would make it more like C, but I'm having trouble finding Integer_IO.
-- #include <stdlib.h> // We aren't using EXIT_SUCCESS

procedure AdaSieve is
    MAXSIEVE: constant := 256;
    MAXCANDIDATE: constant := MAXSIEVE - 1;
    FINALPASS: constant := ( ( MAXSIEVE + 1 ) / 2 );

    type Integer_Type is range 0 .. MAXCANDIDATE;
    subtype Candidate_Type is Integer_Type range
      2 .. Integer_Type'Last;

    package TextIO renames Ada.Text_IO;
    package IntIO is new Ada.Text_IO.Integer_IO( Integer_Type );
    sieve: array ( 2 .. MAXCANDIDATE ) of Boolean;
    windex: Integer; -- Not Candidate_Type, to use it as array index?

begin
    -- sieve[ 0 ] := 0;
    -- sieve[ 1 ] := 0;
    for index in sieve'Range loop
        sieve( index ) := True;
    end loop;

    for prime in 2 .. FINALPASS loop
        if sieve( prime ) = True then
            -- Manufacturing the for loop with while
            windex := prime + prime;
            while windex < MAXSIEVE loop 
                sieve( windex ) := False;
                windex := windex + prime;
            end loop;
        end if;
    end loop;

    
    IntIO.Put( Item  => 0 );
    TextIO.Put_Line( ": is not prime." );
    IntIO.Put( Item  => 1 );
    TextIO.Put_Line( ": is not prime." );
    for index in sieve'Range loop
        IntIO.Put( Item  => Integer_Type( index ),
                   Width => IntIO.Default_Width,
                   Base  => IntIO.Default_Base );
        TextIO.Put( ": is " );
        if not sieve( index ) then
            TextIO.Put( "not " );
        end if;
        TextIO.Put_Line( "prime." );
    end loop;
    -- return EXIT_SUCCESS;
end AdaSieve;



Even though the loop is a while loop instead of a for loop, this most closely matches my original C source.

The FORTH implementation can be found here.

[You might also be interested in the Ada language sieve entry in Rosetta Code.]

No comments:

Post a Comment