Thinking in scheme / racket

151 views
Skip to first unread message

Bob Heffernan

unread,
Jul 9, 2019, 8:09:04 AM7/9/19
to racket...@googlegroups.com
Dear all,

I recently wanted to count the number of primes in the sequences 2^n+3
and 2^n-3 (and a few more besides) where n is a positive integer.

After a while I realised that I had no real idea how to do this in racket /
scheme. I think part of my problem is that I really think of the problem in an
imperative way, i.e. "for each n check whether 2^n+3 is prime and if so add one
to a count".

The program at the bottom of my email is what I came up with. I hope it is
clear. I think a more idiomatic racket approach would be to use the for
looping construct but I couldn't quite figure out how to make this do what I
wanted.

I can imagine improving the below by modifying the count-primes-in-seq
procedure to take several "formulas" so I can deal with several sequences at
once. However, the code below is quite slow even just for one sequence so I
imagine that a different approach entirely is what is needed.

So, my question is this: how would you more experienced racketeers write code
for something like this?

I quickly threw together a naive program in Python (a language I'm not too
familiar with either) and it was faster to write and runs quicker. However, to
be quite honest, I am fonder of racket and would rather just get better at
writing racket code for the little things I need to do.

Any guidance is appreciated.

#lang racket

(require math/number-theory)

(define (count-primes-in-seq formula bound)
(let loop ([n 0]
[c 0])
(let ([k (formula n)])
(cond
[(> n bound)
c]
[(and (positive? k) (prime? k))
(loop [+ n 1] [+ c 1])]
[else
(loop [+ n 1] c)]))))

(define bound 1000)

; Primes of the form 2^n-3
(count-primes-in-seq
(λ (n) (- (expt 2 n) 3))
bound)

; Primes of the form 2^n+3
(count-primes-in-seq
(λ (n) (+ (expt 2 n) 3))
bound)


Regards,
Bob Heffernan

Daniel Prager

unread,
Jul 9, 2019, 8:51:23 AM7/9/19
to Bob Heffernan, Racket Users
Hi Bob

Here's how I'd write your function using for*/sum:

(define (count-primes-in-seq/2 f bound)
  (for*/sum ([n (in-range bound)]
             [k (in-value (f n))]
             #:when (and (positive? k) (prime? k)))
    1))


Performance is similar to your original.

Dan

Bob Heffernan

unread,
Jul 9, 2019, 12:03:27 PM7/9/19
to Daniel Prager, Racket Users
Daniel,

Thank you.
The piece of the puzzle I was missing was in-value.

Your version is much easier to read than mine. It is also easy to
modify it to get a list of the primes that occur, which is nice.

Regards,
Bob

Daniel Prager

unread,
Jul 9, 2019, 4:41:48 PM7/9/19
to Bob Heffernan, Daniel Prager, Racket Users
Hi Bob

My pleasure. 

(in-value ...) is a very neat facility that I suspect could do with a couple of examples in the Racket Guide and Reference to highlight its utility and application.

Dan

Maciek Godek

unread,
Jul 10, 2019, 5:46:45 AM7/10/19
to Racket Users

W dniu wtorek, 9 lipca 2019 14:09:04 UTC+2 użytkownik Bob Heffernan napisał:
Dear all,

I recently wanted to count the number of primes in the sequences 2^n+3
and 2^n-3 (and a few more besides) where n is a positive integer.


Hi!

A while ago, I wrote a booklet which used almost the same problem to introduce to, what you called nicely in the title of this thread, "thinking in Scheme", so if you're interested, you may want to check out the first chapter ("Introduction"):

 
HTH
Panicz

Bob Heffernan

unread,
Jul 10, 2019, 9:36:32 PM7/10/19
to Maciek Godek, Racket Users
On 19-07-10 02:46, Maciek Godek wrote:
> A while ago, I wrote a booklet which used almost the same problem to
> introduce to, what you called nicely in the title of this thread, "thinking
> in Scheme", so if you're interested, you may want to check out the first
> chapter ("Introduction"):

Maciek,

Thank you for your reply.

I skimmed through your booklet some time ago. (I too dislike the R
language, although some attempts have been made in recent years to make
it less awful). I read through the introduction again this morning and
enjoyed it.

The reason I like racket (and scheme-like languages in general) is that
they encourage the style of programming you are advocating (I think)
where the program is expressive and can be read and appreciated by
humans. In theory, my favourite style of programming is one that is
elegant and readable by humans.

My original email had to do with the problem of when this comes into
conflict with a prosaic computational task where my main aim is simply
to get the job done efficiently and my brain defaults to writing the
program in something like C.

For instance, another way of writing something like my original program
might be something like:

(define (f n) (- (expt 2 n) 3))
(define (good? n) (and (positive? n) (prime? n)))
(length (filter good? (map f (range 1 100))))

which is, I think, fairly expressive. It is still, however, relatively
inefficient. A python programmer might write something like

count = 0
for n in range(1, 1000):
if 2 ** n - 3 >= 0 and isprime(2 ** n - 3):
count += 1

print(count)

and happily report that her program ran quite a bit faster than mine.

I'm not sure how I'd advocate for Racket (or scheme) in a situation like this.

Regards,
Bob

Maciek Godek

unread,
Jul 11, 2019, 3:44:35 AM7/11/19
to Racket Users
Hi Bob!
I agree that the subject isn't easy.
I used to think that the Haskell code similar to your Racket code
(or my code from the pamphlet) could be made as efficient
under lazy evaluation as the imperative Python code you wrote above.
In principle, this should be possible, but as I have measured similar
Haskell code, it turned out to be orders of magnitude more resource hungry
(both time and memory).

I also recall that I once wrote a graph searching program for my professor
at a university. He wrote his version in C, but he had a bug that he couldn't
find. My version written in Scheme ran for around 2 weeks (in Chez).
His version ran in under 2 hours (after we fixed the bug).

Yet it's correctness, rather than performance, that should be our priority,
which is why learning to think systematically is important.

I prefer to use Racket over Python because of the versioning hell
that Python has gotten itself into. Racket comes bundled with a fairly
nice framework for developing GUI applications, which works on
Windows, Linux and OS X. Python has an excellent yet little known framework
for GUI apps called Enaml (I really wish Racket had something similar),
but deploying it is a hell.

I also think that functional programming perhaps seems to make little sense
in a small scale, but as your programs grow large, it becomes increasingly important.

James Geddes

unread,
Jul 11, 2019, 4:31:47 AM7/11/19
to Bob Heffernan, Maciek Godek, Racket Users
I am also interested in this problem, because many of my colleagues use Python and "isn't Python faster?" is a common argument. (Albeit one that I think is more of a rationalisation than an reason.)

In this case, however, I would not have thought that there's any prima facie reason why the Python version should be faster, even compared to the "filter and map" Racket version.

Indeed, I would have thought that the calculation time would be entirely dominated by the test for primality, and especially what happens once the candidate primes are bigger than 2^64 and can no longer be represented by a single word.

(One possible explanation might be that the Racket math library is written in Typed Racket; when calling it from untyped Racket, there is some type checking using contracts -- it's possible, I suppose, that this is causing a performance penalty.)

I am going to investigate but if anyone knows more than me I'd appreciate thoughts.

Regards,

James
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/20190711013633.nm2jlznhzwibhsli%40bob-cit.
> For more options, visit https://groups.google.com/d/optout.

Bob Heffernan

unread,
Jul 11, 2019, 9:40:18 AM7/11/19
to James Geddes, Maciek Godek, Racket Users
On 19-07-11 09:31, James Geddes wrote:
> Indeed, I would have thought that the calculation time would be
> entirely dominated by the test for primality, and especially what
> happens once the candidate primes are bigger than 2^64 and can no
> longer be represented by a single word.

James,

I assume that you are right. The isprime procedure in my python code is
the one from sympy (I didn't bother to include the import line in my
previous email). The prime? procedure in my Racket code is that from
the math/number-theory library.

I don't know anything about sympy, but I seem to recall that Python
libraries for these sorts of things often rely on fast routines written
in C. I suppose it might be interesting to implement the same
prime-checking in both Python and Racket and *then* compare speeds.

I'm afraid I've not really looked into Typed Racket yet, although I
probably should.

Regards,
Bob

Bob Heffernan

unread,
Jul 11, 2019, 9:45:52 AM7/11/19
to Maciek Godek, Racket Users
On 19-07-11 00:44, Maciek Godek wrote:
> I also think that functional programming perhaps seems to make little sense
> in a small scale, but as your programs grow large, it becomes increasingly
> important.

Maciek,

You know, I think you might be right and I think this might be at the
root of my frustration.

I found my simple Racket program much harder to write than my simple
Python program. Now, this is mostly due to my own inexperience with
Racket and the functional style.

However, there might be a sense in which I should just be using the
right tool for the job (which may well be Octave or Julia for something
like this).

For another example, if you're manipulating a CSV file it may well make
more sense to use awk rather than Python or Racket or anything else.

Then again, I want to write all my programs in Racket so I can get
experience writing programs in Racket!

Regards,
Bob

James Geddes

unread,
Jul 11, 2019, 12:12:48 PM7/11/19
to Bob Heffernan, Maciek Godek, Racket Users
Dear Bob,

My sense is that there are, perhaps, two questions: first, what's the advantage of Racket when faced with a "prosaic computational task where ... my brain defaults to writing in something like C"?; and second, how to advocate for Racket when Python (appears to be) "quite a bit faster".

I’m interested in both of these because I am trying to understand where my colleagues are coming from (and possibly changing their minds).

On the first question, this discussion has perhaps produced four different versions of the prime-counting program:

1. Python (using a for-loop);
2. Racket (using accumulator-passing style);
3. Racket (using a for-loop);
4. Racket (using map/filter).

(Actually, I can’t find an example of (4) in the email chain, but there’s one at the bottom of this email.)

It’s certainly been educational for me to see all four. On reflection, do you still have the sense that (1) is the “most natural”?


On the second question, let me report some rough-and-ready timings on my machine for counting the number of primes of the form 2^n + 3 for n < 1000 (the code is at the end of the email):

Python: 265 ms
Racket (filter/map version): 760 ms
Racket (for loop version): 670 ms.

My own feeling is that /for this particular application/, there's just not that much difference.

I strongly suspect that all of the difference comes from the implementation of the primality testing functions. It turns out that, for sufficiently large numbers, both sympy and math/number-theory actually check for pseudoprimality. So there may be implementation-dependent details (which I don't understand) that are different, such as the probability that a composite number is incorrectly identified as a prime.

Indeed, maybe the answer to your second question is that, I rather suspect, Racket is quite a bit faster than Python, so one could reasonably turn the question around.

I implemented a cheap-and-cheerful, deterministic primality test (from Wikipedia, code again at the end of the email). I’m pretty sure I’ve written essentially identical code in Python and Racket: Python using `while` and mutation; Racket using `for/and`. Here are the times I get for testing the primality of 2^55 + 3 (which is in fact prime):

Python: 7 s
Racket: 0.8 s

Make of that what you will!

Best regards,

James




=== Python, count special primes ===

import timeit
import sympy

def count_special_primes(N):
count = 0
for n in range(N):
if sympy.isprime(2 ** n + 3):
count = count + 1
return count

print(timeit.timeit('count_special_primes(1000)', globals = globals(), number = 1))


=== Racket, count special primes ===

#lang racket

(require math/number-theory)

;; "filter/map" variation
;; Count primes of the form 2^n + 3 for n < N

(define (count-special-primes N)
(define (f n) (+ (expt 2 n) 3))
(length
(filter prime?
(map f (range N)))))

(time
(count-special-primes 1000))

;; “for-loop" variation

(define (count-special-primes/quicker N)
(define (f n) (+ (expt 2 n) 3))
(for/sum ([n (in-range N)])
(if (prime? (f n)) 1 0)))

(time
(count-special-primes/quicker 1000))


=== Python, primality test ===

Import timeit
import math

## Quick and dirty test for primality
## from https://en.wikipedia.org/wiki/Primality_test

def is_prime(n):

if n <= 3:
return (n > 1)
elif (n % 2 == 0) or (n % 3 == 0):
return False
else:
k = 5
k_max = int(math.sqrt(n)) + 1
while k < k_max:
if (n % k == 0) or (n % (k + 2) == 0):
return False
k = k + 6
return True


print(timeit.timeit('is_prime(2 ** 55 + 3)', globals = globals(), number = 1))


=== Racket, primality test ===

;; integer? -> integer?
;; Quick-and-dirty test for primality
;; from https://en.wikipedia.org/wiki/Primality_test

(define (prime/slow? n)
(if (<= n 3)
(> n 1)
(if (or (= (modulo n 2) 0)
(= (modulo n 3) 0))
#f
(prime-aux n))))

;; Test for primality assuming n > 3 and divisible by neither
;; 2 nor 3

(define (prime-aux n)
(let ([k-max (ceiling (sqrt n))])
(for/and ([k (in-range 5 n 6)]
#:break (> k k-max))
(not (or (= (modulo n k) 0)
(= (modulo n (+ k 2)) 0))))))

(time (prime/slow? (+ 3 (expt 2 55))))




> On 11 Jul 2019, at 02:36, Bob Heffernan <bob.he...@gmail.com> wrote:
>

matt...@ccs.neu.edu

unread,
Jul 11, 2019, 12:12:49 PM7/11/19
to Racket Users

If I may, let me address the (at least) four dimensions of coding that have come up in this thread, as concretely as possible but with some generalizations added:

1. Performance

Generally speaking, Python is a thin layer over C. It comes with almost all the performance advantages of C and all the disadvantages (complete lack of safety).

Specifically, the “naive” Racket code (that was mentioned) allocates many intermediate list raising the pressure on memory. As a functional programmer, you need to pay attention to allocation (cons, structs, lambda, objects) when performance matters. ~~ Haskell performs some “loop fusion” aka “deforestation” but the overhead of uniform laziness gets in the way. If you were to ask on a Haskell mailing list, you’d be told “sprinkle ! everywhere” (meaning make the program as strict as possible).

The performance of your program also suffers from linking Typed Racket code into Racket code in a critical place in the for loop. The for/ version in Typed Racket is the best you can do in terms of performance.

2. What matters about code (systematic design)

Performance, in many cases.

But real code is code that survives long enough so that person 2 will open a file many weeks/months/years after after person 1 wrote it. (Person 2 could be an aged version of person 1.) The performance of person 2 on changing this file depends on

— how systematically person 1 went about his work
— how well person 1 expressed this systematic procedure
(— how well person 2 is trained to understand these ideas)

The goal of HtDP is to get this idea across. One day I’ll write HtD C/S to scale this up.

2a. Functional programming

… excels at designing systematically code at all scales. There are many reasons; here are 3 examples:

— fp encourages type-driven design in a rich language of types (even if the types are informal as in HtDP).
— fp drives people to match expressions to the type structure (if you add a variant it’s easy to find where to add an action)
— fp makes testing very easy

If 3 isn’t enough, give me an N and I’ll come up with more.

2b. Functional programming can be done in almost any language, though some make it a bit harder than others. So don’t make judgments across entire languages.

(Josh Bloch: “Favor Immutability” (item 13) and “Minimize Mutability” (item 15) in “Effective Java.” He designed the API.)

3. Figuring out a good way of writing this Racket program goes as follows for experienced Racket/FP programmers:

FP a la Racket encourages one more way of programming:

— design systematically
— then transform systematically

So in Racket when you see

(filter … (map …))

you know that
(a) this allocates
(b) can be transformed into a for/list comprehension that uses #:when (by practice)
(c) (length (for/llist (.. #:when ..) ..)) can be transformed into (for/sum … #:when ) 1)


Now in a perfect world, you’d point a performance debugger at this (called optimization coach or profler) and it may tell you that calling isprime? from the Math library imposes a high cost and that adding types to your code will eliminate this “borderline cost” (but not the cost of primality checking).

— Matthias








Matthias Felleisen

unread,
Jul 11, 2019, 12:32:44 PM7/11/19
to Racket Users

Josh Rubin

unread,
Jul 11, 2019, 1:52:03 PM7/11/19
to Racket Users

On Tuesday, July 9, 2019 at 8:09:04 AM UTC-4, Bob Heffernan wrote:
Dear all,

I recently wanted to count the number of primes in the sequences 2^n+3
and 2^n-3 (and a few more besides) where n is a positive integer.

<SNIP>

Hi Bob. This has nothing to do with Racket, and you may already know it, so I will just hint. If p is prime, the sequence a^n + b (mod p) is periodic with period (p-1). Divisibility by p depends only on [ n mod (p-1) ]. A fast test can eliminate n for which a^n + b is divisible by p. Do this for 3 and 5. Check out "Fermat's little theorem" for the general case.

Jens Axel Søgaard

unread,
Jul 11, 2019, 3:10:27 PM7/11/19
to Bob Heffernan, Racket Users
Den tor. 11. jul. 2019 kl. 15.40 skrev Bob Heffernan <bob.he...@gmail.com>:
On 19-07-11 09:31, James Geddes wrote:
> Indeed, I would have thought that the calculation time would be
> entirely dominated by the test for primality, and especially what
> happens once the candidate primes are bigger than 2^64 and can no
> longer be represented by a single word.
 
I assume that you are right.  The isprime procedure in my python code is

the one from sympy (I didn't bother to include the import line in my
previous email). 

I believe that sympy uses the C library GMP.
 
The prime? procedure in my Racket code is that from
the math/number-theory library.

 
I don't know anything about sympy, but I seem to recall that Python
libraries for these sorts of things often rely on fast routines written
in C.  I suppose it might be interesting to implement the same
prime-checking in both Python and Racket and *then* compare speeds.

If you want, you can use the prime test from GMP from Racket too. 

#lang racket
(require gmp)

(define repetitions 25) ; manual says values between 15 and 50 are reasonable.

(define (is-prime? n)
  (case (mpz_probab_prime_p (mpz n) repetitions)
    [(2) #t]
    [(1) 'probably]
    [(0) #f]))

(for/list ([n (in-range 2 100)]
           #:when (is-prime? n))
  n)
Reply all
Reply to author
Forward
0 new messages