Niklaus Wirth gave the following problem back in 1973:
Develop a program that generates in ascending order the least 100 numbers of the set M, where M is defined as follows:
a) The number 1 is in M.
b) If x is in M, then y = 2 * x + 1 and z = 3 * x + 1 are also in M.
c) No other numbers are in M.
(via Programming Praxis)
It’s an interesting enough problem, so let’s work out a few different ways of doing it.
First, there’s the simple, brute force method. Start by writing a recursive definition to check if a number is a ‘Wirth number’ by reversing the two equations given in part b of the definition:
; recursively determine if a number is a wirth number (define (wirth? n) (and (positive? n) (integer? n) (or (= n 1) (wirth? (/ (- n 1) 2)) (wirth? (/ (- n 1) 3)))))
def is_wirth(n): '''Recursively determine if a number is a wirth number''' if n < 0 or n != int(n): return False elif n == 1: return True else: return is_wirth((n - 1) / 2.0) or is_wirth((n - 1) / 3.0)
With something like that, it would be easy enough to write a loop in either language that just starts at 1 and checks every number:
; list the first n wirth numbers (define (n-wirth n) (let loop ([i 1] [cnt 0]) (cond [(= cnt n) '()] [(wirth? i) (cons i (loop (+ i 1) (+ cnt 1)))] [else (loop (+ i 1) cnt)])))
def n_wirth(n): '''List the first n wirth numbers''' ls =  i = 1 while len(ls) < n: if is_wirth(i): ls.append(i) i += 1 return ls
Easy enough, and for the case of
n=100, it runs pretty much instantly:
> (time (n-wirth 100)) cpu time: 0 real time: 1 gc time: 0 '(1 3 4 7 9 10 13 15 19 21 22 27 28 31 39 40 43 45 46 55 57 58 63 64 67 79 81 82 85 87 91 93 94 111 115 117 118 121 127 129 130 135 136 139 159 163 165 166 171 172 175 183 187 189 190 193 202 223 231 235 237 238 243 244 247 255 256 259 261 262 271 273 274 279 280 283 319 327 331 333 334 343 345 346 351 352 355 364 367 375 379 381 382 387 388 391 405 406 409 418)
In fact, even until the first 10,000 such numbers it’s still pretty fast:
> (time (n-wirth 10000)) cpu time: 670 real time: 700 gc time: 63 '(...)
So why even bother doing better? Because we can!
The next step would be to realize that we’re redoing a lot of the work in
wirth?. This should look like the perfect opportunity to use memorization, which I just so happen to have posted about both in Python and Racket.
Make those changes and run it again:
> (time (n-wirth 10000)) cpu time: 218 real time: 213 gc time: 0 '(...)
Three times faster. None too shabby, particularly for such a relatively small runtime as it is. But there’s still at least two (fairly obvious) ways to do it!
The next would be to use generators. This solution got a bit strange, so I only have a Python version. The basic idea is that for any number, we want to be able to generate the two branches of Wirth numbers from that number, merging them as we go. So first, let’s start with a function that can merge two generators into a new generator (removing duplicates):
def merge_generators(g1, g2): '''Merge two numeric generators in increasing order.''' n1 = g1.next() n2 = g2.next() while True: if n1 < n2: yield n1 n1 = g1.next() elif n1 > n2: yield n2 n2 = g2.next() else: yield n1 n1 = g1.next() n2 = g2.next()
So for example, if we have generators for multiples of two and three:
def take(gen, n): '''Take the first n items from a generator.''' ls =  while len(ls) < n: ls.append(gen.next()) return ls def multiples_of(n): '''Return multiples of n, starting at n.''' i = n while True: yield i i += n >>> take(merge_generators(multiples_of(2), multiples_of(3)), 20) [2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 26, 27, 28, 30]
So how do we put that all together?
def wirth_gen(n): '''A generator for wirth numbers.''' yield n for i in merge_generators(wirth_gen(2 * n + 1), wirth_gen(3 * n + 1)): yield i
So you yield the number that you’re at, then you merge the two generators. Straight forward enough. And we can use the
take function from above to generate the first
def gen_n_wirth(n): '''List the first n wirth numbers using generators''' return take(wirth_gen(1), n)
I have to admit, I think that’s the most nested generator I’ve ever written. In practice though, it’s faster than either of the previous Racket versions, taking only 160ms to generate the first 10,000 numbers in the sequence.
Two more options and I think I’ll call it a day. The first is based pretty much directly on the sieves that I used in my original billion primes post. Basically, create a list of natural numbers. Starting at the bottom, we’ll check off the Wirth numbers, returning them as we go. The only problem is that we don’t have a solid upper bound (as we did with the primes). So we’ll just ask the user to supply one. Here’s the code:
; list the first n wirth numbers with a sieve (define (sieve-n-wirth n sieve-size) (define sieve (make-vector sieve-size #f)) (vector-set! sieve 1 #t) (take (for/list ([i (in-range sieve-size)] #:when (vector-ref sieve i)) (when (< (+ (* 2 i) 1) sieve-size) (vector-set! sieve (+ (* 2 i) 1) #t)) (when (< (+ (* 3 i) 1) sieve-size) (vector-set! sieve (+ (* 3 i) 1) #t)) i) n))
def sieve_n_wirth(n, size): '''Generate the first n wirth numbers using a sieve.''' sieve = [False for i in xrange(size)] sieve = True ls =  for i in xrange(size): if sieve[i]: ls.append(i) if 2 * i + 1 < size: sieve[2 * i + 1] = True if 3 * i + 1 < size: sieve[3 * i + 1] = True if len(ls) == n: break return ls
> (time (sieve-n-wirth 10000)) cpu time: 15 real time: 11 gc time: 0 '(...)
I don’t think we’re going to beat that time.
Last but not least, priority queues / heaps. The general idea is to let the data structure do the work. Since a heap is designed to let you insert items in any order but only pull them back out in order, it’s perfect for the task. You can get heaps in Python from
heapq and in Racket from
data/heap. Here’s how you use it (there’s a bit of ugliness to remove duplicates):
(require data/heap) ; list the first n wirth numbers with a heap (define (heap-n-wirth n) (define heap (make-heap <)) (heap-add! heap 1) (let loop ([n n]) (cond [(zero? n) '()] [else (define i (heap-min heap)) (heap-add! heap (+ (* 2 i) 1)) (heap-add! heap (+ (* 3 i) 1)) (let loop () ; remove duplicates (when (= i (heap-min heap)) (heap-remove-min! heap) (loop))) (cons i (loop (- n 1)))])))
from heapq import heappop, heappush def heap_n_wirth(n): '''Generate the first n wirth numbers using a heap.''' ls =  heap =  heappush(heap, 1) while len(ls) < n: i = heappop(heap) if not i in ls: ls.append(i) heappush(heap, 2 * i + 1) heappush(heap, 3 * i + 1) return ls
> (time (heap-n-wirth 10000)) cpu time: 562 real time: 557 gc time: 157 '(...)
And we’re back to the original runtime. So the sieved version still wins out.
If you have any particularly different algorithms (or any improvements to my versions above), be sure to let me know in the comments.
|← Confession – Day 39||All||Life of Pi →|
|← One Billion Primes – Segmented Sieve||By category||Generating non-repeating strings →|