Pythagorean Triples

When Programming Praxis mentioned that the newest challenge sounded like a Project Euler problem, they were’t wrong. Basically, the idea is to count the number of Pythagorean Triples with perimeters (sum of the three numbers) under a given value. The necessary code to brute force the problem is really straight forward, but then they asked for the count up to one million. With the brute force O(n^2) algorithm (and a relatively high constant), that’s not really feasible. So that’s when we have to get a bit more creative.

First, how about that brute force solution I promised:

(define (brute-count-triples-under sum)
  (define count 0)
  (for* ([a (in-range 1 sum)]
         [b (in-range 1 a)]
         #:when (let ([c (sqrt (+ (* a a) (* b b)))])
                  (and (<= (+ a b c) sum)
                       (integer? c))))
    (set! count (+ count 1)))
  count)

The basic idea is calculate every a and b such that a < b and such that c = \sqrt{a^2 + b^2} for some integer c and the sum a+b+c is less than the target. Simple enough, how does that work out?

> (time (brute-count-triples-under 100))
cpu time: 8 real time: 5 gc time: 0
17

And pretty fast too. But if we try to run longer examples:

> (time (brute-count-triples-under 1000))
cpu time: 584 real time: 585 gc time: 48
325
 
> (time (brute-count-triples-under 10000))
cpu time: 54687 real time: 54840 gc time: 1348
4858

Yeah, I don’t really want to try that on one million. So we need a faster algorithm. Luckily, you don’t have to try all of the possible ways that could generate a triple–it turns out there are algorithms to do it for you. One in particularly came from a British math professor A. Hall in the paper Genealogy of Pythagorean Triads which showed that given any primitive Pythagorean triple (one that is not a multiple of another, smaller triple), you can produce three more primitive triples with the formulas:

a – 2b + 2c, 2a – b + 2c, 2a – 2b + 3c
a + 2b + 2c, 2a + b + 2c, 2a + 2b + 3c
-a + 2b + 2c, -2a + b + 2c, -2a + 2b + 3c

You can generate even more primitive triples from those in turn, as long as you choose to do so. In our case, we can stop each branch once the perimeter of the triple is larger than our target as they don’t ever get smaller. So how do we turn that into code? We use Racket’s generators:

(define (primitive-triples [max-sum #f])
  (generator ()
    (let loop ([a 3] [b 4] [c 5])
      (when (or (not max-sum) (<= (+ a b c) max-sum))
        (yield (list (min a b) (max a b) c))
        ; a – 2b + 2c, 2a – b + 2c, 2a – 2b + 3c
        (loop (+ a (* -2 b) (* 2 c))
              (+ (* 2 a) (- b) (* 2 c))
              (+ (* 2 a) (* -2 b) (* 3 c)))
        ; a + 2b + 2c, 2a + b + 2c, 2a + 2b + 3c
        (loop (+ a (* 2 b) (* 2 c))
              (+ (* 2 a) b (* 2 c))
              (+ (* 2 a) (* 2 b) (* 3 c)))
        ; -a + 2b + 2c, -2a + b + 2c, -2a + 2b + 3c
        (loop (+ (- a) (* 2 b) (* 2 c))
              (+ (* -2 a) b (* 2 c))
              (+ (* -2 a) (* 2 b) (* 3 c)))))
    (yield #f)))

Similar to Python’s yield, this generator will create a sequence of primitive triples, scanning in a depth first manner down each branch of the aforementioned tree. From there, we just have to calculate each multiple to get a list of all Pythagorean triples with perimeters less than a given value:

(define (triples-under sum)
  (for*/list ([trip (in-producer (primitive-triples sum) #f)]
              [k (in-range 1 (+ 1 (quotient sum (apply + trip))))])
    (map (lambda (n) (* n k)) trip)))

So if we wanted to test this with all 17 triples with perimeters under 100, we could:

> (triples-under 100)
'((3 4 5)    (6 8 10)   (9 12 15)  (12 16 20) (15 20 25) (18 24 30) (21 28 35)
  (24 32 40) (5 12 13)  (10 24 26) (15 36 39) (7 24 25)  (9 40 41)  (20 21 29)
  (8 15 17)  (16 30 34) (12 35 37))

On the other hand though, we don’t necessarily care what the actual triples are, we just want to know how many there are. So a slight modification to just count them:

(define (count-triples-under sum)
  (define count 0)
  (for* ([trip (in-producer (primitive-triples sum) #f)]
         [k (in-range 1 (+ 1 (quotient sum (apply + trip))))])
    (set! count (+ count 1)))
  count)

And that’s it. We can finally answer the question originally asked of us:

> (time (count-triples-under 1000000))
cpu time: 6116 real time: 6121 gc time: 1792
808950

It’s still not as fast as I’d like, but it’s certainly faster than it would have been without the fancier algorithm.

And that’s all there is to it. If you’d like to download the entire source code for today, you can do so here: pythagorean-triples source

Like I said before, it definitely did have the feel of a Project Euler problem, where the straight forward solution is not generally the best one and if you work at it, you can find something that will run in a fraction of the time.

Perhaps I’ll start publishing my Project Euler solutions here. I’ve worked through a good chunk of the problems on two separate occasions, both in Python and in Scheme and it might be interesting to compare and contrast the different languages. Although since it’s sort of a contest, that might be frowned upon. We’ll see.

Edit:

Eesh. When programmingpraxis mentioned that generators were slow down in the comments, I remembered some chatter on the Racket mailing list (the thread from this message) talking about generator performance. But I hadn’t realized it would be quite this extensive. It’s easy enough to reimplement the code to make primitive-triples just return a list of all of the triples directly:

(define (primitive-triples-nogen max-sum)
  (let loop ([a 3] [b 4] [c 5])
    (if (> (+ a b c) max-sum)
        '()
        (append
         (list (list (min a b) (max a b) c))
         ; a – 2b + 2c, 2a – b + 2c, 2a – 2b + 3c
         (loop (+ a (* -2 b) (* 2 c))
               (+ (* 2 a) (- b) (* 2 c))
               (+ (* 2 a) (* -2 b) (* 3 c)))
         ; a + 2b + 2c, 2a + b + 2c, 2a + 2b + 3c
         (loop (+ a (* 2 b) (* 2 c))
               (+ (* 2 a) b (* 2 c))
               (+ (* 2 a) (* 2 b) (* 3 c)))
         ; -a + 2b + 2c, -2a + b + 2c, -2a + 2b + 3c
         (loop (+ (- a) (* 2 b) (* 2 c))
               (+ (* -2 a) b (* 2 c))
               (+ (* -2 a) (* 2 b) (* 3 c)))))))
 
(define (count-triples-under-nogen sum)
  (define count 0)
  (for* ([trip (in-list (primitive-triples-nogen sum))]
         [k (in-range 1 (+ 1 (quotient sum (apply + trip))))])
    (set! count (+ count 1)))
  count)

Then look at the performance we get:

> (time (count-triples-under 1000000))
cpu time: 7109 real time: 7129 gc time: 2292
808950
 
> (time (count-triples-under-nogen 1000000))
cpu time: 352 real time: 348 gc time: 112
808950

Yeah… So there is a bit of a performance penalty for generators after all (most likely due to the use of continuations to implement them according to the aforementioned thread). I’ll have to keep that in mind.

Edit 2:

So I decided to go for the best of both worlds and essentially make a generator while maintaining the state myself. Essentially, I’ll use set! to update the list of triples we haven’t yet tried, stored in the function’s state.

(define (primitive-triples-state [max-sum #f])
  (define triples '((3 4 5)))
  (lambda ()
    (let loop ()
      (cond
        [(null? triples) #f]
        [else
         (define a (caar triples))
         (define b (cadar triples))
         (define c (caddar triples))
         (cond
           [(and max-sum (> (+ a b c) max-sum))
            (set! triples (cdr triples))
            (loop)]
           [else
            (define r (car triples))
            (set! triples
                  (list*
                   ; a – 2b + 2c, 2a – b + 2c, 2a – 2b + 3c
                   (list (+ a (* -2 b) (* 2 c))
                         (+ (* 2 a) (- b) (* 2 c))
                         (+ (* 2 a) (* -2 b) (* 3 c)))
                   ; a + 2b + 2c, 2a + b + 2c, 2a + 2b + 3c
                   (list (+ a (* 2 b) (* 2 c))
                         (+ (* 2 a) b (* 2 c))
                         (+ (* 2 a) (* 2 b) (* 3 c)))
                   ; -a + 2b + 2c, -2a + b + 2c, -2a + 2b + 3c
                   (list (+ (- a) (* 2 b) (* 2 c))
                         (+ (* -2 a) b (* 2 c))
                         (+ (* -2 a) (* 2 b) (* 3 c)))
                   ; all of the rest
                   (cdr triples)))
            r])]))))
 
(define (count-triples-under-state sum)
  (define count 0)
  (for* ([trip (in-producer (primitive-triples-state sum) #f)]
         [k (in-range 1 (+ 1 (quotient sum (apply + trip))))])
    (set! count (+ count 1)))
  count)

It’s fractionally slower than the version that just directly returns the list (the overhead of the function calls, I’m guessing), but not enough to be statistically significant.

> (time (count-triples-under-nogen 1000000))
cpu time: 352 real time: 348 gc time: 112
808950
 
> (time (count-triples-under-state 1000000))
cpu time: 388 real time: 388 gc time: 0
808950

Now I’m wondering how hard it would be a write a version of the generator macro that bundles this all up for you. It wouldn’t be as general purpose, but I’m will to bet it could be done.

All
By category
  • programmingpraxis

    It’s slow because generators are slow. Try again just to solve the problem directly instead of writing a multi-purpose generator and using that to solve the problem.

  • http://blog.jverkamp.com JP

    But generators are so fun! :)

    My original thought was to avoid having to keep all of the primitive triples in memory, but thinking about it more that’s more than countered by having to keep all of the continuations in memory (and those are likely to be much larger). So another case of premature optimization gone awry?

    I’ve edited the post to add a non-generator version and of course you’re quite correct. I’m actually a little surprised at just how much faster though; it’s about a 200x speedup to not use generators.

  • programmingpraxis

    Yup. In my exercise on the sum of the first billion primes, I wrote a function to generate primes because that’s what I really wanted — the exercise to sum the first billion primes was just the tail wagging the dog. But it takes about thirty minutes to sum the first billion primes with a generator, and about thirty seconds to sum them using a segmented sieve directly. Generators look cheap but cost much, even when they are built in to the language, like Python. Avoid them if you can.

  • http://blog.jverkamp.com JP

    Out of curiosity, I’ve added a third version that’s conceptually between the others. Essentially, it acts like a generator but I’m maintaining the state myself. It’s just about as fast as the direct version, but you still get the ability to produce an infinite list of primitive Pythagorean triples if you wanted (although since it’s using a stack instead of a queue it’s still depth first so the infinite case will never get to some of the smallest ones in other branches; that’s easy enough to fix though if desired).