b can be expressed as [g, f

(1)

1. Given the type definition:

data Tree a = Tip a | Bin (Tree a) (Tree a) deriving Show

Define

flatten :: Tree a → [a ]

that returns the tips of the tree from left to right.

Sol. The base functor of Tree a is F(a, X) = a + X × X.

Every F-algebra of type F(a, b) → b can be expressed as [g, f ], where f :: b×b → b and g :: a → b. We can derive the foldt operator as follows:

h = ([g, f ])

≡ { catamorphisms } h · α = [g, f ] · F(id , h)

≡ { expanding α and F }

h · [Tip, Bin] = [g, f ] · (id + h × h)

≡ { fusion of coproducts; coproduct functor } [h · Tip, h · Bin] = [g, f · (h × h)]

≡ { coproducts }

h · Tip = g ∧ h · Bin = f · (h × h).

Translated to Haskell, it reads:

foldt :: (b → b → b) → (a → b) → Tree a → b foldt f g (Tip x ) = g x

foldt f g (Bin t u) = f (foldt f g t ) (foldt f g u) With the foldt operator, it is now easy to define flatten.

flatten = foldt (++) wrap where wrap x = [x ]

2. What is the time complexity of the function flatten you have defined? If it is not linear, can you derive a linear variant of flatten, starting from this definition:

fastflatten :: Tree a → [a ] → [a ] -- ...

Sol. The time complexity of flatten is Θ(n²) (in the worst case), where n is the number of Tip nodes in the tree.

(2)

1. flatten runs in time O(n²). A simple observation shows that there are exactly n − 1 Bin nodes (which can be easily proven by induction, or if you like, fold fusion), and flatten can take O(n) time for each Bin node, which is loosely estimated. Thus the overall time complexity is

(n − 1) × O(n) = O(n²).

2. There exists a set of trees that takes Θ(n²) time to be flattened, which can be generated by

worst :: Int → Tree () worst 1 = Tip ()

worst n = Bin (worst (n − 1)) (Tip ())

The time complexity of flatten running on this particular set of cases is T (n) = T (n − 1) + Θ(n),

which is obviously Θ(n²).

It is possible to derive a faster version of flatten by fold-fusion, using the idea of accumulating parameters. Our goal is to invent a linear-time function fastflatten that satisfies

fastflatten t ys = flatten t ++ ys.

Taking ys = [ ], fastflatten t [ ] computes flatten t since [ ] is the unit of list concatenation. The defining equation above can be rewritten in a “pointless”

style (following the title of Jeremy Gibbons’s radix-sort paper):

fastflatten t ys = flatten t ++ ys

≡ { treating (++) as an ordinary function } fastflatten t ys = (++) (flatten t ) ys

≡ { extensionality }

fastflatten t = (++) (flatten t )

≡ { introducing function composition } fastflatten t = ((++) · flatten) t

≡ { extensionality } fastflatten = (++) · flatten

This immediately suggests that we try to fuse (++) into flatten, which is a fold.

If the fusion succeeds, the resulting fold would be fastflatten.

Let’s go back for a moment to the Algebra of Programming to make the fusion conditions right. The general fusion law states:

h · ([f ]) = ([g]) ⇐ h · f = g · F(id , h),

(3)

which for Tree a would specialise to

h · ([g, f ]) = ([g⁰, f⁰]) ⇐ h · [g, f ] = [g⁰, f⁰] · (id + h × h), which can in turn be translated into (nearly) Haskell notation:

h · foldt f g = foldt f⁰g⁰ ⇐ h · g = g⁰ ∧ h (f x y) = f⁰(h x ) (h y ).

Thus in order to express (++) · flatten as a single fold, we need to invent two functions f⁰ and g⁰ while at the same time proving that:

(++) · wrap = g⁰,

(++) (x ++ y) = f⁰ (x ++) (y++).

It is easy to see that g⁰ can be (:), since its job is concatenating a singleton list constructed from its argument to the front of some other list. The inductive case is more interesting. Given two functions that concatenate x and y to the front of some list respectively, how do we accomplish the task of concatenating x ++ y to the front of some list? Hey! That’s applying the two functions in reverse argument order! Thus f⁰ is simply function composition (·). In order to gain more confidence, let’s do some formal reasoning for the inductive case, which is done pointwisely:

(++) (x ++ y) z

= { writing (++) as an infix operator } (x ++ y) ++ z

= { ++ is associative (aha, that’s the key, right?) } x ++ (y ++ z )

= { sections } (x ++) ((y++) z )

= { function composition (prefix form) } (·) (x ++) (y++) z

Therefore by extensionality, we have (++) (x ++ y) = (·) (x ++) (y++). We have thus derived fastflatten = foldt (·) (:). This has an intuitive meaning:

transform the tips into functions that add a tip to the front of some list, and then compose all the functions together. Equivalently we have fastflatten t ys = foldt (·) (:) t ys, which expands to

fastflatten (Tip x ) ys = x : ys

fastflatten (Bin t u) ys = fastflatten t (fastflatten u ys)

It is the standard implementation with accumulating parameters, which is in part tail-recursive and runs in linear time.

I figured out the derivation myself during a class. The beauty in the derivation makes me believe that there must be some way to generalise the result, and it is possible to discover the generalisation all by myself. Unfortunately,

(4)

the derivation is already recorded in section 3.5 of the Algebra of Programming (which is just the section that I skipped!), and a generalisation is proposed in an exercise. Anyway, it’s still wonderful experience to carry out a beautiful derivation by oneself.

3. Well, let’s practice using some library functions. . . Recall that [1 . .] creates an infinite list starting from 1. Define factorial using [1 . .], take, and foldr (thus you do not need recursion).

Sol.

fact :: Int → Int

fact n = foldr (×) 1 (take n [1 . .])

4.1 Define a function

sieve :: Int → [Int ] → [Int ]

such that sieve n xs removes elements in xs that are multiples of n. You may either use recursion, use a foldr , or use the library function filter . You will also need the built-in function“mod ”.

Sol.

sieve n = filter (isNotMultiple n)

where isNotMultiple n x = x ‘mod ‘ n 6≡ 0

4.2 Having sieve, define

fix (x : xs) = x : sieve x (fix xs)

What does fix [2 . .] represent? In the GHCi or Hugs interpreter, how do you examine the first 100 outputs of fix [2 . .]?

Sol. Let’s expand the definition to

fix (x : xs) = x : filter (isNotMultiple x ) (fix xs),

Haskell is a lazy functional language, which means its reduction strategy is

“outermost graph reduction,” if Richard Bird is right. Let’s try a few steps of the evaluation:

fix [2 . .]

= { expanding fix }

2 : filter (isNotMultiple 2) (fix [3 . .])

= { filter somehow tries to do pattern matching—expanding fix }

(5)

2 : filter (isNotMultiple 2) (3 : filter (isNotMultiple 3) (fix [4 . .]))

= { 3 is not a multiple of 2 }

2 : 3 : filter (isNotMultiple 2) (filter (isNotMultiple 3) (fix [4 . .]))

= { the two filter s doing pattern matching—expanding fix }

2 : 3 : filter (isNotMultiple 2) (filter (isNotMultiple 3) (4 : filter (isNotMultiple 4) (fix [5 . .])))

= { 4 is not a multiple of 3 but is a multiple of 2 }

2 : 3 : filter (isNotMultiple 2) (filter (isNotMultiple 3) (filter (isNotMultiple 4) (fix [5 . .])))

= { keep going } . . .

It is easy to see that a number n > 2 appears in the list fix [2 . .] if and only if it passes through all the filter s ahead of it, i.e., it is not a multiple of k for all 2 6 k < n, which is equivalent to saying that n is a prime number. Since 2 is also in the list, fix [2 . .] is the list of all prime numbers.

The first 100 numbers in the list can be examined by entering take 100 (fix [2 . .]).

4.3 What about

fix2 (x : xs) = x : fix2 (sieve x xs)

What does it represent? Which of fix and fix2 is faster? Why?

Sol. Again, let’s try a few evaluation steps of fix2 [2 . .]:

fix2 [2 . .]

= { expanding fix } 2 : fix2 (sieve 2 [3 . .])

= { fix2 trying to pattern-match its argument—expanding sieve } 2 : fix2 (filter (isNotMultiple 2) [3 . .])

= { evaluating filter }

2 : fix2 (3 : filter (isNotMultiple 2) [4 . .])

= { expanding fix2 }

2 : 3 : fix2 (sieve 3 (filter (isNotMultiple 2) [4 . .]))

= { expanding sieve }

2 : 3 : fix2 (filter (isNotMultiple 3) (filter (isNotMultiple 2) [4 . .]))

= { evaluating the inner filter —4 is a multiple of 2 }

2 : 3 : fix2 (filter (isNotMultiple 3) (filter (isNotMultiple 2) [5 . .]))

= { evaluating the filter s and fix2 }

2 : 3 : 5 : fix2 (filter (isNotMultiple 5) (filter (isNotMultiple 3) (filter (isNotMultiple 2) [6 . .])))

= { keep going } . . .

(6)

It suggests that a number n > 2 appears in the list fix2 [2 . .] if and only if n is not divisible by the prime numbers less than n. Therefore it should be the case that fix2 [2 . .] is again the list of prime numbers. For every number n > 2, fix2 tests n with only the prime numbers less than n, while fix tests n with all k’s where 2 6 k < n. The prime number theorem states that

x→∞lim π(x) x/ ln(x) = 1,

where π(x) is the number of primes less than or equal to x. This entails that

x→∞lim π(x)

x = 0,

which means prime numbers are asymptotically very sparse (since the density goes to zero), though it may not be very obvious when x is not too large. So we can expect the speed ratio of fix2 to fix would increase as more terms are generated.

4.4 In the interpreter, check whether fix [2 . .] yield the same output for the first 100, 200, or 300 outputs. You may find some standard library function such as zipWith, (≡), take, and “and ” useful. (“and ” is a function having type [Bool ] → Bool . You can easily guess what it does. . . )

Sol. One can directly type, say,

and (zipWith (≡) (take 100 (fix [2 . .])) (take 100 (fix2 [2 . .]))) in the interpreter to check the first 100 terms. Or one can define

checkEqual f g n xs = and (zipWith (≡) (take n (f xs)) (take n (g xs))) Then checkEqual fix fix2 n [2 . .] would check whether fix [2 . .] and fix2 [2 . .]

yields the same result for the first n terms.

It is possible to formally prove that fix [2 . .] = fix2 [2 . .], say by showing approx n (fix [2 . .]) = approx n (fix2 [2 . .]) for all n,

where approx is defined to be

approx :: Integer → [a ] → [a ] approx (n + 1) [ ] = [ ]

approx (n + 1) (x : xs) = x : approx n xs

as in Richard Bird’s functional programming book. Then it follows that fix [2 . .] = lim

n→∞approx n (fix [2 . .]) = lim

n→∞approx n (fix2 [2 . .]) = fix2 [2 . .].

This approach utilises the approximation lemma. However we omit the proof here for obvious reasons. . . The other way to establish the equality is to sepa- rately prove that fix [2 . .] and fix2 [2 . .] both generate the list of prime numbers.

Nevertheless I guess the two approaches would turn out to be essentially the same.