|

Transforming polymorphic values

Posted on Sunday, 18 October 2009 at 18:03.

{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE DeriveDataTypeable #-}

module Arith where

import Data.Generics
import Control.Applicative

Recently there’s been a bit of talk in the Haskell café about what a DSL is. Robert Atkey gave the nice answer that you can often capture a DSL in one or more type classes, and implementations of this DSL then correspond to instances of these type classes.

Arithmetic expressions with variables, for example, can be captured by this type:

type Arith = forall a. (Num a, Var a) => a

The Num constraint allows the use of integers and the basic operators, while Var allows the use of variables:

class Var a where
  var :: String -> a

I would like to show you the various things you can do with a value of type Arith.

Example Arith values

Before we do that, however, let’s think about what this definition of Arith really means. It’s not any specific type yet, such as Int or Float. Rather, it’s an expression composed entirely of functions from the Num and Var type classes. Since fromInteger and var are the only functions from those classes that don’t take recursive values as arguments, calls to those two functions will necessarily make up the leaves of any Arith tree—provided it is finite. Here are some example values:

example1 :: Arith
example1 = 1 + 2 * 3

example2 :: Arith
example2 = 2 * var "x" + 0 * var "y"

Note that the integer literals in these examples are automatically expanded to calls to fromInteger. To keep the examples somewhat smaller, we will ignore the negate, abs and signum functions from the Num type class. We will also ignore Num‘s superclasses Eq and Show, providing empty instances where necessary; they mess up some of our examples.

Evaluation as instance

Now, given values of type Arith, what can we do with them? One thing we can think of is to combine them into a bigger expression:

example3 :: Arith
example3 = 6 + example1 - 3 * example2

Can we inspect them, or make them smaller? Well, the type is polymorphic, so we can narrow it to a concrete type. We don’t know any types yet that are instances of both Num and Var, however, so let’s make one! We’ll go for one that evaluates the expression, yielding an integer given an environment:

newtype Eval = Eval { runEval :: Env -> Integer }
type    Env  = [(String, Integer)]

instance Show Eval
instance Eq Eval

instance Num Eval where
  Eval f + Eval g = Eval (liftA2 (+) f g)
  Eval f * Eval g = Eval (liftA2 (*) f g)
  Eval f - Eval g = Eval (liftA2 (-) f g)
  fromInteger n   = Eval (const n)

instance Var Eval where
  var name = Eval $ e -> case lookup name e of
    Nothing -> error ("unbound variable: " ++ name)
    Just x -> x

Now we can evaluate Ariths simply by calling runExpr on them:

*Arith> runEval example1 []
7
*Arith> runEval example2 [("x", 6), ("y", 7)]
12

This method of writing functions on Ariths is very much like writing catamorphisms: the tree is reduced to a value, and operands have already had the catamorphisms applied to them recursively, so any computation you want to run on the tree in this way has to be compositional.

The free instance

Let’s do something slightly more complicated: constant folding. The idea of constant folding is to eliminate trivial uses of the operators. x * 0, for example, can be reduced to 0, regardless of x‘s value. Other examples are multiplication by 1 and subtraction of 0. Also, if an expression doesn’t use any variables, we can evaluate it right away.

What type should our constant folding function have? We would expect something like this:

foldConstants :: Arith -> Arith

Can we write this function as a newtype with instances, too? This turns out to be a lot more difficult. For example, in the case of + we want to inspect the left-hand and right-hand sides to check whether they are zero. But the operands are of type Arith, so we would have to write helper functions such as isZero :: Arith -> Bool which in turn have to be written with new newtypes and instances. This quickly gets out hand. We are running into problems because our catamorphism isn’t regular anymore.

It would be nice if we could do a ‘deep’ inspection of the operands, like we can do when writing functions on normal datatypes. So let’s do that: make a normal datatype and write our foldConstants on that type.

data ReifyArith
  = Add ReifyArith ReifyArith
  | Mul ReifyArith ReifyArith
  | Sub ReifyArith ReifyArith
  | Int Integer
  | Var String
  deriving (Eq, Show, Data, Typeable)

instance Num ReifyArith where
  (+) = Add
  (*) = Mul
  (-) = Sub
  fromInteger = Int

instance Var ReifyArith where
  var = Var

I’ve called this datatype ReifyArith because it reifies an Arith value, turning it into something concrete and tangible. I like to call ReifyArith the free instance, because you can write such a datatype for any combination of classes that adhere to the requirements Robert Atkey mentioned in his email, either as an ADT or a GADT.

From ReifyArith back to Arith

The nice thing about the free instance is that we do not lose any information: we can easily go back to the polymorphic version of our expressions, like so:

generify :: ReifyArith -> Arith
generify expr = case expr of
  Add x y      -> generify x + generify y
  Mul x y      -> generify x * generify y
  Sub x y      -> generify x - generify y
  Int n        -> fromInteger n
  Var name     -> var name

Using this technique, we can manipulate Arith values in arbitrary ways. One thing I particularly like is that the ReifyArith is entirely internal to our module and doesn’t need to be exposed to the outside world.

Constant folding

Now that we have a normal datatype, we can write our constant folding transformation. We implement it as a function that does only one step of the rewriting and then use generic programming (Scrap Your Boilerplate in this case) to apply it recursively to our tree. This is why we also derived Data and Typeable for ReifyArith.

step :: ReifyArith -> ReifyArith
step expr = case expr of
  -- Reduce computations on integer literals
  Add (Int x) (Int y) -> Int (x + y)
  Mul (Int x) (Int y) -> Int (x * y)
  Sub (Int x) (Int y) -> Int (x - y)
  
  -- Rewrite rules based on 0 and 1
  Add (Int 0) y -> y
  Add x (Int 0) -> x
  Mul (Int 0) _ -> 0
  Mul _ (Int 0) -> 0
  Mul (Int 1) y -> y
  Mul x (Int 1) -> x
  Sub x (Int 0) -> x
  
  -- Catch all other cases
  _ -> expr

foldConstants :: Arith -> Arith
foldConstants expr = generify (everywhere (mkT step) expr)

Let’s see if it works:

*Arith> foldConstants example1 :: ReifyArith
Int 7
*Arith> foldConstants example2 :: ReifyArith
Mul (Int 2) (Var "x")

Indeed, in the first case the transformation is able to fold the entire expression to a single integer literal, and in the second case it has completely eliminated the second term.

Conclusion

We have seen that although polymorphic values look abstract and intangible, they are in fact easily manipulated. If a regular, compositional catamorphism is required, the function can be written as instances of the relevant datatypes. If a more complicated, non-regular scheme is required, we can reify the value as a concrete datatype and define the function on that.

Many thanks to Sjoerd Visscher for valuable feedback!

Comments

By Brent Yorgey on Sunday, 18 October 2009 at 20:28:

Neat, thanks for the elegant explanation and examples. Another name for the “free instance” might be the “initial instance”, since reification defines an initial algebra, as witnessed by the fact that there is always a unique morphism from the reified data type to any other instance–namely, generify! But “free instance” is a good name too, not just because it comes “for free” but because it really does correspond to the “free structure” generated by the given type class signatures.

By Jake McArthur on Monday, 19 October 2009 at 17:40:

This is lovely!

By augustss on Tuesday, 20 October 2009 at 00:18:

Be careful about simplifications like x*0=0, because they are rarely true.

By Martijn on Tuesday, 20 October 2009 at 08:24:

@Brent thanks for the feedback! Ah, I didn’t know it had a name already. Thanks for that.

@augustss Can you elaborate a bit? I hope this simplification doesn’t hurt in the context of simple arithmetic expressions?

By Linktipps #17 :: Blackflash on Sunday, 25 October 2009 at 14:22:

[...] [...]

By Martijn’s Journal » GADTs in Haskell 98 on Thursday, 12 November 2009 at 12:27:

[...] Transforming polymorphic values [...]

By Linktipps Januar 2010 :: Blackflash on Saturday, 19 December 2009 at 14:48:

[...] [...]

Leave a comment!

Martijn loves to receive comments! Add yours by filling out the fields below.