GADTs in Haskell 98

Posted on Thursday, 12 November 2009 at 12:30.

Last time we saw how some type classes can be reified to datatypes, transformed in arbitrary ways and then transformed back again to polymorphic types. I briefly mentioned that for some type classes you need GADTs rather than classical ADTs.

The dual to this story holds as well: many (if not all) GADTs can be represented as type classes. To see how, take a look at the standard GADT example:

data Term :: * -> * where 
  Lit  :: Int        -> Term Int 
  Inc  :: Term Int   -> Term Int 
  IsZ  :: Term Int   -> Term Bool 
  If   :: Term Bool  -> Term a    -> Term a     -> Term a 
  Pair :: Term a     -> Term b    -> Term (a,b) 
  Fst  :: Term (a,b) -> Term a 
  Snd  :: Term (a,b) -> Term b

Each constructor result type is indexed by the type of the construct it represents. This example is explained in detail in Simple Unification-based Type Inference for GADTs by Simon Peyton Jones et al.

Writing Term like this allows the following well-typed evaluation function:

eval :: Term a -> a
eval term = case term of
  Lit n    -> n
  Inc t    -> eval t + 1
  IsZ t    -> eval t == 0
  If c x y -> if eval c then eval x else eval y
  Pair x y -> (eval x, eval y)
  Fst t    -> fst (eval t)
  Snd t    -> snd (eval t)

To convert a GADT G to a type class, create a new type class GC with one type parameter g. This type parameter g will have the same kind as G, and its arguments will have the same semantics as those of the G. Then for each constructor create one function in the type class with the exact same type, replacing every occurrence of G by g.

If we follow this recipe for Term we get:

class TermC term where
  lit  :: Int         -> term Int
  inc  :: term Int    -> term Int
  isZ  :: term Int    -> term Bool
  if_  :: term Bool   -> term a    -> term a      -> term a
  pair :: term a      -> term b    -> term (a, b)
  fst_ :: term (a, b) -> term a
  snd_ :: term (a, b) -> term b

I’ve used the same names as the constructors except I made them lowercase. I appended _ to some to avoid name clashes (fst, snd) or lexical errors (if).

Of course, the free instance of TermC is Term.

Now the recipe doesn’t make any assumptions about the GADT in question, but I don’t know if this is right, because you can do some powerful things with GADTs. The two special things in datatypes I can think of—class constraints and existentially quantified variables—are no problem in type classes. Also the thing that makes GADTs truly special—custom type indices in constructor result types—have always been allowed in type classes. If I have overlooked something, please let me know and leave a comment.

However, perhaps the more important question is: can we do everything with TermC that we could do with Term? Well, we can certainly implement eval: this becomes a newtype with a corresponding instance TermC:

newtype Eval a = Eval { evalC :: a }

instance TermC Eval where
  lit       = Eval
  inc t     = Eval $ evalC t + 1
  isZ t     = Eval $ evalC t == 0
  if_ c x y = Eval $ if evalC c then evalC x else evalC y
  pair x y  = Eval $ (evalC x, evalC y)
  fst_ t    = Eval $ fst (evalC t)
  snd_ t    = Eval $ snd (evalC t)

The instance is very, very similar to the original evaluation function. Only instead of recursively calling eval, we call evalC and wrap the result in an Eval constructor. The type of evalC :: Eval a -> a is also similar to eval :: Term a -> a.

What I like most about encoding terms and their evaluation this way is that all the code is Haskell 98!

But not everything is this easy. The step evaluator on the Haskell wiki page on GADTs, for example, uses deep pattern matching and as we saw last time this is a lot easier to do on datatypes than in class instances. In fact, I don’t know if this particular example is possible to translate at all, and if it is, it’s certainly not going to be Haskell 98.

Another thing GADTs are useful for is witnesses in families of types. Again, I don’t know if it is possible to translate this use case to class instances.

Summing up, any GADT can be translated to a type class, but not all uses of GADTs have obvious translations. The standard GADT example, however, is trivial to translate. For that reason, it is probably not the most interesting choice as standard example of GADTs.


By jj on Thursday, 12 November 2009 at 15:08:

What I like about GADTs is not what they let us do, but what they stop us from doing. The standard GADT example you show makes it impossible to produce an ill-typed Term (not term, Term) without using error or naughty recursion. Using the type class TermC, you have to either give up pattern matching or type safety.

By Brent Yorgey on Thursday, 12 November 2009 at 18:58:

Aha, I never realized you could encode GADTs as type classes like this! Simple once you see it. Thanks for sharing!

By Chung-chieh Shan on Friday, 13 November 2009 at 09:45:

Jacques Carette and Oleg Kiselyov and I wrote up some more things that one can do (and have been done) in this way: “Finally
tagless, partially evaluated: Tagless staged interpreters for simpler
typed languages.” Journal of Functional Programming 19(5):509-543.

By Bob Atkey on Friday, 13 November 2009 at 13:28:

Unfortunately, not all GADTs can be expressed faithfully in this way, due to “leakage” between the types used for representation and the type parameters in the constructors. An example is given in Section 4 of the paper “Unembedding domain-specific languages” (available from my home page), where we called them “exotically typed terms”. The example in the paper involves Higher-Order Abstract Syntax, but it still happens when you have no negative occurrences of the type variable: a simple example is a GADT for representing typed S and K combinators with application. The problem seems to occur when there is a type parameter inside a GADT term that does not appear on the outside of the term.

Fortunately, all is not lost (if you allow yourself Haskell98 + Rank-2 polymorphism + existential types). One can still represent the Equality GADT as a type class (with single method refl :: f a a), and then use Neil Ghani and Patty Johann’s decomposition of GADTs into normal algebraic datatypes plus the equality GADT. Oleg did this in OCaml in a post to the OCaml mailing list. A disadvantage of this approach is that you have to apply the type equalities by hand.

Leave a comment!

Martijn loves to receive comments! Add yours by filling out the fields below.