Facebook Twitter GitHub LinkedIn LinkedIn LinkedIn
Back to all posts

Final tagless encodings have little to do with typeclasses

There’s a common misconception as to what final tagless, and more specifically, a final encoding is. A common claim I see is that final tagless means coding against typeclasses. The mtl library and code written in MTL style are raised as examples of final tagless.

I would like to argue that what people are referring to as final tagless is in fact just coding against an interface and that the novelty of final tagless really has very little to do with abstract interfaces. So then what is final tagless? It’s a complicated name for a not-so-complicated idea.

We can break it down into its constituent parts: final and tagless. The use of final is to contrast with the typical initial encoding of a language. Let’s start by looking at an initial encoding.

An Initial Encoding

The most straightforward initial encoding is to make a sum type with a constructor per type. Let’s consider modeling a boolean argument to a SQL WHERE clause:

data SqlExpr
    = B Bool
    | And SqlExpr SqlExpr
    | Or SqlExpr SqlExpr
    | Not SqlExpr

This initial encoding allows you to create arbitrarily nested boolean expressions, to run these expressions one would create an eval function that takes a SqlExpr and returns a Bool:

eval :: SqlExpr -> Bool
eval (B b)             = b
eval (Leq expr1 expr2) = eval expr1 <= eval expr2
eval (And expr1 expr2) = eval expr1 && eval expr2
eval (Or expr1 expr2)  = eval expr2 || eval expr2
eval (Not expr)        = not (eval expr)

This encoding is simple Haskell 98. However, our WHERE clause needs to take more than boolean arguments. Let’s try to expand this encoding to include Int:

data SqlExpr
    = I Int
    | B Bool
    | Leq SqlExpr SqlExpr
    | And SqlExpr SqlExpr
    | Or SqlExpr SqlExpr
    | Not SqlExpr

So now lets define an eval function. But what should the type be? We can create either an Int or Bool. Let’s change our data type to include a type variable, with the intent to allow us to specify a SqlExpr Bool or SqlExpr Int:

data SqlExpr a
    = I Int
    | B Bool
    | Leq (SqlExpr Int) (SqlExpr Int)
    | And (SqlExpr Bool) (SqlExpr Bool)
    | Or (SqlExpr Bool) (SqlExpr Bool)
    | Not (SqlExpr Bool)

This allows us to change our eval function to take a SqlExpr a and return an a:

eval :: SqlExpr a -> a
eval (B b)             = b
eval (I i)             = i
eval (Leq expr1 expr2) = eval expr1 <= eval expr2
eval (And expr1 expr2) = eval expr1 && eval expr2
eval (Or expr1 expr2)  = eval expr2 || eval expr2
eval (Not expr)        = not (eval expr)

Unfortunately, this doesn’t compile:

Main.hs:30:26: error:
    • Couldn't match expected type ‘a’ with actual type ‘Bool’
      ‘a’ is a rigid type variable bound by
        the type signature for:
          eval :: forall a. SqlExpr a -> a
        at Main.hs:29:1-22
    • In the expression: b
      In an equation for ‘eval’: eval (B b) = b
    • Relevant bindings include
        eval :: SqlExpr a -> a (bound at Main.hs:30:1)
   |
30 | eval (B b)             = b
   |

The compiler can’t tell that this is correct because there is nothing tying B to SqlExpr Bool and I to SqlExpr Int

We can solve this by introducing a universal result type :

data SqlExprResult
    = BoolResult Bool
    | IntResult Int


eval :: SqlExpr a -> SqlExprResult
eval (B b) = BoolResult b
eval (I i) = IntResult i
eval (Leq expr1 expr2) =
    let IntResult i1 = eval expr1
        IntResult i2 = eval expr2
    in BoolResult (i1 <= i2)
...

This universal result type is a tag, but unfortunately we have to pattern match on the result in our recursive constructors. This is an incomplete pattern match, making it possible to construct malformed statements like Leq (B True) (I 10), leading to runtime errors.

If we want to continue down the path of this initial encoding we can use fancy types like GADTs to eliminate these, this leads to a tagless initial encoding.

Switching to a Final Encoding

The paper Finally Tagless, Partially Evaluated presents an alternative to this initial encoding—the so-called final encoding. This encoding is called as such because it works in terms of the final representation rather than an intermediate datatype. Let’s write an implementation of our language SqlExpr language in a final encoding:

newtype SqlExpr a =
    SqlExpr { unSqlExpr :: a }


bool :: Bool -> SqlExpr Bool
bool b = SqlExpr b


int :: Int -> SqlExpr Int
int i = SqlExpr i


leq :: SqlExpr Int -> SqlExpr Int -> SqlExpr Bool
leq expr1 expr2 = SqlExpr (unSqlExpr expr1 <= unSqlExpr expr2)

...

eval :: SqlExpr a -> a
eval = unSqlExpr

This compiles, has no tags, and allows only well formed statements to compile. Even leq (bool True) (int 10) will fail to compile with the correct compiler error. This is all that final tagless means. And if this is the only interpretation our language needs, then we are done.

But this of course is only one of a family of interpreters available. Another such interpreter will generate the SQL expression rather than interpreting directly. Let’s write a version that generates output fitting for the rawQuery function in persistent.

{-# LANGUAGE OverloadedStrings #-}

import Data.Text.Lazy.Builder ( Builder )

-- Stand-in for Database.Persist.PersistValue
data PersistValue
    = PersistInt64 Integer
    | PersistBool Bool


data SqlExpr a =
    SqlExpr { unSqlExpr :: (Builder,  [PersistValue]) }


bool :: Bool -> SqlExpr Bool
bool b = SqlExpr ("?", [PersistBool b])


int :: Int -> SqlExpr Int
int i = SqlExpr ("?", [PersistInt64 (fromIntegral i)])


leq :: SqlExpr Int -> SqlExpr Int -> SqlExpr Bool
leq expr1 expr2 =
        let (b1, v1) = unSqlExpr expr1
            (b2, v2) = unSqlExpr expr2
            in SqlExpr ( b1 <> " <= " <> b2, v1 <> v2)


and_ :: SqlExpr Bool -> SqlExpr Bool -> SqlExpr Bool
and_ expr1 expr2 =
   let (b1, v1) = unSqlExpr expr1
       (b2, v2) = unSqlExpr expr2
   in SqlExpr ( b1 <> " AND " <> b2, v1 <> v2)


or_ :: SqlExpr Bool -> SqlExpr Bool -> SqlExpr Bool
or_ expr1 expr2 =
   let (b1, v1) = unSqlExpr expr1
       (b2, v2) = unSqlExpr expr2
   in SqlExpr ( b1 <> " OR " <> b2, v1 <> v2)


not_ :: SqlExpr Bool -> SqlExpr Bool
not_ expr =
    let (b, v) = unSqlExpr expr
    in SqlExpr ("NOT " <> b, v)

Both this and the preceding are valid interpretations. However, they are mutually exclusive. To support both, the paper presents creating a typeclass that abstracts over the representation. This is not a requirement to consider something final tagless. This confusion has led a lot of people astray and had them avoid the simple solution even in situations where several interpretations are not required.

Bonus Round: Context-aware encoding

There is potential issue in the code above. We aren’t setting parentheses. We also don’t want to use parentheses where they aren’t actually required. This requires us to know our context. One might think it wouldn’t be possible for a function to know what context it is in but we can use a trick where we make the context explicit as a function. Let’s look at our new version:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text.Lazy.Builder ( Builder )

data PersistValue
    = PersistInt64 Integer
    | PersistBool Bool

type WithParens = Bool


parens :: Builder -> Builder
parens b = "(" <> b <> ")"


parensM :: Bool -> Builder -> Builder
parensM True  = parens
parensM False = id


data  SqlExpr a =
    SqlExpr { unSqlExpr :: WithParens -> (Builder,  [PersistValue]) }


bool :: Bool -> SqlExpr Bool
bool b = SqlExpr $ \_ -> ("?", [PersistBool b])


int :: Int -> SqlExpr Int
int i = SqlExpr (const ("?", [PersistInt64 (fromIntegral i)]))


leq :: SqlExpr Int -> SqlExpr Int -> SqlExpr Bool
leq expr1 expr2 =
        let (b1, v1) = unSqlExpr expr1 True
            (b2, v2) = unSqlExpr expr2 True
            in SqlExpr $ \p -> (parensM p (b1 <> " <= " <> b2), v1 <> v2)


and_ :: SqlExpr Bool -> SqlExpr Bool -> SqlExpr Bool
and_ expr1 expr2 =
    let (b1, v1) = unSqlExpr expr1 True
        (b2, v2) = unSqlExpr expr2 True
    in SqlExpr $ \p -> (parensM p  ( b1 <> " AND " <> b2), v1 <> v2)


or_ :: SqlExpr Bool -> SqlExpr Bool -> SqlExpr Bool
or_ expr1 expr2 =
    let (b1, v1) = unSqlExpr expr1 True
        (b2, v2) = unSqlExpr expr2 True
    in SqlExpr $ \p -> ( parensM p (b1 <> " OR " <> b2), v1 <> v2)


not_ :: SqlExpr Bool -> SqlExpr Bool
not_ expr =
    let (b, v) = unSqlExpr expr True
    in SqlExpr $ \p -> (parensM p ("NOT " <> b), v)


where_ :: SqlExpr Bool -> (Builder, [PersistValue])
where_ e =
    let (b, v) = unSqlExpr e False
    in ("WHERE " <> b, v)

Ben Levy is a Partner and Principal Software Engineer at Foxhound Systems. At Foxhound Systems, we focus on using Haskell to create fast and reliable custom built software systems. Looking for help with something you’re working on? Reach out to us at info@foxhound.systems.