Over the past several weeks I have implemented most of Simon Peyton Jones' proposal for a major revision to Template Haskell. This brings several new features to Template Haskell, including:

- Typed Template Haskell brackets and splices.
- Pattern splices and local declaration splices.
- The ability to add (and use) new top-level declarations from within top-level splices.

I will concentrate on the first new feature because it allows us to generate and
compile Haskell values at run-time without sacrificing type safety. The code in
this post is available on github; the
github repository README contains instructions for building the `th-new`

branch
of GHC, which is where work on typed Template Haskell is being done. The GHC
wiki contains more details about
the current
implementation status. The plan is that this work will land in HEAD before the
7.8 release of GHC.

## The staged power function: untyped

The driving example I will use is the staged power function. The idea is that we want to compute \(x^n\) where \(n\) is known statically (that is, it is a constant we know ahead of time), but \(x\) may vary. John Lato wrote a blog post, Runtime Meta-programming in Haskell, that shows how to implement the staged power function while providing a veneer of type-safety on top of unsafe runtime code generation using "classic" Template Haskell. Examining his implementation is instructive, so I'll start there.

John's "safe" Template Haskell expression type is a `newtype`

wrapper around the
Template Haskell expression type, `ExpQ`

, that adds a phantom type parameter.

newtype Meta a = Meta { unMeta :: ExpQ }

This definition means that for all α, we can construct a `Meta`

α from
*any* `ExpQ`

by applying the constructor `Meta`

. For example

e :: Meta Int e = Meta [|'c'|]

Not very safe! However, using the `Meta`

type, we can write combinators that
*are* safe in the sense that if their inputs are "well-typed" `Meta`

values,
then their outputs will be "well-typed" `Meta`

values. John defines one such
safe combinator as follows

compose :: Meta (a -> b) -> Meta (b -> c) -> Meta (a -> c) compose (Meta l) (Meta r) = Meta [| $r . $l |]

Although the `compose`

function uses unsafe operations internally, if its two
arguments really are Meta-wrapped Template Haskell expressions with the types ```
a
-> b
```

and `b -> c`

, respectively, then the result really will be a Meta-wrapped
Template Haskell expression of type `a -> c`

. This isn't enforced by the type
system, but it's true nonetheless.

John defines the staged power function as follows

mkPow :: Int -> Meta (Int -> Int) mkPow 0 = Meta [| const 1 |] mkPow n = Meta [| \x -> x * $(unMeta $ mkPow (n-1)) x |]

Again, the type system doesn't enforce safety in the sense that it doesn't
guarantee that the bit of Template Haskell that `mkPow`

produces really has type
`Int -> Int`

, but we can see by inspection that it will always return a Template
Haskell expression that has type `Int -> Int`

. With classic Template Haskell,
this is just about all that we can hope for.

## The staged power function: typed

With the new version of Template Haskell, we can do better. Here is the
definition of `compose`

using the new Template Haskell typed brackets and typed
splices (I have also changed the definition of `compose`

so that the arguments
are in the proper order :)).

compose :: TExpQ (b -> c) -> TExpQ (a -> b) -> TExpQ (a -> c) compose f g = [|| $$f . $$g ||]

We can see a new data type, `TExpQ a`

, and new syntax for splices and
brackets. `TExpQ a`

is a type alias for `Q (TExp a)`

, just as `ExpQ`

is an alias
for `Q Exp`

. A `TExp a`

is a *typed* Template Haskell expression—the type
checker guarantees that it is the abstract syntax representation of a Haskell
term of type `a`

. This is in contrast to `Exp`

, the existing *untyped*
expression representation used by Template Haskell. A value of type `Exp`

is the
abstract syntax of a Haskell expression of *some* type, but a value of type
`TExp a`

is the abstract syntax of a Haskell expression of type `a`

.

Values of type `TExp a`

can only be constructed using typed brackets and typed
splices. The syntax `[||e||]`

is a typed bracket, and the syntax `$$e`

is a
typed splice. No other varieties of splice may occur in a typed bracket. In
other words, `[||e||]`

is the only introduction form for `TExpQ`

, and `$$e`

is
the elimination form. It not the *only* elimination form; Template Haskell also
provides a constant `unType :: TExp a -> Exp`

that allows coercion from the type
Template Haskell world to the untyped world; it simply throws away the type
information

We can now write a completely safe version of the `mkPow`

function.

mkPow :: Num a => Int -> TExpQ (a -> a) mkPow 0 = [|| const 1 ||] mkPow n = [|| \x -> x * $$(mkPow (n-1)) x ||]

This version is type-checked by the compiler; there is no unsafe newtype wrapping/unwrapping going on.

## Under the covers

If you look at the definition of `TExp`

in the `template-haskell`

library,
you'll see that it actually is a newtype wrapper.

newtype TExp a = TExp { unType :: Exp }

Why is this safe? Because the *compiler* is checking that a typed bracket is
well-typed and then creating a `TExp`

whose type index *is that type*. Of course
you can always subvert type safety by importing `Language.Haskell.TH.Syntax`

and
mucking about with `TExp`

's. But as long as you stick to typed brackets and
splices, you will get type safety.

## Run time code generation and compilation

Now that we can generate type safe Haskell expressions at run time, can we also safely compile them at run time? The answer is a qualified yes.

First, how can we compile a `TExp a`

? John uses the
plugins package, originally
written by Don Stewart. I will use the GHC API directly. My implementation
strategy is the same as his: print a Template Haskell expression to a file and
compile and load the file. For this to work, we need two invariants to hold:

- The term that we pretty-print must be a faithful representation of the term that was type checked.
- The API that we use to compile the term we print out must assign it the same type that GHC did when it checked the term.

You might think that pretty-printing is safe, but in fact I had to fix a few
bugs that caused the pretty-printed versions of some Template Haskell terms to
*not even parse as valid Haskell*! I assume that the underlying expression value
is a faithful representation of the term that GHC type checked.

The second property does not obviously hold. First of all, we need to make sure (approximately) that the type environment we use to compile the term is the same as the type environment used to type check it. Template Haskell resolves some names so that they are fully qualified. Perhaps we could walk over the term to figure out exactly which modules we need to import to compile the term, but I'm not sure that is sufficient. At the moment I've punted, so run time compilation may fail at run time. See the github repository for the details. Furthermore, even if we got all the imports right, defaulting and such might conceivably get in the way, so we slap a type signature on out term when we print it out to make sure the type of out compiled term really matches the type of the typed bracket.

Given the function `compile :: forall a . Typeable a => TExp a -> IO a`

, which
is defined in the github repository accompanying this post, we can now compile
and run our staged power function. Note that the `Typeable`

constraint is what
allows us to print out the type of the term when we feed it to the GHC API.

main :: IO () main = do powTExp :: (TExp (Double -> Double)) <- runQ (mkPow 23) putStrLn . pprint . unType $ powTExp pow <- compile powTExp print $ pow 2.0 print $ pow 10.0

## Related work

Typed brackets bring much of the power of MetaML^{1} to Haskell. They are
also similar to F#'s code quotations^{2}; however, Haskell's untyped
brackets are now completely untyped—the type checker doesn't check them at
all. In classic Template Haskell, the compiler attempted to type check all
varieties of bracket, but an expression bracket still resulted in a plain old
`ExpQ`

. F#'s untyped quotations *are* type checked, although their precise
semantics are unclear to me. I like the untyped/typed distinction that is now
made by the new Template Haskell; typed brackets really are typed, and untyped
brackets are only parsed, never type checked.

Typed Template Haskell is quite different from MetaHaskell^{3},
which provides a mechanism for quoting many different object languages while
maintaining type safety. MetaHaskell also supports quotation and antiquotation
using open code, which introduces a host of difficulties.

## Footnotes:

^{1}

W. Taha and T. Sheard,
"Multi-stage
programming with explicit annotations," in *Proceedings of the 1997 ACM SIGPLAN symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM '97)*, Amsterdam, The Netherlands, 1997, pp. 203–217.

^{2}

^{3}

G. Mainland,
"Explicitly
heterogeneous metaprogramming with MetaHaskell," in *Proceedings of the 17th
ACM SIGPLAN International Conference on Functional Programming (ICFP '12)*,
Copenhagen, Denmark, 2012, pp. 311–322.

Could you please elaborate on the difficulties introduced by free variables? Would be interesting to compare your experience with ours in the Scala reflection framework.

ReplyDeleteAlso, don't you find typed quasiquotes to be at times limiting? When would you recommend using them instead of untyped quasiquotes?

Finally have you looked into a possibility of mixing untyped and typed quasiquotes? Sometimes you know the type of an untyped quasiquote in advance - it's just that it's composed of parts that cannot be statically typed, so it cannot be given the type statically. In Scala we allow programmers to wrap untyped Tree's in typed Expr[T]'s. Have considered such a design?

I am really looking forward to using your improvements to Template Haskell, both for further work on metaprogramming and because I've hit some of the bugs that originally led to SPJ's proposal.

ReplyDeleteI'm not sure of a solution to the type environment either. I think walking the AST to get the correct imports is probably good enough for at least 95% of cases. But I'm not yet convinced it always is, or that it's the best way.