The following is a blog post written for PureScript, but should be able to work with Haskell with only very few modifications.

The Tagless Final encoding has gained some steam recently, with some people hailing 2017 as the year of Tagless Final.
Being conceptually similar to the Free Monad, different comparisons have been brought up and the one trade-off that always comes up is the lack or the difficulty of inspection of tagless final programs and in fact, I couldn’t find a single example on the web.
This seems to make sense, as programs in the tagless final encoding aren’t values, like programs expressed in terms of free structures.
However, in this blog post, I’d like to dispell the myth that inspecting and optimizing tagless final programs is more difficult than using `Free`

.

To start with, this blog post is gonna use a different tagless final encoding not based on type classes, but records instead. This allows us to treat interpreters as values. This technique is described here.

Without further ado, let’s get into it, starting with our example algebra, a very simple key-value store:

```
newtype KVStore f = KVStore
{ put :: String -> String -> f Unit
, get :: String -> f (Maybe String)
}
```

To get the easiest example out of the way, here’s how to achieve parallelism in a tagless final program:

```
program :: forall f m. Parallel m f => KVStore f -> f (Maybe String)
program (KVStore k) = do
k.put "A" a
x <- (<>) <$> k.get "B" `parApply` k.get "C"
k.put "X" (fromMaybe "-" x)
pure x
```

This programs makes use of the `Parallel`

type class, that allows us to make use of the `parApply`

combinator to use independent computations with a related `Applicative`

type. This is already much simpler than doing the same thing with `Free`

and `FreeApplicative`

. For more info on `Parallel`

check out the docs here.

However this is kind of like cheating, we’re not really inspecting the structure of our program at all, so let’s look at an example where we actually have access to the structure to do optimizations with.

Let’s say we have the following program:

```
program :: forall f. Apply f => KVStore f -> f (List String)
program mouse (KVStore k) = (\f s _ t -> catMaybes (f : s : t : Nil)) <$>
k.get "Cats" <*> k.get "Dogs" <*> k.put "Mice" "42" <*> k.get "Cats"
```

Not a very exciting program, but it has some definite optimization potential.
Right now, if our KVStore implementation is an asynchronous one with a network boundary, our program will make 4 network requests sequentially if interpreted with the standard `Apply`

instance of something like `Aff`

.
We also have a duplicate request with the `"Cats"`

-key.

So let’s look at what we could potentially do about this.
The first thing we should do, is extract the static information.
The easiest way to do so, is to interpret it into something we can use using a `Monoid`

.
This is essentially equivalent to the `analyze`

function commonly found on `FreeApplicative`

.

Getting this done, is actually quite simple, as we can use `Const`

as our `Applicative`

data type, whenever the lefthand side of `Const`

is a `Monoid`

.
I.e. if `m`

has a `Monoid`

instance, `Const m a`

has an `Applicative`

instance.
You can read more about `Const`

here.

```
import Prelude
import Data.StrMap as M
import Data.Set as S
import Data.Const
analysisInterpreter :: KVStore (Const (Tuple (S.Set String) (M.StrMap String)))
analysisInterpreter = KVStore
{ put : \key value -> Const $ tuple2 S.empty (M.singleton key value)
, get : \key -> Const $ tuple2 (S.singleton key) M.empty
}
(Const (program analysisInterpreter))
```

By using a Tuple of `Set`

and `Map`

as our `Monoid`

, we now get all the unique keys for our `get`

and `put`

operations.
Next, we can use this information to recreate our program in an optimized way.

```
optimizedProgram :: forall f. Apply f => KVStore f -> f (List String)
optimizedProgram (KVStore k) =
let (Const (Tuple gets puts)) = program analysisInterpreter
in traverse (\(Tuple k v) -> k.put k v) (fromFoldable puts) *> traverse k.get (fromFoldable gets)
```

And we got our first very simple optimization.
It’s not much, but we can imagine the power of this technique.
For example, if we were using something like `GraphQL`

, we could sum all of our `get`

requests into one large request, so only one network roundtrip is made.
We could imagine similar things for other use cases, e.g. if we’re querying a bunch of team members that all belong to the same team, it might make sense to just make one request to all the team’s members instead of requesting them all individually.

Other more complex optimizations could involve writing a new interpreter with the information we gained from our static analysis. One could also precompute some of the computations and then create a new interpreter with those computations in mind.

Embedding our Applicative program inside a larger monadic program is also trivial:

```
program :: forall f. Apply f => String -> KVStore f -> f (List String)
program mouse (KVStore k) = (\f s _ t -> catMaybes (f : s : t)) <$>
k.get "Cats" <*> k.get "Dogs" <*> k.put "Mice" mouse <*> k.get "Cats"
optimizedProgram :: forall f. Apply f => String -> KVStore f -> f (List String)
optimizedProgram mouse (KVStore k) =
let (Const (Tuple gets puts)) = program mouse analysisInterpreter
in traverse (\(Tuple k v) -> k.put k v) (fromFoldable puts) *> traverse k.get (fromFoldable gets)
monadicProgram :: forall f. Bind f => KVStore f -> f Unit
monadicProgram (KVStore k) = do
mouse <- k.get "Mice"
list <- optimizedProgram (fromMaybe "64" mouse) k
k.put "Birds" (fromMaybe "128" (head list))
```

Here we refactor our `optimizedProgram`

to take an extra parameter `mouse`

. Then in our larger `monadicProgram`

, we perform a `get`

operation and then apply its result to `optimizedProgram`

.

So now we have a way to optimize our one specific program, next we should see if we can introduce some abstraction.

First we’ll have to look at the shape of a generic program, they usually are functions from an interpreter `algebra f`

to an expression inside the type constructor `f`

, such as `f a`

.

```
type Program alg a = forall f. Applicative f => alg f -> f a
```

The program is only defined by the algebra and the resulting `a`

, so it should work for all type constructors `f`

.

```
optimize :: forall alg f a m. Applicative f
=> Monoid m
=> Program alg a
-> alg (Const m)
-> m -> f a
-> alg f
-> f a
optimize program extract restructure =
let (Const m) = program extract
in restructure m
```

Now we should be able to express our original optimization with this new generic approach:

```
optimizedProgram :: forall f. Apply f => String -> KVStore f -> f (List String)
optimizedProgram mouse (KVStore k) =
optimize program analysisInterpreter (\(Tuple gets puts) ->
traverse (\(Tuple k v) -> k.put k v) (fromFoldable puts) *> traverse k.get (fromFoldable gets))
```

So far so good, we’ve managed to write a function to generically optimize tagless final programs.
However, one of the main advantages of tagless final is that implementation and logic should be separate concerns.
With what we have right now, we’re violating the separation, by mixing the optimization part with the program logic part.
Our optimization should be handled by the interpreter, just as the sequencing of individual steps of a monadic program is the job of the target `Monad`

instance.

One way to go forward, is to create a typeclass that requires certain algebras to be optimizable. This typeclass could be written using the generic function we wrote before, so let’s see what we can come up with:

```
type OptimizerReqs alg f m =
{ extract :: alg (Const m)
, rebuild :: m -> alg f -> f (alg f)
}
class (Monad f, Monoid m) <= Optimizer alg f m | alg -> f , f -> m where
reqs :: OptimizerReqs alg f m
optimize :: forall alg f m a. Optimizer alg f m
=> Program alg a
-> alg f
-> f a
optimize prog interpreter =
let (Const m2) = prog (reqs :: OptimizerReqs alg f m).extract
in (reqs.rebuild m2 interpreter) >>= prog
```

This might look a bit daunting at first, but we’ll go through it bit by bit.
First we define our type class `Optimizer`

parameterized by an algebra `alg :: (* -> *) -> *`

and a type constructor `f :: * -> *`

.
This means we can define different optimizations for different algebras and different target types.
For example, we might want a different optimization for a production `Optimizer KVStore (EitherT Task e) m`

and a testing `Optimizer KVStore Identity m`

.
Next, for our interpreter we need a `Monoid m`

for our static analysis, so we parametrize the `Optimizer`

with an extra parameter `m`

.

The next two functions should seem familiar, the `extract`

function defines an interpreter to get an `m`

out of our program.
The `rebuild`

function takes that value of `m`

and the interpreter and produces an `f alg f`

, which can be understood as an `f`

of an interpreter.
This means that we can statically analyze a program and then use the result of that to create a new optimized interpreter and this is exactly what the `optimize`

function does.
This is also why we needed the `Monad`

constraint on `f`

, we could also get away with returning just a new interpreter `alg f`

from the `rebuild`

method and get away with an `Applicative`

constraint, but we can do more different things this way.

Let’s see what our program would look like with this new functionality:

```
monadicProgram :: forall f m. Optimizer KVStore f m => KVStore f -> f Unit
monadicProgram (KVStore k) = do
mouse <- k.get "Mice"
list <- optimize (program $ fromMaybe "64" mouse) (KVStore k)
k.put "Birds" (fromMaybe "128" (head list))
```

Looking good so far, now all we need to run this is an actual instance of `Optimizer`

.
We’ll use the standard `Aff`

for this and for simplicity our new optimization will only look at the `get`

operations:

```
extract :: KVStore (Const (S.Set String))
extract = KVStore
{ get : \key -> Const $ S.singleton key
, put : \_ _ -> Const $ S.empty
}
rebuild :: forall e. S.Set String -> KVStore (Aff e) -> Aff e (KVStore (Aff e))
rebuild gs (KVStore interp) =
precomputed <#> (\m -> KVStore $ interp
{ get = \key -> case (M.lookup key m) of
Just a -> pure $ Just a
Nothing -> interp.get key
})
where
tupleList :: Aff e (List (Maybe (Tuple String String)))
tupleList =
parTraverse (\key -> interp.get key <#> (\m -> m <#> \s -> key /\ s)) (fromFoldable gs)
precomputed :: Aff e (M.Map String String)
precomputed = tupleList <#> (M.fromFoldable <<< catMaybes)
instance kvStoreAffOptimizer :: Optimizer KVStore (Aff e) (S.Set String) where
reqs = { extract , rebuild }
```

Our `Monoid`

type is just a simple `Set String`

here, as the `extract`

function will only extract the `get`

operations inside the `Set`

.
Then with the `rebuild`

we build up our new interpreter.
First we want to precompute all the values of the program.
To do so, we just run all the operations in parallel and put them into a `Map`

, while discarding values where the `get`

operation returned `Nothing`

.
Now when we have that precomputed `Map`

, we’ll create a new interpreter with it, that will check if the key given to `get`

operation is in the precomputed `Map`

instead of performing an actual request.
We can then lift the value into a `Aff e (Maybe String)`

.
For all the `put`

operations, we’ll simply run the interpreter.

Now we should have a great optimizer for `KVStore`

programs interpreted into an `Aff`

.
Let’s see how we did by interpreting into a silly implementation that only prints whenever you use one of the operations:

```
testInterpreter :: forall e. KVStore (Aff e)
testInterpreter = KVStore
{ put : \_ value -> do
liftEff $ unsafeCoerceEff $ log $ "Put something " <> value
pure unit
, get : \key -> do
liftEff $ unsafeCoerceEff $ log $ "Hit network for " <> key
pure $ Just $ key <> "!"
}
```

Now let’s run our program with this interpreter and the optimizations!

```
launchAff $ monadicProgram testInterpreter
// Hit network for Mice
// Hit network for Cats
// Hit network for Dogs
// Put something: Mice!
// Put something: Cats!
```

And it works, we’ve now got a principled way to write programs that can then be potentially optimized.

## Conclusion

Designing a way to completely separate the problem description from the actual problem solution is fairly difficult. The tagless final encoding allows us one such fairly simple way. Using the technique described in this blog post, we should be able to have even more control over the problem solution by inspecting the structure of our program statically.

Another thing we haven’t covered here, are programs with multiple algebras, which is quite a bit more complex as you can surely imagine, maybe that will be the topic of a follow up blog post.

The code for this blog post can be found here, if people find it useful enough, I’ll publish and document it!

What kind of problems and techniques would you like to see with regards to tagless final? Would love to hear from you in the comments!