Understanding applicatives in Haskell
10 Apr 2024 (haskell, functional-programming)
My previous post was about the problem of parsing left-recursive grammars when using the parser combinator pattern in Haskell. In my code examples, I made extensive use of applicatives but didn’t explain what they are or how they work. This was partially because their syntax is fairly intuitive when applied to parser construction, but also because I hadn’t really grokked them myself—I was relying heavily on that intuitiveness without properly understanding what I was doing. So, I decided to take a deeper look at applicatives, what they are and how they work.
Rather than defining applicatives from the get-go, I’ll present how they’re used by parser libraries, as this is the way I first became properly aware of them.
Let’s say we’re building a programming language, and we want to be
able to declare variables in the style of C—for example, we might
declare an integer like int x;
. To represent this in the
AST, we might create a type like so:
data VarDecl = VarDecl { varType :: String, varName :: String }
Then to parse this kind of expression, we might create a parser like so:
varDeclP :: Parser VarDecl
= VarDecl <$> string <*> string <* symbol ";" varDeclP
As I mentioned previously, it is fairly intuitive to read this code
as “we want to parse a string, then another string, then a semicolon and
then use those things to construct a VarDecl
.” But how
exactly does this work? It isn’t immediately obvious how the two strings
end up populating the varType
and varName
fields correctly. To understand it, let’s break down this parser
function piece by piece.
Data constructors are functions
The first important thing to know is that in Haskell, instantiating a
data type is done using a data constructor, and these data
constructors are just functions. In the snippet above where we define
the VarDecl
type, we write VarDecl
on both
sides of the =
sign. The instance on the left is defining
the name of the type, whereas the instance on the right is
specifying the name of the constructor used for instantiating
the type. If we wanted to, we could give the constructor a different
name from the type, like VarDeclConstructor
.
We then specify that VarDecl
is a record type by placing
the fields we expect between curly braces. It isn’t obvious, but we’re
actually creating a function called VarDecl
with the
signature String -> String -> VarDecl
—a function that
takes two strings and returns a VarDecl
. The function’s
arguments are the fields we specified between the curly braces, in the
order we specified them.
To explicitly construct a VarDecl
for the expression
int x;
we would use the data constructor like so:
VarDecl "int" "x"
-- This creates a value like
-- VarDecl { varType = "int", varName = "x" }
Understanding data constructors gets us a step closer to
understanding how the two strings in varDeclP
are used to
construct a VarDecl
: we’re simply passing arguments to a
function in the order they appear in the constructor definition.
What is a functor?
The next term in the parser is the <$>
operator.
This is an alias of the fmap
function, made into an infix
operator. fmap
is part of the Functor
typeclass—it has the signature:
fmap :: Functor f => (a -> b) -> f a -> f b
It is used to “lift” a function into a functorial context, meaning a
function that takes a
and returns b
becomes a
function that takes f a
(a
in the context of a
functor f
) and returns f b
(b
in
the context of a functor f
).
I won’t go into detail about functors here, but they can be thought
of as types which can be mapped over. The clearest example of a functor
is the list type. For a list, fmap
behaves exactly the same
as map
: replacing f
in fmap
’s
signature with list types gives us:
fmap :: (a -> b) -> [a] -> [b]
fmap
takes a function from a -> b
and
gives us back a function that applies that function to each element in a
list and then returns the resulting list. Since <$>
is an alias for fmap
, the following expressions mean the
same thing:
fmap f xs == f <$> xs
Importantly, the behaviour of fmap
changes based on the
particular functorial context of its second argument. Let’s add
parentheses to our parser to make it clearer how this applies there.
varDeclP :: Parser VarDecl
= ((VarDecl <$> string) <*> string) <* symbol ";" varDeclP
Function application is left-associative, so the parentheses do not alter the function’s behaviour, they only highlight the associativity.
The first sub-expression we have is
VarDecl <$> string
. We are lifting the data
constructor VarDecl
(which, again, is just a function) into
whichever context string
is in. Which context is that? The
string
parser has the signature
string :: Parser String
, so assuming Parser
is
a member of the Functor
typeclass (it needs to be for the
parser combinator pattern to work) we can use it as the second argument
to fmap
. Let’s substitute Parser
for
f
and String
for a
in the
fmap
type signature:
fmap :: (String -> b) -> Parser String -> Parser b
This works, but what is the type of b
? Well, functions
are always curried, and the VarDecl
constructor has the
signature VarDecl :: String -> String -> VarDecl
, so
passing a single string argument to it leaves us with a resulting
function of type String -> VarDecl
—this is
b
.
fmap :: (String -> (String -> VarDecl)) -> Parser String -> Parser (String -> VarDecl)
Because our VarDecl
constructor is a binary function,
mapping it over a single argument in the Parser
context
results in a partially applied function lifted into that context. What
we want is to continue passing arguments to the function so we can
finish building our VarDecl
record, but we’ll need another
way to invoke it now that it’s in the Parser
context. Now
we’re ready to introduce applicatives.
What is an applicative?
An applicative (short for applicative functor) is a special case of
functors, where the type of value inside the functor can be a function.
For a type to qualify as a member of the Applicative
class,
it must implement two functions:
pure :: Functor f => a -> f a
<*> :: Functor f => f (a -> b) -> f a -> f b
The pure
function takes a value of type a
and lifts it into the context of a functor type f
. This
means that for a Functor f
to also be a member of the
Applicative
class, it must be possible to construct an
instance of f
from a single value.
The <*>
function, called the sequential
application operator, takes an applicative value (a function
a -> b
in a functorial context) and a value of type
f a
(a
in the same context). The value inside
the f a
argument is fed into the function
a -> b
inside the applicative, and the resulting
b
value is returned, still inside the f
context. As with functors, the specifics of how this behaviour is
implemented is different for each instance of
Applicative
.
Let’s revisit the list functor for a concrete example. To test if
lists are a member of the Applicative
class, we’ll need to
see if we can implement the pure
and <*>
functions. We can implement pure
by simply taking some
value x
and creating a list where x
is the
only element.
pure :: a -> [a]
pure x = [x]
For sequential application, we need to accept a list of functions
a -> b
along with a list of a
values, apply
the each function in the first list to each value in the second, and
then return the list of results.
<*> :: [a -> b] -> [a] -> [b]
:fs) <*> xs = (f <$> xs) <> (fs <*> xs)
(f<*> _ = [] []
This recursive implementation takes the list of functions and the
list of values, and it maps the first function over the full list of
values (using the fmap <$>
operator discussed
previously). Then it calls itself with the remainder of the list of
functions and the full list of values. When finally called with an empty
list of functions, it returns an empty list of values. This is not the
official implementation, it’s only an example to demonstrate that both
functions required by the Applicative
class can be
implemented for lists, and hence, lists are an example of an
Applicative
type.
Parser
as an
applicative
Returning to our parser function, we’ve evaluated the first
sub-expression and found that it evaluates to the type
Parser (String -> VarDecl)
. We know that
Parser
is a functor, so this type certainly looks like an
applicative, and indeed it is. Following the parser combinator pattern,
the Parser
type might be defined like so:
data Parser a = Parser { parse :: String -> Maybe (a, String) }
The Parser
has a parse
method which
attempts to construct a value of type a
by consuming
characters from an input string. If successful, the value is returned in
a tuple along with the remainder of the input string. Otherwise, it
returns Nothing
. Typically Either
would be
used instead of Maybe
to enable more informative errors to
be returned, but I’m using Maybe
here for simplicity. All
parsing errors just result in Nothing
.
The pure
and <*>
functions for this
type might be implemented like so:
pure :: a -> Parser a
pure x = Parser $ \input -> Just (a, input)
<*> :: Parser (a -> b) -> Parser a -> Parser b
Parser f <*> Parser g = do
Parser f <*> Parser g = Parser $ \input ->
case f input of
Nothing -> Nothing
Just (f', input') ->
case g input' of
Nothing -> Nothing
Just (val, rem) -> Just (f' val, rem)
The pure
function is fairly self-explanatory—it simply
takes a value x
and uses it to construct a parser that
consumes no input and always returns x
along with the full
input stream.
Sequential application is a little more complicated. This function
takes two parsers as its arguments: the first is a parser that attempts
to construct a function a -> b
; the second is a parser
that attempts to construct a value of type a
. It runs the
first parser on the input, and if it succeeds to construct the function
a -> b
, it continues by then applying the second parser
to the remaining input. If the second parser is also successful in
constructing a value of type a
, that value is passed to the
function a -> b
, resulting in a value of type
b
. This final value is then returned in the form of a
parser which always returns that b
value without consuming
any input. If either of the parsers fail, the whole operation results in
Nothing
.
This part had me stumped for a while. I couldn’t wrap my head around
the idea of a function inside a parser. A type like
Parser String
or Parser Int
is intuitive to
understand—they’re trying to parse strings or integers from their
inputs. But how could an unevaluated function be parsed from the text?
The key to understanding this is to think of Parser
not as
a container, but as a name for a particular kind of computation—namely a
computation that has access to, and can consume, a stream of input text.
The type passed to Parser
is the type of value you’re
trying to construct within that context. That is,
Parser (a -> b)
is not a parser of functions (which
doesn’t seem to make much sense), it is a function
a -> b
resulting from a computation in the
Parser
context.
With that said, we can now better make sense of our parser function.
Again, the sub-expression VarDecl <$> string
results
in a value of type Parser (String -> VarDecl)
. We have
used the string
function to consume a String
from the input text, and fed this into the VarDecl
constructor, partially applying it and resulting in another function
String -> VarDecl
. Because this is part of a computation
in the Parser
context (we lifted VarDecl
into
that context using fmap <$>
), this resulting function
is also in the Parser
context.
Working our way from the inside out, the next sub-expression is
(VarDecl <$> string) <*> string
(including the
expression we’ve already looked at). Here we’re using the sequential
application operator <*>
. Its first argument is the
result of the first expression (of type
Parser (String -> VarDecl)
) and its second argument is
the string
function which has a type of
Parser String
. Let’s substitute these types into the type
signature for the <*>
function.
<*> :: Parser (String -> VarDecl) -> Parser String -> Parser VarDecl
Thinking about how the <*>
is implemented for the
Parser
type, this expression will run the first parser to
parse a string and use it to partially apply the VarDecl
constructor as we’ve already described. Then it will run the second
parser to parse another string, and if successful, it will pass that
string into the partially applied function, finally arriving at a value
of type VarDecl
which it returns still wrapped in the
Parser
context. Voilà! We’ve constructed the
VarDecl
type by composing parsers together.
But hold on, the full function is:
= ((VarDecl <$> string) <*> string) <* symbol ";" varDeclP
There is another part to the expression, a symbol
parser
which looks for the semicolon that terminates our int x;
expression. It is important for the semicolon to be there, but we don’t
need its value to construct a VarDecl
record, so how is
this parser incorporated into the function?
This final parser is being applied using an <*
operator we haven’t looked at yet. This operator is also a part of the
Applicative
class, but it is not part of the minimal
definition. The type signature for this operator is:
<* :: Applicative f => f a -> f b -> f a
This is a function which takes two values both in an applicative context, and just seems to return the first one. This initially seems kind of pointless, but take a look at how it might be implemented.
Parser f <* Parser g = Parser $ \input ->
case f input of
Nothing -> Nothing
Just (x, rem) ->
case g rem of
Nothing -> Nothing
Just (_, rem') -> Just (x, rem')
Although the value produced by the second parser isn’t used, we do
nonetheless run both parsers, allowing them both to consume input and
possibly fail. If the second parser does fail, the whole expression also
fails. This is ideal for our case where we want to parse a semicolon but
then throw the parsed value away. Let’s plug the types from our parser
into the type signature for <*
.
<* :: Parser VarDecl -> Parser String -> Parser VarDecl
The parser on the left-hand side of the <*
is
executed and its resulting value is the value returned from the overall
expression. The parser on the right-hand side does have to run
successfully, but its resulting value has no effect on the output of the
overall expression.
The bigger picture
We’ve looked at how applicatives are used in the parser combinator
pattern, but the Applicative
class is intentionally
abstract so that it can be used in lots of different contexts. In
Haskell, I often find myself able to understand an abstraction well
enough to use it in a certain context, but unable to fully grasp the
essence of what it aims to express.
For example, I’ve been aware for quite a long time that an
applicative is “a function inside a functor”. That mental model works
well enough when thinking about lists or the Maybe
type,
because both of those types can be thought of as containers for other
values, which is intuitive. However, that understanding tripped me up
when it came to learning about parsers; how can a parser be a container
for a value, especially when that value is a function? Haskell is unique
in that you can often ascertain a lot about what a function does just by
looking at its type signature. Of course, this often isn’t all the
information you need to effectively use it, implementation does matter,
but often the essence of an abstraction is best understood by looking at
its type.
Take another look at the type of the sequential application operator:
<*> :: Functor f => f (a -> b) -> f a -> f b
It bears a resemblance to the type of the function application
operator $
:
$ :: (a -> b) -> a -> b
They seem to be expressing the same thing, except the former is in a
functorial context. The function application operator $
is
a controversial one, because in a simple expression, it doesn’t do
anything useful. For example, the following expressions are
equivalent:
== f $ x f x
The only purpose served by the operator is in longer, more complex
expressions where it can be used to disambiguate the precedence of
terms. The reason for this is because there is no question of how a
function a -> b
can be mapped over a value
a
—this is a pure function and a pure value, so Haskell
knows how to evaluate it. This is the essence of applicatives as an
abstraction.
The sequential application operator <*>
is serving a very similar purpose to the function application
operator $
, but in the former case, Haskell doesn’t know
how to apply the function to the argument, because they’re both in an
applicative context. The <*>
operator leverages the
applicative type to gain further information about how to map
the function over the value.
This is incredibly powerful; we’re able to express the composition of potentially very complex logic in terms of simple functions, because the complexity is abstracted away in the type. In the language of category theory, we’re able to move into another context (like a parser context) and express morphisms in that context using the syntax and semantics of functions.
After improving my understanding of applicatives, the question occurred to me: “then what is a monad?” That could warrant its own post, monads serve a similar purpose, the difference essentially being that monads can be chained together with the results of one term in the chain depending on the results of the previous terms—this isn’t possible with applicatives. In cases where that sort of behaviour isn’t needed, applicatives are a more natural fit.