We’ve seen how monads can be used to take values with contexts and apply
them to functions and how using >>=
or do
notation allows us to focus
on the values themselves while the context gets handled for us.
We’ve met the Maybe
monad and seen how it adds a context of possible
failure to values. We’ve learned about the list monad and saw how it
lets us easily introduce non-determinism into our programs. We’ve also
learned how to work in the IO
monad, even before we knew what a monad
was!
In this chapter, we’re going to learn about a few other monads. We’ll see how they can make our programs clearer by letting us treat all sorts of values as monadic ones. Exploring a few monads more will also solidify our intuition for monads.
The monads that we’ll be exploring are all part of the mtl
package. A
Haskell package is a collection of modules. The mtl
package comes with
the Haskell Platform, so you probably already have it. To check if you
do, type ghc-pkg list
in the command-line. This will show which Haskell
packages you have installed and one of them should be mtl
, followed by a
version number.
We’ve loaded our gun with the Maybe
monad, the list monad and the IO
monad. Now let’s put the Writer
monad in the chamber and see what
happens when we fire it!
Whereas Maybe
is for values with an added context of failure and the
list is for non-deterministic values, the Writer
monad is for values
that have another value attached that acts as a sort of log value.
Writer
allows us to do computations while making sure that all the log
values are combined into one log value that then gets attached to the
result.
For instance, we might want to equip our values with strings that explain what’s going on, probably for debugging purposes. Consider a function that takes a number of bandits in a gang and tells us if that’s a big gang or not. That’s a very simple function:
isBigGang :: Int -> Bool
isBigGang x = x > 9
Now, what if instead of just giving us a True
or False
value, we want it
to also return a log string that says what it did? Well, we just make
that string and return it alongside our Bool
:
isBigGang :: Int -> (Bool, String)
isBigGang x = (x > 9, "Compared gang size to 9.")
So now instead of just returning a Bool
, we return a tuple where the
first component of the tuple is the actual value and the second
component is the string that accompanies that value. There’s some added
context to our value now. Let’s give this a go:
ghci> isBigGang 3
(False,"Compared gang size to 9.")
ghci> isBigGang 30
(True,"Compared gang size to 9.")
So far so good. isBigGang
takes a normal value and returns a value with
a context. As we’ve just seen, feeding it a normal value is not a
problem. Now what if we already have a value that has a log string
attached to it, such as (3, "Smallish gang.")
, and we want to feed it to
isBigGang
? It seems like once again, we’re faced with this question: if
we have a function that takes a normal value and returns a value with a
context, how do we take a value with a context and feed it to the
function?
When we were exploring the Maybe
monad, we made a function applyMaybe
,
which took a Maybe a
value and a function of type a -> Maybe b
and fed
that Maybe a
value into the function, even though the function takes a
normal a
instead of a Maybe a
. It did this by minding the context that
comes with Maybe a
values, which is that they are values with possible
failure. But inside the a -> Maybe b
function, we were able to treat
that value as just a normal value, because applyMaybe
(which later
became >>=
) took care of checking if it was a Nothing
or a Just
value.
In the same vein, let’s make a function that takes a value with an
attached log, that is, an (a,String)
value and a function of type
a -> (b,String)
and feeds that value into the function. We’ll call it
applyLog
. But because an (a,String)
value doesn’t carry with it a
context of possible failure, but rather a context of an additional log
value, applyLog
is going to make sure that the log of the original value
isn’t lost, but is joined together with the log of the value that
results from the function. Here’s the implementation of applyLog
:
applyLog :: (a,String) -> (a -> (b,String)) -> (b,String)
applyLog (x,log) f = let (y,newLog) = f x in (y,log ++ newLog)
When we have a value with a context and we want to feed it to a
function, we usually try to separate the actual value from the context
and then try to apply the function to the value and then see that the
context is taken care of. In the Maybe
monad, we checked if the value
was a Just x
and if it was, we took that x
and applied the function to
it. In this case, it’s very easy to find the actual value, because we’re
dealing with a pair where one component is the value and the other a
log. So first we just take the value, which is x
and we apply the
function f
to it. We get a pair of (y,newLog)
, where y
is the new result
and newLog
the new log. But if we returned that as the result, the old
log value wouldn’t be included in the result, so we return a pair of
(y,log ++ newLog)
. We use ++
to append the new log to the old one.
Here’s applyLog
in action:
ghci> (3, "Smallish gang.") `applyLog` isBigGang
(False,"Smallish gang.Compared gang size to 9")
ghci> (30, "A freaking platoon.") `applyLog` isBigGang
(True,"A freaking platoon.Compared gang size to 9")
The results are similar to before, only now the number of people in the
gang had its accompanying log and it got included in the result log.
Here are a few more examples of using applyLog
:
ghci> ("Tobin","Got outlaw name.") `applyLog` (\x -> (length x, "Applied length."))
(5,"Got outlaw name.Applied length.")
ghci> ("Bathcat","Got outlaw name.") `applyLog` (\x -> (length x, "Applied length"))
(7,"Got outlaw name.Applied length")
See how inside the lambda, x
is just a normal string and not a tuple and
how applyLog
takes care of appending the logs.
Be sure you know what monoids are at this point! Cheers.
Right now, applyLog
takes values of type (a,String)
, but is there a
reason that the log has to be a String
? It uses ++
to append the logs,
so wouldn’t this work on any kind of list, not just a list of
characters? Sure it would. We can go ahead and change its type to this:
applyLog :: (a,[c]) -> (a -> (b,[c])) -> (b,[c])
Now, the log is a list. The type of values contained in the list has to
be the same for the original list as well as for the list that the
function returns, otherwise we wouldn’t be able to use ++
to stick them
together.
Would this work for bytestrings? There’s no reason it shouldn’t.
However, the type we have now only works for lists. It seems like we’d
have to make a separate applyLog
for bytestrings. But wait! Both lists
and bytestrings are monoids. As such, they are both instances of the
Monoid
type class, which means that they implement the mappend
function.
And for both lists and bytestrings, mappend
is for appending. Watch:
ghci> [1,2,3] `mappend` [4,5,6]
[1,2,3,4,5,6]
ghci> B.pack [99,104,105] `mappend` B.pack [104,117,97,104,117,97]
Chunk "chi" (Chunk "huahua" Empty)
Cool! Now our applyLog
can work for any monoid. We have to change the
type to reflect this, as well as the implementation, because we have to
change ++
to mappend
:
applyLog :: (Monoid m) => (a,m) -> (a -> (b,m)) -> (b,m)
applyLog (x,log) f = let (y,newLog) = f x in (y,log `mappend` newLog)
Because the accompanying value can now be any monoid value, we no longer
have to think of the tuple as a value and a log, but now we can think of
it as a value with an accompanying monoid value. For instance, we can
have a tuple that has an item name and an item price as the monoid
value. We just use the Sum
newtype to make sure that the prices get
added as we operate with the items. Here’s a function that adds drink to
some cowboy food:
import Data.Monoid
type Food = String
type Price = Sum Int
addDrink :: Food -> (Food,Price)
addDrink "beans" = ("milk", Sum 25)
addDrink "jerky" = ("whiskey", Sum 99)
addDrink _ = ("beer", Sum 30)
We use strings to represent foods and an Int
in a Sum
newtype
wrapper to
keep track of how many cents something costs. Just a reminder, doing
mappend
with Sum
results in the wrapped values getting added together:
ghci> Sum 3 `mappend` Sum 9
Sum {getSum = 12}
The addDrink
function is pretty simple. If we’re eating beans, it
returns "milk"
along with Sum 25
, so 25 cents wrapped in Sum
. If we’re
eating jerky we drink whiskey and if we’re eating anything else we drink
beer. Just normally applying this function to a food wouldn’t be
terribly interesting right now, but using applyLog
to feed a food that
comes with a price itself into this function is interesting:
ghci> ("beans", Sum 10) `applyLog` addDrink
("milk",Sum {getSum = 35})
ghci> ("jerky", Sum 25) `applyLog` addDrink
("whiskey",Sum {getSum = 124})
ghci> ("dogmeat", Sum 5) `applyLog` addDrink
("beer",Sum {getSum = 35})
Milk costs 25
cents, but if we eat it with beans that cost 10
cents,
we’ll end up paying 35
cents. Now it’s clear how the attached value
doesn’t always have to be a log, it can be any monoid value and how two
such values are combined into one depends on the monoid. When we were
doing logs, they got appended, but now, the numbers are being added up.
Because the value that addDrink
returns is a tuple of type (Food,Price)
,
we can feed that result to addDrink
again, so that it tells us what we
should drink along with our drink and how much that will cost us. Let’s
give it a shot:
ghci> ("dogmeat", Sum 5) `applyLog` addDrink `applyLog` addDrink
("beer",Sum {getSum = 65})
Adding a drink to some dog meat results in a beer and an additional 30
cents, so ("beer", Sum 35)
. And if we use applyLog
to feed that to
addDrink
, we get another beer and the result is ("beer", Sum 65)
.
Writer
typeNow that we’ve seen that a value with an attached monoid acts like a
monadic value, let’s examine the Monad
instance for types of such
values. The Control.Monad.Writer
module exports the Writer w a
type
along with its Monad
instance and some useful functions for dealing with
values of this type.
First, let’s examine the type itself. To attach a monoid to a value, we
just need to put them together in a tuple. The Writer w a
type is just a
newtype
wrapper for this. Its definition is very simple:
newtype Writer w a = Writer { runWriter :: (a, w) }
It’s wrapped in a newtype
so that it can be made an instance of Monad
and that its type is separate from a normal tuple. The a
type parameter
represents the type of the value and the w
type parameter the type of
the attached monoid value.
Its Monad
instance is defined like so:
instance (Monoid w) => Monad (Writer w) where
return x = Writer (x, mempty)
(Writer (x,v)) >>= f = let (Writer (y, v')) = f x in Writer (y, v `mappend` v')
First off, let’s examine >>=
. Its implementation is essentially the
same as applyLog
, only now that our tuple is wrapped in the
Writer newtype
, we have to unwrap it when pattern matching. We take the value x
and apply the function f
to it. This gives us a Writer w a
value and we
use a let
expression to pattern match on it. We present y
as the new
result and use mappend
to combine the old monoid value with the new one.
We pack that up with the result value in a tuple and then wrap that with
the Writer
constructor so that our result is a Writer
value instead of
just an unwrapped tuple.
So, what about return
? It has to take a value and put it in a default
minimal context that still presents that value as the result. So what
would such a context be for Writer
values? If we want the accompanying
monoid value to affect other monoid values as little as possible, it
makes sense to use mempty
. mempty
is used to present identity monoid
values, such as ""
and Sum 0
and empty bytestrings. Whenever we use
mappend
between mempty
and some other monoid value, the result is that
other monoid value. So if we use return
to make a Writer
value and then
use >>=
to feed that value to a function, the resulting monoid value
will be only what the function returns. Let’s use return
on the number 3
a bunch of times, only we’ll pair it with a different monoid every time:
ghci> runWriter (return 3 :: Writer String Int)
(3,"")
ghci> runWriter (return 3 :: Writer (Sum Int) Int)
(3,Sum {getSum = 0})
ghci> runWriter (return 3 :: Writer (Product Int) Int)
(3,Product {getProduct = 1})
Because Writer
doesn’t have a Show
instance, we had to use runWriter
to
convert our Writer
values to normal tuples that can be shown. For
String
, the monoid value is the empty string. With Sum
, it’s 0
, because
if we add 0 to something, that something stays the same. For Product
,
the identity is 1
.
The Writer
instance doesn’t feature an implementation for fail
, so if a
pattern match fails in do
notation, error
is called.
Now that we have a Monad
instance, we’re free to use do
notation for
Writer
values. It’s handy for when we have a several Writer
values and
we want to do stuff with them. Like with other monads, we can treat them
as normal values and the context gets taken for us. In this case, all
the monoid values that come attached get mappend
ed and so are reflected
in the final result. Here’s a simple example of using do
notation with
Writer
to multiply two numbers:
import Control.Monad.Writer
logNumber :: Int -> Writer [String] Int
logNumber x = Writer (x, ["Got number: " ++ show x])
multWithLog :: Writer [String] Int
multWithLog = do
a <- logNumber 3
b <- logNumber 5
return (a*b)
logNumber
takes a number and makes a Writer
value out of it. For the
monoid, we use a list of strings and we equip the number with a
singleton list that just says that we have that number. multWithLog
is a
Writer
value which multiplies 3
and 5
and makes sure that their attached
logs get included in the final log. We use return
to present a*b
as the
result. Because return
just takes something and puts it in a minimal
context, we can be sure that it won’t add anything to the log. Here’s
what we see if we run this:
ghci> runWriter multWithLog
(15,["Got number: 3","Got number: 5"])
Sometimes we just want some monoid value to be included at some
particular point. For this, the tell
function is useful. It’s part of
the MonadWriter
type class and in the case of Writer
it takes a monoid
value, like ["This is going on"]
and creates a Writer
value that
presents the dummy value ()
as its result but has our desired monoid
value attached. When we have a monadic value that has () as its result,
we don’t bind it to a variable. Here’s multWithLog
but with some extra
reporting included:
multWithLog :: Writer [String] Int
multWithLog = do
a <- logNumber 3
b <- logNumber 5
tell ["Gonna multiply these two"]
return (a*b)
It’s important that return (a*b)
is the last line, because the result
of the last line in a do
expression is the result of the whole do
expression. Had we put tell
as the last line, ()
would have been the
result of this do
expression. We’d lose the result of the
multiplication. However, the log would be the same. Here is this in
action:
ghci> runWriter multWithLog
(15,["Got number: 3","Got number: 5","Gonna multiply these two"])
Euclid’s algorithm is an algorithm that takes two numbers and computes
their greatest common divisor. That is, the biggest number that still
divides both of them. Haskell already features the gcd
function, which
does exactly this, but let’s implement our own and then equip it with
logging capabilities. Here’s the normal algorithm:
gcd' :: Int -> Int -> Int
gcd' a b
| b == 0 = a
| otherwise = gcd' b (a `mod` b)
The algorithm is very simple. First, it checks if the second number is
ghci> gcd' 8 3
1
It does. Very good! Now, we want to equip our result with a context, and
the context will be a monoid value that acts as a log. Like before,
we’ll use a list of strings as our monoid. So the type of our new gcd'
function should be:
gcd' :: Int -> Int -> Writer [String] Int
All that’s left now is to equip our function with log values. Here’s the code:
import Control.Monad.Writer
gcd' :: Int -> Int -> Writer [String] Int
gcd' a b
| b == 0 = do
tell ["Finished with " ++ show a]
return a
| otherwise = do
tell [show a ++ " mod " ++ show b ++ " = " ++ show (a `mod` b)]
gcd' b (a `mod` b)
This function takes two normal Int
values and returns a Writer [String] Int
,
that is, an Int
that has a log context. In the case where b
is 0
,
instead of just giving a
as the result, we use a do
expression to put
together a Writer
value as a result. First we use tell
to report that
we’re finished and then we use return
to present a
as the result of the
do
expression. Instead of this do
expression, we could have also written
this:
Writer (a, ["Finished with " ++ show a])
However, I think the do
expression is easier to read. Next, we have the
case when b
isn’t 0
. In this case, we log that we’re using mod
to figure
out the remainder of dividing a
and b
. Then, the second line of the do
expression just recursively calls gcd'
. Remember, gcd'
now ultimately
returns a Writer
value, so it’s perfectly valid that
gcd' b (a `mod` b)
is a line in a do
expression.
While it may be kind of useful to trace the execution of this new gcd'
by hand to see how the logs get appended, I think it’s more insightful
to just look at the big picture and view these as values with a context
and from that gain insight as to what the final result will be.
Let’s try our new gcd'
out. Its result is a Writer [String] Int
value
and if we unwrap that from its newtype
, we get a tuple. The first part
of the tuple is the result. Let’s see if it’s okay:
ghci> fst $ runWriter (gcd' 8 3)
1
Good! Now what about the log? Because the log is a list of strings,
let’s use mapM_ putStrLn
to print those strings to the screen:
ghci> mapM_ putStrLn $ snd $ runWriter (gcd' 8 3)
8 mod 3 = 2
3 mod 2 = 1
2 mod 1 = 0
Finished with 1
I think it’s awesome how we were able to change our ordinary algorithm
to one that reports what it does as it goes along just by changing
normal values to monadic values and letting the implementation of >>=
for Writer
take care of the logs for us. We can add a logging mechanism
to pretty much any function. We just replace normal values with Writer
values where we want and change normal function application to >>=
(or
do
expressions if it increases readability).
When using the Writer
monad, you have to be careful which monoid to use,
because using lists can sometimes turn out to be very slow. That’s
because lists use ++
for mappend
and using ++
to add something to the
end of a list is slow if that list is really long.
In our gcd'
function, the logging is fast because the list appending
ends up looking like this:
a ++ (b ++ (c ++ (d ++ (e ++ f))))
Lists are a data structure that’s constructed from left to right, and
this is efficient because we first fully construct the left part of a
list and only then add a longer list on the right. But if we’re not
careful, using the Writer
monad can produce list appending that looks
like this:
((((a ++ b) ++ c) ++ d) ++ e) ++ f
This associates to the left instead of to the right. This is inefficient because every time it wants to add the right part to the left part, it has to construct the left part all the way from the beginning!
The following function works like gcd'
, only it logs stuff in reverse.
First it produces the log for the rest of the procedure and then adds
the current step to the end of the log.
import Control.Monad.Writer
gcdReverse :: Int -> Int -> Writer [String] Int
gcdReverse a b
| b == 0 = do
tell ["Finished with " ++ show a]
return a
| otherwise = do
result <- gcdReverse b (a `mod` b)
tell [show a ++ " mod " ++ show b ++ " = " ++ show (a `mod` b)]
return result
It does the recursion first, and binds its result
value to result. Then
it adds the current step to the log, but the current step goes at the
end of the log that was produced by the recursion. Finally, it presents
the result of the recursion as the final result. Here it is in action:
ghci> mapM_ putStrLn $ snd $ runWriter (gcdReverse 8 3)
Finished with 1
2 mod 1 = 0
3 mod 2 = 1
8 mod 3 = 2
It’s inefficient because it ends up associating the use of ++
to the
left instead of to the right.
Because lists can sometimes be inefficient when repeatedly appended in
this manner, it’s best to use a data structure that always supports
efficient appending. One such data structure is the difference list. A
difference list is similar to a list, only instead of being a normal
list, it’s a function that takes a list and prepends another list to it.
The difference list equivalent of a list like [1,2,3]
would be the
function \xs -> [1,2,3] ++ xs
. A normal empty list is []
, whereas an
empty difference list is the function \xs -> [] ++ xs
.
The cool thing about difference lists is that they support efficient
appending. When we append two normal lists with ++
, it has to walk all
the way to the end of the list on the left of ++
and then stick the
other one there. But what if we take the difference list approach and
represent our lists as functions? Well then, appending two difference
lists can be done like so:
f `append` g = \xs -> f (g xs)
Remember, f
and g
are functions that take lists and prepend something to
them. So, for instance, if f
is the function ("dog"++)
(just another way
of writing \xs -> "dog" ++ xs
) and g
the function ("meat"++)
, then
f `append` g
makes a new function that’s equivalent to the following:
\xs -> "dog" ++ ("meat" ++ xs)
We’ve appended two difference lists just by making a new function that first applies one difference list some list and then the other.
Let’s make a newtype
wrapper for difference lists so that we can easily
give them monoid instances:
newtype DiffList a = DiffList { getDiffList :: [a] -> [a] }
The type that we wrap is [a] -> [a]
because a difference list is just a
function that takes a list and returns another. Converting normal lists
to difference lists and vice versa is easy:
toDiffList :: [a] -> DiffList a
toDiffList xs = DiffList (xs++)
fromDiffList :: DiffList a -> [a]
fromDiffList (DiffList f) = f []
To make a normal list into a difference list we just do what we did before and make it a function that prepends it to another list. Because a difference list is a function that prepends something to another list, if we just want that something, we apply the function to an empty list!
Here’s the Monoid
instance:
instance Monoid (DiffList a) where
mempty = DiffList (\xs -> [] ++ xs)
(DiffList f) `mappend` (DiffList g) = DiffList (\xs -> f (g xs))
Notice how for lists, mempty
is just the id
function and mappend
is
actually just function composition. Let’s see if this works:
ghci> fromDiffList (toDiffList [1,2,3,4] `mappend` toDiffList [1,2,3])
[1,2,3,4,1,2,3]
Tip top! Now we can increase the efficiency of our gcdReverse
function
by making it use difference lists instead of normal lists:
import Control.Monad.Writer
gcd' :: Int -> Int -> Writer (DiffList String) Int
gcd' a b
| b == 0 = do
tell (toDiffList ["Finished with " ++ show a])
return a
| otherwise = do
result <- gcd' b (a `mod` b)
tell (toDiffList [show a ++ " mod " ++ show b ++ " = " ++ show (a `mod` b)])
return result
We only had to change the type of the monoid from [String]
to
DiffList String
and then when using tell
, convert our normal lists into
difference lists with toDiffList
. Let’s see if the log gets assembled
properly:
ghci> mapM_ putStrLn . fromDiffList . snd . runWriter $ gcdReverse 110 34
Finished with 2
8 mod 2 = 0
34 mod 8 = 2
110 mod 34 = 8
We do gcdReverse 110 34
, then use runWriter
to unwrap it from the
newtype
, then apply snd
to that to just get the log, then apply
fromDiffList
to convert it to a normal list and then finally print its
entries to the screen.
To get a feel for just how much difference lists may improve your
performance, consider this function that just counts down from some
number to zero, but produces its log in reverse, like gcdReverse
, so
that the numbers in the log will actually be counted up:
finalCountDown :: Int -> Writer (DiffList String) ()
finalCountDown 0 = do
tell (toDiffList ["0"])
finalCountDown x = do
finalCountDown (x-1)
tell (toDiffList [show x])
If we give it 0
, it just logs it. For any other number, it first counts
down its predecessor to 0
and then appends that number to the log. So if
we apply finalCountDown
to 100
, the string "100"
will come last in the
log.
Anyway, if you load this function in GHCi and apply it to a big number,
like 500000
, you’ll see that it quickly starts counting from 0
onwards:
ghci> mapM_ putStrLn . fromDiffList . snd . runWriter $ finalCountDown 500000
0
1
2
...
However, if we change it to use normal lists instead of difference lists, like so:
finalCountDown :: Int -> Writer [String] ()
finalCountDown 0 = do
tell ["0"]
finalCountDown x = do
finalCountDown (x-1)
tell [show x]
And then tell GHCi to start counting:
ghci> mapM_ putStrLn . snd . runWriter $ finalCountDown 500000
We’ll see that the counting is really slow.
Of course, this is not the proper and scientific way to test how fast our programs are, but we were able to see that in this case, using difference lists starts producing results right away whereas normal lists take forever.
Oh, by the way, the song Final Countdown by Europe is now stuck in your head. Enjoy!
In the chapter about
applicatives, we saw that
the function type, (->) r
is an instance of Functor
. Mapping a function
f
over a function g
will make a function that takes the same thing as g
,
applies g
to it and then applies f
to that result. So basically, we’re
making a new function that’s like g
, only before returning its result, f
gets applied to that result as well. For instance:
ghci> let f = (*5)
ghci> let g = (+3)
ghci> (fmap f g) 8
55
We’ve also seen that functions are applicative functors. They allow us to operate on the eventual results of functions as if we already had their results. Here’s an example:
ghci> let f = (+) <$> (*2) <*> (+10)
ghci> f 3
19
The expression (+) <$> (*2) <*> (+10)
makes a function that takes
a number, gives that number to (*2)
and (+10)
and then adds together
the results. For instance, if we apply this function to 3
, it applies
both (*2)
and (+10)
to 3
, giving 6
and 13
. Then, it calls (+)
with 6
and 13
and the result is 19
.
Not only is the function type (->) r
a functor and an applicative
functor, but it’s also a monad. Just like other monadic values that
we’ve met so far, a function can also be considered a value with a
context. The context for functions is that that value is not present yet
and that we have to apply that function to something in order to get its
result value.
Because we’re already acquainted with how functions work as functors and
applicative functors, let’s dive right in and see what their Monad
instance looks like. It’s located in Control.Monad.Instances
and it goes
a little something like this:
instance Monad ((->) r) where
return x = \_ -> x
h >>= f = \w -> f (h w) w
We’ve already seen how pure
is implemented for functions, and return
is
pretty much the same thing as pure
. It takes a value and puts it in a
minimal context that always has that value as its result. And the only
way to make a function that always has a certain value as its result is
to make it completely ignore its parameter.
The implementation for >>=
seems a bit cryptic, but it’s really not
all that. When we use >>=
to feed a monadic value to a function, the
result is always a monadic value. So in this case, when we feed a
function to another function, the result is a function as well. That’s
why the result starts off as a lambda. All of the implementations of
>>=
so far always somehow isolated the result from the monadic value
and then applied the function f
to that result. The same thing happens
here. To get the result from a function, we have to apply it to
something, which is why we do (h w)
here to get the result from the
function and then we apply f
to that. f
returns a monadic value, which
is a function in our case, so we apply it to w
as well.
If don’t get how >>=
works at this point, don’t worry, because with
examples we’ll see how this is a really simple monad. Here’s a do
expression that utilizes this monad:
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
a <- (*2)
b <- (+10)
return (a+b)
This is the same thing as the applicative expression that we wrote
earlier, only now it relies on functions being monads. A do
expression
always results in a monadic value and this one is no different. The
result of this monadic value is a function. What happens here is that it
takes a number and then (*2)
gets applied to that number and the result
becomes a
. (+10)
is applied to the same number that (*2)
got applied to
and the result becomes b
. return
, like in other monads, doesn’t have any
other effect but to make a monadic value that presents some result. This
presents a+b
as the result of this function. If we test it out, we get
the same result as before:
ghci> addStuff 3
19
Both (*2)
and (+10)
get applied to the number 3
in this case.
return (a+b)
does as well, but it ignores it and always presents a+b
as the
result. For this reason, the function monad is also called the reader
monad. All the functions read from a common source. To illustrate this
even better, we can rewrite addStuff
like so:
addStuff :: Int -> Int
addStuff x = let
a = (*2) x
b = (+10) x
in a+b
We see that the reader monad allows us to treat functions as values with
a context. We can act as if we already know what the functions will
return. It does this by gluing functions together into one function and
then giving that function’s parameter to all of the functions that it
was glued from. So if we have a lot of functions that are all just
missing one parameter and they’d eventually be applied to the same
thing, we can use the reader monad to sort of extract their future
results and the >>=
implementation will make sure that it all works
out.
Haskell is a pure language and because of that, our programs are made of functions that can’t change any global state or variables, they can only do some computations and return them results. This restriction actually makes it easier to think about our programs, as it frees us from worrying what every variable’s value is at some point in time. However, some problems are inherently stateful in that they rely on some state that changes over time. While such problems aren’t a problem for Haskell, they can be a bit tedious to model sometimes. That’s why Haskell features a thing called the state monad, which makes dealing with stateful problems a breeze while still keeping everything nice and pure.
When we were dealing with random numbers,
we dealt with functions that took a random generator as a parameter and
returned a random number and a new random generator. If we wanted to
generate several random numbers, we always had to use the random
generator that a previous function returned along with its result. When
making a function that takes a StdGen
and tosses a coin three times
based on that generator, we had to do this:
threeCoins :: StdGen -> (Bool, Bool, Bool)
threeCoins gen =
let (firstCoin, newGen) = random gen
(secondCoin, newGen') = random newGen
(thirdCoin, newGen'') = random newGen'
in (firstCoin, secondCoin, thirdCoin)
It took a generator gen
and then random gen
returned a Bool
value along
with a new generator. To throw the second coin, we used the new
generator, and so on. In most other languages, we wouldn’t have to
return a new generator along with a random number. We could just modify
the existing one! But since Haskell is pure, we can’t do that, so we had
to take some state, make a result from it and a new state and then use
that new state to generate new results.
You’d think that to avoid manually dealing with stateful computations in this way, we’d have to give up the purity of Haskell. Well, we don’t have to, since there exist a special little monad called the state monad which handles all this state business for us and without giving up any of the purity that makes Haskell programming so cool.
So, to help us understand this concept of stateful computations better, let’s go ahead and give them a type. We’ll say that a stateful computation is a function that takes some state and returns a value along with some new state. That function would have the following type:
s -> (a,s)
s
is the type of the state and a
the result of the stateful
computations.
Assignment in most other languages could be thought of as a stateful computation. For instance, when we do
x = 5
in an imperative language, it will usually assign the value5
to the variablex
and it will also have the value5
as an expression. If you look at that functionally, you could look at it as a function that takes a state (that is, all the variables that have been assigned previously) and returns a result (in this case5
) and a new state, which would be all the previous variable mappings plus the newly assigned variable.
This stateful computation, a function that takes a state and returns a result and a new state, can be thought of as a value with a context as well. The actual value is the result, whereas the context is that we have to provide some initial state to actually get that result and that apart from getting a result we also get a new state.
Say we want to model operating a stack. You have a stack of things one on top of another and you can either push stuff on top of that stack or you can take stuff off the top of the stack. When you’re putting an item on top of the stack we say that you’re pushing it to the stack and when you’re taking stuff off the top we say that you’re popping it. If you want to something that’s at the bottom of the stack, you have to pop everything that’s above it.
We’ll use a list to represent our stack and the head of the list will be
the top of the stack. To help us with our task, we’ll make two
functions: pop
and push
. pop
will take a stack, pop one item and return
that item as the result and also return a new stack, without that item.
push
will take an item and a stack and then push that item onto the
stack. It will return ()
as its result, along with a new stack. Here
goes:
type Stack = [Int]
pop :: Stack -> (Int,Stack)
pop (x:xs) = (x,xs)
push :: Int -> Stack -> ((),Stack)
push a xs = ((),a:xs)
We used ()
as the result when pushing to the stack because pushing an
item onto the stack doesn’t have any important result value, its main
job is to change the stack. Notice how we just apply the first parameter
of push
, we get a stateful computation. pop
is already a stateful
computation because of its type.
Let’s write a small piece of code to simulate a stack using these functions. We’ll take a stack, push 3 to it and then pop two items, just for kicks. Here it is:
stackManip :: Stack -> (Int, Stack)
stackManip stack = let
((),newStack1) = push 3 stack
(a ,newStack2) = pop newStack1
in pop newStack2
We take a stack
and then we do push 3 stack
, which results in a tuple.
The first part of the tuple is a ()
and the second is a new stack and we
call it newStack1
. Then, we pop a number from newStack1
, which results
in a number a
(which is the 3
) that we pushed and a new stack which we
call newStack2
. Then, we pop a number off newStack2
and we get a number
that’s b
and a newStack3
. We return a tuple with that number and that
stack. Let’s try it out:
ghci> stackManip [5,8,2,1]
(5,[8,2,1])
Cool, the result is 5
and the new stack is [8,2,1]
. Notice how
stackManip
is itself a stateful computation. We’ve taken a bunch of
stateful computations and we’ve sort of glued them together. Hmm, sounds
familiar.
The above code for stackManip
is kind of tedious since we’re manually
giving the state to every stateful computation and storing it and then
giving it to the next one. Wouldn’t it be cooler if, instead of giving
the stack manually to each function, we could write something like this:
stackManip = do
push 3
a <- pop
pop
Well, using the state monad will allow us to do exactly this. With it, we will be able to take stateful computations like these and use them without having to manage the state manually.
The Control.Monad.State
module provides a newtype
that wraps stateful
computations. Here’s its definition:
newtype State s a = State { runState :: s -> (a,s) }
A State s a
is a stateful computation that manipulates a state of type s
and has a result of type a
.
Now that we’ve seen what stateful computations are about and how they
can even be thought of as values with contexts, let’s check out their
Monad
instance:
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
Let’s take a gander at return
first. Our aim with return
is to take a
value and make a stateful computation that always has that value as its
result. That’s why we just make a lambda \s -> (x,s)
. We always
present x
as the result of the stateful computation and the state is
kept unchanged, because return
has to put a value in a minimal context.
So return
will make a stateful computation that presents a certain value
as the result and keeps the state unchanged.
What about >>=
? Well, the result of feeding a stateful computation to
a function with >>=
has to be a stateful computation, right? So we
start off with the State
newtype
wrapper and then we type out a lambda.
This lambda will be our new stateful computation. But what goes on in
it? Well, we somehow have to extract the result value from the first
stateful computation. Because we’re in a stateful computation right now,
we can give the stateful computation h
our current state s
, which
results in a pair of result and a new state: (a, newState)
. Every time
so far when we were implementing >>=
, once we had the extracted the
result from the monadic value, we applied the function f
to it to get
the new monadic value. In Writer
, after doing that and getting the new
monadic value, we still had to make sure that the context was taken care
of by mappend
ing the old monoid value with the new one. Here, we do f a
and we get a new stateful computation g
. Now that we have a new stateful
computation and a new state (goes by the name of newState
) we just apply
that stateful computation g
to the newState
. The result is a tuple of
final result and final state!
So with >>=
, we kind of glue two stateful computations together, only
the second one is hidden inside a function that takes the previous one’s
result. Because pop
and push
are already stateful computations, it’s
easy to wrap them into a State
wrapper. Watch:
import Control.Monad.State
pop :: State Stack Int
pop = State $ \(x:xs) -> (x,xs)
push :: Int -> State Stack ()
push a = State $ \xs -> ((),a:xs)
pop
is already a stateful computation and push
takes an Int
and returns
a stateful computation. Now we can rewrite our previous example of
pushing 3
onto the stack and then popping two numbers off like this:
import Control.Monad.State
stackManip :: State Stack Int
stackManip = do
push 3
a <- pop
pop
See how we’ve glued a push and two pops into one stateful computation?
When we unwrap it from its newtype
wrapper we get a function to which we
can provide some initial state:
ghci> runState stackManip [5,8,2,1]
(5,[8,2,1])
We didn’t have to bind the second pop
to a
because we didn’t use that a
at all. So we could have written it like this:
stackManip :: State Stack Int
stackManip = do
push 3
pop
pop
Pretty cool. But what if we want to do this: pop one number off the
stack and then if that number is 5
we just put it back onto the stack
and stop but if it isn’t 5
, we push 3
and 8
back on? Well, here’s the
code:
stackStuff :: State Stack ()
stackStuff = do
a <- pop
if a == 5
then push 5
else do
push 3
push 8
This is quite straightforward. Let’s run it with an initial stack.
ghci> runState stackStuff [9,0,2,1,0]
((),[8,3,0,2,1,0])
Remember, do
expressions result in monadic values and with the State
monad, a single do
expression is also a stateful function. Because
stackManip
and stackStuff
are ordinary stateful computations, we can
glue them together to produce further stateful computations.
moreStack :: State Stack ()
moreStack = do
a <- stackManip
if a == 100
then stackStuff
else return ()
If the result of stackManip
on the current stack is 100
, we run
stackStuff
, otherwise we do nothing. return ()
just keeps the state as
it is and does nothing.
The Control.Monad.State
module provides a type class that’s called
MonadState
and it features two pretty useful functions, namely get
and
put
. For State
, the get
function is implemented like this:
get = State $ \s -> (s,s)
So it just takes the current state and presents it as the result. The
put
function takes some state and makes a stateful function that
replaces the current state with it:
put newState = State $ \s -> ((),newState)
So with these, we can see what the current stack is or we can replace it with a whole other stack. Like so:
stackyStack :: State Stack ()
stackyStack = do
stackNow <- get
if stackNow == [1,2,3]
then put [8,3,1]
else put [9,2,1]
It’s worth examining what the type of >>=
would be if it only worked
for State
values:
(>>=) :: State s a -> (a -> State s b) -> State s b
See how the type of the state s
stays the same but the type of the
result can change from a
to b
? This means that we can glue together
several stateful computations whose results are of different types but
the type of the state has to stay the same. Now why is that? Well, for
instance, for Maybe
, >>=
has this type:
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
It makes sense that the monad itself, Maybe
, doesn’t change. It wouldn’t
make sense to use >>=
between two different monads. Well, for the
state monad, the monad is actually State s
, so if that s
was different,
we’d be using >>=
between two different monads.
At the beginning of this section, we saw how generating numbers can sometimes be awkward because every random function takes a generator and returns a random number along with a new generator, which must then be used instead of the old one if we want to generate another random number. The state monad makes dealing with this a lot easier.
The random
function from System.Random
has the following type:
random :: (RandomGen g, Random a) => g -> (a, g)
Meaning it takes a random generator and produces a random number along
with a new generator. We can see that it’s a stateful computation, so we
can wrap it in the State
newtype
constructor and then use it as a
monadic value so that passing of the state gets handled for us:
import System.Random
import Control.Monad.State
randomSt :: (RandomGen g, Random a) => State g a
randomSt = State random
So now if we want to throw three coins (True
is tails, False
is heads)
we just do the following:
import System.Random
import Control.Monad.State
threeCoins :: State StdGen (Bool,Bool,Bool)
threeCoins = do
a <- randomSt
b <- randomSt
c <- randomSt
return (a,b,c)
threeCoins
is now a stateful computations and after taking an initial
random generator, it passes it to the first randomSt
, which produces a
number and a new generator, which gets passed to the next one and so on.
We use return (a,b,c)
to present (a,b,c)
as the result without changing
the most recent generator. Let’s give this a go:
ghci> runState threeCoins (mkStdGen 33)
((True,False,True),680029187 2103410263)
Nice. Doing these sort of things that require some state to be kept in between steps just became much less of a hassle!
We know by now that Maybe
is used to add a context of possible failure
to values. A value can be a Just something
or a Nothing
. However useful
it may be, when we have a Nothing
, all we know is that there was some
sort of failure, but there’s no way to cram some more info in there
telling us what kind of failure it was or why it failed.
The Either e a
type on the other hand, allows us to incorporate a
context of possible failure to our values while also being able to
attach values to the failure, so that they can describe what went wrong
or provide some other useful info regarding the failure. An Either e a
value can either be a Right
value, signifying the right answer and a
success, or it can be a Left
value, signifying failure. For instance:
ghci> :t Right 4
Right 4 :: (Num t) => Either a t
ghci> :t Left "out of cheese error"
Left "out of cheese error" :: Either [Char] b
This is pretty much just an enhanced Maybe
, so it makes sense for it to
be a monad, because it can also be viewed as a value with an added
context of possible failure, only now there’s a value attached when
there’s an error as well.
Its Monad
instance is similar to that of Maybe
and it can be found in
Control.Monad.Error
:
instance (Error e) => Monad (Either e) where
return x = Right x
Right x >>= f = f x
Left err >>= f = Left err
fail msg = Left (strMsg msg)
return
, as always, takes a value and puts it in a default minimal
context. It wraps our value in the Right
constructor because we’re using
Right
to represent a successful computation where a result is present.
This is a lot like return
for Maybe
.
The >>=
examines two possible cases: a Left
and a Right
. In the case
of a Right
, the function f
is applied to the value inside it, similar to
how in the case of a Just
, the function is just applied to its contents.
In the case of an error, the Left
value is kept, along with its
contents, which describe the failure.
The Monad
instance for Either e
makes an additional requirement, and
that is that the type of the value contained in a Left
, the one that’s
indexed by the e
type parameter, has to be an instance of the Error
type
class. The Error
type class is for types whose values can act like error
messages. It defines the strMsg
function, which takes an error in the
form of a string and returns such a value. A good example of an Error
instance is, well, the String
type! In the case of String
, the strMsg
function just returns the string that it got:
ghci> :t strMsg
strMsg :: (Error a) => String -> a
ghci> strMsg "boom!" :: String
"boom!"
But since we usually use String
to describe the error when using Either
,
we don’t have to worry about this too much. When a pattern match fails
in do
notation, a Left
value is used to signify this failure.
Anyway, here are a few examples of usage:
ghci> Left "boom" >>= \x -> return (x+1)
Left "boom"
ghci> Right 100 >>= \x -> Left "no way!"
Left "no way!"
When we use >>=
to feed a Left
value to a function, the function is
ignored and an identical Left
value is returned. When we feed a Right
value to a function, the function gets applied to what’s on the inside,
but in this case that function produced a Left
value anyway!
When we try to feed a Right
value to a function that also succeeds,
we’re tripped up by a peculiar type error! Hmmm.
ghci> Right 3 >>= \x -> return (x + 100)
<interactive>:1:0:
Ambiguous type variable `a' in the constraints:
`Error a' arising from a use of `it' at <interactive>:1:0-33
`Show a' arising from a use of `print' at <interactive>:1:0-33
Probable fix: add a type signature that fixes these type variable(s)
Haskell says that it doesn’t know which type to choose for the e
part of
our Either e a
typed value, even though we’re just printing the Right
part. This is due to the Error e
constraint on the Monad
instance. So if
you get type errors like this one when using Either
as a monad, just add
an explicit type signature:
ghci> Right 3 >>= \x -> return (x + 100) :: Either String Int
Right 103
Alright, now it works!
Other than this little hangup, using this monad is very similar to using
Maybe
as a monad. In the previous chapter, we used the monadic aspects
of Maybe
to simulate birds landing on the balancing pole of a tightrope
walker. As an exercise, you can rewrite that with the error monad so
that when the tightrope walker slips and falls, we remember how many
birds were on each side of the pole when he fell.
In this section, we’re going to explore a few functions that either
operate on monadic values or return monadic values as their results (or
both!). Such functions are usually referred to as monadic functions.
While some of them will be brand new, others will be monadic
counterparts of functions that we already know, like filter
and foldl
.
Let’s see what they are then!
When we started our journey to the top of Monad Mountain, we first looked at functors, which are for things that can be mapped over. Then, we learned about improved functors called applicative functors, which allowed us to apply normal functions between several applicative values as well as to take a normal value and put it in some default context. Finally, we introduced monads as improved applicative functors, which added the ability for these values with context to somehow be fed into normal functions.
So every monad is an applicative functor and every applicative functor
is a functor. The Applicative
type class has a class constraint such
that our type has to be an instance of Functor
before we can make it an
instance of Applicative
. But even though Monad
should have the same
constraint for Applicative
, as every monad is an applicative functor, it
doesn’t, because the Monad
type class was introduced to Haskell way
before Applicative
.
But even though every monad is a functor, we don’t have to rely on it
having a Functor
instance because of the liftM
function. liftM
takes a
function and a monadic value and maps it over the monadic value. So it’s
pretty much the same thing as fmap
! This is liftM
’s type:
liftM :: (Monad m) => (a -> b) -> m a -> m b
And this is the type of fmap
:
fmap :: (Functor f) => (a -> b) -> f a -> f b
If the Functor
and Monad
instances for a type obey the functor and monad
laws, these two amount to the same thing (and all the monads that we’ve
met so far obey both). This is kind of like pure
and return
do the same
thing, only one has an Applicative
class constraint whereas the other
has a Monad
one. Let’s try liftM
out:
ghci> liftM (*3) (Just 8)
Just 24
ghci> fmap (*3) (Just 8)
Just 24
ghci> runWriter $ liftM not $ Writer (True, "chickpeas")
(False,"chickpeas")
ghci> runWriter $ fmap not $ Writer (True, "chickpeas")
(False,"chickpeas")
ghci> runState (liftM (+100) pop) [1,2,3,4]
(101,[2,3,4])
ghci> runState (fmap (+100) pop) [1,2,3,4]
(101,[2,3,4])
We already know quite well how fmap
works with Maybe
values. And liftM
does the same thing. For Writer
values, the function is mapped over the
first component of the tuple, which is the result. Doing fmap
or liftM
over a stateful computation results in another stateful computation,
only its eventual result is modified by the supplied function. Had we
not mapped (+100)
over pop
in this case before running it, it would have
returned (1,[2,3,4])
.
This is how liftM
is implemented:
liftM :: (Monad m) => (a -> b) -> m a -> m b
liftM f m = m >>= (\x -> return (f x))
Or with do
notation:
liftM :: (Monad m) => (a -> b) -> m a -> m b
liftM f m = do
x <- m
return (f x)
We feed the monadic value m
into the function and then we apply the
function f
to its result before putting it back into a default context.
Because of the monad laws, this is guaranteed not to change the context,
only the result that the monadic value presents. We see that liftM
is
implemented without referencing the Functor
type class at all. This
means that we can implement fmap
(or liftM
, whatever you want to call
it) just by using the goodies that monads offer us. Because of this, we
can conclude that monads are stronger than just regular old functors.
The Applicative
type class allows us to apply functions between values
with contexts as if they were normal values. Like this:
ghci> (+) <$> Just 3 <*> Just 5
Just 8
ghci> (+) <$> Just 3 <*> Nothing
Nothing
Using this applicative style makes things pretty easy. <$>
is just
fmap
and <*>
is a function from the Applicative
type class that has
the following type:
(<*>) :: (Applicative f) => f (a -> b) -> f a -> f b
So it’s kind of like fmap
, only the function itself is in a context. We
have to somehow extract it from the context and map it over the f a
value and then assemble the context back together. Because all functions
are curried in Haskell by default, we can use the combination of <$>
and <*>
to apply functions that take several parameters between
applicative values.
Anyway, it turns out that just like fmap
, <*>
can also be implemented
by using only what the Monad
type class give us. The ap
function is
basically <*>
, only it has a Monad
constraint instead of an
Applicative
one. Here’s its definition:
ap :: (Monad m) => m (a -> b) -> m a -> m b
ap mf m = do
f <- mf
x <- m
return (f x)
mf
is a monadic value whose result is a function. Because the function
is in a context as well as the value, we get the function from the
context and call it f
, then get the value and call that x
and then
finally apply the function to the value and present that as a result.
Here’s a quick demonstration:
ghci> Just (+3) <*> Just 4
Just 7
ghci> Just (+3) `ap` Just 4
Just 7
ghci> [(+1),(+2),(+3)] <*> [10,11]
[11,12,12,13,13,14]
ghci> [(+1),(+2),(+3)] `ap` [10,11]
[11,12,12,13,13,14]
Now we see that monads are stronger than applicatives as well, because
we can use the functions from Monad
to implement the ones for
Applicative
. In fact, many times when a type is found to be a monad,
people first write up a Monad
instance and then make an Applicative
instance by just saying that pure
is return
and <*>
is ap
. Similarly,
if you already have a Monad
instance for something, you can give it a
Functor
instance just saying that fmap
is liftM
.
The liftA2
function is a convenience function for applying a function
between two applicative values. It’s defined simply like so:
liftA2 :: (Applicative f) => (a -> b -> c) -> f a -> f b -> f c
liftA2 f x y = f <$> x <*> y
The liftM2
function does the same thing, only it has a Monad
constraint.
There also exist liftM3
and liftM4
and liftM5
.
We saw how monads are stronger than applicatives and functors and how
even though all monads are functors and applicative functors, they don’t
necessarily have Functor
and Applicative
instances, so we examined the
monadic equivalents of the functions that functors and applicative
functors use.
Here’s some food for thought: if the result of one monadic value is
another monadic value i.e. if one monadic value is nested inside the
other, can you flatten them to just a single normal monadic value? Like,
if we have Just (Just 9)
, can we make that into Just 9
? It turns out
that any nested monadic value can be flattened and that this is actually
a property unique to monads. For this, the join
function exists. Its
type is this:
join :: (Monad m) => m (m a) -> m a
So it takes a monadic value within a monadic value and gives us just a
monadic value, so it sort of flattens it. Here it is with some Maybe
values:
ghci> join (Just (Just 9))
Just 9
ghci> join (Just Nothing)
Nothing
ghci> join Nothing
Nothing
The first line has a successful computation as a result of a successful
computation, so they’re both just joined into one big successful
computation. The second line features a Nothing
as a result of a Just
value. Whenever we were dealing with Maybe
values before and we wanted
to combine several of them into one, be it with <*>
or >>=
, they
all had to be Just
values for the result to be a Just
value. If there
was any failure along the way, the result was a failure and the same
thing happens here. In the third line, we try to flatten what is from
the onset a failure, so the result is a failure as well.
Flattening lists is pretty intuitive:
ghci> join [[1,2,3],[4,5,6]]
[1,2,3,4,5,6]
As you can see, for lists, join
is just concat
. To flatten a Writer
value whose result is a Writer
value itself, we have to mappend
the
monoid value.
ghci> runWriter $ join (Writer (Writer (1,"aaa"),"bbb"))
(1,"bbbaaa")
The outer monoid value "bbb"
comes first and then to it "aaa"
is
appended. Intuitively speaking, when you want to examine what the result
of a Writer
value is, you have to write its monoid value to the log
first and only then can you examine what it has inside.
Flattening Either
values is very similar to flattening Maybe
values:
ghci> join (Right (Right 9)) :: Either String Int
Right 9
ghci> join (Right (Left "error")) :: Either String Int
Left "error"
ghci> join (Left "error") :: Either String Int
Left "error"
If we apply join
to a stateful computation whose result is a stateful
computation, the result is a stateful computation that first runs the
outer stateful computation and then the resulting one. Watch:
ghci> runState (join (State $ \s -> (push 10,1:2:s))) [0,0,0]
((),[10,1,2,0,0,0])
The lambda here takes a state and puts 2
and 1
onto the stack and
presents push 10
as its result. So when this whole thing is flattened
with join
and then run, it first puts 2
and 1
onto the stack and then
push 10
gets carried out, pushing a 10
on to the top.
The implementation for join
is as follows:
join :: (Monad m) => m (m a) -> m a
join mm = do
m <- mm
m
Because the result of mm
is a monadic value, we get that result and then
just put it on a line of its own because it’s a monadic value. The trick
here is that when we do m <- mm
, the context of the monad in which we
are in gets taken care of. That’s why, for instance, Maybe
values result
in Just
values only if the outer and inner values are both Just
values.
Here’s what this would look like if the mm
value was set in advance to
Just (Just 8)
:
joinedMaybes :: Maybe Int
joinedMaybes = do
m <- Just (Just 8)
m
Perhaps the most interesting thing about join
is that for every monad,
feeding a monadic value to a function with >>=
is the same thing as
just mapping that function over the value and then using join
to flatten
the resulting nested monadic value! In other words, m >>= f
is always
the same thing as join (fmap f m)
! It makes sense when you think about
it. With >>=
, we’re always thinking about how to feed a monadic value
to a function that takes a normal value but returns a monadic value. If
we just map that function over the monadic value, we have a monadic
value inside a monadic value. For instance, say we have Just 9
and the
function \x -> Just (x+1)
. If we map this function over Just 9
, we’re
left with Just (Just 10)
.
The fact that m >>= f
always equals join (fmap f m)
is very useful if
we’re making our own Monad
instance for some type because it’s often
easier to figure out how we would flatten a nested monadic value than
figuring out how to implement >>=
.
The filter
function is pretty much the bread of Haskell programming (map
being the butter). It takes a predicate and a list to filter out and
then returns a new list where only the elements that satisfy the
predicate are kept. Its type is this:
filter :: (a -> Bool) -> [a] -> [a]
The predicate takes an element of the list and returns a Bool
value.
Now, what if the Bool
value that it returned was actually a monadic
value? Whoa! That is, what if it came with a context? Could that work?
For instance, what if every True
or a False
value that the predicate
produced also had an accompanying monoid value, like
["Accepted the number 5"]
or ["3 is too small"]
? That sounds like it could work. If
that were the case, we’d expect the resulting list to also come with a
log of all the log values that were produced along the way. So if the
Bool
that the predicate returned came with a context, we’d expect the
final resulting list to have some context attached as well, otherwise
the context that each Bool
came with would be lost.
The filterM
function from Control.Monad
does just what we want! Its type
is this:
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
The predicate returns a monadic value whose result is a Bool
, but
because it’s a monadic value, its context can be anything from a
possible failure to non-determinism and more! To ensure that the context
is reflected in the final result, the result is also a monadic value.
Let’s take a list and only keep those values that are smaller than 4. To
start, we’ll just use the regular filter
function:
ghci> filter (\x -> x < 4) [9,1,5,2,10,3]
[1,2,3]
That’s pretty easy. Now, let’s make a predicate that, aside from
presenting a True
or False
result, also provides a log of what it did.
Of course, we’ll be using the Writer
monad for this:
keepSmall :: Int -> Writer [String] Bool
keepSmall x
| x < 4 = do
tell ["Keeping " ++ show x]
return True
| otherwise = do
tell [show x ++ " is too large, throwing it away"]
return False
Instead of just returning a Bool
, this function returns a
Writer [String] Bool
. It’s a monadic predicate. Sounds fancy, doesn’t it? If
the number is smaller than 4
we report that we’re keeping it and then
return True
.
Now, let’s give it to filterM
along with a list. Because the predicate
returns a Writer
value, the resulting list will also be a Writer
value.
ghci> fst $ runWriter $ filterM keepSmall [9,1,5,2,10,3]
[1,2,3]
Examining the result of the resulting Writer
value, we see that
everything is in order. Now, let’s print the log and see what we got:
ghci> mapM_ putStrLn $ snd $ runWriter $ filterM keepSmall [9,1,5,2,10,3]
9 is too large, throwing it away
Keeping 1
5 is too large, throwing it away
Keeping 2
10 is too large, throwing it away
Keeping 3
Awesome. So just by providing a monadic predicate to filterM
, we were
able to filter a list while taking advantage of the monadic context that
we used.
A very cool Haskell trick is using filterM
to get the powerset of a list
(if we think of them as sets for now). The powerset of some set is a set
of all subsets of that set. So if we have a set like [1,2,3]
, its
powerset would include the following sets:
[1,2,3]
[1,2]
[1,3]
[1]
[2,3]
[2]
[3]
[]
In other words, getting a powerset is like getting all the combinations
of keeping and throwing out elements from a set. [2,3]
is like the
original set, only we excluded the number 1
.
To make a function that returns a powerset of some list, we’re going to
rely on non-determinism. We take the list [1,2,3]
and then look at the
first element, which is 1
and we ask ourselves: should we keep it or
drop it? Well, we’d like to do both actually. So we are going to filter
a list and we’ll use a predicate that non-deterministically both keeps
and drops every element from the list. Here’s our powerset
function:
powerset :: [a] -> [[a]]
powerset xs = filterM (\x -> [True, False]) xs
Wait, that’s it? Yup. We choose to drop and keep every element, regardless of what that element is. We have a non-deterministic predicate, so the resulting list will also be a non-deterministic value and will thus be a list of lists. Let’s give this a go:
ghci> powerset [1,2,3]
[[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]]
This takes a bit of thinking to wrap your head around, but if you just consider lists as non-deterministic values that don’t know what to be so they just decide to be everything at once, it’s a bit easier.
The monadic counterpart to foldl
is foldM
. If you remember your folds
from the folds section, you know that foldl
takes a binary
function, a starting accumulator and a list to fold up and then folds it
from the left into a single value by using the binary function. foldM
does the same thing, except it takes a binary function that produces a
monadic value and folds the list up with that. Unsurprisingly, the
resulting value is also monadic. The type of foldl
is this:
foldl :: (a -> b -> a) -> a -> [b] -> a
Whereas foldM
has the following type:
foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
The value that the binary function returns is monadic and so the result of the whole fold is monadic as well. Let’s sum a list of numbers with a fold:
ghci> foldl (\acc x -> acc + x) 0 [2,8,3,1]
14
The starting accumulator is 0
and then 2
gets added to the accumulator,
resulting in a new accumulator that has a value of 2
. 8
gets added to
this accumulator resulting in an accumulator of 10
and so on and when we
reach the end, the final accumulator is the result.
Now what if we wanted to sum a list of numbers but with the added
condition that if any number is greater than 9
in the list, the whole
thing fails? It would make sense to use a binary function that checks if
the current number is greater than 9
and if it is, fails, and if it
isn’t, continues on its merry way. Because of this added possibility of
failure, let’s make our binary function return a Maybe
accumulator
instead of a normal one. Here’s the binary function:
binSmalls :: Int -> Int -> Maybe Int
binSmalls acc x
| x > 9 = Nothing
| otherwise = Just (acc + x)
Because our binary function is now a monadic function, we can’t use it
with the normal foldl
, but we have to use foldM
. Here goes:
ghci> foldM binSmalls 0 [2,8,3,1]
Just 14
ghci> foldM binSmalls 0 [2,11,3,1]
Nothing
Excellent! Because one number in the list was greater than 9
, the whole
thing resulted in a Nothing
. Folding with a binary function that returns
a Writer
value is cool as well because then you log whatever you want as
your fold goes along its way.
When we were solving the problem of implementing a RPN
calculator, we noted that it worked
fine as long as the input that it got made sense. But if something went
wrong, it caused our whole program to crash. Now that we know how to
take some code that we have and make it monadic, let’s take our RPN
calculator and add error handling to it by taking advantage of the Maybe
monad.
We implemented our RPN calculator by taking a string like "1 3 + 2 *"
,
breaking it up into words to get something like ["1","3","+","2","*"]
and then folding over that list by starting out with an empty stack and
then using a binary folding function that adds numbers to the stack or
manipulates numbers on the top of the stack to add them together and
divide them and such.
This was the main body of our function:
import Data.List
solveRPN :: String -> Double
solveRPN = head . foldl foldingFunction [] . words
We made the expression into a list of strings, folded over it with our folding function and then when we were left with just one item in the stack, we returned that item as the answer. This was the folding function:
foldingFunction :: [Double] -> String -> [Double]
foldingFunction (x:y:ys) "*" = (x * y):ys
foldingFunction (x:y:ys) "+" = (x + y):ys
foldingFunction (x:y:ys) "-" = (y - x):ys
foldingFunction xs numberString = read numberString:xs
The accumulator of the fold was a stack, which we represented with a
list of Double
values. As the folding function went over the RPN
expression, if the current item was an operator, it took two items off
the top of the stack, applied the operator between them and then put the
result back on the stack. If the current item was a string that
represented a number, it converted that string into an actual number and
returned a new stack that was like the old one, except with that number
pushed to the top.
Let’s first make our folding function capable of graceful failure. Its type is going to change from what it is now to this:
foldingFunction :: [Double] -> String -> Maybe [Double]
So it will either return Just
a new stack or it will fail with Nothing
.
The reads
function is like read
, only it returns a list with a single
element in case of a successful read. If it fails to read something,
then it returns an empty list. Apart from returning the value that it
read, it also returns the part of the string that it didn’t consume.
We’re going to say that it always has to consume the full input to work
and make it into a readMaybe
function for convenience. Here it is:
readMaybe :: (Read a) => String -> Maybe a
readMaybe st = case reads st of [(x,"")] -> Just x
_ -> Nothing
Testing it out:
ghci> readMaybe "1" :: Maybe Int
Just 1
ghci> readMaybe "GO TO HELL" :: Maybe Int
Nothing
Okay, it seems to work. So, let’s make our folding function into a monadic function that can fail:
foldingFunction :: [Double] -> String -> Maybe [Double]
foldingFunction (x:y:ys) "*" = return ((x * y):ys)
foldingFunction (x:y:ys) "+" = return ((x + y):ys)
foldingFunction (x:y:ys) "-" = return ((y - x):ys)
foldingFunction xs numberString = liftM (:xs) (readMaybe numberString)
The first three cases are like the old ones, except the new stack gets
wrapped in a Just
(we used return
here to do this, but we could have
written Just
just as well). In the last case, we do
readMaybe numberString
and then we map (:xs)
over it. So if the stack xs
is
[1.0,2.0]
and readMaybe numberString
results in a Just 3.0
, the result
is Just [3.0,1.0,2.0]
. If readMaybe numberString
results in a Nothing
then the result is Nothing
. Let’s try out the folding function by
itself:
ghci> foldingFunction [3,2] "*"
Just [6.0]
ghci> foldingFunction [3,2] "-"
Just [-1.0]
ghci> foldingFunction [] "*"
Nothing
ghci> foldingFunction [] "1"
Just [1.0]
ghci> foldingFunction [] "1 wawawawa"
Nothing
It looks like it’s working! And now it’s time for the new and improved
solveRPN
. Here it is ladies and gents!
import Data.List
solveRPN :: String -> Maybe Double
solveRPN st = do
[result] <- foldM foldingFunction [] (words st)
return result
Just like before, we take the string and make it into a list of words.
Then, we do a fold, starting with the empty stack, only instead of doing
a normal foldl
, we do a foldM
. The result of that foldM
should be a
Maybe
value that contains a list (that’s our final stack) and that list
should have only one value. We use a do
expression to get that value and
we call it result
. In case the foldM
returns a Nothing
, the whole thing
will be a Nothing
, because that’s how Maybe
works. Also notice that we
pattern match in the do
expression, so if the list has more than one
value or none at all, the pattern match fails and a Nothing
is produced.
In the last line we just do return result
to present the result of the
RPN calculation as the result of the final Maybe
value.
Let’s give it a shot:
ghci> solveRPN "1 2 * 4 +"
Just 6.0
ghci> solveRPN "1 2 * 4 + 5 *"
Just 30.0
ghci> solveRPN "1 2 * 4"
Nothing
ghci> solveRPN "1 8 wharglbllargh"
Nothing
The first failure happens because the final stack isn’t a list with one
element in it and so the pattern matching in the do
expression fails.
The second failure happens because readMaybe
returns a Nothing
.
When we were learning about the monad laws, we said that the <=<
function is just like composition, only instead of working for normal
functions like a -> b
, it works for monadic functions like a -> m b
.
For instance:
ghci> let f = (+1) . (*100)
ghci> f 4
401
ghci> let g = (\x -> return (x+1)) <=< (\x -> return (x*100))
ghci> Just 4 >>= g
Just 401
In this example we first composed two normal functions, applied the
resulting function to 4
and then we composed two monadic functions and
fed Just 4
to the resulting function with >>=
.
If we have a bunch of functions in a list, we can compose them one all
into one big function by just using id
as the starting accumulator and
the .
function as the binary function. Here’s an example:
ghci> let f = foldr (.) id [(+1),(*100),(+1)]
ghci> f 1
201
The function f
takes a number and then adds 1
to it, multiplies the
result by 100
and then adds 1
to that. Anyway, we can compose monadic
functions in the same way, only instead of normal composition we use <=<
and instead of id
we use return
. We don’t have to use a foldM
over a
foldr
or anything because the <=<
function makes sure that composition
happens in a monadic fashion.
When we were getting to know the list monad in the previous
chapter, we used it to figure out
if a knight can go from one position on a chessboard to another in
exactly three moves. We had a function called moveKnight
which took the
knight’s position on the board and returned all the possible moves that
he can make next. Then, to generate all the possible positions that he
can have after taking three moves, we made the following function:
in3 start = return start >>= moveKnight >>= moveKnight >>= moveKnight
And to check if he can go from start
to end
in three moves, we did the
following:
canReachIn3 :: KnightPos -> KnightPos -> Bool
canReachIn3 start end = end `elem` in3 start
Using monadic function composition, we can make a function like in3
,
only instead of generating all the positions that the knight can have
after making three moves, we can do it for an arbitrary number of moves.
If you look at in3, we see that we used moveKnight
three times and each
time we used >>=
to feed it all the possible previous positions. So
now, let’s make it more general. Here’s how to do it:
import Data.List
inMany :: Int -> KnightPos -> [KnightPos]
inMany x start = return start >>= foldr (<=<) return (replicate x moveKnight)
First we use replicate
to make a list that contains x
copies of the
function moveKnight
. Then, we monadically compose all those functions
into one, which gives us a function that takes a starting position and
non-deterministically moves the knight x
times. Then, we just make the
starting position into a singleton list with return
and feed it to the
function.
Now, we can change our canReachIn3
function to be more general as well:
canReachIn :: Int -> KnightPos -> KnightPos -> Bool
canReachIn x start end = end `elem` inMany x start
In this section, we’re going to look at an example of how a type gets
made, identified as a monad and then given the appropriate Monad
instance. We don’t usually set out to make a monad with the sole purpose
of making a monad. Instead, we usually make a type whose purpose is
to model an aspect of some problem and then later on if we see that the
type represents a value with a context and can act like a monad, we give
it a Monad
instance.
As we’ve seen, lists are used to represent non-deterministic values. A
list like [3,5,9]
can be viewed as a single non-deterministic value that
just can’t decide what it’s going to be. When we feed a list into a
function with >>=
, it just makes all the possible choices of taking an
element from the list and applying the function to it and then presents
those results in a list as well.
If we look at the list [3,5,9]
as the numbers 3
, 5
and 9
occurring at
once, we might notice that there’s no info regarding the probability
that each of those numbers occurs. What if we wanted to model a
non-deterministic value like [3,5,9]
, but we wanted to express that 3
has a 50% chance of happening and 5
and 9
both have a 25% chance of
happening? Let’s try and make this happen!
Let’s say that every item in the list comes with another value, a probability of it happening. It might make sense to present this like this then:
[(3,0.5),(5,0.25),(9,0.25)]
In mathematics, probabilities aren’t usually expressed in percentages,
but rather in real numbers between a 0 and 1. A 0 means that there’s no
chance in hell for something to happen and a 1 means that it’s happening
for sure. Floating point numbers can get real messy real fast because
they tend to lose precision, so Haskell offers us a data type for
rational numbers that doesn’t lose precision. That type is called
Rational
and it lives in Data.Ratio
. To make a Rational
, we write it as
if it were a fraction. The numerator and the denominator are separated
by a %
. Here are a few examples:
ghci> 1%4
1 % 4
ghci> 1%2 + 1%2
1 % 1
ghci> 1%3 + 5%4
19 % 12
The first line is just one quarter. In the second line we add two halves
to get a whole and in the third line we add one third with five quarters
and get nineteen twelfths. So let’use throw out our floating points and
use Rational
for our probabilities:
ghci> [(3,1%2),(5,1%4),(9,1%4)]
[(3,1 % 2),(5,1 % 4),(9,1 % 4)]
Okay, so 3
has a one out of two chance of happening while 5
and 9
will
happen one time out of four. Pretty neat.
We took lists and we added some extra context to them, so this
represents values with contexts too. Before we go any further, let’s
wrap this into a newtype
because something tells me we’ll be making some
instances.
import Data.Ratio
newtype Prob a = Prob { getProb :: [(a,Rational)] } deriving Show
Alright. Is this a functor? Well, the list is a functor, so this should probably be a functor as well, because we just added some stuff to the list. When we map a function over a list, we apply it to each element. Here, we’ll apply it to each element as well, only we’ll leave the probabilities as they are. Let’s make an instance:
instance Functor Prob where
fmap f (Prob xs) = Prob $ map (\(x,p) -> (f x,p)) xs
We unwrap it from the newtype
with pattern matching, apply the function
f
to the values while keeping the probabilities as they are and then
wrap it back up. Let’s see if it works:
ghci> fmap negate (Prob [(3,1%2),(5,1%4),(9,1%4)])
Prob {getProb = [(-3,1 % 2),(-5,1 % 4),(-9,1 % 4)]}
Another thing to note is that the probabilities should always add up to
1
. If those are all the things that can happen, it doesn’t make sense
for the sum of their probabilities to be anything other than 1
. A coin
that lands tails 75% of the time and heads 50% of the time seems like it
could only work in some other strange universe.
Now the big question, is this a monad? Given how the list is a monad,
this looks like it should be a monad as well. First, let’s think about
return
. How does it work for lists? It takes a value and puts it in a
singleton list. What about here? Well, since it’s supposed to be a
default minimal context, it should also make a singleton list. What
about the probability? Well, return x
is supposed to make a monadic
value that always presents x
as its result, so it doesn’t make sense for
the probability to be 0
. If it always has to present it as its result,
the probability should be 1
!
What about >>=
? Seems kind of tricky, so let’s make use of the fact
that m >>= f
always equals join (fmap f m)
for monads and think about
how we would flatten a probability list of probability lists. As an
example, let’s consider this list where there’s a 25% chance that
exactly one of 'a'
or 'b'
will happen. Both 'a'
and 'b'
are equally
likely to occur. Also, there’s a 75% chance that exactly one of 'c'
or
'd'
will happen. 'c'
and 'd'
are also equally likely to happen. Here’s a
picture of a probability list that models this scenario:
What are the chances for each of these letters to occur? If we were to
draw this as just four boxes, each with a probability, what would those
probabilities be? To find out, all we have to do is multiply each
probability with all of probabilities that it contains. 'a'
would occur
one time out of eight, as would 'b'
, because if we multiply one half by
one quarter we get one eighth. 'c'
would happen three times out of eight
because three quarters multiplied by one half is three eighths. 'd'
would also happen three times out of eight. If we sum all the
probabilities, they still add up to one.
Here’s this situation expressed as a probability list:
thisSituation :: Prob (Prob Char)
thisSituation = Prob
[( Prob [('a',1%2),('b',1%2)] , 1%4 )
,( Prob [('c',1%2),('d',1%2)] , 3%4)
]
Notice that its type is Prob (Prob Char)
. So now that we’ve figured out
how to flatten a nested probability list, all we have to do is write the
code for this and then we can write >>=
simply as join (fmap f m)
and
we have ourselves a monad! So here’s flatten
, which we’ll use because
the name join
is already taken:
flatten :: Prob (Prob a) -> Prob a
flatten (Prob xs) = Prob $ concat $ map multAll xs
where multAll (Prob innerxs,p) = map (\(x,r) -> (x,p*r)) innerxs
The function multAll
takes a tuple of probability list and a probability
p
that comes with it and then multiplies every inner probability with p
,
returning a list of pairs of items and probabilities. We map multAll
over each pair in our nested probability list and then we just flatten
the resulting nested list.
Now we have all that we need, we can write a Monad
instance!
instance Monad Prob where
return x = Prob [(x,1%1)]
m >>= f = flatten (fmap f m)
fail _ = Prob []
Because we already did all the hard work, the instance is very simple.
We also defined the fail
function, which is the same as it is for lists,
so if there’s a pattern match failure in a do
expression, a failure
occurs within the context of a probability list.
It’s also important to check if the monad laws hold for the monad that
we just made. The first one says that return x >>= f
should be equal
to f x
. A rigorous proof would be rather tedious, but we can see that if
we put a value in a default context with return
and then fmap
a function
over that and flatten the resulting probability list, every probability
that results from the function would be multiplied by the 1%1
probability that we made with return
, so it wouldn’t affect the context.
The reasoning for m >>= return
being equal to just m
is similar. The
third law states that f <=< (g <=< h)
should be the same as
(f <=< g) <=< h
. This one holds as well, because it holds for the list monad
which forms the basis of the probability monad and because
multiplication is associative. 1%2 * (1%3 * 1%5)
is equal to
(1%2 * 1%3) * 1%5
.
Now that we have a monad, what can we do with it? Well, it can help us do calculations with probabilities. We can treat probabilistic events as values with contexts and the probability monad will make sure that those probabilities get reflected in the probabilities of the final result.
Say we have two normal coins and one loaded coin that gets tails an astounding nine times out of ten and heads only one time out of ten. If we throw all the coins at once, what are the odds of all of them landing tails? First, let’s make probability values for a normal coin flip and for a loaded one:
data Coin = Heads | Tails deriving (Show, Eq)
coin :: Prob Coin
coin = Prob [(Heads,1%2),(Tails,1%2)]
loadedCoin :: Prob Coin
loadedCoin = Prob [(Heads,1%10),(Tails,9%10)]
And finally, the coin throwing action:
import Data.List (all)
flipThree :: Prob Bool
flipThree = do
a <- coin
b <- coin
c <- loadedCoin
return (all (==Tails) [a,b,c])
Giving it a go, we see that the odds of all three landing tails are not that good, despite cheating with our loaded coin:
ghci> getProb flipThree
[(False,1 % 40),(False,9 % 40),(False,1 % 40),(False,9 % 40),
(False,1 % 40),(False,9 % 40),(False,1 % 40),(True,9 % 40)]
All three of them will land tails nine times out of forty, which is less
than 25%. We see that our monad doesn’t know how to join all of the
False
outcomes where all coins don’t land tails into one outcome. That’s
not a big problem, since writing a function to put all the same outcomes
into one outcome is pretty easy and is left as an exercise to the reader
(you!)
In this section, we went from having a question (what if lists also carried information about probability?) to making a type, recognizing a monad and finally making an instance and doing something with it. I think that’s quite fetching! By now, we should have a pretty good grasp on monads and what they’re about.