I’m a software engineer living in San Francisco, constantly trying to level up my BBQ and powerlifting skills. 🇬🇧 🏳️‍🌈 🏋️‍♀️ 🍖

 

Monads, or Programmable Semicolons

In a spell of madness I’ve decided to explain the concept of monads in my own personal way, with a slant towards Python programmers.

Syntax as an Interface

Think about the case of magic methods in Python. They’re used as programmable ways to get special shorthand from the interpreter; for example object1 + object2 will be interpreted as object1.__add__(object2). This means that if your class implements the __add__(self, other) method, you can benefit from some interpreter shorthand for invoking it.

The other place this comes up in Python is for loops, a.k.a. iteration. This is a bit more useful than a simple operator override, since it deals with blocks of code. If an object implements a kind of sequence, lazy or otherwise, you can define __iter__ and use the object it returns as the subject of a for loop. For example:

class range(object):  
    def __init__(self, start, stop, step):
        self.start = start
        self.stop = stop
        self.step = step

    def __iter__(self):
        val = self.start
        while val < self.stop:
            yield val
            val += self.step

This means we can write code like:

for date in range(date(2014, 4, 5), date(2014, 4, 20), timedelta(days=1)):  
    print date.day

What would the alternative be if we didn’t have a programmable for? A method which returned eager-loaded native lists? An iterator function which accepted a callback? By having a syntax which you can plug into just by exporting an interface, Python enables you to write more semantic code with a negligible performance overhead.


In Python we take our ability to define order-of-operations for granted. What this means is that in the following program:

a = 0  
print("a is {}".format(a))  
a += 1  
print("a is {}".format(a))  

we assume that the output will be

a is 0
a is 1

because the order in which statements are executed is guaranteed by the interpreter; it implements the concept of sequential statements for us. If you consider each line of the program to be an expression with a result, Python implements a ‘newline operator’ (although in languages like C you’d consider it a ‘semicolon operator’) between statements, which joins subsequent ones together with certain semantics. In most CompSci educations this is called ‘control flow’.

However, monads were developed for languages like Haskell in which there are no statements; your entire program is effectively a single expression—like a giant Python lambda. What would such a program look like in today’s Python?

Well we could use the callback paradigm for this, which is what most JavaScript does today. Make every line of the program a lambda which takes a callback and calls it with the result of its expression:

# N.B.: it's OK to define functions and values this way as long as we don't
# try to change them later
incr = lambda val, callback: callback(val + 1)

(lambda a:
  incr([print("a is {}".format(a)), a][1],
       lambda a: print("a" is {}".format(a))))
)(0)

Which renders:

a is 0
a is 1

However this format becomes unwieldy quickly: the depth of callbacks is proportional not to complexity of the program, but to the number of sequential statements you want to run.

We can define a class which represents a value that can be fed into the next stage of computation:

class Val(object):  
    def __init__(self, value):
        self.value = value

    def then(self, func):
        return Val(func(self.value))

With this our program becomes:

incr = lambda a: a + 1  
do_print = lambda a: [print("a is {}".format(a)), a][1]  
Val(0).then(do_print).then(incr).then(do_print)  

We still get the correct output:

a is 0
a is 1

In this case Val is our monad: its __init__ and then methods define how we get values into the data processing pipeline, and how we chain parts of the pipeline together.

Beyond Applicability to Pure Languages

The real power of monads stands out when you realize that these simple semicolon-like semantics aren’t the only one you can implement; you can use this to use asynchronous I/O libraries and interfaces but still write code in the same way as synchronous I/O.

Imagine if the Python implementation of print() took a callback and a string, returning immediately without waiting for the string to actually be printed, and calling the callback when it’s finished printing. This is how everything works in Node.js, for example. We can expand our monad to support transforming this callback-style code into a more recognizable monadic sequence:

from functools import partial

class LazyVal(object):  
    def __init__(self, func):
        self.func = func

    # Note that we now define a classmethod for converting scalar objects
    # into LazyVals, since __init__ is now an implementation detail.
    @classmethod
    def value(cls, value):
        return cls(lambda cb: cb(value))

    def then(self, func):
        return LazyVal(lambda cb: self.func(partial(func, cb)))

    # Now that we're not executing everything immediately, we need a way
    # to map things from the LazyVal monad back into Pythonic flow
    # control.
    def eval(self, callback):
        return self.func(callback)

# in our invented Python implementation, print() now takes a string
# and a callback:
do_print = lambda cb, a: print("a is {}".format(a), lambda: cb(a))  
incr = lambda cb, a: cb(a + 1)

LazyVal.value(0).then(do_print).then(incr).then(do_print).eval(lambda a: None)  

We basically just implemented JavaScript Promises.


Another great example of flow control is exceptions, and this is supported through what is known as the Try monad:

class Try(object):  
    # Bring a variadic function into the Try monad
    @classmethod
    def lift(cls, func):
        return lambda *a, **kw: cls.make(lambda: func(*a, **kw))

    # Bring a 0-ary function into the Try monad
    @classmethod
    def make(cls, get_value):
        try:
            val = get_value()
        except Exception, exc:
            return Failure(exc)
        if isinstance(val, cls):
            return val
        return Success(val)

class Success(Try):  
    def __init__(self, value):
        self.value = value

    def then(self, func):
        return Try.lift(lambda: func(self.value))

    def or_else(self, func):
        return self

    def get_or_raise(self):
        return self.value

class Failure(Try):  
    def __init__(self, error):
        self.error = error

    def then(self, func):
        return self

    def or_else(self, func):
        return Try.lift(lambda: func(self.error))

    def get_or_raise(self):
        raise self.error

divide = lambda (a, b): a / b  
do_print = lambda a: [print("a is {}".format(a)), a][1]  
do_log_err = lambda exc: print("Exception: {}".format(exc))

Success((1.0, 2.0)).then(divide).then(do_print).or_else(do_log_err)  
Success((1.0, 0.0)).then(divide).then(do_print).or_else(do_log_err)  

The output of this program is:

a is 0.5
Exception: float division by zero

Which is exactly correct! You might have noticed that Success and Failure are nearly symmetrical; I find this to be one of the most elegant monads because it brings what seems like quite a janky type of control flow (one which C and Go don’t even implement) under the same understandable umbrella as the other monadic flows.

Syntax as an Interface, Reprise

Core to all of the monads we’ve explored today is a small interface:

  • A classmethod which maps values from regular Pythonic control flow into the monad (e.g. LazyVal.value or Val itself); and
  • A method on a monad which takes a function and returns a new monad representing the composition of the two pipes in the pipeline (then());

There are other optional parts to the interface, including:

  • Ways of getting values back out of a monad (e.g. Try.get_or_raise, LazyVal.eval)
  • Additional means of composition (e.g. Try.or_else)
  • Ways of transforming functions from a => b into a => monad[b] (e.g. Try.lift)

Again, let’s pretend we were in a language where the entire program had to evaluate to a single expression. Would we need a call to .then() for every ‘line’ of our program? Well, Haskell implements something called ‘do-notation’ wherein a series of statements operating under a monad can be strung together in a classical imperative programming style. This is the ‘programmable semicolon’ of monad lore:

do a <- Val(0)  
   print a
   a <- incr(a)
   print a

We’ve defined what it means to do one thing ‘and then’ do another thing by implementing a simple interface, and now our program can be a ‘pure’ expression whilst still having all the ordered I/O side-effects we need.


I dont expect Python to implement something like a do-notation, but perspective is worth 80 IQ points. It’s worth noting that Twisted’s Deferreds and JavaScript’s Promises are effectively a combination of the LazyVal and Try monads. Go forth and program semicolons!