Monoids: what they are, why they are useful, and what they teach us about software

Share on:

Monoids are a pretty interesting concept in software development. Monoids are everywhere. Monoids are simple yet powerful. And Monoids have a lot to teach us about software, in particular about composition and building powerful abstraction.

This post will take you through a small tour of what Monoids are and are for.

We will first define what Monoids are and show some first examples. We will then discuss the benefits they can bring to your code, and illustrate it through a more involved example. As closing thoughts, we will step back and think about what Monoids can teach us about programming in general.

Disclaimer: Starting from the second part, we will use Haskell and C++ for our examples, but this post is written such that the knowledge of these languages is not strictly required to follow it.

What is a Monoid?

We here review the basics of Monoids, by going through the formal definition and some classic examples. You are encouraged to skip this section if you are already familiar with Monoids.

Definition (formal)

A Monoid can be defined as a tuple $(M, op, e)$ where:

  • $M$ is a set of element
  • $op$ is an associative binary operation on two elements of M, returning a new element of M
  • $e$ is an element of $M$, neutral for op on both left and right side

Depending on the field of Math considered, Monoids have several definitions. This one is not the most general one, but will do for the rest of this post.

Definition (statically typed languages)

In a statically typed language, we can translate the notion of set into the notion of type. A Monoid consists of a type T and a function f obeying the following rules:

  • f takes two instances of T and returns a new instance of T
  • f is associative: $\forall a, b, c$ we have $f(f(a, b), c) = f(a, f(b, c))$
  • f has a neutral element e: $\forall a, f(e, a) = a = f(a, e)$

Some language like Haskell have an explicit interface for Monoids which a type can implement. Beware though that in general the same type T can participate in several Monoids.

Classical examples

I listed below some of the very classic and general examples. Almost all of them are pretty easy to identify and are documented in many online resources:

  • Integers form a Monoid under addition with the neutral element zero
  • Integers form a Monoid under multiplication with the neutral element one
  • Sequential containers form a Monoid under concatenation (strings, vectors…)
  • Associative containers form a Monoid under union (maps, sets…)

You can find a bunch more listed in this Wikipedia page dedicated to Monoids.

Monoids are everywhere

Here are some less classic examples, which I listed below to show that Monoids can be found in many different domains, and in many successful libraries:

  • C++ range-v3 form a Monoid under view::concat with the empty range as neutral element (*)
  • Money amounts define a Monoid under summation with the null amount as neutral element
  • Relative paths in a file system form a Monoid under appending
  • Access rights to files form a Monoid under intersection or union of rights

All of these examples are pretty close to the classic ones. Access rights are related to sets, ranges are related to sequential containers, money amounts are related to numbers (or to associative containers).

So an easy way to identify and recognise Monoids in your own code is simply to look at how the concept of your domain relate to the usual suspects.

(*) ERRATUM: range-v3 view::concat does not strictly speaking defines a Monoid: although we can view them as Monoids from a conceptual standpoint, the types do not align. Thank you Eric Niebler for pointing out that error.

Why Monoids are useful

Monoids are a great way to build complex behaviour out of simple elements, without having to introduce new concepts in your software. This section will explain why. The next section will illustrate it with code.

The closure gives composition free of intellectual charge

Because the binary operation of a Monoid takes two values of a given type and returns a new value of the same type (closure property), this operation can be chained indefinitely.

  • We can compose simple elements into composite elements
  • These composite elements can be composed further, just as simple elements can
  • Since we stay in the same world, this process can continue for ever

In addition to this, and because we are only dealing with one type, we get composition for free: we do not have to introduce new types, and therefore new concepts in our code.

A Monoid gives us a way to build complexity out of simplicity, with no conceptual cost added.

Associativity gives abstraction over construction details

Let us imagine for a moment that the binary operation was not associative:

  • The outcome of combining several elements together would depend on the order in which we combine them (the more elements we have, the larger the number of outcomes)
  • So to get a desired outcome, we would have to care about this composition order, all the way down to the simplest elements we combine together
  • We would therefore have to distinguish primitive elements (not built by combination) from composite ones, and care about how many combination took place

In short, we would have to know all about the details of construction of a value. Instead, associativity abstracts away the details of construction. Given a value, we do not have to care how it was built, and whether it is a composite or not. We might not be even able to tell (*).

This abstraction has additional benefits as well: we can split work of constructing a value to different parts of the software, or to different threads even.

(*) In fact, we must not be able to tell or the abstraction is broken. Associativity requires this kind of information to be erased. This is the point of an abstraction: getting rid of unimportant details.

A more involved example of Monoid

We will now illustrate how Monoids abstract away the details of creation of a value, to help us build powerful abstractions. To do so, we will create a small library to represent shapes in any number of dimensions.

We will give the examples in Haskell first, and translate them in C++ at the end of each section.

Abstract definition of a shape

In the context of this example, we will define a shape as being a region of space, with an arbitrary number of dimension. For instance, a disk is a region of a 2D space. It contains all the points at a distance from the centre of the disk lower or equal than the radius.

Our goal is to write a small library to define and combine such shapes together into arbitrarily complex shapes. To do so, and in order to be agnostic to sampling issues, we will use an idea I first read about from Conal Elliott. We will represent shapes as functions from coordinates to a boolean, whose value indicates whether the point is part of the shape. A simple and beautiful idea.

Shapes and coordinates

To define a shape in code, we will define a Shape type which wraps a function from a coordinate (any kind of coordinate, so we use a template parameter) to a boolean:

newtype Shape coord = Shape {  -- A shape templated on any type of coordinate
  isInShape :: coord -> Bool   -- Contains a function from a coordinate to a boolean
}

To define a shape in 2D, we define a type alias for coordinates in two dimensions. We then define a 2-dimentional shape as a type alias on a shape templated on a 2D coordinate:

type Coord2D = (Double, Double)
type Shape2D = Shape Coord2D

Viewing shapes as function is a pretty interesting model. Thanks to it, we can for instance pretty easily define the complement of a shape. This new shape is such that it contains all the points that were not in the previous shape (and does not include any of the points of the previous shape).

We call the function which computes this complement outside. It simply negates the predicate of the input shape (note that this works in any number of dimensions, with any coordinate system):

outside :: Shape coord -> Shape coord
outside s = Shape (not . isInShape s)

Here is the equivalent code (up to the strong typing) in C++:

template<class Coordinate>
using Shape = std::function<bool (Coordinate const&)>;

using Coord2D = std::pair<double, double>;
using Shape2D = Shape<Coord2D>;

template<class Coordinate>
Shape<Coordinate> outside(Shape<Coordinate> const& s) {
  return [=](Coordinate const& coord) {
    return not s(coord);
  };
}

Example of shapes

To define a shape, we only need to define a function which returns whether a coordinate is inside the shape or not. Here is how we would define a disk:

disk :: Coord2D -> Radius -> Shape2D
disk center radius =
  Shape $ \coord -> euclidianDistance center coord <= radius
  • The disk function returns a shape that wraps a lambda function
  • The lambda takes a coordinates and returns whether it is in the disk
  • To do so, it computes the euclidian distance of the coordinate to the centre
  • A point is in the disk if its distance to the center is inferior to the radius

We can test our disk in the REPL, by asking whether a point is inside it:

> let c = disk (1, 1) 1

> isInShape c (2, 0)
False

> isInShape c (2, 1)
True

Here is the equivalent code (up to the strong typing) in C++:

Shape2D disk(Coord2D center, double radius) {
  return [=](Coord2D const& c) {
    return distance(center, c) <= radius;
  };
}

Now that we have the basics for a shape, let us see what kind of Monoid we can find here.

Intersecting shapes is a Monoid

A shape forms a Monoid under the intersection of shapes. The intersection of two shapes is defined as another shape which contains only the coordinates that are in both input shapes:

intersect :: Shape coord -> Shape coord -> Shape coord
intersect s1 s2 =
  Shape $ \coord -> isInShape s1 coord && isInShape s2 coord

We can even generalize this to define the intersection of an arbitrarily large number of shapes (the resulting code is available in this GitHub Gist).

We also need a neutral element. In the case of the intersection, it would be the shape that covers the whole space. We can define it as a shape which always returns that a coordinate is inside it:

allSpace :: Shape coord
allSpace = Shape (const True) -- `const True` is a function which always returns True

Here is the equivalent code (up to the strong typing) in C++:

template<typename Coordinate>
Shape<Coordinate> allSpace() {
  return [](Coordinate const&) {
    return true;
  };
}

template<typename Coordinate>
Shape<Coordinate> intersect(Shape<Coordinate> const& lhs, Shape<Coordinate> const& rhs) {
  return [=](Coordinate const& coord) {
    return lhs(coord) && rhs(coord);
  };
}

Finally, because the and operator is associative, our intersect function is associative as well. So we have a Monoid. Let us play with it.

Note: We can define another Monoid as well for the superposition of shapes: the full code is provided here. This intersection / union process can be generalised to any predicate (with the AND and the OR operators).

Complex shapes out of simple shapes

Using our intersection operation, we can build more complex shapes. We can for instance intersect a disk with a rectangle, giving us a shape like this:

    *******************
  ***********************
 *************************
 *************************
  ***********************
    *******************

To build a ring-like shape, we can intersect the outside of a small disk with the interior of a big disk, both centered at the same point:

ring :: Coord2D -> Radius -> Radius -> Shape2D
ring center smallRadius bigRadius =
  intersect
    (disk center bigRadius)
    (outside (disk center smallRadius))

Using a small helper function, we can draw one in the console:

     ****    
   ********  
   ********  
  ***    *** 
  ***    *** 
  ***    *** 
  ***    *** 
   ********  
   ********  
     **** 

Here is the equivalent code (up to the strong typing) in C++:

Shape2D ring(Coord2D center, double smallRadius, double bigRadius) {
  return intersect(
    disk(center, bigRadius),
    outside(disk(center, smallRadius))
  );
}

Nothing forbids us to compose these shapes even further, as the intersection gives us back a shape like any other, we can intersect a ring with a rectangle for instance. There is no limit to our crazy imagination.

Building a vocabulary of shapes

Since the output of a Monoid operator is also a shape, we can define complex shapes out of very simple shapes and keep on composing them. The details of construction of these new shapes are abstracted.

We are also unable to distinguish a composite shape from a primitive shape. Composite shapes behave, look and smell exactly like primitive shapes, enriching our vocabulary of primitive shapes at a low cost.

Finally, and thanks to the associative nature of the Monoid operator, we never have to care about the order in which these intersections take place. We can first intersect a rectangle with a disk and then a ring, or do it the other way around. It does not matter: given a shape, we never have to care how it was constructed.

EDIT: as u/carrutstick noted, the use of the “order” word in the sentence above might be misleading. It refers to the order of application of the associative binary operation, not the commutativity, that is to say intersect(intersect(rectangle, disk), ring) is the same as intersect(rectangle, intersect(disk, ring)).

More advanced examples of Monoids

There are plenty more examples of Monoid we could talk about.

I have been lucky enough to participate to a training given by Cyrille Martraire, in which he gave us some more example on how to apply this notion to represent financial cashflows. French speakers might be interested in looking at his Monoid talk at the Devoxx 2015.

If you happen to attend the CppCon 2017, you might be interested in coming our presentation on How to apply Functional Programming ideas to build a HTTP router API. At this occasion, Jeremy Demeule and myself will be showing two more examples of Monoids.

Closing thoughts: what Monoids teach us about software

We saw that Monoids have very good characteristics in terms of composition and abstraction, two very valuable characteristics to get in our code. By looking at how they achieve this, we can make some parallel with other good ways to achieve composition.

Associativity and ordering

As we just saw, the associativity is a pretty important property of the Monoid operation. Had the binary operation not been associative, the whole abstraction would have collapsed.

  • Without associativity, we have to care about the order in which we combine the elements
  • This ordering provokes an combinatorial explosion of the possible outcomes

In fact, associativity is not as important as being ordering independent (it just so happens that associativity is sufficient to get this property for Monoids). In general, code whose outcome does not depend on its order of execution favours greater composition.

Side effect and ordering

Let us think about the obvious kind of things that would instantaneously make us lose associativity (or alternatively, make the result of our program sensitive to ordering).

The obvious suspect is side effects. Any side effect (an action performed by a process, visible from the rest of the program) inside the binary operation of a Monoid would instantaneously break the associative property. For instance, a adding a print statement would result in different traces being emitted depending on the order of evaluation.

In general, any assignment to a variable aliased externally, or any IO effect, will introduce a dependency to ordering and time in our programs. It will make the program vulnerable to issues such as race conditions, but also greatly limit its composability.

Side effect and composition

In general, there is a pretty strong correlation between our ability to compose program and the amount of side-effects this program relies on to perform its job.

If we see Monoids are a small laboratory to experiment with the consequences of side-effects on a whole program, we understand that limiting side effects to a limited scope in our software is key to build composable and leak-free abstractions.

Updated:

 

<
Previous Post
List Monad connection with Non-deterministic Polynomial time complexity
>
Next Post
Transforming data structures into types: an introduction to dependent typing and its benefits

Comments