The Lawless Guide to Monads

This is an introduction to the concept of monads in Haskell for those who are starting to get familiar with the basics of the language - things like polymorphism, type signatures, higher-order functions and so. I will use a very simple example that looks useless but will hopefully help you understand this concept.

Start a REPL session with the command ghci and try the following:

ghci> f = (\x -> [2*x])
ghci> f 3
[6]

Here we define a function f that takes a number and gives back a list of number(s). We can use the :type command to check this:

ghci> :type f
f :: Num a => a -> [a]

It can take some time to get use to reading type signatures but this is a fundamental part of programming in Haskell so make sure you understand them.

We will be playing quite a bit with lists so for brevity, let’s create, say, a list of numbers named xs that we can use for trying stuff out:

ghci> xs = [1,2,3]
ghci> :type xs
xs :: Num a => [a]

Since we have a list of numbers and a function that takes a number, we could use the map function to apply f to each element of xs:

ghci> map f xs
[[2],[4],[6]]

As you can see, we have successfully multiplied each element of xs by two, although, because our function returns a list with a single number, we get this funky list of lists of numbers (i.e. Num a => [[a]]). But worry not: as Haskell is such a great language, we can easily flatten it:

ghci> concat (map f xs)
[2,4,6]

If by any chance you are wondering “can we make this shorter?” I’ve got great news for you :D Haskell is such a magnificent language that it provides an operator called “bind” to do the same thing in half the characters!

ghci> xs >>= f
[2,4,6]

Fantastic isn’t it? Next,

ghci> xs >>= (\x -> [2*x])
[2,4,6]
ghci> xs >>= (\x -> return (2*x))
[2,4,6]

The first expression is there just to remind you what f is. Now, the second expression is equivalent to the first but it uses a function named return. Note that this name is unfortunately misleading: this is not a statement for “returning” some value to the caller (like it would be for instance in C), this is just an ordinary function, and in this case we are simply applying it to (2*x).

For the sake of clarity, this is an equivalent expression without using >>=:

concat (map (\x -> return (x*2)) xs)
[2,4,6]

Since we still get the same result, it seems like return is just wrapping a value in a list, no? Hmmm… Let’s check its type signature:

ghci> :type return
return :: Monad m => a -> m a

This says that return is a function that takes a value of some generic type a and gives back a monad of type a. “But weren’t we expecting a list of a?”, I hear you say. Well, guess what? Lists are monads!

You see, for a type to be a monad it needs to implement the functions >>= (bind) and return that work for its own type, and Haskell lists do that out of the box. This idea of classifying something in terms of what you can do with it might seem a bit strange if you don’t have experience with type classes1. If this is the case for you then you may want to revisit ad hoc polymorphism in Haskell to get more familiar with it.

Back to monads, we haven’t checked the signature of >>= yet:

ghci> :type (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b

Since >>= is an infix function, the first parameter represents the left-hand-side, this is m a, and the second one the right-hand-side, (a -> m b). If we consider that in our case m is a list then you could spell out this signature as “bind is a function that takes two parameters: (i) a list of type A and (ii) a function that takes a value of type A and gives back a list of type B; bind then gives back a list of type B”.

If there is one thing you should remember from all this it is this >>= function, in particular its type signature (i.e. m a -> (a -> m b) -> m b). As a mnemonic you could say that “a monad represents a type that can be bound over”, similar to the classic “a functor represents a type that can be mapped over”.

Congrats, you now know what a monad is. There are still a few details2 you could learn but you’ll be alright skipping those for now. In my next article I will show you how to use a monad called IO to deal with I/O operations in a purely functional way.

  1. 📆 March 29, 2024
  2. 🏷️ functional programming

Cryptographic protocol analysis with Verifpal

Most software projects avoid formal verification due to time constraints and expertise requirements. Tools like the Coq proof assistant, commonly used to introduce the topic to computer science students, can appear too cumbersome for real-world applications and therefore hard to justify.

In contrast, in the field of cryptography specialised tools such as ProVerif have been regularly used for the last two decades for verifying important properties of cryptographic protocols, but these still involve writing scarily long proofs.

However, in recent years, multiple similar tools that try simplify the process have been developed. One that caught my attention was Verifpal, whose creators have prioritised user-friendliness in its design. Created by Nadim Kobeissi, Georgio Nicolas and Mukesh Tiwari, it was first released in 20191.

The Verifpal User Manual is really fun and it only requires some undergraduate-level knowledge of cryptography. In this post, I will show you a short exercise from the manual (“Challenge-Response Protocol”) to illustrate what automated formal verification looks like with Verifpal.

So, first thing our model will need is an adversary, which in this case is an active attacker that has total control of the network2:

attacker[active]

Yup, that’s all. Now we need some principals, this is, the parties who are trying to communicate:

principal Server [
	knows private s
	gs = G^s
]
principal Client [
	knows private c
	gc = G^c
	generates nonce
]

Here we have two principals, a client named Client and a server named Server. The server has a secret s that in this example represents a Diffie-Hellman secret exponent, and its derived public key named gs. Similarly, the client knows its own secret exponent and public key, plus a random nonce that the client will use as a challenge for the server:

Client->Server: nonce
principal Server[
	proof = SIGN(s, nonce)
]

The first line means that Client sends the nonce to Server through the network. Then Server digitally signs the nonce using its secret exponent s.

Server->Client: gs, proof // uh-oh
principal Client[
	valid = SIGNVERIF(gs, nonce, proof)?
	generates attestation
	signed = SIGN(c, attestation)
]

Now Server sends its public key gs and the signed challenge proof to Client. The // uh-oh is just a code comment (ignore it for now ;p). Then Client validates the signed challenge (the ? after SIGNVERIF tells that the execution should end here if the signature verification fails), generates an attestation message and signs it.

Client->Server: [gc], attestation, signed
principal Server[
	storage = SIGNVERIF(gc, attestation, signed)?
]

In the final part of our protocol Client sends its public key gc, the attestation message and the digital signature. Then Server verifies that attestation was signed by Client.

Note the square brackets wrapping gc: it means that this value cannot be tempered by an attacker, only read. Why? Well, let’s just assume the principals are able to share public keys through some other channel that they consider secure, so basically we want to keep this part out of the scope of our model.

Our protocol is now defined. Let’s see if there is any way an attacker could impersonate the server or the client. For that we need to check if an attacker can do anything interesting with our signatures:

queries[
	authentication? Server->Client: proof
	authentication? Client->Server: signed
]

A query is the way you ask Verifpal to verify properties of your protocol. Here we care about the authenticity of some values; could anyone perform a successful MITM? Let’s run the following command in the shell to see:

$ verifpal verify challenge-response.vp
Verifpal 0.27.2 - https://verifpal.com
  Warning • Verifpal is Beta
 Verifpal • Parsing model 'challenge-response.vp'...
[...]
 Analysis • Initializing Stage 2 mutation map for Client... 
   Result • authentication? Server -> Client: proof — When:
[...]
            gs → G^nil ← mutated by Attacker (originally G^s)
            proof → SIGN(nil, nonce) ← mutated by Attacker (originally SIGN(s, nonce))
            valid → nil ← obtained by Attacker
            storage → nil ← obtained by Attacker
            proof (SIGN(nil, nonce)), sent by Attacker and not by Server, is successfully used in SIGNVERIF(G^nil, nonce, SIGN(nil, nonce))? within Client's state.
[...]

The full output is a bit long for a blog post, but as you can see an attacker was able to impersonate the server by swapping the server’s public key gs for its own and then using it to sign a new proof!

This happened because we forgot to indicate that gs, like gc, can’t be tempered; remember the “uh-oh”? Let’s fix that:

Server->Client: [gs], proof

Now if we run the verify command again we will get this instead:

[...]
 Verifpal • All queries pass. 

 Verifpal • Thank you for using Verifpal. 

Nice! This confirms that the properties we wanted to verify are true ^_^

  1. 📆 December 27, 2023
  2. 🏷️ cryptography, formal methods

A couple of Readline incantations

You may have never heard of GNU Readline before, but provably you have used it many times. Readline is a library for command-line interfaces that manages user interaction: it’s the thing that shows you a prompt and allows you to write commands in programs like Bash and the Postgres client.

One of the “tricks” you learn when using a shell for the first time is to press Tab to autocomplete commands and paths, that’s Readline. You also learn that you can press Up (or Ctrl+p) to navigate the command history, that’s Readline too. There are of course many key bindings to edit text in different ways, and if you are a daily CLI user reading the Readline manpages is a must. One that I use all the time with Bash, for example, is Alt+. (aka yank-last-arg) which grabs the last argument of the previous command, so if you just created a new directory like mkdir mynewproject you can quickly go inside it by typing cd followed by Alt+.1 and enter. It is also possible to indicate which argument in particular you want from the history by using Alt+arg_number followed by Ctrl+Alt+Y (aka yank-nth-arg).

A very cool thing about Readline is that it allows us to create new custom key bindings. For instance, ages ago I wrote the following logical extension of the yank-nth-arg that yanks all the arguments starting from the nth argument:

# /etc/inputrc
"\e,": "0\eb!:\ef\C-b\C-d*\e\C-e"

To use this, you first have to type the initial argument number followed by Alt+, (i.e. a comma instead of period). This basically combines already existing features into one Readline command, but the power of Readline doesn’t stop there: You can also create completely new functionalities by running shell commands and using their output on the fly!

This is one I personally think is a must for Bash:

"\e\C-p": "$(printf %q "$(pwd)")\e\C-e"

Now you can press Ctrl+Alt+P to get the current full path, which you can use, for example, to quickly cd into any directory ancestor by pressing Alt and then Backspace as many times as you need.