Functions Everywhere
In the previous post we used a small subset of a given programming language and tried (successfully) to define our own implementation of recursion. Let’s try to use even smaller subset.
Imagine… We don’t have numbers, strings, operations… Everything we have is the ability to define functions (abstraction) and to call (invoke) them (application). Let’s start with this idea in mind and see where it’ll take us.
In The Beginning Was The Function
So, what do we have? We can define functions:
fn (x) -> x end
…and we can call (invoke) them:
fn (x) -> x end.(fn (x) -> x end)
That’s all we can do.
But what should we pass to a function as its arguments?
We only have one type of data - functions. We can define them (creating instances of this data type) and invoke them. If we look at the code sample above, we can see that we are calling the function with a function as an argument. Yes. This is it! The only thing we can pass to a function call is another (or the same) function.
Now, one last constraint : we only have the ability to define functions of one argument.
So to summarize:
- We only have one type of data - functions.
- We know how to define a function of one argument.
- We know how to call a function. From (1) : one function can be called with a parameter of type
function
.
And now, to make our lives a bit easier, we’ll add one additional rule:
- We can ‘give a name’ to our functions, so we can reuse them:
id = fn(x) -> x end
id.(id)
What should we do with the rules we have? The answer is simple : let’s think of a few challenges we could solve using only these four base rules.
So here are the challenges:
- Let’s find a way to pass more than one argument to a function call
- Let’s define the simplest data type - the boolean
- Let’s define numbers and operations over those numbers
Multiple Arguments
The function, we defined and called above is known as identity or I combinator. When called with an argument, it just returns it.
We want to define another well known combinator - K
.
This is a very simple function, also known as ‘The Kestrel’ (if you want to know why, take a look at this book).
The interesting thing about the Kestrel combinator is that its definition looks like this : K(x, y) = x
, which means that it is a function of two arguments and it returns the first of them.
The question is : How to define it, when we can only define functions of one argument?
To answer this question, we can look at a technique, which main idea is to change the logic of a function of multiple arguments to a function of one argument, which returns a function of one argument, which returns a function of one argument and so on, depending on the number of the arguments of the original function. The last function of one argument executes the original function’s calculation using the arguments of the functions wrapping it. This technique is known as ‘schönfinkelisation’. Let’s look at a code example:
a = fn (x, y, z) -> (x + y) * z end
b = fn (x) ->
fn (y) ->
fn (z) -> (x + y) * z end
end
end
a.(2, 3, 4)
# 20
b.(2).(3).(4)
# 20
We can see that the result of calling the two functions is the same, but the invocations and the definitions differ. The important thing is that the logic is kept the same. And yes, we went out of our constraints a bit for this example. Interesting fact: this technique was named after its inventor Moses Schönfinkel and further developed by Haskell Curry. That’s why today we call it ‘currying’ and not ‘schönfinkelisation’. Guess why? :)
This technique allows us to think of a function of one argument returning a function of one argument and so on, as a functions of multiple arguments. That’s how we are going to define the Kestrel:
kestrel = fn (x) -> fn (y) -> x end end
We can define another combinator, named after a bird - ‘The Starling’, also known as the S combinator.
Its definition is: S(x, y, z) = x(z, (y(z)))
.
This means that it expects three arguments: the first, x
, should be a function of (at least) two arguments,
which S
calls with its third argument, z
and y(z)
.
And here is ‘The Starting’, written in Elixir
:
starling =
fn (x) ->
fn (y) ->
fn (z) ->
x.(z).(y.(z))
end
end
end
Interesting fact:
starling.(kestrel).(kestrel)
is
id
Notice that comparing functions is hard.
In this case is not so hard, but generally it is a hard thing to do.
We can assume that two functions are identical if they return the same value for all of the possible values we can pass to them in a function call.
This equality is hard to prove.
We say that two functions are functionally equivalent, if for every y
they return the same value.
As mentioned above, proving that id
and starling.(kestrel).(kestrel)
are functionally equivalent is not that hard:
starling.(kestrel).(kestrel)
# By the definition of starling =>
fn (z) ->
kestrel.(z).(kestrel.(z))
end
# By the definition of kestrel =>
fn (z) ->
z
end
# By the definition of id =>
id
Now we are ready to define a new type of data.
Boolean Values
Let’s define TRUE and FALSE like this:
bool_true = fn (a) -> fn (b) -> a end end
bool_false = fn (a) -> fn (b) -> b end end
Which means that we define the two boolean values as follows:
- TRUE is a function of two arguments, which always returns the first one.
- FALSE is a function of two arguments, which always returns the second one.
We could also define them using the S and K combinators:
bool_true = kestrel
bool_false = starling.(kestrel)
The first line is clear, and the second:
K(x, y) = x
S(x, y, z) = x(z, y(z))
=>
S(K, a, b) = K(b, a(b)) = b === FALSE(a, b) = b
Now we have the two boolean values. Starting from this point, we can define the simplest operation on them: NOT:
bool_not = fn (p) -> p.(bool_false).(bool_true) end
bool_not.(bool_true)
# bool_true.(bool_false).(bool_true) => bool_false
bool_not.(bool_false)
# bool_false.(bool_false).(bool_true) => bool_true
It’s easy to define a function for conjunction. Let’s think about it:
- We know, that if we pass to
bool_true
its two arguments, it will always return the first one. - We know, that if we pass to
bool_false
its two arguments, it will always return the second one. - We know, that AND should be a function of two boolean arguments; If the first one is
bool_false
, AND should returnbool_false
, we don’t need to check the second one. - From (1) and (2),
bool_true.(x).(bool_false)
isx
, andbool_false.(x).(bool_false)
isbool_false
. - From (3) and (4), we can define AND as:
bool_and = fn (a) -> fn (b) -> a.(b).(bool_false) end end
bool_and.(bool_true).(bool_true)
# bool_true.(bool_true).(bool_false) => bool_true
bool_and.(bool_false).(bool_true)
# bool_false.(bool_true).(bool_false) => bool_false
bool_and.(bool_true).(bool_false)
# bool_true.(bool_false).(bool_false) => bool_false
bool_and.(bool_false).(bool_false)
# bool_false.(bool_false).(bool_false) => bool_false
In a similar fashion we can define OR as:
bool_or = fn (a) -> fn (b) -> a.(bool_true).(b) end end
So we have logical NOT, conjunction and disjunction. We can also define IF or rather COND. This is a function of three arguments. If the first one is TRUE, it will return the second one, if the first one is instead FALSE, it will return the third one:
bool_cond = fn (a) -> fn (b) -> fn (c) -> a.(b).(c) end end end
Let’s break out of our constraint to have only one type of data and write a function,
that returns the right Elixir
boolean value when we call it with bool_true
or bool_false
:
church_to_bool =
fn (church_bool) ->
church_bool.(true).(false)
end
church_to_bool.(bool_true)
# true
church_to_bool.(bool_false)
# false
church_to_bool.(bool_and.(bool_false).(bool_true))
# false
This way of representing the boolean values and the operations on them as functions is called Church’s encoding of booleans and is named after Alonzo Church. He used this in the lambda calculus. We’ll be talking a lot about the lambda calculus and Church’s encodings. For example, in the following section we will define our own numbers.
Numbers And Arithmetic Operations
We’ll do something very similar to the way we defined our boolean values. We’ll assume that:
0
iszero(f, x) = x
.1
isone(f, x) = f(x)
.2
istwo(f, x) = f(f(x))
.3
isthree(f, x) = f(f(f(x)))
.
The idea is that a number is a function of two arguments. When the number is zero, the function just returns its second argument. When the number is one, the function applies its first argument on the second one : f(x). When the number is two, the function applies its first argument on itself and then the result on its second argument : f(f(x)). We can continue forever like that, so n is fn(x).
We can write this representation in Elixir
as:
zero = fn (f) -> fn (x) -> x end end
one = fn (f) -> fn (x) -> x |> f.() end end
two = fn (f) -> fn (x) -> x |> f.() |> f.() end end
three = fn (f) -> fn (x) -> x |> f.() |> f.() |> f.() end end
# And so on...
Now, let’s introduce a few operations on these numbers.
Our first task is to define a function is_zero(n)
, that takes a number as an argument and returns bool_true
if the number is zero
or bool_false
if it isn’t.
Let’s think about it:
- The numbers are functions of two arguments.
- We can pass two values to a given number and examine its behaviour.
- If we pass the values
g
anda
to a number,g
will be called one or more times if the number is notzero
. If the number iszero
,g
won’t be called at all. - From (3) :
g
should be a function, which always returnsbool_false
anda
could bebool_true
. Now if we pass these functions tozero
, we’ll get the second argument :a
orbool_true
. If the number is notzero
, we’ll get the result of callingg
, which isbool_false
.
bool_true = fn (a) -> fn (b) -> a end end
bool_false = fn (a) -> fn (b) -> b end end
is_zero =
fn (n) ->
n.(fn (_) -> bool_false end).(bool_true)
end
is_zero.(zero) |> church_to_bool.()
# true
is_zero.(one) |> church_to_bool.()
# false
The next task is to implement a function, which given a number returns the next number.
This is the function succ(n)
, which when called with zero
returns one
, when called with one
, returns two
and so on.
As we know the numbers themselves are functions of two arguments, so n
is actually fn(x) and n+1
is f(fn(x)) or f(n(f, x)).
We can define succ
as a function of three arguments : the number n
, f
and x
.
This function constructs a new number : f(n(f, x)):
succ =
fn (n) ->
fn (f) ->
fn(x) ->
f.(n.(f).(x))
end
end
end
succ.(zero)
# f.(x) => one
succ.(two)
# f.( f.(f.(x)) ) => three
four = succ.(three)
five = succ.(four)
The function for adding two numbers is very similar. We construct a new number using the two numbers we want to add- m and n. The new number is m(f, n(f, x)) => m(f, fn(x)) => fm(fn(x)) => fm + n(x) => m + n:
plus =
fn (m) ->
fn (n) ->
fn (f) ->
fn(x) ->
m.(f).(n.(f).(x))
end
end
end
end
six = plus.(four).(two)
It is hard to debug the results we get by invoking these operations on numbers.
That’s why let’s leave our constraints behind for a bit and write a function which transforms our numbers based on functions to Elixir
numbers:
church_to_int =
fn (n) ->
n.(&(&1 + 1)).(0)
end
six |> church_to_int.()
# 6
seven = succ.(six)
seven |> church_to_int.()
# 7
plus.(four).(five) |> church_to_int.()
# 9
twenty = plus.(plus.(four).(five)).(plus.(six).(five))
twenty |> church_to_int.()
# 20
Let’s define another operation - pred
.
This one will return the previous number when we apply it to a number.
Important thing : we don’t have negative numbers, so pred(zero) == zero
.
Let’s think.
- To calculate
n+1
fromn
(usingsucc
), we passn
’s value ( fn(x) ) to its first argument f. - From (1) : to calculate
n-1
we just have to remove one f fromn
’s return value.
The problem is that doing this is not as easy as it sounds.
The idea is to construct n
applying f to x n times and on every step to keep the value of the previous one.
In the end we are going to return the previous value for fn(x), which is fn-1(x).
Our task now is to think of a way to keep both the previous and current result of applying f on every step. If we had a tuple of values it would work, wouldn’t it? But we don’t have this data type in our little language subset.
The solution is : let’s define a custom PAIR type, based on functions, like we did it for the booleans and the numbers. We assume that a PAIR is defined as:
pair =
fn (a) ->
fn (b) ->
fn (c) -> c.(a).(b) end
end
end
The only thing left is to implement retrieving the first or the second element of a pair
:
first =
fn (p) ->
g = fn (x) -> fn (y) -> x end end
p.(g)
end
second =
fn (p) ->
g = fn (x) -> fn (y) -> y end end
p.(g)
end
first.(pair.(one).(two)) |> church_to_int.()
# 1
second.(pair.(one).(two)) |> church_to_int.()
# 2
The above idea is simple - the function, constructing the pair, expects 3 arguments, but we call it with only two - the two elements of the pair.
The third argument is a function of two arguments, which will be called with the two elements of the pair as arguments.
So if we pass the right function as third argument of pair
, we could get the right element.
Interesting fact is, that the function we use to read the first element of a pair in first
is bool_true
and the one in second
is bool_false
.
It is common to see the same function playing different role, depending on the type of data it operates on or represents.
Don’t forget that the K combinator is also bool_true
.
Let’s return to the pred
operation.
It is time to define the case, in which we call it with zero
:
zero_case = pair.(zero).(zero)
And now the case for every other number:
main_step =
fn (p) ->
pair.(second.(p)).(succ.(second.(p)))
end
Basically it takes a pair of numbers and returns a new pair of numbers. The new pair’s first element is the second element of the original pair and its second element is the second element of the original pair, incremented by one.
- We know that the number itself is a function of two arguments, which applies its first argument to its second one as many times, as the value of the number it represents.
- The
main_step
function expects a pair of numbers and creates a new pair using the following logic :(n, m)
->(m, m+1)
. - If we start with passing
(0, 0)
tomain_step
once, we’ll get(0, 1)
. If we pass this result tomain_step
, we’ll get(1, 2)
. Passing this tomain_step
will produce(2, 3)
. So, callingmain_step
on(0, 0)
n times will return(n-1, n)
. If we now read the first element of the pair we will get what we wanted -n-1
.
If we have the number n (don’t forget, it is a function of two arguments) and call it like this: n.(main_step).(zero_case)
,
by the definition of n
, main_step
will get applied to zero_case
exactly n
times.
Returning the first element of this result will produce n-1
:
pred =
fn (n) ->
first.(n.(main_step).(zero_case))
end
# And so:
pred.(zero) # =>
first.(zero.(main_step).(zero_case)) # =>
first.((fn (f) -> fn (x) -> x end end).(main_step).(zero_case)) # =>
first.(zero_case) # =>
zero
pred.(two) # =>
first.(two.(main_step).(zero_case)) # =>
first.((fn (f) -> fn (x) -> f.(f.(x)) end end).(main_step).(zero_case)) # =>
first.(main_step.(main_step.(zero_case))) # =>
first.(main_step.(pair.(second.(zero_case)).(succ.(second.(zero_case))))) # =>
first.(main_step.(pair.(zero).(one))) # =>
first.(pair.(second.(pair.(zero).(one))).(succ.(second.(pair.(zero).(one))))) # =>
first.(pair.(one).(succ.(one))) # =>
one
We can define subtraction very easy, using pred
.
If we want to subtract m
from n
we just have to apply pred
exactly m times to n
:
minus =
fn (n) ->
fn(m) ->
m.(pred).(n)
end
end
minus.(seven).(three) |> church_to_int.()
# 4
Again, we used the fact that our numbers are just functions of two arguments which apply their first argument on their second as many times as their numerical value.
Let’s define multiplication using the same property:
mult =
fn (n) ->
fn (m) ->
m.(plus.(n)).(zero)
end
end
mult_without_plus =
fn (n) ->
fn (m) ->
fn (f) ->
n.(m.(f))
end
end
end
one_hundred = mult.(twenty).(five)
one_hundred |> church_to_int.()
# 100
And what about comparing numbers? Not a problem! Here is the operation for checking if two numbers are equal:
num_eq =
fn (n) ->
fn (m) ->
# bool_and.(is_zero.(minus.(n).(m))).(is_zero.(minus.(m).(n)))
bool_and.(is_zero.((m).(pred).(n))).(is_zero.((n).(pred).(m)))
end
end
num_eq.(one_hundred).(one_hundred) |> church_to_bool.()
# true
We use is_zero
and minus
.
If we subtract m
from n
and we get zero
, the numbers are equal.
We don’t have negative numbers, so if we subtract bigger number from smaller one, we’ll get zero
(pred(zero) == zero
) and num_eq
will return bool_true
.
That’s why we use bool_and
to subtract both n
from m
and m
from n
and check if both of these subtractions are zero
.
The strict inequalities, greater than and less than are similar:
num_gt =
fn (n) ->
fn (m) ->
bool_and.(is_zero.((n).(pred).(m))).(bool_not.(is_zero.((m).(pred).(n))))
end
end
num_lt =
fn (n) ->
fn (m) ->
bool_and.(is_zero.((m).(pred).(n))).(bool_not.(is_zero.((n).(pred).(m))))
end
end
num_gt.(two).(one) |> church_to_bool.()
# true
num_gt.(two).(three) |> church_to_bool.()
# false
num_lt.(two).(three) |> church_to_bool.()
# true
num_lt.(four).(three) |> church_to_bool.()
# false
We know that (n).(pred).(m)
returns zero
, if n
is greater than m
or equal to m
.
We don’t want to return bool_true
if n
and m
are equal.
That’s why we also check that (m).(pred).(n)
is not zero
.
And this is how the strictly greater than check works.
The implementation of the strictly less than check is equivalent.
Defining their non-strict counterparts is even easier:
num_gt_eq =
fn (n) ->
fn (m) ->
is_zero.((n).(pred).(m))
end
end
num_lt_eq =
fn (n) ->
fn (m) ->
is_zero.((m).(pred).(n))
end
end
num_gt_eq.(two).(one) |> church_to_bool.()
# true
num_gt_eq.(two).(two) |> church_to_bool.()
# true
num_gt_eq.(two).(three) |> church_to_bool.()
# false
num_lt_eq.(two).(three) |> church_to_bool.()
# true
num_lt_eq.(three).(three) |> church_to_bool.()
# true
num_lt_eq.(four).(three) |> church_to_bool.()
# false
Let’s summarize : we can compare numbers, add them to one another, subtract them and multiply them.
It was not very easy to define the subtraction operation.
It required the pred
function, which required working with pairs.
Our next tasks is even harder. We are going to define the division operation. The algorithm can be described like this:
a / b =
if a >= b do
1 + (a - b) / b
else
0
end
We have all the operations we need to implement this algorithm, but the division operation is used recursively. That’s why we’ll have to use the Y combinator:
y = fn (h) ->
(fn (f) ->
f.(f)
end).(fn (g) ->
h.(fn (x) -> g.(g).(x) end)
end)
end
And with the Y combinator, the definition of the division operation looks like this:
divide =
fn (n) ->
fn (m) ->
y.(fn (divide1) ->
fn (current) ->
num_gt_eq.(current).(m).(
fn (_) -> succ.(divide1.(minus.(current).(m))) end
).(
fn (_) -> zero end
).(zero)
end
end).(n)
end
end
divide.(twenty).(two) |> church_to_int.()
# 10
divide.(twenty).(three) |> church_to_int.()
# 6
divide.(twenty).(four) |> church_to_int.()
# 5
Actually it was not so hard. But that’s because we are using a lot of things we defined in this and the previous posts. Basically we are implementing the pseudo-code from above with our own operations and with the Y combinator, providing the recursion.
When we need to do n / m
, we call divide.(n).(m)
.
This will pass n
as the initial current
value.
If current
is greater than or equal to m
, we add 1
to the recursive call to divide
with current - m
and m
as arguments (the first time the arguments are n - m
and m
).
If current
is less than m
, we return zero
.
Notice that we achieved that by only using functions of one argument.
The operation, returning the remainder of a division is very similar:
remainder =
fn (n) ->
fn (m) ->
y.(fn (remainder1) ->
fn (current) ->
num_gt_eq.(current).(m).(
fn (_) -> remainder1.(minus.(current).(m)) end
).(
fn (_) -> current end
).(zero)
end
end).(n)
end
end
remainder.(twenty).(two) |> church_to_int.()
# 0
remainder.(twenty).(three) |> church_to_int.()
# 2
remainder.(twenty).(seven) |> church_to_int.()
# 6
In the end, we return current
and not the number of the divisions.
Let’s add another operation as a bonus - the exponentiation:
power =
fn (n) ->
fn (m) ->
m.(n)
end
end
power.(two).(three) |> church_to_int.()
# 8
This one is very simple, so I’ll leave it to you.
This post became a bit long, so let’s conclude it with something we did in the previous one. We’ll define a function calculating the factorial but this time we’ll only be using functions of one argument as our data type and operations. We’ve got all we need to do it.
Our previous definition of factorial was:
proto_factorial = fn g ->
fn
0 -> 1
n -> n * g.(n - 1)
end
end
factorial = y.(proto_factorial)
By using our own numbers and booleans, based on functions and the operations we created,
we can transform the above factorial
into:
proto_factorial = fn g ->
fn (n) ->
is_zero.(n).(fn (_) -> one end).(
fn (_) ->
mult.(n).(g.(pred.(n)))
end
).(zero)
end
end
factorial = y.(proto_factorial)
factorial.(five) |> church_to_int.()
# 120
The problem with this implementation is that if you invoke it with even a small number like ten, it’ll take time. This applies to subtracting large numbers too.
The numbers we defined using functions are very unoptimized for real calculations.
Their idea is to prove that only using anonymous functions of one argument, we can define numbers and operations on them.
The anonymous functions of one argument are exactly what the lambda calculus is based on.
The untyped lambda calculus is one of the simples programming languages, which theoretically can be used to calculate any program written in another programming language, like Elixir
for example.
And this is very interesting.
The lambda calculus is the base of the modern functional programming and is very important. That’s why we’ll continue tackling with it in future posts. The numbers, we defined and used in this post, are known as ‘Church numerals’. There are even versions of these numerals supporting negative numbers.
Related Books And Links
So a fellow functional programming zealot, Phil, gave me the idea to add links in the end of my posts to some books or articles, related to the topics I write about. I’ll start adding such links from now on.
If the topic covered here was of interest to you, checkout the following books:
- To Mock a Mockingbird - Puzzles, games and birds. This is the book providing bird names to some well known combinators.
- Types and Programming Languages - This book will be an integral part of the topics I’m going to cover in the future. Types, lambda calculus, computability etc.
- Understanding Computation - Contains similar exercises to the ones we did through this post, written in
Ruby
.
Code
- The code I wrote for this post is located here.
- The markdown source of the post can be found here. If you want to contribute, by fixing mistakes I made, or adding more content, you can do it in a PR there.