Real numbers are hard

I'm sure that you, dear reader, are familiar with the real numbers: $0$, $1$, $\frac{1}{2}$, $\sqrt{2}$, $\pi$, $e$. They are a merry bunch, with many neighbours by either side (some might even object that there are too many neighbours). You can add them, you can multiply them, you can even (gasps) divide them!

The real numbers are essentially what you get when you ask "what is in between all of these rationals?". They completely fill the gaps between the rationals and so, give a continuous, smooth line of numbers, that we call the Real Line.

These numbers allow us to ask questions regarding continuous things. Curves, areas, distance, lengths, etc.

They are so ubiquitous that when someone thinks of "numbers" they usually think of the real numbers. They are everywhere! Even inside your computer!

Go on take a look! I can see one right here inside mine.

Wait. (double checks script)

This isn't a real number. This is a floating point number! (gasps)

#include <math.h>
#include <stdint.h>
#include <stdio.h>

typedef union {
  float f;
  uint32_t u;
} float_as_unsigned;

int main() {
  float_as_unsigned pi = {.f = M_PI};
  printf("%f\n", pi.f);
  printf("%x\n", pi.u);
  return 0;
}
3.141593
40490fdb

Look how they massacred my beautiful $\pi$ (sniffles).

Alright, jokes aside, computers don't really use real numbers. Pretty much every computer uses floats, 32 bit or 64 bit, or some variation thereof (Some people really like Unums).

I'll stick to berating discussing floats just for simplicity.

Floats

From a high and mighty mathematical perspective, floats are just a crude approximation of reals. In fact, they are even a crude approximation of rational numbers, seeing as how they lack some of the big, important, properties that numbers usually have.

The most blatant broken property is associativity. Raise thyself, oh JavaScript developer, who hast encountered once before the dreadful loss of precision below:

> 0.1 + (0.2 + 0.3)
0.6

> (0.1 + 0.2) + 0.3
0.6000000000000001

Floats also break distributivity:

> (1+2)/10
0.3

> 1/10 + 2/10
0.30000000000000004

Thankfully, IEEE-754 floats guarantee commutativity of addition.

There are plenty of reasons not to use floats, as there are a handful of reasons to use floats.

Either way, floats aren't really what I want to discuss. So what is? Ah, yes, real numbers!

Real numbers

So why don't we just use real numbers in our computers, if floats are so weird?

Well, the straight answer is: Real numbers are weirder. Much, much, weirder.

The usual argument for why we can't use real numbers in computers is what I call "The Finite Argument". It goes somewhat like this:

The Finite Argument

  1. Real numbers are infinite.
  2. Computers are finite.
  3. You can't fit the infinite in the finite.

You can't fit real numbers in a computer.

Note of clarification: the statement 'Real numbers are infinite' can refer to the fact that there are an infinite amount of real numbers, or to the fact that a single real number is infinite.

If we interpret the statement in the first form, then we could also apply this argument to integers, since they are also infinite. And the argument will be correct, up to a point.

The trouble with integers is that, to represent arbitrarily large integers, we would require arbitrarily large amounts of memory.

We do find these kinds of ever expanding, ever memory consuming integers, in the wild, such as Python's bignums. In this sense, we do have a concrete encoding of the integers onto a computers memory, but for very large integers, we require a very large amount of memory.

The Finite Argument as I've presented it here, is a tad bit ambiguous as to what fitting a number in a computer is. I've chosen to present it in this form, since it's how I've come to hear of it at many times.

Personally, I believe it's far more interesting to interpret fitting as constructing an encoding of any arbitrary element of our infinite set into computer memory.

Otherwise, we will inevitably find ourselves faced with the brick wall that is the finite memory that we have at our disposal, and in that sense, we couldn't even fit the integers into our computers.

So, I'm going to expand and refine the Finite Argument somewhat:

The Infinite Element Argument

  1. Each individual real number is infinite.
  2. Computers are finite.
  3. You can't encode an infinite object in a finite memory.

You can't fit real numbers in a computer.

This seems like a somewhat valid argument. $\pi$ is infinite, we can't possibly encode all of it's digits in a computer, and if we can't store $\pi$, then how can we even store any other real number?

Infinite redundancy

Alright, let's take a step back. Maybe, just maybe, we can't store $\pi$, but we can store $1$. That's a real number!

Why can we store $1$? We were doing it before with integers, how can we do it now? Ok, $1$, is really $1.0000000000000...$ when we view it in decimal (or any base, really). It also has an infinite amount of digits, just like $\pi$, so does that mean we can't even store $1$?? Were we deceived??

No, we were not. Well, kind of.

See how I wrote the infinite expansion of $1$? I put some ellipsis at the end ($...$) to signify that the $0$'s go on forever. We could store that in memory as $1.(0\infty)$

And there we go! A finite representation of an infinite sequence of digits. We can store a real number in finite memory.

Now, I concede, $1$ is a pretty basic real number. But now we know how to encode the integers as real numbers. We just have some special way of signaling that some symbol repeats forever such as $\infty$.

We can now even encode rational numbers as their decimal expansion, such as $\frac{1}{3}$, with $0.(3\infty)$. Actually, a better encoding might even be, simply $\frac{1}{3}$, in the sense that we store, separately, the numerator and denominator, in memory.

So, do we have our silver bullet to encoding real numbers? Not really, no.

Take a look at this number:

$$0.123456789101112131415161718192021222324252627282930...$$

To make your visual parsing of the number easier, I'll write it like this: $$0.1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30...$$

This real number encodes every single non-negative integer in it's decimal expansion. The fractional part of the number contains the natural numbers in sequence. We can't really use $\infty$ here, as there is no forever repeating sequence.

How do we encode this real number in a finite manner?

Information is Computation

Or is it the other way around?

I'll let you in on a little secret. Come closer. Here: We already did.

'Where?', you ask? Take a look above:

The fractional part of the number contains the natural numbers in sequence.

Is this not a complete and accurate description of the real number, in a finite form, in this case, around 75 characters?

Words are ambiguous, however, and are not really helpful when encoding things into computer memory.

So, we get the closest, best thing: ✨structured words✨ (also known as code).

Take this Python program below:

def real_weird():
    yield "0"
    yield "."
    n = 1
    while True:
        yield str(n)
        n += 1


for i in real_weird():
    print(i, end="")
0.1234567891011121314151617181920212223[...]

This program defines an infinite generator that outputs each digit in our real number that encodes all natural numbers. It then uses the generator to print, for eternity, the next number in the sequence.

Notice something about the program above? You can read it, you can understand it and you can execute it to get each digit of the real number we want to encode.

Knowing the number and knowing this program is, essentially, the same thing.

The program is the encoding.

This is about Turing Machines, isn't it?

So, we're trying to encode a real number as a program that can output the digits of the real number in question.

There are better ways of encoding real numbers programmatically, and we will touch on that later, but for now, let's stick to outputting digits.

There can be a lot of programs that can do this. We can print $1$ with the following C program:

#include <stdio.h>
int main() {
    printf("1.");
    while (1) { printf("0"); }
    return 0;
}

or this Python program:

def one():
    yield "1"
    yield "."
    while True:
        yield "0"

for i in one():
    print(i, end="")

or this Haskell program:

module Main where

one = '1' : '.' : zeroes
zeroes = '0':zeroes

main :: IO ()
main = do
    print one

We can even print the same number in many different ways in the same programming language.

So encodings are not unique. That, at least, is clear.

So, we really should settle on a common language to encode real numbers in.

Any particular choice will be arbitrary, but the construct in question must be expressive enough to be able to encode any real number, which relies on non-termination. We'll likely have to settle on something Turing Complete.

Well, if it's going to be Turing Complete, might as well be a Turing Machine!

Again, the choice is arbitrary, here.

If it eases your mind, substitute any mention of Turing Machines with your favorite programming language or kitchen utensil.

Mapping Machines to Numbers

Our goal as of now is to have, for every real number $x$, some Turing Machine that outputs $x$'s digits.

Different real numbers need to map do different Turing Machines, so we need atleast as many Turing Machines as there are real numbers.

There are a lot of Turing Machines, so this should be easy enough.

*phone rings*

Hello?

Hey, what's up? I need your help, can you come over?

I can't, I'm encoding $\pi$ in a Turing Machine.

All right, well, you do you, I guess. Any reason for the insanity?

I'm trying to encode every real number as a Turing Machine.

What do you mean, "encode every real number"?

I'm encoding every real number. To replace floats.

Hate to break it to you, but you're not going to be able to do that.

What do you mean, "I'm not able to do that"?

It means it's not possible. There aren't enough Turing Machines!*

WHAT DO YOU MEAN, "NOT ENOUGH TURING MACHINES"!?

Look, the Real Numbers aren't countable, while the Turing Machines are.

Wait, let me look that up... Okay, so Cantor's diagonal argument, huh?

That's the one.

*sighs*

Anyway, someone broke production on a Friday. Have to go!

*phone hangs*

Sorry you had to hear that.

Where were we? Oh yes, we just had our dreams shattered.

Shattered Dreams

Turns out, there are way more Turing Machines than there are real numbers. An infinitely obscene amount more.

I won't get to the gritty details of why. See Description Numbers and Cantor's diagonal argument if you want a more detailed analysis.

Essentially, every Turing Machine has an identifying integer that describes it and it is a known fact in math that there are infinitely more reals than there are integers, and that there is no way to fully map the integers to the entirety of the reals.

If every Turing Machine has some unique integer, and if you can't fully map the integers to the reals, then we also can't map the Turing Machines to the reals.

$$ \text{TM} \leq \Z \lt \R $$

The real numbers that you can't assign a Turing Machine to are aptly called the Uncomputable Numbers. They just can't be computed.

That feels weird, right? It is very much alien to me that some numbers just can't be computed or calculated. We can't necessarily write the entirety of $\pi$ down but we can compute it. There are some numbers that we can't even do that.

Take for instance what has come to be my favorite number: Chaitin's Constant. It is a normal, transcendental and uncomputable real number, that is usually written down as $\Omega$.

It is uncomputable because, in a loose sense, the number encodes the probability that a randomly constructed program will or will not halt, and computing $\Omega$ would solve the halting problem.

sighs

Somehow the Finite Argument was both wrong and right. There are in fact too many real numbers for our computers to handle. But not because computers are finite. Instead, the act of computation in itself is too "small" to handle all real numbers.

Picking up the pieces

At this point, we have a choice.

We can give up, knowing that our initial quest to fully replace floats with all real numbers has failed, or, we can move the goalposts.

We can't compute all real numbers? Well, let's focus on the ones that we can. Let's have a look at the Computable Numbers.

Now, I'm going to be honest with you. Up until now I've been somewhat cheeky and loose with my definition of a computable number. I've said that a number is computable if there is some Turing Machine that outputs its digits. That is somewhat incorrect.

Take for example, the numbers $0.3333...$ and $0.6666...$ that are the infinite decimal representation of $\frac{1}{3}$ and $\frac{2}{3}$. The sum of both these numbers should be $1$, but what should its decimal representation be? $1.0000...$? $0.9999...$?

If we want to place a $1$ at the start, we would first need to check that there are indeed an infinite number of $3$'s and $9'$s, which we can't do in a finite amount of time. If we want to place a $0$ at the start, then we need to make sure that there is some digit like $2$ that would stop the carry from infinitely propagating, which we also can't do, since our search would never end.

So, we can't compute $\frac{1}{3} + \frac{2}{3}$, by encoding in decimal.

Fear not, there are encodings which are suitable, and that do not suffer from such issues, such as those that use negative digits.

Other definitions of computability circumvent the problem by just requiring that the output number have an arbitrarily small error from the desired number.

There are many definitions of computable real numbers. I chose the decimal one since it's simple to understand and it's similar to the one Turing proposed in his 1936 paper on the Entscheidungsproblem:

The "computable" numbers may be described briefly as the real numbers whose expressions as a decimal are calculable by finite means.

What was the point of all of this?

I hope you can see just how weird it is how try and make real numbers work with computing. It really is impossible to work with all real numbers in computing, so we have to somewhat tie our hands behind our backs and settle with the numbers that we can work with.

Of course, this is a tragedy to the hypothetical mathematician that sees absolute perfection in the real numbers, but it is of little concern to anyone else. The computable numbers are extremely interesting in their own regard, and I believe that they warrant a separate post just for them.

All, in all, the main takeaway from all of this is that real numbers are hard.

Fun Tidbits to read