The Intellectual Gravity of Brilliant Baseball Players

(by MarkCC) Feb 21 2013

Some of my friends at work are baseball fans. I totally don't get baseball - to me, it's about as interesting as watching paint dry. But thankfully, some of my friends disagree, which is how I found this lovely little bit of crackpottery.

You see, there's a (former?) baseball player named Jose Canseco, who's been plastering twitter with his deep thoughts about science.

At first glance, this is funny, but not particularly interesting. I mean, it's a classic example of my mantra: the worst math is no math.

The core of this argument is pseudo-mathematical. The dumbass wants to make the argument that under current gravity, it wouldn't be possible for things the size of the dinosaurs to move around. The problem with this argument is that there's no problem! Things the size of dinosaurs could move about in current gravity with absolutely no difficult. If you actually do the math, it's fine.

If dinosaurs had the anatomy of human beings, then it's true that if you scaled them up, they wouldn't be able to walk. But they didn't. They had anatomical structures that were quite different from ours in order to support their massive size. For example, here's a bone from quetzlcoatlus:

Media,111639,en See the massive knob sticking out to the left? That's a muscle attachement point. That gave the muscles much greater torque than ours have, which they needed. (Yes, I know that Quetzalcoatlus wwasn't really a dinosaur, but it is one of the kinds of animals that Canseco was talking about, and it was easy to find a really clear image.)

Most animal joints are, essentially, lever systems. Muscles attach to two different bones, which are connected by a hinge. The muscle attachement points stick out relative to the joint. When the muscles contract, that creates a torque which rotate the bones around the joint.

The lever is one of the most fundamental machines in the universe. It operates by the principal of torque. Our regular daily experiences show that levers act in a way that magnifies our efforts. I can't walk up to a car and lift it. But with a lever, I can. Muscle attachment points are levers. Take another look at that bone picture: what you're seeing is a massive level to magnify the efforts of the muscles. That's all that a large animal needed to be able to move around in earths gravity.

This isn't just speculation - this is stuff that's been modeled in great detail. And it's stuff that can be observed in modern day animals. Look at the skeleton of an elephant, and compare it to the skeleton of a dog. The gross structure is very similar - they are both quadripedal mammals. But if you look at the bones, the muscle attachment points in the elephants skeleton have much larger projections, to give the muscles greater torque. Likewise, compare the skeleton of an american robin with the skeleton of a mute swan: the swan (which has a maximum recorded wingspan of 8 feet!) has much larger projections on the attachment points for its muscles. If you just scaled a robin from its 12 inch wingspan to the 8 feet wingspan of a swan, it wouldn't be able to walk, much less fly! But the larger bird's anatomy is different in order to support its size - and it can and does fly with those 8 foot wings!

That means that on the basic argument for needing different gravity, Canseco fails miserably.

Canseco's argument for how gravity allegedly changed is even worse.

What he claims is that at the time when the continental land masses were joined together as the pangea supercontinent, the earths core moved to counterbalance the weight of the continents. Since the earths core was, after this shift, farther from the surface, the gravity at the surface would be smaller.

This is an amusingly ridiculous idea. It's even worse that Ted Holden and his reduced-felt-gravity because of the electromagnetic green saturn-star.

First, the earths core isn't some lump of stuff that can putter around. The earth is a solid ball of material. It's not like a ball of powdered chalk with a solid lump of uranium at the center. The core can't move.

Even if it could, Canseco is wrong. Canseco is playing with two different schemes of how gravity works. We can approximate the behavior of gravity on earth by assuming that the earth is a point: for most purposes, gravity behaves almost as if the entire mass of the earth was concentrated at the earths center of mass. Canseco is using this idea when he moves the "core" further from the surface. He's using the idea that the core (which surrounds the center of mass in the real world) is the center of mass. So if the core moves, and the center of mass moves with it, then the point-approximation of gravity will change because the distance from the center of mass has increased.

But: the reason that he claims the core moved is because it was responding to the combined landmasses on the surface clumping together as pangea. That argument is based on the idea that the core had to move to balance the continents. In that case, the center of gravity wouldn't be any different - if the core could move to counterbalance the continents, it would move just enough to keep the center of gravity where it was - so if you were using the point approximation of gravity, it would be unaffected by the shift.

He's combining incompatible assumptions. To justify moving the earths core, he's *not* using a point-model of gravity. He's assuming that the mass of the earths core and the mass of the continents are different. When he wants to talk about the effect of gravity of an animal on the surface, he wants to treat the full mass of the earth as a point source - and he wants that point source to be located at the core.

It doesn't work that way.

The thing that I find most interesting about this particular bit of crackpottery isn't really about this particular bit of crackpottery, but about the family of crackpottery that it belongs to.

People are fascinated by the giant creatures that used to live on the earth. Intuitively, because we don't see giant animals in the world around us, there's a natural tendency to ask "Why?". And being the pattern-seekers that we are, we intuitively believe that there must be a reason why the animals back then were huge, but the animals today aren't. It can't just be random chance. So people keep coming up with reasons. Like:

  1. Neal Adams: who argues that the earth is constantly growing larger, and that gravity is an illusion caused by that growth. One of the reasons, according to his "theory", for why we know that gravity is just an illusion, is because the dinosaurs supposedly couldn't walk in current gravity.
  2. Ted Holden and the Neo-Velikovskians: who argue that the solar system is drastically different today than it used to be. According to Holden, Saturn used to be a "hyperintelligent green electromagnetic start", and the earth used to be tide-locked in orbit around it. As a result, the felt effect of gravity was weaker.
  3. Stephen Hurrell, who argues similarly to Neal Adams that the earth is growing. Hurrell doesn't dispute the existence of gravity the way that Adams does, but similarly argues that dinosaurs couldn't walk in present day gravity, and resorts to an expanding earth to explain how gravity could have been weaker.
  4. Ramin Amir Mardfar: who claims that the earth's mass has been continually increasing because meteors add mass to the earth.
  5. Gunther Bildmeyer, who argues that gravity is really an electromagnetic effect, and so the known fluctuations in the earths magnetic fields change gravity. According to him, the dinosaurs could only exist because of the state of the magnetic field at the time, which reduced gravity.

There are many others. All of them grasping at straws, trying to explain something that doesn't need explaining, if only they'd bother to do the damned math, and see that all it takes is a relatively small anatomical change.

Share

22 responses so far

Euler's Equation Crackpottery

(by MarkCC) Feb 18 2013

One of my twitter followers sent me an interesting piece of crackpottery. I debated whether to do anything with it. The thing about crackpottery is that it really needs to have some content. Total incoherence isn't amusing. This bit is, frankly, right on the line.

Euler's Equation and the Reality of Nature.

a) Euler's Equation as a mathematical reality.

Euler's identity is "the gold standard for mathematical beauty'.
Euler's identity is "the most famous formula in all mathematics".
‘ . . . this equation is the mathematical analogue of Leonardo
da Vinci’s Mona Lisa painting or Michelangelo’s statue of David’
‘It is God’s equation’, ‘our jewel ‘, ‘ It is a mathematical icon’.
. . . . etc.

b) Euler's Equation as a physical reality.

"it is absolutely paradoxical; we cannot understand it,
and we don't know what it means, . . . . .’
‘ Euler's Equation reaches down into the very depths of existence’
‘ Is Euler's Equation about fundamental matters?’
‘It would be nice to understand Euler's Identity as a physical process
using physics.‘
‘ Is it possible to unite Euler's Identity with physics, quantum physics ?’

My aim is to understand the reality of nature.

Can Euler's equation explain me something about reality?

To give the answer to this. question I need to bind Euler's equation with an object – particle. Can it be math- point or string- particle or triangle-particle? No, Euler's formula has quantity (pi) which says me that the particle must be only a circle .

Now I want to understand the behavior of circle - particle and therefore I need to use spatial relativity and quantum theories. These two theories say me that the reason of circle – particle’s movement is its own inner impulse (h) or (h*=h/2pi).

a) Using its own inner impulse (h) circle - particle moves ( as a wheel) in a straight line with constant speed c = 1. We call such particle - ‘photon’. From Earth – gravity point of view this speed is maximally. From Vacuum point of view this speed is minimally. In this movement quantum of light behave as a corpuscular (no charge).

b) Using its own inner impulse / intrinsic angular momentum ( h* = h / 2pi ) circle - particle rotates around its axis. In such movement particle has charge, produce electric waves ( waves property of particle) and its speed ( frequency) is : c.

1. We call such particle - ‘ electron’ and its energy is: E=h*f.

In this way I can understand the reality of nature.

==.

Best wishes.

Israel Sadovnik Socratus.

Euler's equation says that e^{i\pi} + 1 = 0. It's an amazingly profound equation. The way that it draws together fundamental concepts is beautiful and surprising.

But it's not nearly as mysterious as our loonie-toon makes it out to be. The natural logarithm-base is deeply embedded in the structure of numbers, and we've known that, and we've known how it works for a long time. What Euler did was show the relationship between e and the fundamental rotation group of the complex numbers. There are a couple of ways of restating the definition of that make the meaning of that relationship clearer.

For example:

e^z = lim_{n\rightarrow \infty}(1 + \frac{z}{n})^n

That's an alternative definition of what e is. If we use that, and we plug i\pi into it, we get:

e^{i\pi} = lim_{n \rightarrow \infty}(1+\frac{i\pi}{n})^n

If you work out that limit, it's -1. Also, if you take values of N, and plot (1 + \frac{i\pi}{n})^1, (1+\frac{i\pi}{n})^2, (1 + \frac{i\pi}{n})^3, and (1 + \frac{i\pi}{n})^4, ... on the complex plane, as N gets larger, the resulting curve gets closer and closer to a semicircle.

An equivalent way of seeing it is that exponents of e^i are rotations in the complex number plane. The reason that e^{i\pi} = -1 is because if you take the complex number (1 + 0i), and rotate it by \pi radians, you get -1: 1(e^{i\pi}) = -1.

That's what Euler's equation means. It's amazing and beautiful, but it's not all that difficult to understand. It's not mysterious in the sense that our crackpot friend thinks it is.

But what really sets me off is the idea that it must have some meaning in physics. That's silly. It doesn't matter what the physical laws of the universe are: the values of \pi and e will not change. It's like trying to say that there must be something special about our universe that makes 1 + 1 = 2 - or, conversely, that the fact that 1+1=2 means something special about the universe we live in. These things are facts of numbers, which are independent of physical reality. Create a universe with different values for all of the fundamental constants - e and π will be exactly the same. Create a universe with less matter - e and π will still be the same. Create a universe with no matter, a universe with different kinds of matter, a universe with 300 forces instead of the four that we see - and e and π won't change.

What things like e and π, and their relationship via Euler's equation tell us is that there's a fundamental relationship between numbers and shapes on a two-dimensional plane which does not and cannot really exist in the world we live in.

Beyond that, what he's saying is utter rubbish. For example:

These two theories say me that the reason of circle – particle’s movement is its own inner impulse (h) or (h*=h/2pi). Using its own inner impulse (h) circle - particle moves ( as a wheel) in a straight line with constant speed c = 1. We call such particle - ‘photon’. From Earth – gravity point of view this speed is maximally. From Vacuum point of view this speed is minimally. In this movement quantum of light behave as a corpuscular (no charge).

This is utterly meaningless. It's a jumble of words that pretends to be meaningful and mathematical, when in fact it's just a string of syllables strung together nonsensical ways.

There's a lot that we know about how photons behave. There's also a lot that we don't know about photons. This word salad tells us exactly nothing about photons. In the classic phrase, it's not even wrong: what it says doesn't have enough meaning to be wrong. What is the "inner impulse" of a photon according to this crackpot? We can't know: the term isn't defined. We are pretty certain that a photon is not a wheel rolling along. Is that what the crank is saying? We can't be sure. And that's the problem with this kind of crankery.

As I always say: the very worst math is no math. This is a perfect example. He starts with a beautiful mathematical fact. He uses it to jump to a completely non-mathematical conclusion. But he writes a couple of mathematical symbols, to pretend that he's using math.

Share

12 responses so far

The Meta of Gödel

(by MarkCC) Feb 17 2013

As you may be figuring out, there's a reason why I resisted walking through Gödel's proof of incompleteness for so long. Incompeteness isn't a simple proof!

To refresh your memory, here's a sketch of the proof:

  1. Take a simple logic. We've been using a variant of the Principia Mathematica's logic, because that's what Gödel used.
  2. Show that any statement in the logic can be encoded as a number using an arithmetic process based on the syntax of the logic. The process of encoding statements numerically is called Gödel numbering.
  3. Show that you can express meta-mathematical properties of logical statements in terms of arithemetic properties of their Gödel numbers. In particular, we need to build up the logical infrastructure that we need to talk about whether or not a statement is provable.
  4. Using meta-mathematical properties, show how you can create an unprovable statement encoded as a Gödel number.

What we've done so far is the first two steps, and part of the third. In this post, we saw the form of the Principia's logic that we're using, and how to numerically encode it as a Gödel numbering. We've start started on the third point in this post, by figuring out just what it means to say that things are encoded arithmetically. Now we can get to the part where we see how to encode meta-mathematical properties in terms of arithmetic properties of the Gödel numbering. In this post, we're going to build up everything we need to express syntactic correctness, logical validity, and provability in terms of arithmetical properties of Gödel numbers. (And, as a reminder, I've been using this translation on Gödel's original paper on incompleteness.)

This is the most complex part of the incompleteness proof. The basic concept of what we're doing is simple, but the mechanics are very difficult. What we want to do is define a set of predicates about logical statements, but we want those predicates to be expressed as arithmetic properties of the numerical representations of the logical statements.

The point of this is that we're showing that done in the right way, arithmetic is logic - that doing arithmetic on the Gödel numbers is doing logical inference. So what we need to do is build up a toolkit that shows us how to understand and manipulate logic-as-numbers using arithmetic. As we saw in the last post, primitive recursion is equivalent to arithmetic - so if we can show how all of the properties/predicates that we define are primitive recursive, then they're arithmetic.

This process involves a lot of steps, each of which is building the platform for the steps that follow it. I struggled quite a bit figuring out how to present these things in a comprehensible way. What I ended up with is writing them out as code in a pseudo-computer language. Before inventing this language, I tried writing actual executable code, first in Python and then in Haskell, but I wasn't happy with the clarity of either one.

Doing it in an unimplemented language isn't as big a problem as you might think. Even if this was all executable, you're not going to be able to actually run any of it on anything real - at least not before you hair turns good and gray. The way that this stuff is put together is not what any sane person would call efficient. But the point isn't to be efficient: it's to show that this is possible. This code is really all about searching; if we wanted to be efficient, this could all be done in a different representation, with a different search method that was a lot faster - but that wolud be harder to understand.

So, in the end, I threw together a simple language that's easy to read. This language, if it were implemented, wouldn't really even be Turing complete - it's a primitive recursive language.

Basics

We'll start off with simple numeric properties that have no obvious connection to the kinds of meta-mathematical statements that we want to talk about, but we'll use those to define progressively more and more complex and profound properties, until we finally get to our goal.

# divides n x == True if n divides x without remainder.
pred divides(n, x) = x mod n == 0

pred isPrime(0) = False
pred isPrime(1) = False
pred isPrime(2) = True
pred isPrime(n) = {
  all i in 2 to n {
    not divides(i, n)
  }
}

fun fact(0) = 1
fun fact(n) = n * fact(n - 1)

Almost everything we're going to do here is built on a common idiom. For anything we want to do arithmetically, we're going to find a bound - a maximum numeric value for it. Then we're going to iterate over all of the values smaller than that bound, searching for our target.

For example, what's the nth prime factor of x? Obviously, it's got to be smaller than x, so we'll use x as our bound. (A better bound would be the square root of x, but it doesn't matter. We don't care about efficiency!) To find the nth prime factor, we'll iterate over all of the numbers smaller than our bound x, and search for the smallest number which is prime, which divides x, and which is larger than the n-1th prime factor of x. We'll translate that into pseudo-code:

fun prFactor(0, x) = 0
fun prFactor(n, x) = {
  first y in 1 to x {
    isPrime(y) and divides(y, x) and prFactor(n - 1, x) < y 
  }
}

Similarly, for extracting values from strings, we need to be able to ask, in general, what's the nth prime number? This is nearly identical to prFactor above. The only difference is that we need a different bound. Fortunately, we know that the nth prime number can't be larger than the factorial of the previous prime plus 1.

fun nthPrime(0) = 0
fun nthPrime(n) = {
  first y in 1 to fact(nthPrime(n - 1)) + 1  {
    isPrime(y) and y > nthPrime(n - 1)) 
  }
}

In composing strings of Gödel numbers, we use exponentiation. Given integers x and n, xn, we can obviously compute them via primitive recursion. I'll define them below, but in the rest of this post, I'll write them as an operator in the language:

fun pow(n, 0) = 1
fun pow(n, i) = n * pow(n, i - 1)

String Composition and Decomposition

With those preliminaries out of the way, we can get to the point of defining something that's actually about one of the strings encoded in these Gödel numbers. Given a number n encoding a string, item(n, x) is the value of the nth character of x. (This is slow. This is really slow! We're getting to the limit of what a very powerful computer can do in a reasonable amount of time. But this doesn't matter. The point isn't that this is a good way of doing these things: it's that these things are possible. To give you an idea of just how slow this is, I started off writing the stuff in this post in Haskell. Compiled with GHC, which is a very good compiler, item to extract the 6th character of an 8 character string took around 10 minutes on a 2.4Ghz laptop.)

fun item(n, x) = {
  first y in 1 to x {
    divides(prFactor(n, x) ** y, y) and
      not divides(prFactor(n, x)**(y+1), x) 
  }
}

Given a string, we want to be able to ask how long it is; and given two strings, we want to be able to concatenate them.

fun length(x) = {
  first y in 1 to x {
    prFactor(y, x) > 0 and prFactor(y + 1, x) == 0
  }
}

fun concat(x, y) = {
  val lx = length(x)
  val ly = length(y)

  first z in 1 to nthprime(lx + ly)**(x + y) {
    (all n in 1 to lx {
        item(n, z) == item(n, x)
     }) and (all n in 1 to ly {
        item(n + lx, z) == item(n, y)
      }) 
  }
}

fun concatl([]) = 0
fun concatl(xs) = {
  concat(head(xs), concatl(tail(xs)))
}

fun seq(x) = 2**x

We want to be able to build statements represented as numbers from other statements represented as numbers. We'll define a set of functions that either compose new strings from other strings, and to check if a particular string is a particular kind of syntactic element.

# x is a variable of type n.
pred vtype(n, x) = {
  some z in 17 to x {
    isPrime(z) and x == n**z
  }
}

# x is a variable
pred isVar(x) = {
  some n in 1 to x {
    vtype(n, x)
  }
}

fun paren(x) =
  concatl([gseq(11), x, gseq(13)])

# given the Gödel number for a statement x, find 
# the Gödel number for not x.
fun gnot(x) =
  concat(gseq(5), paren(x))

# Create the number for x or y.
fun gor(x, y) =
  concatl([paren(x), seq(7), paren(y)])

# Create the number for 'forall x(y)'.
fun gforall(x, y) =
  concatl([seq(9), seq(x), paren(y)])

# Create the number for x with n invocations of the primitive
# successor function.
fun succn(0, x) = x
fun succn(n, x) = concat(seq(3), succn(n - 1, x))

# Create the number n using successor and 0.
fun gnumber(n) = succn(n, seq(1))

# Check if a statement is type-1.
pred stype_one(x) = {
  some m in 1 to x {
     m == 1 or (vtype(1, m) and x == succn(n, seq(m))
  }
}

# Check if a statement is type n.
pred fstype(1, x) = stype_one(x)
pred fstype(n, x) =
  some v in 1 to x {
    vtype(n, v) and R(v)
  }
}

That last function contains an error: the translation of Gödel that I'm using says R(v) without defining R. Either I'm missing something, or the translator made an error.

Formulae

Using what we've defined so far, we're now ready to start defining formulae in the basic Principia logic. Forumlae are strings, but they're strings with a constrained syntax.

pred elFm(x) = {
  some y in 1 to x {
    some z in 1 to x {
      some n in 1 to x {
        stype(n, y) and stype(n+1, z) and x == concat(z, paren(y))
      }
    }
  }
}

All this is doing is expressing the grammar rule in arithmetic form: an elementary formula is a predicate: P(x), where x is a variable on level n, and P is a variable of level x + 1.

The next grammar rule that we encode this way says how we can combine elementary formulae using operators. There are three operators: negation, conjunction, and universal quantification.

pred op(x, y, z) = {
  x == gnot(y) or
  x == gor(y, z) or 
  (some v in 1 to x { isVar(v) and x == gforall(v, y) })
}

And now we can start getting complex. We're going to define the idea of a valid sequence of formulae. x is a valid sequence of formulae when it's formed from a collection of formulae, each of which is either an elementary formula, or is produced from the formulae which occured before it in the sequence using either negation, logical-or, or universal quantification.

In terms of a more modern way of talking about it, the syntax of the logic is a grammar. A formula sequence, in this system, is another way of writing the parse-tree of a statement: the sequence is the parse-tree of the last statement in the sequence.

pred fmSeq(x) = {
  all p in 0 to length(x) { 
    elFm(item(n, x)) or
      some p in 0 to (n - 1) {
        some q in 0 to (n - 1) {
          op(item(n,x), item(p, x), item(q, x))
        }
      }
  }
}

The next one bugs me, because it seems wrong, but it isn't really! It's a way of encoding the fact that a formula is the result of a well-defined sequence of formulae. In order to ensure that we're doing primitive recursive formulae, we're always thinking about sequences of formulae, where the later formulae are produced from the earlier ones. The goal of the sequence of formula is to produce the last formula in the sequence. What this predicate is really saying is that a formula is a valid formula if there is some sequence of formulae where this is the last one in the sequence.

Rephrasing that in grammatical terms, a string is a formula if there is valid parse tree for the grammar that produces the string.

pred isFm(x) = {
  some n in 1 to nthPrime(length(x)**2)**(x*length(x)**2) {
    fmSeq(n)
  }
}

So, now, can we say that a statement is valid because it's parsed according to the grammar? Not quite. It's actually a familiar problem for people who write compilers. When you parse a program in some language, the grammar doesn't usually specify variables must be declared before they're used. It's too hard to get that into the grammar. In this logic, we've got almost the same problem: the grammar hasn't restricted us to only use bound variables. So we need to have ways to check whether a variable is bound in a Gödel-encoded formula, and then use that to check the validity of the formula.

# The variable v is bound in formula x at position n.
pred bound(v, n, x) = {
  isVar(v) and isFm(x) and 
  (some a in 1 to x {
    some b in 1 to x {
      some c in 1 to x {
        x == concatl([a, gforall(v, b), c]) and
        isFm(b) and
        length(a) + 1 ≤ n ≤ length(a) + length(forall(v, b))
      }
    }
  })
}

# The variable v in free in formula x at position n
pred free(v, n, x) = {
  isVar(v) and isFm(x) and
  (some a in 1 to x {
    some b in 1 to x {
      some c in 1 to x {
        v == item(n, x) and n ≤ length(x) and not bound(v, n, x)
      }
    }
  })
}

pred free(v, x) = {
  some n in 1 to length(x) {
    free(v, n, x)
  }
}

To do logical inference, we need to be able to do things like replace a variable with a specific infered value. We'll define how to do that:

# replace the item at position n in x with y.
fun insert(x, n, y) = {
  first z in 1 to nthPrime(length(x) + length(y))**(x+y) {
    some u in 1 to x {
      some v in 1 to x {
        x == concatl([u, seq(item(n, x)), v]) and
        z == concatl([u, y, v]) and
        n == length(u) + 1
      }
    }
  }
}

There are inference operations and validity checks that we can only do if we know whether a particular variable is free at a particular position.

# freePlace(k, v, k) is the k+1st place in x (counting from the end)
# where v is free.
fun freePlace(0, v, x) = {
  first n in 1 to length(x) {
    free(v, n, x) and 
    not some p in n to length(x) {
      free(v, p, x)
    }
  }
}

fun freePlace(k, v, x) = {
  first n in 1 to freePlace(n, k - 1, v) {
    free(v, n, x) and 
    not some p in n to freePlace(n, k - 1, v) {
      free(v, p, x)
    }
  }
}

# number of places where v is free in x
fun nFreePlaces(v, x) = {
  first n in 1 to length(x) {
    freeplace(n, v, x) == 0
  }
}

In the original logic, some inference rules are defined in terms of a primitive substitution operator, which we wrote as subst[v/c](a) to mean substitute the value c for the variable c in the statement a. We'll build that up on a couple of steps, using the freePlaces function that we just defined.

# Subst1 replaces a single instance of v with y.
fun subst'(0, x, v, y) = x
fun subst1(0k, x, v, y) =
  insert(subst1(k, x, v, y), freePlace(k, v, x), y)

# subst replaces all instances of v with y
fun subst(x, v, y) = subst'(nFreePlaces(v, x), x, v, y)

The next thing we're going to do isn't, strictly speaking, absolutely necessary. Some of the harder stuff we want to do will be easier to write using things like implication, which aren't built in primitive of the Principia logic. To write those as clearly as possible, we'll define the full suite of usual logical operators in terms of the primitives.

# implication
fun gimp(x, y) = gor(gnot(x), y)

# logical and
fun gand(x, y) = gnot(gor(gnot(x), gnot(y)))

# if/f
fun gequiv(x, y) = gand(gimp(x, y), gimp(y, x))

# existential quantification
fun gexists(v, y) = not(gforall(v, not(y)))

Axioms

The Peano axioms are valid logical statements, so they have Gödel numbers in this system. We could compute their value, but why bother? We know that they exist, so we'll just give them names, and define a predicate to check if a value matches them.

The form of the Peano axioms used in incompleteness are:

  1. Zero: ¬(succ(x1) = 0)
  2. Uniqueness: succ(x1) = succ(y1) \Rightarrow x = y
  3. Induction: x2(0) ∧ ∀x1(x2(x1)⇒ x2(succ(x1))) ⇒ ∀x1(x2(x1))
const pa1 = ...
const pa2 = ...
const pa3 = ...

pred peanoAxiom(x) = 
  (x == pa1) or (x == pa2) or (x == pa3)

Similarly, we know that the propositional axioms must have numbers. The propositional
axioms are:

  1. p \lor p \Rightarrow p
  2. p \Rightarrow p \lor q
  3. p \lor q \Rightarrow p \lor q
  4. (p \Rightarrow q) \Rightarrow (r \lor p \Rightarrow r \lor q)

I'll show the translation of the first - the rest follow the same pattern.

# Check if x is a statement that is a form of propositional
# axiom 1: y or y => y
pred prop1Axiom(x) =
  some y in 1 to x {
    isFm(x) and x == imp(or(y, y), y)
  }
}

pred prop2Axiom(x) = ...
pred prop3Axiom(x) = ...
pred prop4Axiom(x) = ...
pred propAxiom(x) = prop2Axiom(x) or prop2Axiom(x) or 
    prop3Axiom(x) or prop4Axiom(x)

Similarly, all of the other axioms are written out in the same way, and we add a predicate isAxiom to check if something is an axiom. Next is quantifier axioms, which are complicated, so I'll only write out one of them - the other follows the same basic scheme.

The two quantifier axioms are:

  1. \forall v(a) \Rightarrow \text{subst}[v/c](a)
  2. \forall v(b \lor a) \Rightarrow (b \or \forall v(a))
quantifier_axiom1_condition(z, y, v) = {
  not some n in 1 to length(y) {
    some m in 1 to length(z) {
      some w in 1 to z {
         w == item(m, z) and bound(w, n, y) and free(v, n, y)
      }
    }
  }
}

pred quantifier1Axiom(x) = {
  some v in 1 to x {
    some y in 1 to x {
      some z in 1 to x {
        some n in 1 to x {
          vtype(n, v) and stype(n, z) and
          isFm(y) and 
          quantifier_axiom1_condition(z, y, v) and
          x = gimp(gforall(v, y), subst(y, v, z))
        }
      }
    }
  }
}
      
quanitifier_axiom2 = ...
isQuantifierAxiom = quantifier1Axiom(x) or quantifier2Axiom(x)

We need to define a predicate for the reducibility axiom (basically, the Principia's version of the ZFC axiom of comprehension). The reducibility axiom is a schema: for any predicate a, \exists u (\forall v (u(v) \Leftrightarrow a. In our primitive recursive system, we can check if something is an instance of the reducibility axiom schema with:

pred reduAxiom(x) =
  some u in 1 to x {
    some v in 1 to x {
      some y in 1 to x {
        some n in 1 to x {
          vtype(n, v) and
          vtype(n+1, u) and
          not free(u, y) and
          isFm(y) and
          x = gexists(u, gforall(v, gequiv(concat(seq(u), paren(seq(v))), y)))
        }
      }
    }
  }
}

Now, the set axiom. In the logic we're using, this is the axiom that defines set equality. It's written as \forall x_1 (x_2(x_1) \Leftrightarrow y_2(y_1) \Rightarrow x_2 = x_1). Set equality is defined for all types of sets, so we need to have one version of axiom for each level. We do that using type-lifting: we say that the axiom is true for type-1 sets, and that any type-lift of the level-1 set axiom is also a version of the set axiom.

fun typeLift(n, x) = {
  first y in 1 to x**(x**n) {
    all k in 1 to length(x) {
      item(k, x) ≤ 13 and item(k, y) == item(k, v) or
      item(k, x) > 13 and item(k, y) = item(k, x) * prFactor(1, item(k, x))**n
    }
  }
}

We haven't defined the type-1 set axiom. But we just saw the axiom above, and it's obviously a simple logical statement. That mean that it's got a Gödel number. Instead of computing it, we'll just say that that number is called sa1. Now we can define a predicate to check if something is a set axiom:

val sa1 = ... 
pred setAxiom(x) = 
  some n in 1 to x {
    x = typeLift(n, sa)
  }
}

We've now defined all of the axioms of the logic, so we can now create a general predicate to see if a statement fits into any of the axiom categories:

pred isAxiom(x) =
  peanoAxiom(x) or propAxiom(x) or quantifierAxom(x) or
  reduAxiom(x) or setAxiom(x)

Proofs and Provability!

With all of the axioms expressible in primitive recursive terms, we can start on what it means for something to be provable. First, we'll define what it means for some statement x to be an immediate consequence of some statements y and z. (Back when we talked about the Principia's logic, we said that x is an immediate consequence of y and z if either: y is the formula z ⇒ x, or if c is the formula ∀v.x).

pred immConseq(x, y, z) = {
  y = imp(z, x) or 
  some v in 1 to x {
    isVar(v) and x = forall(v, y)
  }
}

Now, we can use our definition of an immediate consequence to specify when a sequence of formula is a proof figure. A proof figure is a sequence of statements where each statement in it is either an axiom, or an immediate consequence of two of the statements that preceeded it.

pred isProofFigure(x) = {
  (all n in 0 to length(x) {
    isAxiom(item(n, x)) or
    some p in 0 to n {
      some q in 0 to n {
        immConseq(item(n, x), item(p, x), item(q, x))
      }
    }
  }) and
  length(x) > 0
}

We can say that x is a proof of y if x is proof figure, and the last statement in x is y.

pred proofFor(x, y) =
  isProofFigure(x) and
  item(length(x), x) == y

Finally, we can get to the most important thing! We can define what it means for something to be provable! It's provable if there's a proof for it!

pre provable(x) =
  some y {
    proofFor(y, x)
  }
}

Note that this last one is not primitive recursive! There's no way that we can create a bound for this: a proof can be any length.

At last, we're done with these definition. What we've done here is really amazing: now, every logical statement can be encoded as a number. Every proof in the logic can be encoded as a sequence of numbers: if something is provable in the Principia logic, we can encode that proof as a string of numbers, and check the proof for correctness using nothing but (a whole heck of a lot of) arithmetic!

Next post, we'll finally get to the most important part of what Gödel did. We've been able to define what it means for a statement to be provable - we'll use that to show that there's a way of creating a number encoding the statement that something is not provable. And we'll show how that means that there is a true statement in the Principia's logic which isn't provable using the Principia's logic, which means that the logic isn't complete.

In fact, the proof that we'll do shows a bit more than that. It doesn't just show that the Principia's logic is incomplete. It shows that any consistent formal system like the Principia, any system which is powerful enough to encode Peano arithmetic, must be incomplete.

Share

3 responses so far

Defining Properties Arithmetically (part 1): Gödel and Primitive Recursion

(by MarkCC) Feb 11 2013

When I left off, we'd seen how to take statements written in the logic of the Principia Mathematica, and convert them into numerical form. What we need to see now is how to get meta-mathematical.

We want to be able to write second-order logical statements. The basic trick to incompleteness is that we're going to use the numerical encoding of statements to say that a predicate or relation is represented by a number. Then we're going to write predicates about predicates by defining predicates on the numerical representations of the first-order predicates. That's going to let us create a true statement in the logic that can't be proven with the logic.

To do that, we need to figure out how to take our statements and relations represented as numbers, and express properties of those statements and relations in terms of arithmetic. To do that, we need to define just what it means to express something arithmetically. Gödel did that by defining "arithmetically" in terms of a concept called primitive recursion.

I learned about primitive recursion when I studied computational complexity. Nowadays, it's seen as part of theoretical computer science. The idea, as we express it in modern terms, is that there are many different classes of computable functions. Primitive recursion is one of the basic complexity classes. You don't need a Turing machine to compute primitive recursive functions - they're a simpler class.

The easiest way to understand primitive recursion is that it's what you get in a programming language with integer arithmetic, and simple for-loops. The only way you can iterate is by repeating things a bounded number of times. Primitive recursion has a lot of interesting properties: the two key ones for our purposes here are: number theoretic proofs are primitive recursive, and every computation of a primitive recursive function is guaranteed to complete within a bounded amount of time.

The formal definition of primitive recursion, the way that Gödel wrote it, is quite a bit more complex than that. But it means the same thing.

We start with what it means to define a formula via primitive recursion. (Note the language that I used there: I'm not explaining what it means for a function to be primitive recursive; I'm explaining what it means to be defined via primitive recursion.) And I'm defining formulae, not functions. In Gödel's proof, we're always focused on numerical reasoning, so we're not going to talk about programs or algorithms, we're going to about the definition of formulae.

A formula \phi(x_1, x_2, ..., x_n) is defined via primitive recursion if, for some other formulae \psi and \mu:

  • Base: \phi(0, x_2, ..., x_n) = \psi(x_2, ..., x_n)
  • Recursive: \phi(i+1, x_2, ..., x_n) = \mu(i, \phi(i, x_2, ..., x_n), x_2, ..., x_n).

So, basically, the first parameter is a bound on the number of times that \phi can invoked recursively. When it's 0, you can't invoke \phi any more.

A formula is primitive recursive if it defined from a collection of formulae \phi_1, ..., \phi_n where any formula \phi_i is defined via primitive recursion from \phi_1, ..., \phi_{i-1}, or the primitive succ function from Peano arithmetic.

For any formula \phi_i in that sequence, the degree of the formula is the number of other primitive recursive formulae used in its definition.

Now, we can define a primitive recursive property: R(x_1, ..., x_n) is primitive recursive if and only if there exists a primitive recursive function \phi such that \phi(x_1, ..., x_n) = 0.

With primitive recursive formulae and relations defined, there's a bunch of theorems about how you can compose primitive recursive formulae and relations:

  1. Every function or relation that you get by substituting a primitive recursive function for a variable in a primitive recursive function/relation is primitive recursive.
  2. If R and S are primitive relations, then ¬R, R∧S, R∨S are all primitive recursive.
  3. If \phi(x_1, ..., x_n) and \psi(x_1, ..., x_n) are primitive recursive functions, then the relation R(x_1, ..., x_n) \Leftrightarrow (\phi(x_1, ..., x_n) = \psi(x_1, ..., x_n) is also primitive recursive.
  4. Let xv and zv be finite-length tuples of variables. If the function \phi(xv) and the relation R(y, zv) are primitive recursive, then so are the relations:
    • S(xv, zv) \Leftrightarrow  (\exists y \le \phi(xv). R(y, zv))
    • T(xv, zv) \Leftrightarrow  (\forall y \le A(xv). R(y, zv))
  5. Let xv and zv be finite-length tuples of variables. And let \text{argmin}[y \le f(x).R(x)] be the smallest value of x for which y \le f(x) and R(x) is true, or 0 if there is no such value. Then if the function \phi(xv) and the relation R(y, zv) are primitive recursive, then so is the function P(xv, zv) = (\text{argmin}[y \le A(xv). R(y, zv))].

By these definitions, addition, subtraction, multiplication, and integer division are all primitive recursive.

Ok. So, now we've got all of that out of the way. It's painful, but it's important. What we've done is come up with a good formal description of what it means for something to be an arithmetic property: if we can write it as a primitive recursive relation or formula, it's arithmetic.

So now, finally, we're ready to get to the really good part! Now that we know what it means to define something arithmetically, we can see how to define meta-mathematical properties and formulae arithmetically. Next post, hopefully tomorrow, I'll start showing you arithmetic expressions of meta-math!

Share

4 responses so far

G&oum;del Numbering: Encoding Logic as Numbers

(by MarkCC) Feb 07 2013

The first step in Gödel's incompleteness proof was finding a way of taking logical statements and encoding them numerically. Looking at this today, it seems sort-of obvious. I mean, I'm writing this stuff down in a text file - that text file is a stream of numbers, and it's trivial to convert that stream of numbers into a single number. But when Gödel was doing it, it wasn't so obvious. So he created a really clever mechanism for numerical encoding. The advantage of Gödel's encoding is that it makes it much easier to express properties of the encoded statements numerically.

Before we can look at how Gödel encoded his logic into numbers, we need to look at the logic that he used. Gödel worked with the specific logic variant used by the Principia Mathematica. The Principia logic is minimal and a bit cryptic, but it was built for a specific purpose: to have a minimal syntax, and a complete but minimal set of axioms.

The whole idea of the Principia logic is to be purely syntactic. The logic is expected to have a valid model, but you shouldn't need to know anything about the model to use the logic. Russell and Whitehead were deliberately building a pure logic where you didn't need to know what anything meant to use it. I'd really like to use Gödel's exact syntax - I think it's an interestingly different way of writing logic - but I'm working from a translation, and the translator updated the syntax. I'm afraid that switching between the older Gödel syntax, and the more modern syntax from the translation would just lead to errors and confusion. So I'm going to stick with the translation's modernization of the syntax.

The basic building blocks of the logic are variables. Already this is a bit different from what you're probably used to in a logic. When we think of logic, we usually consider predicates to be a fundamental thing. In this logic, they're not. A predicate is just a shorthand for a set, and a set is represented by a variable.

Variables are stratified. Again, it helps to remember where Russell and
Whitehead were coming from when they were writing the Principia. One of their basic motivations was avoiding self-referential statements like Russell's paradox. In order to prevent that, they thought that they could create a stratified logic, where on each level, you could only reason about objects from the level below. A first-order predicate would be a second-level object could only reason about first level objects. A second-order predicate would be a third-level object which could reason about second-level objects. No predicate could ever reason about itself or
anything on its on level. This leveling property is a fundamental property built into their logic. The way the levels work is:

  • Type one variables, which range over simple atomic values, like
    specific single natural numbers. Type-1 variables are written as
    a1, b1.
  • Type two variables, which range over sets of atomic values, like
    sets of natural numbers. A predicate, like IsOdd, about specific natural numbers would be represented as a type-2 variable. Type-2 variables are
    written a2, b2, ...
  • Type three variables range over sets of sets of atomic values.
    The mappings of a function could be represented as type-3 variables:
    in set theoretic terms, a function is set of ordered pairs. Ordered pairs, in
    turn, can be represented as sets of sets - for example, the
    ordered pair (1, 4) would be represented by the set { {1}, {1, 4} }.
    A function, in turn, would be represented by a type-4 variable - a set
    of ordered pairs, which is a set of sets of sets of values.

Using variables, we can form simple atomic expressions, which in Gödel's terminology are called signs. As with variables, the signs are divided into stratified levels:

  • Type-1 signs are variables, and successor expressions. Successor expressions
    are just Peano numbers written with "succ": 0, succ(0), succ(succ(0)),
    succ(a1), etc.
  • Signs of any type greater than 1 are just variables of that type/level.

Now we can assemble the basic signs into formulae. Gödel explained how to build formulae in a classic logicians form, which I think is hard to follow, so I've converted it into a grammar:

 elementary_formula → signn+1(signn)
 formula → ¬(elementary_formula)
 formula → (elementary_formula) \or (elementary_formula)
 formula → ∀ signn (elementary_formula)

That's all that's really included in Gödel's logic. It's enough: everything else can be defined in terms of combinatinos of these. For example, you can define logical and in terms of negation and logical or: (a ∧ b) ⇔ ¬ (¬ a ∨ ¬ b).

Next, we need to define the axioms of the system. In terms of logic the way I think of it, these axioms include both "true" axioms, and the inference rules defining how the logic works. There are five families of axioms.

  1. First, there's the Peano axioms, which define the natural numbers.
    1. ¬(succ(x1) = 0): 0 is a natural number, and it's not the successor of anything.
    2. succ(x1) = succ(y1) ⇒ x1 = y1: Successors are unique.
    3. (x2(0) ∧ ∀ x1 (x2(x1) ⇒ x2(succ(x1)))) ⇒ ∀ x1(x2(x1)): induction works on the natural numbers.
  2. Next, we've got a set of basic inference rules about simple propositions. These
    are defined as axiom schemata, which can be instantiated for any set of formalae
    p, q, and r.

    1. p ∨ p ⇒ p
    2. p ⇒ p ∨ q
    3. p ∨ q ⇒ q ∨ p
    4. (p ⇒ q) ⇒ (p ∨ r) ⇒ (q ∨ r)
  3. Axioms that define inference over quantification. v is a variable,
    a is any formula, b is any formula where v is not a free variable, and c is a sign of the same level as v, and
    which doesn't have any free variables that would be bound if it were inserted as
    a replacement for v.

    1. ∀ v (a) ⇒ subst[v/c](a): if formula a is true for all values of v, then you can substitute any specific value c for v in a, and a must still be true.
    2. (∀ v (b ∨ a)) ⇒ (b ∨ ∀ v (a))
  4. The Principia's version of the set theory axiom of comprehension:
    &exists; u (∀ v ( u(v) ⇔ a )).
  5. And last but not least, an axiom defining set equivalence:
    ∀ xi (xi+1(xi) ⇔ yi+1(yi)) ⇒ xi+1 = yi+1
  6. So, now, finally, we can get to the numbering. This is quite clever. We're going to use the simplest encoding: for every possible string of symbols in the logic, we're going to define a representation as a number. So in this representation, we are not going to get the property that every natural number is a valid formula: lots of natural numbers won't be. They'll be strings of nonsense symbols. (If we wanted to, we could make every number be a valid formula, by using a parse-tree based numbering, but it's much easier to just let the numbers be strings of symbols, and then define a predicate over the numbers to identify the ones that are valid formulae.)

    We start off by assigning numbers to the constant symbols:

    Symbols Numeric Representation
    0 1
    succ 3
    ¬ 5
    7
    9
    ( 11
    ) 13

    Variables will be represented by powers of prime numbers, for prime numbers greater that 13. For a prime number p, p will represent a type one variable, p2 will represent a type two variable, p3 will represent a type-3 variable, etc.

    Using those symbol encodings, we can take a formula written as symbols x1x2x3...xn, and encode it numerically as the product 2x13x25x2...pnxn, where pn is the nth prime number.

    For example, suppose I wanted to encode the formula: ∀ x1 (y2(x1)) ∨ x2(x1).

    First, I'd need to encode each symbol:

    1. "∀" would be 9.
    2. "x1"" = 17
    3. "(" = 11
    4. "y2" = 192 = 361
    5. "(" = 11
    6. "x1" = 17
    7. ")" = 13
    8. "∨" = 7
    9. "x2" = 172 = 289
    10. "(" = 11
    11. "x1" = 17
    12. ")" = 13
    13. ")" = 13

    The formula would thus be turned into the sequence: [9, 17, 11, 361, 11, 17, 13, 7, 289, 11, 17, 13, 13]. That sequence would then get turned into a single number
    29 317 511 7361 1111 1317 1713 197 23289 2911 3117 3713 4113, which according to Hugs is the number (warning: you need to scroll to see it. a lot!):
    1,821,987,637,902,623,701,225,904,240,019,813,969,080,617,900,348,538,321,073,935,587,788,506,071,830,879,280,904,480,021,357,988,798,547,730,537,019,170,876,649,747,729,076,171,560,080,529,593,160,658,600,674,198,729,426,618,685,737,248,773,404,008,081,519,218,775,692,945,684,706,455,129,294,628,924,575,925,909,585,830,321,614,047,772,585,327,805,405,377,809,182,961,310,697,978,238,941,231,290,173,227,390,750,547,696,657,645,077,277,386,815,869,624,389,931,352,799,230,949,892,054,634,638,136,137,995,254,523,486,502,753,268,687,845,320,296,600,343,871,556,546,425,114,643,587,820,633,133,232,999,109,544,763,520,253,057,252,248,982,267,078,202,089,525,667,161,691,850,572,084,153,306,622,226,987,931,223,193,578,450,852,038,578,983,945,920,534,096,055,419,823,281,786,399,855,459,394,948,921,598,228,615,703,317,657,117,593,084,977,371,635,801,831,244,944,526,230,994,115,900,720,026,901,352,169,637,434,441,791,307,175,579,916,097,240,141,893,510,281,613,711,253,660,054,258,685,889,469,896,461,087,297,563,191,813,037,946,176,250,108,137,458,848,099,487,488,503,799,293,003,562,875,320,575,790,915,778,093,569,309,011,025,000,000,000.

    Next, we're going to look at how you can express interesting mathematical properties in terms of numbers. Gödel used a property called primitive recursion as an example, so we'll walk through a definition of primitive recursion, and show how Gödel expressed primitive recursion numerically.

    Share

4 responses so far

Gödel's Incompleteness

(by MarkCC) Feb 03 2013

I've mentioned Gödel's incompleteness theorems many times on this blog, but I've never actually written about them in detail. I was asking, on twitter, for topics that readers would be interested in, and one thing that came up is actually go through the proof of incompleteness. I'm happy to take the time to do that incompleteness is one of the most beautiful and profound proofs that I've ever seen.

It takes a fair bit of effort though, so it's going to take a couple of posts just to get through the preliminaries. What I'm going to do is work with this translation of the original paper where Gödel published his first incompleteness proof. Before we can get to the actual proof, we need to learn a bit about the particular kind of logic that he used in his proof.

The story of incompleteness is really pretty interesting. In the late 19th and early 20th centuries, mathematicians started to get really interested in formal foundations. Why, exactly, this happened is somewhat of a debate, but the story I find most likely is that it's a result of Cantor and his naive set theory.

Set theory came along, and it seemed like such a great idea. It seems incredibly simple, and it makes many proofs in many different subject areas appear to be really simple. But there's a huge problem: it's not consistent.

The problem is what's come to be known as Russell's paradox. I've explained Russell's paradox a bunch of times on the blog. For this post, I'm going to do it a bit differently. As part of my prep for this series of posts, I've been reading the book Gödel's Proof, and in that book, they have a way of rephrasing Russell's paradox which I think clarifies it nicely, so I'm going to try their version.

If you think about the set of all sets, you can partition it into two non-overlapping subsets. There's a subset of normal sets, and a class of abnormal sets. What makes a set abnormal is that it's self-referential - an abnormal set contains itself as a member.

So considered the set of all normal sets. Is it, itself, a normal set?

The answer is: argh! Because if the set of all normal sets is normal, then it must be a member of itself. But if it's a member of itself, it can't be normal. Similarly, if you start off saying that the set of all normal sets isn't normal, then by definition, it is normal. No matter what you do, you're screwed.

What Russell's paradox did was show that there was a foundational problem in math. You could develop what appeared to be a valid mathematicial structure and theory, only to later discover that all the work you did was garbage, because there was some non-obvious fundamental inconsistency in how you defined it. But the way that foundations were treated simple wasn't strong or formal enough to be able to detect, right up front, whether you'd hard-wired an inconsistency into your theory. So foundations had to change, to prevent another Cantor incident.

So, eventually, along came two mathematicians, Russell and Whitehead, and they created an amazing piece of work called the Principia Mathematica. The principia was supposed to be an ideal, perfect mathematical foundation. The Principia was supposed to have two key properties: it was supposed to consistent, and it was supposed to be complete.

  • Consistent meant that the statement would not allow any inconsistencies of any kind. If you used the logic and the foundantions of the Principia, you couldn't even say anything like Russell's paradox: you couldn't even write it as a valid statement.
  • Complete meant that every true statement was provably true, every false statement was provably false, and every statement was either true or false.

A big part of how it did this was by creating a very strict stratification of logic. You could reason about a specific number, using level-0 statements. You could reason about sets of numbers using level-1 statements. And you could reason about sets of sets of numbers using level-2 statements. In a level-1 statement, you could make meta-statements about level-0 properties, but you couldn't reason about level-1. In level-2, you could reason about level-1 and level-0, but not about level-2. This meant that you couldn't make a set like the set of normal sets - because that set is formed using a predicate - and when you're forming a set like "the set of proper sets", the sets you can reason about are a level below you. You can form a level-1 set using a level-1 statement about level-0 objects. But you can't make a level-1 statement about level-1 objects! So self-referential systems would be completely impossible in the Principia's logic.

As a modern student of math, it's hard to understand what a profound thing they were trying to do. We've grown up learning math long after incompleteness became a well-known fact of life. (I read "Gödel Escher Bach" when I was a college freshman - well before I took any particularly deep math classes - so I knew about incompleteness before I knew enough to really understand what completeness woud have meant!) The principia would have been the perfection of math, a final ultimate perfect system. There would have been nothing that we couldn't prove, nothing in math that we couldn't know!

What Gödel did was show that using the Principia's own system, and it's own syntax, that not only was the principia itself flawed, but that any possible effort like the principia would inevitably be flawed!

With the incompleteness proof, Gödel showed that even in the Principia, even with all of the effort that it made to strictly separate the levels of reasoning, that he could form self-referential statements, and that those self-referential statements were both true and unprovable.

The way that he did it was simply brilliant. The proof was a sequence of steps.

  1. He showed that using Peano arithmetic - that is, the basic definition of natural numbers and natural number arithmetic - that you could take any principia-logic statement, and uniquely encode it as a number - so that every logical statement was a number, and ever number was a specific logical statement.
  2. Then using that basic mechanic, he showed how you could take any property defined by a predicate in the principia's logic, and encode it as a arithmetic property of the numbers. So a number encoded a statement, and the property of a number could be encoded arithmetically. A number, then, could be both a statement, and a definition of an arithmetic property of a stament, and a logical description of a logical property of a statement - all at once!
  3. Using that encoding, then - which can be formed for any logic that can express Peano arithmetic - he showed that you could form a self-referential statement: a number that was a statement about numbers including the number that was statement itself. And more, it could encode a meta-property of the statement in a way that was both true, and also unprovable: he showed how to create a logical property "There is no proof of this statement", which applied to its own numeric encoding. So the statement said, about itself, that it couldn't be proven.

The existence of that statement meant that the Principia - and any similar system! - was incomplete. Completeness means that every true statement is provably true within the system. But the statement encodes the fact that it cannot be proven. If you could prove it, the system would be inconsistent. If you can't, it's consistent, but incomplete.

We're going to go through all of that in detail. But to do that in a way where we can really understand it, we're going to need to start by looking at the logic used by the Principia. Then we'll look at how to encode Principia-logical statements as natural numbers using a technique called Gödel numbering, and how to express predicates numerically. And then, finally, we can look at how you can form the classic Gödel statement that shows that the Principia's system of logic is incomplete.

Share

16 responses so far

Least Square Linear Regression

(by MarkCC) Jan 30 2013

There's one topic I've been asked about multiple times, but which I've never gotten around to writing about. It happens to be one of the first math things that my dad taught me about: linear regression.

Here's the problem: you're doing an experiment. You're measuring one quantity as you vary another. You've got a good reason to believe that there's a linear relationship between the two quantities. But your measurements are full of noise, so when you plot the data on a graph, you get a scattershot. How can you figure out what line is the best match to your data, and how can you measure how good the match is?

When my dad taught me this, he was working for RCA manufacturing semiconductor chips for military and satellite applications. The focus of his work was building chips that would survive in the high-radiation environment of space - in the jargon, he was building radiation hard components. They'd put together a set of masks for an assembly line, and do a test run. Then they'd take the chips from that run, and they'd expose them to gamma radiation until they failed. That would give them a good way of estimating the actual radiation hardness of the run, and whether it was good enough for their customers. Based on a combination of theory and experience, they knew that the relationship they cared about was nearly linear: for a given amount of radiation, the number of specific circuitry failures was proportional to the amount of gamma exposure.

graph For example, here's a graph that I generated semi-randomly of data points. The distribution of the points isn't really what you'd get from real observations, but it's good enough for demonstration.

The way that we'd usually approach this is called least square linear regression. The idea is that what we want do do is create a line where the square of the vertical distance between the chosen line and the measured data points is a minimum.

For the purposes of this, we'll say that one quantity is the independent value, and we'll call that x, and the other quantity is the dependent variable, and we'll call that y. In theory, the dependent variable, as its name suggests depends on the independent variable. In fact, we don't always really know which value depends on the other, so we do our best to make an intelligent guess.

So what we want to do is find a linear equation, y = mx + b where the mean-square distance is minimal. All we need to do is find values for m (the slope of the line) and b (the point where the line crosses the y axis, also called the y intercept). And, in fact, b is relatively easy to compute once we know the slope of the line. So the real trick is to find the slope of the line.

The way that we do that is: first we compute the means of x and y, which we'll call \overline{x} and \overline{y}. Then using those, we compute the slope as:

 m = \frac{\Sigma_{i=1}^n (x-\hat{x})(y-\hat{y})}{\Sigma_{i=1}^{n} (x-\hat{x})^2}

Then for the y intercept: b = \hat{y} - m\hat{x}.

In the case of this data: I set up the script so that the slope would be about 2.2 +/- 0.5. The slop in the figure is 2.54, and the y-intercept is 18.4.

Now, we want to check how good the linear relationship is. There's several different ways of doing that. The simplest is called the correlation coefficient, or r.

 r = 	\frac{\left(\Sigma (x-\hat{x})\right) \left(\Sigma (y - \hat{y})\right)}{<br />
	\sqrt{ \left(\Sigma (x-\hat{x})^2\right) \left(\Sigma (y - \hat{y})^2\right)  }<br />
}<br />

If you look at this, it's really a check of how well the variation between the measured values and the expected values (according to the regression) match. On the top, you've got a set of products; on the bottom, you've got the square root of the same thing squared. The bottom is, essentially, just stripping the signs away. The end result is that if the correlation is perfect - that is, if the dependent variable increases linearly with the independent, then the correlation will be 1. If the dependency variable decreases linearly in opposition to the dependent, then the correlation will be -1. If there's no relationship, then the correlation will be 0.

For this particular set of data, I generated it with a linear equation with a little bit of random noise. The correlation coefficient is slighly greater than 0.95, which is exctly what you'd expect.

When you see people use linear regression, there are a few common errors that you'll see all the time.

  • No matter what your data set looks like, linear regression will find a line. It won't tell you "Oops, I couldn't find a match". So the fact that you fit a line means absolutely nothing by itself. If you're doing it right, you start off with a hypothesis based on prior plausibility for a linear relation, and you're using regression as part of a process to test that hypothesis.
  • You don't get to look at the graph before you do the analysis. What I mean by that is, if you look at the data, you'll naturally notice some patterns. Humans are pattern seekers - we're really good at noticing them. And almost any data set that you look at carefully enough will contain some patterns purely by chance. If you look at the data, and there's a particular pattern that you want to see, you'll probably find a way to look at the data that produces that pattern. For example, in the first post on this blog, I was looking at a shoddy analysis by some anti-vaxxers, who were claiming that they'd found an inflection point in the rate of autism diagnoses, and used linear regression to fit two lines - one before the inflection, one after. But that wasn't supported in the data. It was random - the data was very noisy. You could fit different lines to different sections by being selective. If you picked one time, you'd get a steeper slope before that time, and a shallower one after. But by picking different points, you could get a steeping slope after. The point is, when you're testing the data, you need to design the tests before you've seen the data, in order to keep your bias out!

  • A strong correlation doesn't imply linear correlation. If you fit a line to a bunch of data that's not really linear, you can still get a strong positive (or negative) correlation. Correlation is really testing whether the data is increasing the way you'd expect it to, not whether it's truly linear. Random data will have a near-zero correlation. Data where the dependent variable doesn't vary consistently with the independent will have near-zero correlation. But there are plenty of ways of getting data where the dependent and independent variables increase together that produce a strong correlation. You need to do other things to judge the strength of the fit. (I might do some more posts on this kind of thing to cover some of that.)
Share

20 responses so far

Not Really Free Energy from Silicon and Water

(by MarkCC) Jan 28 2013

A reader sent me a link to an article about a new technique for producing hydrogen. His original request was for me to debunk it, because it looked like another free energy scam. But it isn't, so I'm going to attempt to explain why.

BUFFALO, N.Y. — Super-small particles of silicon react with water to produce hydrogen almost instantaneously, according to University at Buffalo researchers.

In a series of experiments, the scientists created spherical silicon particles about 10 nanometers in diameter. When combined with water, these particles reacted to form silicic acid (a nontoxic byproduct) and hydrogen — a potential source of energy for fuel cells.

The reaction didn’t require any light, heat or electricity, and also created hydrogen about 150 times faster than similar reactions using silicon particles 100 nanometers wide, and 1,000 times faster than bulk silicon, according to the study.

Producing hydrogen from water, with no light, heat, or electricity! I can't blame you if you read that, and think that it sounds like a free-energy scam. We're talking about producing hydrogen from water without using any energy!

But here's the catch: it's a chemical reaction. It consumes the silicon, producing silicic acid.

There are all sorts of chemical reactions that can release energy. That's why we like gasoline so much: it's a chemical that reacts easily with oxygen, and releases lots of energy when it does. There's no cheating going on there. As thermodynamics predicts, everything trends towards the lowest energy state - if you look at the thermodynamic system containing a reaction that releases energy, it properly ends up in a lower energy state.

In the case of this silicon bead technique, all that we're seeing is a typical entropy increasing chemical reaction.

If the silicon wasn't consumed by the reaction, but we were still able to break water molecules producing free hydrogen, which we could then burn to produce energy, that would be a free energy system: you could take water, dump in the silicon, produce hydrogen, burn the hydrogen, releasing energy and producing water, and then take the resulting water, and dump it back into the silicon mixture. But that's not what's going on here. Instead, the silicon is getting consumed, and if you wanted to re-use it, you would need to use energy to extract it from the silicic acid - which would be a typical lossy thermodynamic process.

Share

2 responses so far

Recipe: Kimchi-Marinated Steak Sandwich

(by MarkCC) Jan 26 2013

Last weekend, I was watching cooking shows on PBS. One of my favorite chefs, Ming Tsai, came on. Along with a guest, he ad-libbed a dish that looked absolutely phenomenal: a kimchi marinated cheesesteak sandwich.

I didn't write down the recipe; I just took the idea, and ran with it. I mean come on, what could be better than a well-seared medium-rare flank steak with kimchi and Tallegio? My version is, I think, a bit lighter than what they made, but it's not exactly a light snack!

For my tastes, this is an amazing combination. Start with flank steak, which is one of my favorite cuts of beef. It's not the tenderest piece of beef, so it needs some work, but it's got an absolutely fantastic flavor, and as long as you prepare it properly, don't overcook it, and slice it thin, it comes out great.

Then you've got kimchi. I only discovered kimchi fairly recently, and I absolutely love the stuff, and I'm always looking for more things to eat it with.

Finally, tallegio cheese. I adore soft cheeses, and tallegio is probably the nicest soft cheese that I've had from Italy. It's very creamy, with a nice funkiness to it, but not an overpowering stinkiness.

Put all three of those together in a sandwich, and I'm in heaven.

Ingredients

  • One largish (1 1/2 pound) flank steak.
  • 1/2 cup kimchi, finely minced, plus 1 tablespoon of
    the liquid from the kimchi.
  • Some more kimchi, left whole.
  • salt and pepper
  • one teaspoon sugar.
  • one half of a large onion, cut into thin slices.
  • 1/2 pound tallegio cheese.
  • Bagette.

One note about the ingredients: I recently made homemade kimchi for the first time. It's really easy to make, and it is really a totally different thing from the stuff out of a jar. I don't see why it's so different: kimchi is a fermented preserved cabbage, which is exactly the kind of thing that should be really easy to make in quantity, and then sell in jars. But the fact is, nothing from the store remotely compares to homemade. I basically used this recipe, except that I couldn't get any daikon, so I left that out.

Instructions

  1. Take a couple of forks, and go up and down the steak, poking holes. Don't be lazy: the more you do this, the better. You're tenderizing the meat, and making a way for the marinade to help penetrate.
  2. Salt the meat generously, then coat both sides with the minced kimchi. Refrigerate overnight.
  3. Scrape the kimchi off the steak, but do not throw it away. We're going to cook it, and use it as a topping.
  4. Heat a cast-iron pan on high heat until it's smoking hot.
  5. Sprinkle half the sugar and some more salt and pepper onto the top of the steak, and then put the sugared side down in the pan, for about 3 minutes. While it's cooking on that side, put the rest of the sugar and salt onto the other side. Then turn it over, and cook the other side for 3 minutes.
  6. Transfer the steak into a 350 degree oven for 8-10 minutes. (I did eight; it was exactly the way I like it, nice and medium rare, but I think a lot of people would prefer it a bit more well done. Take it out after the ten minutes, and leave it alone for a while. Don't slice it immediately!
  7. While the steak is resting, heat up a pan, and add some oil. Toss in the onions with a bit of salt, and stir them around. When they just start to turn brown, add in the kimchi and any liquid left from the marinade, and turn down the heat. Cook it for about five minutes, adding a bit of water if it gets too dry.
  8. Slice the break in half, and spread a thin layer of the taleggio cheese over both halves.
  9. Slice the steak on a bias against the grain into thin slices.
  10. Take any drippings from slicing the steak, and pour them into the onions and kimchi.
  11. Put a heap of steak slices onto the bread, and then put a pile of the cooked onions and kimchi, and then a piece of fresh uncooked kimchi on top.
  12. Close up the sandwich, and eat!

Philadelphia, eat your heart out. Your cheesesteaks got nothing on this!

Share

No responses yet

Back to an old topic: Bad Vaccine Math

(by MarkCC) Jan 25 2013

The very first Good Math/Bad Math post ever was about an idiotic bit of antivaccine rubbish. I haven't dealt with antivaccine stuff much since then, because the bulk of the antivaccine idiocy has nothing to do with math. But the other day, a reader sent me a really interesting link from what my friend Orac calls a "wretched hive of scum and quackery", naturalnews.com, in which they try to argue that the whooping cough vaccine is an epic failure:

(NaturalNews) The utter failure of the whooping cough (pertussis) vaccine to provide any real protection against disease is once again on display for the world to see, as yet another major outbreak of the condition has spread primarily throughout the vaccinated community. As it turns out, 90 percent of those affected by an ongoing whooping cough epidemic that was officially declared in the state of Vermont on December 13, 2012, were vaccinated against the condition -- and some of these were vaccinated two or more times in accordance with official government recommendations.

As reported by the Burlington Free Press, at least 522 cases of whooping cough were confirmed by Vermont authorities last month, which was about 10 times the normal amount from previous years. Since that time, nearly 100 more cases have been confirmed, bringing the official total as of January 15, 2013, to 612 cases. The majority of those affected, according to Vermont state epidemiologist Patsy Kelso, are in the 10-14-year-old age group, and 90 percent of those confirmed have already been vaccinated one or more times for pertussis.

Even so, Kelso and others are still urging both adults and children to get a free pertussis shot at one of the free clinics set up throughout the state, insisting that both the vaccine and the Tdap booster for adults "are 80 to 90 percent effective." Clearly this is not the case, as evidenced by the fact that those most affected in the outbreak have already been vaccinated, but officials are apparently hoping that the public is too naive or disengaged to notice this glaring disparity between what is being said and what is actually occurring.

It continues in that vein. The gist of the argument is:

  1. We say everyone needs to be vaccinated, which will protect them from getting the whooping cough.
  2. The whooping cough vaccine is, allagedly, 80 to 90% effective.
  3. 90% of the people who caught whooping cough were properly vaccinated.
  4. Therefore the vaccine can't possibly work.

What they want you to do is look at that 80 to 90 percent effective rate, and see that only 10-20% of vaccinated people should be succeptible to the whooping cough, and compare that 10-20% to the 90% of actual infected people that were vaccinated. 20% (the upper bound of the succeptible portion of vaccinated people according to the quoted statistic) is clearly much smaller than 90% - therefore it's obvious that the vaccine doesn't work.

Of course, this is rubbish. It's a classic apple to orange-grove comparison. You're comparing percentages, when those percentages are measuring different groups - groups with wildly difference sizes.

Take a pool of 1000 people, and suppose that 95% are properly vaccinated (the current DTAP vaccination rate in the US is around 95%). That gives you 950 vaccinated people and 50 unvaccinated people who are unvaccinated.

In the vaccinated pool, let's assume that the vaccine was fully effective on 90% of them (that's the highest estimate of effectiveness, which will result in the lowest number of succeptible vaccinated - aka the best possible scenario for the anti-vaxers). That gives us 95 vaccinated people who are succeptible to the whooping cough.

There's the root of the problem. Using numbers that are ridiculously friendly to the anti-vaxers, we've still got a population of twice as many succeptible vaccinated people as unvaccinated. so we'd expect, right out of the box, that better than 2/3rds of the cases of whooping cough would be among the vaccinated people.

In reality, the numbers are much worse for the antivax case. The percentage of people who were ever vaccinated is around 95%, because you need the vaccination to go to school. But that's just the childhood dose. DTAP is a vaccination that needs to be periodically boosted or the immunity wanes. And the percentage of people who've had boosters is extremely low. Among adolescents, according to the CDC, only a bit more than half have had DTAP boosters; among adults, less that 10% have had a booster within the last 5 years.

What's your succeptibility if you've gone more than 5 years without vaccination? Somewhere 40% of people who didn't have boosters in the last five years are succeptible.

So let's just play with those numbers a bit. Assume, for simplicity, than 50% of the people are adults, and 50% children, and assume that all of the children are fully up-to-date on the vaccine. Then you've got 10% of the children (10% of 475), 10% of the adults that are up-to-date (10% of 10% of 475), and 40% of the adults that aren't up-to-date (40% of 90% of 475) is the succeptible population. That works out to 266 succeptible people among the vaccinated, which is 85%: so you'd expect 85% of the actual cases of whooping cough to be among people who'd been vaccinated. Suddenly, the antivaxers case doesn't look so good, does it?

Consider, for a moment, what you'd expect among a non-vaccinated population. Pertussis is highly contagious. If someone in your household has pertussis, and you're succeptible, you've got a better than 90% chance of catching it. It's that contagious. Routine exposure - not sharing a household, but going to work, to the store, etc., with people who are infected still gives you about a 50% chance of infection if you're succeptible.

In the state of Vermont, where NaturalNews is claiming that the evidence shows that the vaccine doesn't work, how many cases of Pertussis have they seen? Around 600, out of a state population of 600,000 - an infection rate of one tenth of one percent. 0.1 percent, from a virulently contagious disease.

That's the highest level of Pertussis that we've seen in the US in a long time. But at the same time, it's really a very low number for something so contagious. To compare for a moment: there's been a huge outbreak of Norovirus in the UK this year. Overall, more than one million people have caught it so far this winter, out of a total population of 62 million, for a rate of about 1.6% or sixteen times the rate of infection of pertussis.

Why is the rate of infection with this virulently contagious disease so different from the rate of infection with that other virulently contagious disease? Vaccines are a big part of it.

Share

8 responses so far

« Newer posts Older posts »

Bad Behavior has blocked 1092 access attempts in the last 7 days.