Static Typing: Give me a break!

I'm a software engineer. I write code for a living. I'm also a programming language junkie. I love programming languages. I'm obsessed with programming languages. I've taught myself more programming languages than any sane person has any reason to know.

Learning that many languages, I've developed some pretty strong opinions about what makes a good language, and what kind of things I really want to see in the languages that I use.

My number one preference: strong static typing. That's part of a more general preference, for preserving information. When I'm programming, I know what kind of thing I expect in a parameter, and I know what I expect to return. When I'm programming in a weakly typed language, I find that I'm constantly throwing information away, because I can't actually say what I know about the program. I can't put my knowledge about the expected behavior into the code. I don't think that that's a good thing.

But... (you knew there was a but coming, didn't you?)

This is my preference. I believe that it's right, but I also believe that reasonable people can disagree. Just because you don't think the same way that I do doesn't mean that you're an idiot. It's entirely possible for someone to know as much as I do about programming languages and have a different opinion. We're talking about preferences.

Sadly, that kind of attitude is something that is entirely too uncommon. I seriously wonder somethings if I'm crazy, because it seems like everywhere I look, on every topic, no matter how trivial, most people absolutely reject the idea that it's possible for an intelligent, knowledgeable person to disagree with them. It doesn't matter what the subject is: politics, religion, music, or programming languages.

What brought this little rant on is that someone sent me a link to a comic, called "Cartesian Closed Comic". It's a programming language geek comic. But what bugs me is this comic. Which seems to be utterly typical of the kind of attitude that I'm griping about.

See, this is a pseudo-humorous way of saying "Everyone who disagrees with me is an idiot". It's not that reasonable people can disagree. It's that people who disagree with me only disagree because they're ignorant. If you like static typing, you probably know type theory. If you don't like static typing, that there's almost no chance that you know anything about type theory. So the reason that those stupid dynamic typing people don't agree with people like me is because they just don't know as much as I do. And the reason that arguments with them don't go anywhere isn't because we have different subjective preferences: it's because they're just too ignorant to understand why I'm right and they're wrong.

Most programmers - whether they prefer static typing or not - don't know type theory. Most of the arguments about whether to use static or dynamic typing aren't based on type theory. It's just the same old snobbery, the "you can't disagree with me unless you're an idiot".

Among intelligent skilled engineers, the static versus dynamic typing thing really comes down to a simple, subjective argument:

Static typing proponents believe that expressing intentions in a static checkable form is worth the additional effort of making all of the code type-correct.

Dynamic typing proponents believe that it's not: that strong typing creates an additional hoop that the programmer needs to jump through in order to get a working system.

Who's right? In fact, I don't think that either side is universally right. Building a real working system is a complex thing. There's a complex interplay of design, implementation, and testing. What static typing really does is take some amount of stuff that could be checked with testing, and allows the compiler the check it in an abstract way, instead of with specific tests.

Is it easier to write code with type declarations, or with additional tests? Depends on the engineers and the system that they're building.

Sloppy Dualism Denies Free Will?

When I was an undergrad in college, I was a philosophy minor. I spent countless hours debating ideas about things like free will. My final paper was a 60 page rebuttal to what I thought was a sloppy argument against free will. Now, it's been more years since I wrote that than I care to admit - and I still keep seeing the same kind of sloppy arguments, that I argue are ultimately circular, because they're hiding their conclusion in their premises.

There's an argument against free will that I find pretty compelling. I don't agree with it, but I do think that it's a solid argument:

Everything in our experience of the universe ultimately comes down to physics. Every phenomenon that we can observe is, ultimately, the result of particles interacting according to basic physical laws. Thermodynamics is the ultimate, fundamental ruler of the universe: everything that we observe is a result of a thermodynamic process. There are no exceptions to that.

Our brain is just another physical device. It's another complex system made of an astonishing number of tiny particles, interacting in amazingly complicated ways. But ultimately, it's particles interacting the way that particles interact. Our behavior is an emergent phenomenon, but ultimately, we don't have any ability to make choice, because there's no mechanism that allows us free choice. Our choice is determined by the physical interactions, and our consciousness of those results is just a side-effect of that.

If you want to argue that free will doesn't exist, that argument is rock solid.

But for some reason, people constantly come up with other arguments - in fact, much weaker arguments that come from what I call sloppy dualism. Dualism is the philosophical position that says that a conscious being has two different parts: a physical part, and a non-physical part. In classical terms, you've got a body which is physical, and a mind/soul which is non-physical.

In this kind of argument, you rely on that implicit assumption of dualism, essentially asserting that whatever physical process we can observe isn't really you, and that therefore by observing any physical process of decision-making, you infer that you didn't really make the decision.

For example...

And indeed, this is starting to happen. As the early results of scientific brain experiments are showing, our minds appear to be making decisions before we're actually aware of them — and at times by a significant degree. It's a disturbing observation that has led some neuroscientists to conclude that we're less in control of our choices than we think — at least as far as some basic movements and tasks are concerned.

This is something that I've seen a lot lately: when you do things like functional MRI, you can find that our brains settled on a decision before we consciously became aware of making the choice.

Why do I call it sloppy dualism? Because it's based on the idea that somehow the piece of our brain that makes the decision is different from the part of our brain that is our consciousness.

If our brain is our mind, then everything that's going on in our brain is part of our mind. Taking a piece of our brain, saying "Whoops, that piece of your brain isn't you, so when it made the decision, it was deciding for you instead of it being you deciding.

By starting with the assumption that the physical process of decision-making we can observe is something different from your conscious choice of the decision, this kind of argument is building the conclusion into the premises.

If you don't start with the assumption of sloppy dualism, then this whole argument says nothing. If we don't separate our brain from our mind, then this whole experiment says nothing about the question of free will. It says a lot of very interesting things about how our brain works: it shows that there are multiple levels to our minds, and that we can observe those different levels in how our brains function. That's a fascinating thing to know! But does it say anything about whether we can really make choices? No.

The Investors vs. the Tabby

There's an amusing article making its rounds of the internet today, about the successful investment strategy of a cat named Orlando..

A group of people at the Observer put together a fun experiment.
They asked three groups to pretend that they had 5000 pounds, and asked each of them to invest it, however they wanted, in stocks listed on the FTSE. They could only change their investments at the end of a calendar quarter. At the end of the year, they compared the result of the three groups.

Who were the three groups?

1. The first was a group of professional investors - people who are, at least in theory, experts at analyzing the stock market and using that analysis to make profitable investments.
2. The second was a classroom of students, who are bright, but who have no experience at investment.
3. The third was an orange tabby cat named Orlando. Orlando chose stocks by throwing his toy mouse at a
targetboard randomly marked with investment choices.

As you can probably guess by the fact that we're talking about this, Orlando the tabby won, by a very respectable margin. (Let's be honest: if the professional investors came in first, and the students came in second, no one would care.) At the end of the year, the students had lost 160 pounds on their investments. The professional investors ended with a profit of 176 pounds. And the cat ended with a profit of 542 pounds - more than triple the profit of the professionals.

Most people, when they saw this, had an immediate reaction: "see, those investors are a bunch of idiots. They don't know anything! They were beaten by a cat!"
And on one level, they're absolutely right. Investors and bankers like to present themselves as the best of the best. They deserve their multi-million dollar earnings, because, so they tell us, they're more intelligent, more hard-working, more insightful than the people who earn less. And yet, despite their self-alleged brilliance, professional investors can't beat a cat throwing a toy mouse!

It gets worse, because this isn't a one-time phenomenon: there've been similar experiments that selected stocks by throwing darts at a news-sheet, or by rolling dice, or by picking slips of paper from a hat. Many times, when people have done these kinds of experiments, the experts don't win. There's a strong implication that "expert investors" are not actually experts.

Does that really hold up? Partly yes, partly no. But mostly no.

Before getting to that, there's one thing in the article that bugged the heck out of me: the author went out of his/her way to make sure that they defended the humans, presenting their performance as if positive outcomes were due to human intelligence, and negative ones were due to bad luck. In fact, I think that in this experiment, it was all luck.

For example, the authors discuss how the professionals were making more money than the cat up to the last quarter of the year, and it's presented as the human intelligence out-performing the random cat. But there's no reason to believe that. There's no evidence that there's anything qualitatively different about the last quarter that made it less predictable than the first three.

The headmaster at the student's school actually said "The mistakes we made earlier in the year were based on selecting companies in risky areas. But while our final position was disappointing, we are happy with our progress in terms of the ground we gained at the end and how our stock-picking skills have improved." Again, there's absolutely no reason to believe that the students stock picking skills miraculously improved in the final quarter; much more likely that they just got lucky.

The real question that underlies this is: is the performance of individual stocks in a stock market actually predictable, or is it dominantly random. Most of the evidence that I've seen suggests that there's a combination; on a short timescale, it's predominantly random, but on longer timescales it becomes much more predictable.

But people absolutely do not want to believe that. We humans are natural pattern-seekers. It doesn't matter whether we're talking about financial markets, pixel-patterns in a bitmap, or answers on a multiple choice test: our brains look for patterns. If you randomly generate data, and you look at it long enough, with enough possible strategies,
you'll find a pattern that fits. But it's an imposed pattern, and it has no predictive value. It's like the images of jesus on toast: we see patterns in noise. So people see patterns in the market, and they want to believe that it's predictable.

Second, people want to take responsibility for good outcomes, and excuse bad ones. If you make a million dollars betting on a horse, you're going to want to say that it was your superiour judgement of the horses that led to your victory. When an investor makes a million dollars on a stock, of course he wants to say that he made that money because he made a smart choice, not because he made a lucky choice. But when that same investor loses a million dollars, he doesn't want to say that the lost a million dollars because he's stupid; he wants to say that he lost money because of bad luck, of random factors beyond his control that he couldn't predict.

The professional investors were doing well during part of the year: therefore, during that part of the year, they claim that their good performance was because they did a good job judging which stocks to buy. But when they lost money during the last quarter? Bad luck. But overall, their knowledge and skills paid off! What evidence do we have to support that? Nothing: but we want to assert that we have control, that experts understand what's going on, and are able to make intelligent predictions.

The students performance was lousy, and if they had invested real money, they would have lost a tidy chunk of it. But their teacher believes that their performance in the last quarter wasn't luck - it was that their skills had improved. Nonsense! They were lucky.

On the general question: Are "experts" useless for managing investments?

It's hard to say for sure. In general, experts do perform better than random, but not by a huge margin, certainly not by as much as they'd like us to believe. The Wall Street Journal used to do an experiment where they compared dartboard stock selection against human experts, and against passive investment in the Dow Jones Index stocks over a one-year period. The pros won 60% of the time. That's better than chance: the experts knowledge/skills were clearly benefiting them. But: blindly throwing darts at a wall could beat experts 2 out of 5 times!

When you actually do the math and look at the data, it appears that human judgement does have value. Taken over time, human experts do outperform random choices, by a small but significant margin.

What's most interesting is a time-window phenomenon. In most studies, the human performance relative to random choice is directly related to the amount of time that the investment strategy is followed: the longer the timeframe, the better the humans perform. In daily investments, like day-trading, most people don't do any better than random. The performance of day-traders is pretty much in-line with what you'd expect from probability from random choice. Monthly, it's still mostly a wash. But if you look at yearly performance, you start to see a significant difference: humans do typically outperform random choice by a small but definitely margin. If you look at longer time-frames, like 5 or ten years, then you start to see really sizeable differences. The data makes it look like daily fluctuations of the market are chaotic and unpredictable, but that there are long-term trends that we can identify and exploit.

A Bad Mathematical Refutation of Atheism

At some point a few months ago, someone (sadly I lost their name and email) sent me a link to yet another Cantor crank. At the time, I didn't feel like writing another Cantor crankery post, so I put it aside. Now, having lost it, I was using Google to try to find the crank in question. I didn't, but I found something really quite remarkably idiotic.

(As a quick side-comment, my queue of bad-math-crankery is, sadly, empty. If you've got any links to something yummy, please shoot it to me at markcc@gmail.com.)

The item in question is this beauty. It's short, so I'll quote the whole beast.

MYTH: Cantor's Set Theorem disproves divine omniscience

God is omniscient in the sense that He knows all that is not impossible to know. God knows Himself, He knows and does, knows every creature ideally, knows evil, knows changing things, and knows all possibilites. His knowledge allows free will.

Cantor's set theorem is often used to argue against the possibility of divine omniscience and therefore against the existence of God. It can be stated:

1. If God exists, then God is omniscient.
2. If God is omniscient, then, by definition, God knows the set of all truths.
3. If Cantor's theorem is true, then there is no set of all truths.
4. But Cantor’s theorem is true.
5. Therefore, God does not exist.

However, this argument is false. The non-existence of a set of all truths does not entail that it is impossible for God to know all truths. The consistency of a plausible theistic position can be established relative to a widely accepted understanding of the standard model of Cantorian set theorem. The metaphysical Cantorian premises imply that Cantor’s theorem is inapplicable to the things that God knows. A set of all truths, if it exists, must be non-Cantorian.

The attempted disproof of God’s omniscience is, from a meta-mathematical standpoint, is inadequate to the extent that it doesn't explain well-known mathematical contexts in which Cantor’s theorem is invalid. The "disproof" doesn't acknowledge standard meta-mathematical conceptions that can analogically be used to establish the relative consistency of certain theistic positions. The metaphysical assertions concerning a set of all truths in the atheistic argument above imply that Cantor’s theorem is inapplicable to a set of all truths.

This is an absolute masterwork of crankery! It's remarkably silly argument on so many levels.

1. The first problem is just figuring out what the heck he's talking about! When you say "Cantor's theorem", what I think of is one of Cantor's actual theorems: "For any set S, the powerset of S is larger than S." But that is clearly not what he's referring to. I did a bit of searching to make sure that this wasn't my error, but I can't find anything else called Cantor's theorem.
2. So what the heck does he mean by "Cantor's set theorem"? From his text, it appears to be a statement something like: "there is no set of all truths". The closest actual mathematical statement that I can come up with to match that is Gödel's incompleteness theorem. If that's what he means, then he's messed it up pretty badly. The closest I can come to stating incompleteness informally is: "In any formal mathematical system that's powerful enough to express Peano arithmetic, there will be statements that are true, but which cannot be proven". It's long, complex, not particularly intuitive, and it's still not a particularly good statement of incompleteness.

Incompleteness is a difficult concept, and as I've written about before, it's almost impossible to state incompleteness in an informal way. When you try to do that, it's inevitable that you're going to miss some of its subtleties. When you try to take an informal statement of incompleteness, and reason from it, the results are pretty much guaranteed to be garbage - as he's done. He's using a mis-statement of incompleteness,and trying to reason from it. It doesn't matter what he says: he's trying to show how "Cantor's set theorem" doesn't disprove his notion of theism. Whether it does or not doesn't matter: for any statement X, no matter what X is, you can't prove that "Cantor's set theorem" or Gödel's incompleteness theorem, or anything else disproves X if you're arguing against something that isn't X.

3. Ignoring his mis-identification of the supposed theorem, the way that he stated it is actually meaningless. When we talk about sets, we're using the word set in the sense of either ZFC or NBG set theory. Mathematical set theory defines what a set is, using first order predicate logic. His version of "Cantor's set theorem" talks about a set which cannot be a set!

He wants to create a set of truths. In set theory terms, that's something you'd define with the axiom of specification: you'd use a predicate ranging over your objects to select the ones in the set. What's your predicate? Truth. At best, that's going to be a second-order predicate. You can't form sets using second-order predicates! The entire idea of "the set of truths" isn't something that can be expressed in set theory.

4. Let's ignore the problems with his "Cantor's theorem" for the moment. Let's pretend that the "set of all truths" was well-defined and meaningful. How does his argument stand up? It doesn't: it's a terrible argument. It's ultimately nothing more than "Because I say so!" hidden behind a collection of impressive-sounding words. The argument, ultimately, is that the set of all truths as understood in set theory isn't the same thing as the set of all truths in theology (because he says that they're different), therefore you can't use a statement about the set of all truths from set theory to talk about the set of all truths in theology.
5. I've saved what I think is the worst for last. The entire thing is a strawman. As a religious science blogger, I get almost as much mail from atheists trying to convince me that my religion is wrong as I do from Christians trying to convert me. After doing this blogging thing for six years, I'm pretty sure that I've been pestered with every argument, both pro- and anti-theistic that you'll find anywhere. But I've never actually seen this argument used anywhere except in articles like this one, which purport to show why it's wrong. The entire argument being refuted is a total fake: no one actually argues that you should be an atheist using this piece of crap. It only exists in the minds of crusading religious folk who prop it up and then knock it down to show how smart they supposedly are, and how stupid the dirty rotten atheists are.

Let's Get Rid of Zero!

One of my tweeps sent me a link to a delightful pile of rubbish: a self-published "paper" by a gentleman named Robbert van Dalen that purports to solve the "problem" of zero. It's an amusing pseudo-technical paper that defines a new kind of number which doesn't work without the old numbers, and which gets rid of zero.

Before we start, why does Mr. van Dalen want to get rid of zero?

So what is the real problem with zeros? Zeros destroy information.

That is why they don’t have a multiplicative inverse: because it is impossible to rebuilt something you have just destroyed.

Hopefully this short paper will make the reader consider the author’s firm believe that: One should never destroy anything, if one can help it.

We practically abolished zeros. Should we also abolish simplifications? Not if we want to stay practical.

There's nothing I can say to that.

So what does he do? He defines a new version of both integers and rational numbers. The new integers are called accounts, and the new rationals are called super-rationals. According to him, these new numbers get rid of that naughty information-destroying zero. (He doesn't bother to define real numbers in his system; I assume that either he doesn't know or doesn't care about them.)

Before we can get to his definition of accounts, he starts with something more basic, which he calls "accounting naturals".

He doesn't bother to actually define them - he handwaves his way through, and sort-of defines addition and multiplication, with:

a + b == a concat b
Multiplication
a * b = a concat a concat a ... (with b repetitions of a)

So... a sloppy definition of positive integer addition, and a handwave for multiplication.

What can we take from this introduction? Well, our author can't be bothered to define basic arithmetic properly. What he really wants to say is, roughly, Peano arithmetic, with 0 removed. But my guess is that he has no idea what Peano arithmetic actually is, so he handwaves. The real question is, why did he bother to include this at all? My guess is that he wanted to pretend that he was writing a serious math paper, and he thinks that real math papers define things like this, so he threw it in, even though it's pointless drivel.

With that rubbish out of the way, he defines an "Account" as his new magical integer, as a pair of "account naturals". The first member of the pair is called a the credit, and the second part is the debit. If the credit is a and the debit is b, then the account is written (a%b). (He used backslash instead of percent; but that caused trouble for my wordpress config, so I switched to percent-sign.)

a%b ++ c%d = (a+c)%(b+d)
Multiplication
a%b ** c%d = ((a*c)+(b*d))%((a*d)+(b*c))
Negation
- a%b = b%a

So... for example, consider 5*6. We need an "account" for each: We'll use (7%2) for 5, and (9%3) for 6, just to keep things interesting. That gives us: 5*6 = (7%2)*(9%3) = (63+6)%(21+18) = 69%39, or 30 in regular numbers.

Yippee, we've just redefined multiplication in a way that makes us use good old natural number multiplication, only now we need to do it four times, plus 2 additions to multiply two numbers! Wow, progress! (Of a sort. I suppose that if you're a cloud computing provider, where you're renting CPUs, then this would be progress.

Oh, but that's not all. See, each of these "accounts" isn't really a number. The numbers are equivalence classes of accounts. So once you get the result, you "simplify" it, to make it easier to work with.

So make that 4 multiplications, 2 additions, and one subtraction. Yeah, this is looking nice, huh?

So... what does it give us?

As far as I can tell, absolutely nothing. The author promises that we're getting rid of zero, but it sure likes like this has zeros: 1%1 is zero, isn't it? (And even if we pretend that there is no zero, Mr. van Dalen specifically doesn't define division on accounts, we don't even get anything nice like closure.)

But here's where it gets really rich. See, this is great, cuz there's no zero. But as I just said, it looks like 1%1 is 0, right? Well it isn't. Why not? Because he says so, that's why! Really. Here's a verbatim quote:

An Account is balanced when Debit and Credit are equal. Such a balanced Account can be interpreted as (being in the equivalence class of) a zero but we won’t.

Yeah.

But, according to him, we don't actually get to see these glorious benefits of no zero until we add rationals. But not just any rationals, dum-ta-da-dum-ta-da! super-rationals. Why super-rationals, instead of account rationals? I don't know. (I'm imagining a fraction with blue tights and a red cape, flying over a building. That would be a lot more fun than this little "paper".)

So let's look as the glory that is super-rationals. Suppose we have two accounts, e = a%b, and f = c%d. Then a "super-rational" is a ratio like e/f.

So... we can now define arithmetic on the super-rationals:

e/f +++ g/h = ((e**h)++(g**f))/(f**h); or in other words, pretty much exactly what we normally do to add two fractions. Only now those multiplications are much more laborious.
Multiplication
e/f *** g/h = (e**g)/(f**h); again, standard rational mechanics.
Multiplication Inverse (aka Reciprocal)
e/f = f/e; (he introduces this hideous notation for no apparent reason - backquote is reciprocal. Why? I guess for the same reason that he did ++ and +++ - aka, no particularly good reason.

So, how does this actually help anything?

It doesn't.

See, zero is now not really properly defined anymore, and that's what he wants to accomplish. We've got the simplified integer 0 (aka "balance"), defined as 1%1. We've got a whole universe of rational pseudo-zeros - 0/1, 0/2, 0/3, 0/4, all of which are distinct. In this system, (1%1)/(4%2) (aka 0/2) is not the same thing as (1%1)/(5%2) (aka 0/3)!

The "advantage" of this is that if you work through this stupid arithmetic, you essentially get something sort-of close to 0/0 = 0. Kind-of. (There's no rule for converting a super-rational to an account; assuming that if the denominator is 1, you can eliminate it, you get 1/0 = 0:

I'm guessing that he intends identities to apply, so: (4%1)/(1%1) = ((4%1)/(2%1)) *** ((2%1)/(1%1)) = ((4%1)/(2%1)) *** ((1%1)/(2%1)) = (1%1)/(2%1). So 1/0 = 0/1 = 0... If you do the same process with 2/0, you end up getting the result being 0/2. And so on. So we've gotten closure over division and reciprocal by getting rid of zero, and replacing it with an infinite number of non-equal pseudo-zeros.

What's his answer to that? Of course, more hand-waving!

Note that we also can decide to simplify a Super- Rational as we would a Rational by calculating the Greatest Common Divisor (GCD) between Numerator and Denominator (and then divide them by their GCD). There is a catch, but we leave that for further research.

The catch that he just waved away? Exactly what I just pointed out - an infinite number of pseudo-0s, unless, of course, you admit that there is a zero, in which case they all collapse down to be zero... in which case this is all pointless.

Essentially, this is all a stupidly overcomplicated way of saying something simple, but dumb: "I don't like the fact that you can't divide by zero, and so I want to define x/0=0."

Why is that stupid? Because dividing by zero is undefined for a reason: it doesn't mean anything! The nonsense of it becomes obvious when you really think about identities. If 4/2 = 2, then 2*2=4; if x/y=z, then x=z*y. But mix zero in to that: if 4/0 = 0, then 0*0=4. That's nonsense.

You can also see it by rephrasing division in english. Asking "what is four divided by two" is asking "If I have 4 apples, and I want to distribute them into 2 equal piles, how many apples will be in each pile?". If I say that with zero, "I want to distribute 4 apples into 0 piles, how many apples will there be in each pile?": you're not distributing the apples into piles. You can't, because there's no piles to distribute them to. That's exactly the point: you can't divide by zero.

If you do as Mr. van Dalen did, and basically define x/0 = 0, you end up with a mess. You can handwave your way around it in a variety of ways - but they all end up breaking things. In the case of this account nonsense, you end up replacing zero with an infinite number of pseudo-zeros which aren't equal to each other. (Or, if you define the pseudo-zeros as all being equal, then you end up with a different mess, where (2/0)/(4/0) = 2/4, or other weirdness, depending on exactly how you defie things.)

The other main approach is another pile of nonsense I wrote about a while ago, called nullity. Zero is an inevitable necessity to make numbers work. You can hate the fact that division by zero is undefined all you want, but the fact is, it's both necessary and right. Division by zero doesn't mean anything, so mathematically, division by zero is undefined.

For every natural number N, there's a Cantor Crank C(n)

More crankery? of course! What kind? What else? Cantor crankery!

It's amazing that so many people are so obsessed with Cantor. Cantor just gets under peoples' skin, because it feels wrong. How can there be more than one infinity? How can it possibly make sense?

As usual in math, it all comes down to the axioms. In most math, we're working from a form of set theory - and the result of the axioms of set theory are quite clear: the way that we define numbers, the way that we define sizes, this is the way it is.

Today's crackpot doesn't understand this. But interestingly, the focus of his problem with Cantor isn't the diagonalization. He thinks Cantor went wrong way before that: Cantor showed that the set of even natural numbers and the set of all natural numbers are the same size!

Unfortunately, his original piece is written in Portuguese, and I don't speak Portuguese, so I'm going from a translation, here.

The Brazilian philosopher Olavo de Carvalho has written a philosophical “refutation” of Cantor’s theorem in his book “O Jardim das Aflições” (“The Garden of Afflictions”). Since the book has only been published in Portuguese, I’m translating the main points here. The enunciation of his thesis is:

Georg Cantor believed to have been able to refute Euclid’s fifth common notion (that the whole is greater than its parts). To achieve this, he uses the argument that the set of even numbers can be arranged in biunivocal correspondence with the set of integers, so that both sets would have the same number of elements and, thus, the part would be equal to the whole.

And his main arguments are:

It is true that if we represent the integers each by a different sign (or figure), we will have a (infinite) set of signs; and if, in that set, we wish to highlight with special signs, the numbers that represent evens, then we will have a “second” set that will be part of the first; and, being infinite, both sets will have the same number of elements, confirming Cantor’s argument. But he is confusing numbers with their mere signs, making an unjustifiable abstraction of mathematical properties that define and differentiate the numbers from each other.

The series of even numbers is composed of evens only because it is counted in twos, i.e., skipping one unit every two numbers; if that series were not counted this way, the numbers would not be considered even. It is hopeless here to appeal to the artifice of saying that Cantor is just referring to the “set” and not to the “ordered series”; for the set of even numbers would not be comprised of evens if its elements could not be ordered in twos in an increasing series that progresses by increments of 2, never of 1; and no number would be considered even if it could be freely swapped in the series of integeres.

He makes two arguments, but they both ultimately come down to: "Cantor contradicts Euclid, and his argument just can't possibly make sense, so it must be wrong".

The problem here is: Euclid, in "The Elements", wrote severaldifferent collections of axioms as a part of his axioms. One of them was the following five rules:

1. Things which are equal to the same thing are also equal to one another.
2. If equals be added to equals, the wholes are equal.
3. If equals be subtracted from equals, the remainders are equal.
4. Things which coincide with one another are equal to one another.
5. The whole is greater that the part.

The problem that our subject has is that Euclid's axiom isn't an axiom of mathematics. Euclid proposed it, but it doesn't work in number theory as we formulate it. When we do math, the axioms that we start with do not include this axiom of Euclid.

In fact, Euclid's axioms aren't what modern math considers axioms at all. These aren't really primitive ground statements. Most of them are statements that are provable from the actual axioms of math. For example, the second and third axioms are provable using the axioms of Peano arithmetic. The fourth one doesn't appear to be a statement about numbers at all; it's a statement about geometry. And in modern terms, the fifth one is either a statement about geometry, or a statement about measure theory.

The first argument is based on some strange notion of signs distinct from numbers. I can't help but wonder if this is an error in translation, because the argument is so ridiculously shallow. Basically, it concedes that Cantor is right if we're considering the representations of numbers, but then goes on to draw a distinction between representations ("signs") and the numbers themselves, and argues that for the numbers, the argument doesn't work. That's the beginning of an interesting argument: numbers and the representations of numbers are different things. It's definitely possible to make profound mistakes by confusing the two. You can prove things about representations of numbers that aren't true about the numbers themselves. Only he doesn't actually bother to make an argument beyond simply asserting that Cantor's proof only works for the representations.

That's particularly silly because Cantor's proof that the even naturals and the naturals have the same cardinality doesn't talk about representation at all. It shows that there's a 1 to 1 mapping between the even naturals and the naturals. Period. No "signs", no representations.

The second argument is, if anything, even worse. It's almost the rhetorical equivalent of sticking his fingers in his ears and shouting "la la la la la". Basically - he says that when you're producing the set of even naturals, you're skipping things. And if you're skipping things, those things can't possible be in the set that doesn't include the skipped things. And if there are things that got skipped and left out, well that means that it's ridiculous to say that the set that included the left out stuff is the same size as the set that omitted the left out stuff, because, well, stuff got left out!!!.

Here's the point. Math isn't about intuition. The properties of infinitely large sets don't make intuitive sense. That doesn't mean that they're wrong. Things in math are about formal reasoning: starting with a valid inference system and a set of axioms, and then using the inference to reason. If we look at set theory, we use the axioms of ZFC. And using the axioms of ZFC, we define the size (or, technically, the cardinality) of sets. Using that definition, two sets have the same cardinality if and only if there is a one-to-one mapping between the elements of the two sets. If there is, then they're the same size. Period. End of discussion. That's what the math says.

Cantor showed, quite simply, that there is such a mapping:

There it is. It exists. It's simple. It works, by the axioms of Peano arithmetic and the axiom of comprehension from ZFC. It doesn't matter whether it fits your notion of "the whole is greater than the part". The entire proof is that set comprehension. It exists. Therefore the two sets have the same size.

Everyone should program, or Programming is Hard? Both!

I saw something on twitter a couple of days ago, and I promised to write this blog post about it. As usual, I'm behind on all the stuff I want to do, so it took longer to write than I'd originally planned.

My professional specialty is understanding how people write programs. Programming languages, development environment, code management tools, code collaboration tools, etc., that's my bread and butter.

So, naturally, this ticked me off.

The article starts off by, essentially, arguing that most of the programming tutorials on the web stink. I don't entirely agree with that, but to me, it's not important enough to argue about. But here's where things go off the rails:

But that's only half the problem. Victor thinks that programming itself is broken. It's often said that in order to code well, you have to be able to "think like a computer." To Victor, this is absurdly backwards-- and it's the real reason why programming is seen as fundamentally "hard." Computers are human tools: why can't we control them on our terms, using techniques that come naturally to all of us?

For some reason, so many people have this bizzare idea that programming is this really easy thing that programmers just make difficult out of spite or elitism or clueless or something, I'm not sure what. And as long as I've been in the field, there's been a constant drumbeat from people to say that it's all easy, that programmers just want to make it difficult by forcing you to think like a machine. That what we really need to do is just humanize programming, and it will all be easy and everyone will do it and the world will turn into a perfect computing utopia.

First, the whole "think like a machine" think is a verbal shorthand that attempts to make programming as we do it sound awful. It's not just hard to program, but those damned engineers are claiming that you need to dehumanize yourself to do it!

To be a programmer, you don't need to think like a machine. But you need to understand how machines work. To program successfully, you do need to understand how machines work - because what you're really doing is building a machine!

When you're writing a program, on a theoretical level, what you're doing is designing a machine that performs some mechanical task. That's really what a program is: it's a description of a machine. And what a programming language is, at heart, is a specialized notation for describing a particular kind of machine.

No one will go to an automotive engineer, and tell him that there's something wrong with the way transmissions are designed, because they make you understand how gears work. But that's pretty much exactly the argument that Victor is making.

How hard is it to program? That all depends on what you're tring to do. Here's the thing: The complexity of the machine that you need to build is what determines the complexity of the program. If you're trying to build a really complex machine, then a program describing it is going to be really complex.

Period. There is no way around that. That is the fundamental nature of programming.

In the usual argument, one thing that I constantly see is something along the lines of "programming isn't plumbing: everyone should be able to do it". And my response to that is: of course so. Just like everyone should be able to do their own plumbing.

That sounds like an amazingly stupid thing to say. Especially coming from me: the one time I tried to fix my broken kitchen sink, I did over a thousand dollars worth of damage.

But: plumbing isn't just one thing. It's lots of related but different things:

• There are people who design plumbing systems for managing water distribution and waste disposal for an entire city. That's one aspect of plubing. And that's an incredibly complicated thing to do, and I don't care how smart you are: you're not going to be able to do it well without learning a whole lot about how plumbing works.
• Then there are people who design the plumbing for a single house. That's plumbing, too. That's still hard, and requires a lot of specialized knowledge, most of which is pretty different from the city designer.
• There are people who don't design plumbing, but are able to build the full plumbing system for a house from scratch using plans drawn by a designer. Once again, that's still plumbing. But it's yet another set of knowledge and skills.
• There are people who can come into a house when something isn't working, and without ever seeing the design, and figure out what's wrong, and fix it. (There's a guy in my basement right now, fixing a drainage problem that left my house without hot water, again! He needed to do a lot of work to learn how to do that, and there's no way that I could do it myself.) That's yet another set of skills and knowledge - and it's still plumbing.
• There are non-professional people who can fix leaky pipes, and replace damaged bits. With a bit of work, almost anyone can learn to do it. Still plumbing. But definitely: everyone really should be able to do at least some of this.

• And there are people like me who can use a plumbing snake and a plunger when the toilet clogs. That's still plumbing, but it requires no experience and no training, and absolutely everyone should be able to do it, without question.

All of those things involve plumbing, but they require vastly different amounts and kinds of training and experience.

Programming is exactly the same. There are different kinds of programming, which require different kinds of skills and knowledge. The tools and training methods that we use are vastly different for those different kinds of programming - so different that for many of them, people don't even realize that they are programming. Almost everyone who uses computers does do some amount of programming:

• When someone puts together a presentation in powerpoint, with things that move around, appear, and disappear on your command: that is programming.
• When someone puts formula into a spreadsheet: that is programming.
• When someone builds a website - even a simple one - and use either a set of tools, or CSS and HTML to put the site together: that is programming.
• When someone writes a macro in Word or Excel: that is programming.
• When someone sets up an autoresponder to answer their email while they're on vacation: that is programming.

People like Victor completely disregard those things as programming, and then gripe about how all programming is supercomplexmagicalsymbolic gobbledygook. Most people do write programs without knowing about it, precisely because they're doing it with tools that present the programming task as something that's so natural to them that they don't even recognize that they are programming.

But on the other hand, the idea that you should be able to program without understanding the machine you're using or the machine that you're building: that's also pretty silly.

When you get beyond the surface, and start to get to doing more complex tasks, programming - like any skill - gets a lot harder. You can't be a plumber without understanding how pipe connections work, what the properties of the different pipe materials are, and how things flow through them. You can't be a programmer without understanding something about the machine. The more complicated the kind of programming task you want to do, the more you need to understand.

Someone who does Powerpoint presentations doesn't need to know very much about the computer. Someone who wants to write spreadsheet macros needs to understand something about how the computer processes numbers, what happens to errors in calculations that use floating point, etc. Someone who wants to build an application like Word needs to know a whole lot about how a single computer works, including details like how the computer displays things to people. Someone who wants to build Google doesn't need to know how computers render text clearly on the screen, but they do need to know how computers work, and also how networks and communications work.

To be clear, I don't think that Victor is being dishonest. But the way that he presents things often does come off as dishonest, which makes it all the worse. To give one demonstration, he presents a comparison of how we teach programming to cooking. In it, he talks about how we'd teach people to make a soufflee. He shows a picture of raw ingredients on one side, and a fully baked soufflee on the other, and says, essentially: "This is how we teach people to program. We give them the raw ingredients, and say fool around with them until you get the soufflee."

The thing is: that's exactly how we really teach people to cook - taken far out of context. If we want them to be able to prepare exactly one recipe, then we give them complete, detailed, step-by-step instructions. But once they know the basics, we don't do that anymore. We encourage them to start fooling around. "Yeah, that soufflee is great. But what would happen if I steeped some cardamom in the cream? What if I left out the vanilla? Would it turn out as good? Would that be better?" In fact, if you never do that experimentation, you'll probably never learn to make a truly great soufflee! Because the ingredients are never exactly the same, and the way that it turns out is going to depend on the vagaries of your oven, the weather, the particular batch of eggs that you're using, the amount of gluten in the flour, etc.

To write complicated programs is complicated. To write programs that manipulate symbolic data, you need to understand how the data symbolizes things. To write a computer that manipulates numbers, you need to understand how the numbers work, and how the computer represents them. To build a machine, you need to understand the machine that you're building. It's that simple.

There's always more Cantor crackpottery!

I'm not the only one who gets mail from crackpots!

A kind reader forwarded me yet another bit of Cantor crackpottery. It never ceases to amaze me how many people virulently object to Cantor, and how many of them just spew out the same, exact, rubbish, somehow thinking that they're different than all the others who made the same argument.

This one is yet another in the representation scheme. That is, it's an argument that I can write out all of the real numbers whose decimal forms have one digit after the decimal point; then all of the reals with two digits; then all of them with 3 digits; etc. This will produce an enumeration, therefore, there's a one-to-one mapping from the naturals to the reals. Presto, Cantor goes out the window.

Or not.

As usual, the crank starts off with a bit of pomposity:

Dear Colleague,

My mathematic researshes lead me to resolve the continuum theory of Cantor, subject of controversy since a long time.

This mail is made to inform the mathematical community from this work, and share the conclusions.

You will find in attachment extracts from my book "Théorie critique fondamentale des ensembles de Cantor",

Inviting you to contact me,

Francis Collot,

Member of the American mathematical society

Membre de la société mathématique de France

Member of the Bulletin of symbolic logic

Director of éditions européennes

As a quick aside, I love how he signs he email "Member of the AMS", as if that were something meaningful. The AMS is a great organization - but anyone can be a member. All you need to do is fill out a form, and write them a check. It's not something that anyone sane or reasonable brags about, because it doesn't mean anything.

Anyway, let's move on. Here's the entirety of his proof. I've reproduced the formatting as well as I could; the original document sent to me was a PDF, so the tables don't cut-and-paste.

The well-order on the set of real numbers result from this remark that it is possible to build, after the comma, a set where each subset has the same number of ordered elements (as is ordered the subset 2 : 10 …13 … 99).

Each successive integer is able to be followed after the comma (in french the real numbers have one comma after the integer) by an increasing number of figures.

 0,0 0,10 0,100 0,1 0,11 0,101 0,2 0,12 0,102 … … … 0,9 0,99 0,999

It is the same thing for each successive interger before the comma.

1 2 3

So it is the 2 infinite of real number.

For this we use the binary notation.

But Cantor and his disciples never obtained this simple result.

After that, the theory displays that the infinity is the asymptote of the two branches of the hyperbole thanks to an introduction of trigonometry notions.

The successive numbers which are on cotg (as 1/2, 1/3, 1/4, 1/5) never attain 0 because it would be necessary to write instead (1/2, 1/3, 1/4, 1/4 ).

The 0 of the cotg is also the origin of the asymptote, that is to say infinite.

The beginning is, pretty much, a typical example of the representational crankery. It's roughly a restatement of, for example, John Gabriel and his decimal trees. The problem with it is simple: this kind of enumeration will enumerate all of the real numbers with finite length representations. Which means that the total set of values enumerated by this won't even include all of the rational numbers, much less all of the real numbers.

(As an interesting aside: you can see a beautiful example of what Mr. Collot missed by looking at Conway's introduction to the surreal numbers, On Numbers and Games, which I wrote about here. He specifically deals with this problem in terms of "birthdays" and the requirement to include numbers who have an infinite birthday, and thus an infinite representation in the surreal numbers.)

After the enumeration stuff, he really goes off the rails. I have no idea what that asymptote nonsense is supposed to mean. I think part of the problem is that mr. Collot isn't very good at english, but the larger part of it is that he's an incoherent crackpot.

Mathematical Illiteracy in the NYT

I know I'm late to the game here, but I can't resist taking a moment to dive in to the furor surrounding yesterday's appalling NY Times op-ed, "Is Algebra Necessary?". (Yesterday was my birthday, and I just couldn't face reading something that I knew would make me so angry.)

In case you haven't seen it yet, let's start with a quick look at the argument:

A typical American school day finds some six million high school students and two million college freshmen struggling with algebra. In both high school and college, all too many students are expected to fail. Why do we subject American students to this ordeal? I’ve found myself moving toward the strong view that we shouldn’t.

My question extends beyond algebra and applies more broadly to the usual mathematics sequence, from geometry through calculus. State regents and legislators — and much of the public — take it as self-evident that every young person should be made to master polynomial functions and parametric equations.

There are many defenses of algebra and the virtue of learning it. Most of them sound reasonable on first hearing; many of them I once accepted. But the more I examine them, the clearer it seems that they are largely or wholly wrong — unsupported by research or evidence, or based on wishful logic. (I’m not talking about quantitative skills, critical for informed citizenship and personal finance, but a very different ballgame.)

Already, this is a total disgrace. The number of cheap fallacies in this little excerpt is just amazing. To point out a couple:

1. Blame the victim. We do a lousy job teaching math. Therefore, math is bad, and we should stop teaching it.
2. Obfuscation. The author really wants to make it look like math is really terribly difficult. So he chooses a couple of silly phrases that take simple mathematical concepts, and express them in ways that make them sound much more complicated and difficult. It's not just algebra: It's polynomial functions and parametric equations. What do those two terms really mean? Basically, "simple algebra". "Parametric equations" means "equations with variables". "Polynomial equations" means equations that include variables with exponents. Which are, of course, really immensely terrifying and complex things that absolutely never come up in the real world. (Except, of course, in compound interest, investment, taxes, mortgages....)
3. Qualification. The last paragraph essentially says "There are no valid arguments to support the teaching of math, except for the valid ones, but I'm going to exclude those."

and from there, it just keeps getting worse. The bulk of the argument can be reduced to the first point above: lots of students fail high-school level math, and therefore, we should give up and stop teaching it. It repeats the same thing over and over again: algebra is so terribly hard, students keep failing it, and it's just not useful (except for all of the places where it is)

One way of addressing the stupidity of this is to just take what the moron says, and try applying it to any other subject:

A typical American school day finds some six million high school students and two million college freshmen struggling with grammatical writing. In both high school and college, all too many students are expected to fail. Why do we subject American students to this ordeal? I’ve found myself moving toward the strong view that we shouldn’t.

My question extends beyond just simple sentence construction and applies more broadly to the usual english sequence - from basic sentence structure and grammar through writing full-length papers and essays. State regents and legislators — and much of the public — take it as self-evident that every young person should be made to master rhetoric, thesis construction, and logical synthesis.

Would any newspaper in the US, much less one as obsessed with its own status as the New York Times, ever consider publishing an article like that claiming that we shouldn't bother to teach students to write? It's an utter disgrace, but in America, land of the mathematically illiterate, this is an acceptable, respectable argument when applied to mathematics.

Is algebra really so difficult? No.

But more substantial, is it really useful? Yes. Here's a typical example, from real life. My wife and I bought our first house back in 1997. Two years later, the interest rate had gone down by a substantial amount, so we wanted to refinance. We had a choice between two refinance plans: one had an interest rate 1/4% lower, but required pre-paying of 2% of the principal in a special lump interest payment. Which mortgage should we have taken?

The answer is, it depends on how long we planned to own the house. The idea is that we needed to figure out when the amount of money saved by the lower interest rate would exceed the 2% pre-payment.

How do you figure that out?

Well, the amortization equation describing the mortgage is:

Where:

• m is the monthly payment on the mortgage.
• p is the amount of money being borrowed on the loan.
• i is the interest rate per payment period.
• n is the number of payments.

Using that equation, we can see the monthly payment. If we calculate that for both mortgages, we get two values, and . Now, how many months before it pays off? If is the amount of the pre-payment to get the lower interest rate, then , where is the number of months - so it would take months. It happened that for us, worked out to around 60 - that is, about 5 years. We did make the pre-payment. (And that was a mistake; we didn't stay in that house as long as we'd planned.)

According to our idiotic author, expecting the average american to be capable of figuring out the right choice in that situation is completely unreasonable. We're screwing over students all over the country by expecting them to be capable of dealing with parametric polynomial equations like this.

Of course, the jackass goes on to talk about how we should offer courses in things like statistics instead of algebra. How on earth are you going to explain bell curves, standard deviations, margins of error, etc., without using any algebra? The guy is so totally clueless that he doesn't even understand when he's using it.

Total crap. I'm going to leave it with that, because writing about this is just making me so damned angry. Mathematical illiteracy is just as bad

The American Heat Wave and Global Warming

Global warming is a big issue. If we're honest and we look carefully at the data, it's beyond question that the atmosphere of our planet is warming. It's also beyond any honest question that the preponderance of the evidence is that human behavior is the primary cause. It's not impossible that we're wrong - but when we look at the real evidence, it's overwhelming.

Of course, this doesn't stop people from being idiots.

But what I'm going to focus on here isn't exactly the usual idiots. See, here in the US, we're in the middle of a dramatic heat wave. All over the country, we've been breaking heat daily temperature records. As I write this, it's 98 degrees outside here in NY, and we're expecting another couple of degrees. Out in the west, there are gigantic wildfires, cause by poor snowfall last winter, poor rainfall this spring, and record heat to dry everything out. So: is this global warming?

We're seeing lots and lots of people saying yes. Or worse, saying that it is, because of the heat wave, while pretending that they're not really saying that it is. For one, among all-too-many examples, you can look at Bad Astronomy here. Not to rag too much on Phil though, because hes just one among about two dozen different example of this that I've seen in the last 3 days.

Weather 10 or twenty degree above normal isn't global warming. A heat wave, even a massive epic heat wave, isn't proof that global warming is real, any more than an epic cold wave or blizzard is evidence that global warming is fake.

I'm sure you've heard many people say weather is not climate. But for human beings, it's really hard to understand just what that really means. Climate is a world-wide long-term average; weather is instantaneous and local. This isn't just a nitpick: it's a huge distinction. When we talk about global warming, what we're talking about is the year-round average temperature changing by one or two degrees. A ten degree variation in local weather doesn't tell us anything about the worldwide trend.

Global warming is about climate. And part of what that means is that in some places, global warming will probably make the weather colder. Cold weather isn't evidence against global warming. Most people realize that - which is why we all laugh when gasbags like Rush Limbaugh talk about how a snowstorm "proves" that global warming is a fraud. But at the same time, we look at weather like what we have in the US, and conclude that "Yes, global warming is real". But we're making the same mistake.

Global warming is about a surprisingly small change. Over the last hundred years, global warming is a change of about 1 degree celsius in the global average temperature. That's about 1 1/2 degrees fahrenheit, for us Americans. It seems miniscule, and it's a tiny fraction of the temperature difference that we're seeing this summer in the US.

But that tiny difference in climate can cause huge differences in weather. As I mentioned before, it can make local weather either warmer or colder - not just by directly warming the air, but by altering wind and water currents in ways that create dramatic changes.

For example, global warming could, likely, make Europe significantly colder. How? The weather in western Europe is greatly affected by an ocean water current called the atlantic conveyor. The conveyor is a cyclic ocean current, where (driven in part by the jet stream), warm water flows north from the equator in a surface current, cooling as it goes, until it finally sinks and starts to cycle back south in a deep underwater current. This acts as a heat pump, moving energy from the equator north and east to western Europe. This is why Western Europe is significantly warmer than places at the same latitude in Eastern North America.

Global warming could alter the flow of the atlantic conveyor. (We don't know if it will - but it's one possibility, which makes a good example of something counter-intuitive.) If the conveyor is slowed, so that it transfers less energy, Europe will get colder. How could the conveyor be slowed? By ice-melt. The conveyor works as a cycle because of the differences in density between warm and cold water: cold water is denser than warm water, so the cold water sinks as it cools. It warms in the tropics, gets pushed north by the jet stream, cools along the way and gradually sinks.

But global warming is melting a lot of artic and glacier ice, which produces freshwater. Freshwater is less dense than saltwater. So the freshwater, when it dilutes the cold water at the northern end of the conveyor, it reduces its density relative to the pure salt-water - and that reduces the tendency of the cold water to sink, which could slow the conveyor.

There are numerous similar phenomena that involve changes in ocean currents and wind due to relatively small temperature variations. El Nino and La Nina, conveyor changes, changes in the moisture-carrying capacity of wind currents to carry - they're all caused by relatively small changes - changes well with the couple of degrees of variatio that we see occuring.

But we need to be honest and careful. This summer may be incredibly hot, and we had an unsually warm winter before it - but we really shouldn't try to use that as evidence of global warming. Because if you do, when some colder-than-normal weather occurs somewhere, the cranks and liars that want to convince people that global warming is an elaborate fraud will use that the muddle things - and when they do, it'll be our fault when people fall for it, because we'll be the ones who primed them for that argument. As nice, as convenient, as convincing as it might seem to draw a correlation between a specific instance of extreme weather and global warming, we really need to stop doing it.

• Scientopia Blogs