## Correction, Sigma Algebras, and Mass functions

(by MarkCC) Sep 02 2013

So, I messed up a bit in the previous post. Let me get that out of the way before we move forward!

In measure theory, you aren't just working with sets. You're working with something called σ-algebras. It's a very important distinction.

The problem is, our intuition of sets doesn't always work. Sets, as defined formally, are really pretty subtle. We expect certain things to be true, because they make sense. But in fact, they are not implied by the definition of sets. A σ-algebra is, essentially, a well-behaved set - a set whose behavior matches our usual expectations.

To be formal, a sigma algebra over a set S is a collection Σ of subsets of S such that:

1. Σ is closed over set complement.
2. Σ is closed over countable union.

The reason why you need to make this restriction is, ultimately, because of the axiom of choice. Using the axiom of choice, you can create sets which are unmeasurable. They're clearly subsets of a measurable set, and supersets of other measurable sets - and yet, they are, themselves, not measurable. This leads to things like the Banach-Tarski paradox: you can take a measurable set, divide it into non-measurable subsets, and then combine those non-measurable subsets back into measurable sets whose size seem to make no sense. You can take a sphere the size of a baseball, slice it into pieces, and then re-assemble those pieces into a sphere the size of the earth, without stretching them!

These non-measurable sets blow away our expectations about how things should behave. The restriction to σ algebras is just a way of saying that we need to be working in a space where all sets are measurable. When we're looking at measure theory (or probability theory, where we're building on measures), we need to exclude non-measurable sets. If we don't, we're seriously up a creek without a paddle. If we allowed non-measurable sets, then the probability theory we're building would be inconsistent, and that's the kiss of death in mathematics.

Ok. So, with that out of the way, how do we actually use Kolmogorov's axioms? It all comes down to the idea of a sample space. You need to start with an experiment that you're going to observe. For that experiment, there are a set of possible outcomes. The set of all possible outcomes is the sample space.

Here's where, sadly, even axiomatized probability theory gets a bit handwavy. Given the sample space, you can define the structure of the sample space with a function, called the probability mass function, f, which maps each possible event in the sample space to a probability. To be a valid mass function for a sample space S, it's got to have the following properties:

1. For each event e in S, f(e) ≥ 0 and f(e) <= 1..
2. The sum of the probabilities in the sample space must be 1:

So we wind up with a sort of circularity: in order to describe the probability of events, we need to start by knowing the probability of events. In fact, this isn't really a problem: we're talking about taking something than we observe in the real world, and mapping it into the abstract space of math. Whenever we do that, we need to take our observations of the real world and create an approximation as a mathematical model.

The point of probability theory isn't to do that primitive mapping. In general, we already understand how rolling a single die works. We know how it should behave, and we know how and why its actual behavior can vary from our expectation. What we want to know is really how many events combine.

We don't need any special theory to figure out what the probability of rolling a 3 on a six-sided die is: that's easy, and it's obvious: it's 1 in 6. But what's the probability of winning a game of craps?

If all days of the year 2001 are equally likely, then we don't need anything fancy to ask what the probability of someone born in 2001's birthday being July 21st. It's easy: 1 in 365. But if I've got a group of 35 people, what's the probability of two of them sharing the same birthday?

Both of those questions start with the assignment of a probability mass function, which is trivial. But they involve combining the probabilities given by those mass functions, and use them with Kolmogorov's axioms to figure out the probabilities of the complicated events.

## Kolmogorov's Axioms of Probability

(by MarkCC) Aug 24 2013

The way that I've talked about probability so far is mostly informal. That's the way that probability theory was treated for a long time. You defined probability spaces over collections of equal probability sets. You combined probability spaces by combining their events into other kinds of equally probable events.

The problem with that should be obvious: it's circular. You want to define the probability of events; to do that, you need to start with equally probable events, which means that on some level, you already know the probabilities. If you don't know the probabilities, you can't talk about them. The reality is somewhat worse than that, because this way of looking at things completely falls apart when you start trying to think about infinite probability spaces!

So what can you do?

The answer is to reformulate probability. Mathematicians knew about this kind of problem for a very long time, but what they mostly just ignored it: probability wasn't considered a terribly interesting field.

Then, along came Kolmogorov - the same brilliant guy who's theory of computational complexity is so fascinating to me! Kolmogorov created a new formulation of probability theory. Instead of starting with a space of equally probable discrete events, you start with a measure space.

Before we can look at how Kolmogorov reformulated probability (the Kolmogorov axioms), we need to look at just what a measure space is.

A measure space is just a set with a measure function. So let X be a set. A measure μ on X is a function from a subset of X to a real number: with the following properties:

• Measures are non-negative:
• The measure of the empty set is always 0:
• The measure of a finite sequence of unions is the sum of the individual measures

So the idea is pretty simple: a measure space is just a way of defining the size of a subset in a consistent way.

To work with probability, you need a measure space where the measure of the entire set is 1. With that idea in mind, we can put together a proper, formal definition of a probability space that will really allow us to work with, and to combine probabilities in a rigorous way.

Like our original version, a probability space has a set of events, called its event space. We'll use F to represent the set of all possible events, and e to represent an event in that set.

There are three fundamental axioms of probability, which are going to look really similar to the three axioms of a measure space:

1. Basic measure: the probability of any event is a positive real number: .
2. Unit measure: the probability that some event will occur is 1, which we write as ( is called the unit event, and is the union of all possible events.) Alternatively, the probability of no event occurring is 0: .
3. Combination: For any two distinct events or sets of events and , the probability of or is : . This can be extended to any countable sequence of unions.

This is very similar to the informal version we used earlier. But as we'll see later, this simple formulation from measure theory will give us a lot of additional power.

It's worth taking a moment to point out two implications of these axioms. (In fact, I've seen some presentations that treat some of these as additional axioms, but they're provable from the first three.

• Monotonicity: if , then .
• Upper Bound: for any event or set of events , .

The brilliance of Kolmogorov was realizing that these rules were everything you need to work out any probability you want - in both finite and infinite spaces. We'll see that there's a lot of complexity in the combinatorics of probability, but it will all always ultimately come back to these three rules.

## Infinite Cantor Crankery

(by MarkCC) Jul 29 2013

I recently got yet another email from a Cantor crank.

Sadly, it's not a particularly interesting letter. It contains an argument that I've seen more times than I can count. But I realized that I don't think I've ever written about this particular boneheaded nonsense!

I'm going to paraphrase the argument: the original is written in broken english and is hard to follow.

• Cantor's diagonalization creates a magical number ("Cantor's number") based on an infinitely long table.
• Each digit of Cantor's number is taken from one row of the table: the Nth digit is produced by the Nth row of the table.
• This means that the Nth digit only exists after processing N rows of the table.
• Suppose it takes time t to get the value of a digit from a row of the table.
• Therefore, for any natural number N, it takes N*t time to get the first N digits of Cantor's number.
• Any finite prefix of Cantor's number is a rational number, which is clearly in the table.
• The full Cantor's number doesn't exist until an infinite number of steps has been completed, at time &infinity;*t.
• Therefore Cantor's number never exists. Only finite prefixes of it exist, and they are all rational numbers.

The problem with this is quite simple: Cantor's proof doesn't create a number; it identifies a number.

It might take an infinite amount of time to figure out which number we're talking about - but that doesn't matter. The number, like all numbers, exists, independent of
our ability to compute it. Once you accept the rules of real numbers as a mathematical framework, then all of the numbers, every possible one, whether we can identify it, or describe it, or write it down - they all exist. What a mechanism like Cantor's diagonalization does is just give us a way of identifying a particular number that we're interested in. But that number exists, whether we describe it or identify it.

The easiest way to show the problem here is to think of other irrational numbers. No irrational number can ever be written down completely. We know that there's got to be some number which, multiplied by itself, equals 2. But we can't actually write down all of the digits of that number. We can write down progressively better approximations, but we'll never actually write the square root of two. By the argument above against Cantor's number, we can show that the square root of two doesn't exist. If we need to create the number by writing down all af its digits,s then the square root of two will never get created! Nor will any other irrational number. If you insist on writing numbers down in decimal form, then neither will many fractions. But in math, we don't create numbers: we describe numbers that already exist.

But we could weasel around that, and create an alternative formulation of mathematics in which all numbers must be writeable in some finite form. We wouldn't need to say that we can create numbers, but we could constrain our definitions to get rid of the nasty numbers that make things confusing. We could make a reasonable argument that those problematic real numbers don't really exist - that they're an artifact of a flaw in our logical definition of real numbers. (In fact, some mathematicians like Greg Chaitin have actually made that argument semi-seriously.)

By doing that, irrational numbers could be defined out of existence, because they
can't be written down. In essence, that's what my correspondant is proposing: that the definition of real numbers is broken, and that the problem with Cantor's proof is that it's based on that faulty definition. (I don't think that he'd agree that that's what he's arguing - but either numbers exist that can't be written in a finite amount of time, or they don't. If they do, then his argument is worthless.)

You certainly can argue that the only numbers that should exist are numbers that can be written down. If you do that, there are two main paths. There's the theory of computable numbers (which allows you to keep π and the square roots), and there's the theory of rational numbers (which discards everything that can't be written as a finite fraction). There are interesting theories that build on either of those two approaches. In both, Cantor's argument doesn't apply, because in both, you've restricted the set of numbers to be a countable set.

But that doesn't say anything about the theory of real numbers, which is what Cantor's proof is talking about. In the real numbers, numbers that can't be written down in any form do exist. Numbers like the number produced by Cantor's diagonalization definitely do. The infinite time argument is a load of rubbish because it's based on the faulty concept that Cantor's number doesn't exist until we create it.

The interesting thing about this argument to be, is its selectivity. To my correspondant, the existence of an infinitely long table isn't a problem. He doesn't think that there's anything wrong with the idea of an infinite process creating an infinite table containing a mapping between the natural numbers and the real numbers. He just has a problem with the infinite process of traversing that table. Which is really pretty silly when you think about it.

## Recipe: Sous Vide Braised Pork Belly with Chevre Polenta

(by MarkCC) Jul 01 2013

I really outdid myself with tonight's dinner. It was a total ad-lib - not recipe written in advance, just randomly trying to make something good. It turned out so good that I need to write down what I did, so that I can make it again!

## Part 1: the pork

• 2 1/2 pounds pork belly. I'm picky about pork; if I'm going to eat it, I want it to be good. I didn't grow up eating pork. My family didn't keep kosher, but we didn't bring pork into the house. To this day, I don't like most pork. Grocery store pork is, typically, bland, greasy, and generally nasty stuff. But the first real pork that I ate was at Momofuku in Manhattan. It was Berkshire pork, from a farm in upstate NY. That was delicious. Since then I've experimented, and I really think that nothing compares to fresh Berkshire. It costs a lot more than grocery store pork, but it's worth it. I order it direct from Flying Pig Farm.
• 4 cloves garlic.
• 1 teaspoon salt.
• 1 1/2 teaspoons fennel pollen.
• 1 teaspoon dried rosemary.
• 1 tablespoons olive oil.
• pepper
• 1/4 cup salt.
• 1/4 cup sugar.
1. Prepare the pork belly: trim off the skin, and any egregiously extra fat from the skin side.
2. Put the garlic, fennel pollen, rosemary, 1 teaspoon of salt, and the olive oil into a mortar and pestle, and crush them to a paste.
3. Coat the pork with the herb paste.
4. Add fresh-ground black pepper to the pork.
5. Mix together 1/4 cup each of sugar and salt, and coat the pork with it.
6. Put the pork into the fridge overnight.
7. In the morning, remove the pork from the fridge, and discard any liquids that were drawn out by the salt.
8. Sealed the pork in a sous vide bag, and cook at 190 degrees for
5 hours. (If you don't have a sous vide machine, you could probably
do it covered in a 200 degree oven. You'll probably want to add a bit
of water.)
9. Take out the pork, and separate the meat from the liquid that's collected in the bags. (Do NOT discard it; that's pure flavor!) Put
both into the fridge for a couple of hours to cool.
10. When it's cool, the fat that rendered out of the pork will have solidifed - remove it, and discard it. (Or keep it for something else.)
11. Cut the pork into 2 inch thick chunks.
12. In a smoking hot cast iron pan, brown the pork chunks on all sides.
13. Add in the reserved liquids, along with 1/4 cup of port wine.
Reduce until it forms a glaze over the pork. Remove the pork to a
plate - it's done!

## Part 2: the Polenta

• 1 cup polenta. I use very coarse polenta - I like my polenta to have some texture. (My friend Anoop teases me, insisting that I'm making grits.)
• 4 cups chicken stock.
• 1 cup water.
• 1 teaspoon salt.
• 1 tablespoon butter.
• 2 ounces chevre goat cheese.
1. Put the salt, water, and chicken stock into a pan, and bring to a boil.
2. Reduce the heat to medium low, and stir in the polenta.
3. Cook the polenta on medium low to low heat for 1 1/2 hours.
4. Remove from heat, add in the butter, and stir until it's all melted and blended in.
5. crumble the goat cheese in, and stir it in.

### Part 3: the assembly.

1. Put a big pile of the polenta in the middle of a plate.
2. Put a couple of chunks of the glazed pork onto the polenta.
3. Put sauteed asparagus around the outside.

## Independence and Combining probabilities.

(by MarkCC) Jun 24 2013

As I alluded to in my previous post, simple probability is really a matter of observation, to produce a model. If you roll a six sided die over and over again, you'll see that each face comes up pretty much equally often, and so you model it as 1/6 probability for each face. There's not really a huge amount of principle behind the basic models: it's really just whatever works. This is the root of the distinction between interpretations: a frequentist starts with an experiment, and builds a descriptive model based on it, and says that the underlying phenomena being tested has the model as g, a property; a Bayesian does almost the same thing, but says that the model describes the state of their knowledge.

Where probability starts to become interesting in when you combine things. I know the probability of outcomes for rolling one die: how can I use that to see what happens when I roll five dice together? I know the probability of drawing a specific card from a deck: what are the odds of being dealt a particular poker hand?

We'll start with the easiest part: combining independent probabilities. The probability of two events are independent when there's no way for the outcome of one to influence the outcome of the other. For example, if you're flipping a coin several times, the result of one coin flip has no effect on the result of a subsequent flip. On the other hand, dealing 10 cards from a deck is a sequence of dependent events: once you've dealt one card, the next deal must come from the remaining cards: you can't deal the same card twice.

If you know the probability space of your trials, then recognizing an independent situation is easy: if the outcome of one trial doesn't alter the probability space of other trials, then they're independent.

Look back at the coin flip example: we know what the probability space of a coin flip looks like: it's got two, equally probable outcomes. If you've flipped a coin once, and you're going to flip another coin, the result of the first flip can't do anything that alters the probability space of a subsequent flip.

But if you think about dealing cards, that's not true. With a standard deck of cards, the initial probability space has 52 outcomes, each of which is equally likely. So the odds of being dealt the 5 of spades is exactly 1/52.

Now, suppose that you got lucky, and you did get dealt the 5 of spades on the first card. What's the probability of being dealt the 5 of spades's on the second? If they were independent events, it would still be 1/52. But once you've dealt one card, you can't deal it again. The probability of being dealt the 5 of spades as the second card is 0: it's impossible. The probability space only has 51 possible outcomes, and the 5 of spades is not one of them. The space has changed. That's the definition of a dependent event.

When you're faced with dependent probabilities, you need to figure out how the probability space will be changed, and incorporate that into your computation. Once you've incorporated the change in the probability space of the second test, then you've got a new independent probability, and you can combine them. Figuring out how to alter the probability space can be extremely difficult, but that's what makes it interesting.

When you're dealing with independent events, it's easy to combine them. There are two basic ways of combining event probabilities,
and they should be familiar from logic: event1 AND event2, and event1 OR event2.

Suppose you're looking at two test with independent outcomes. I know that the probability of event e is P(e), and the probability of event f is P(f) Then the outcome of e & f - that is, of having e as the outcome of the first trial, and f as the outcome of the second, is P(e)×P(f). The odds of rolling HTTH on a coin is (1/2)*(1/2)*(1/2)*(1/2)=(1/16).

If you're looking at independent alternatives - that is, the probability of e OR F, you combine the probabilities of the event with addition: P(e) + P(f). So, the odds of drawing any heart from a deck: for each card, it's 1/52. There are thirteen different hearts. So the odds of drawing a red are 1/52 + 1/52 + ... = 13/52 = 1/4.

That still doesn't get us to the really interesting stuff. We still can't quite work out something like the odds of being dealt a flush. To get there, we need to learn some combinatorics, which will allow us to formulate the probability spaces that we need for an interesting probability.

## Probability Spaces

(by MarkCC) Jun 19 2013

Sorry for the slowness of the blog lately. I finally got myself back onto a semi-regular schedule when I posted about the Adria Richards affair, and that really blew up. The amount of vicious, hateful bile that showed up, both in comments (which I moderated) and in my email was truly astonishing. I've written things which pissed people off before, and I've gotten at least my fair share of hatemail. But nothing I've written before came close to preparing me for the kind of unbounded hatred that came in response to that post.

I really needed some time away from the blog after that.

Anyway, I'm back, and it's time to get on with some discrete probability theory!

I've already written a bit about interpretations of probability. But I haven't said anything about what probability means formally. When I say that the probability of rolling a 3 with a pair of fair six-sided dice is 1/18, how do I know that? Where did that 1/6th figure come from?

The answer lies in something called a probability space. I'm going to explain the probability space in frequentist terms, because I think that that's easiest, but there is (of course) an equivalent Bayesian description.) Suppose I'm looking at a particular experiment. In classic mathematical form, a probability space consists of three components (Ω, E, P), where:

1. Ω, called the sample space, is a set containing all possible outcomes of the experiment. For a pair of dice, Ω would be the set of all possible rolls: {(1,1), (1,2), (1,3), (1,4), (1,5), (1, 6), (2,1), ..., (6, 5), (6,6)}.
2. E is an equivalence relation over Ω, which partitions Ω into a set of events. Each event is a set of outcomes that are equivalent. For rolling a pair of dice, an event is a total - each event is the set of outcomes that have the same total. For the event "3" (meaning a roll that totalled three), the set would be {(1, 2), (2, 1)}.
3. P is a probability assignment. For each event e in E, P(e) is a value between 0 and 1, where:

(That is, the sum of the probabilities of all of the possible events in the space is exactly 1.)

The probability of an event e being the outcome of a trial is P(e).

So the probability of any particular event as the result of a trial is a number between 0 and 1. What's it mean? If the probability of event e is p, then if we repeat the trial N times, we expect N*p of those trials to have e as their result. If the probability of e is 1/4, and we repeat the trial 100 times, we'd expect e to be the result 25 times.

But in an important sense, that's a cop-out. We've defined probability in terms of this abstract model, where the third component is the probability. Isn't that circular?

Not really. For a given trial, we create the probability assignment by observation and/or analysis. The important point is that this is really just a bare minimum starting point. What we really care about in probability isn't the change associated with a single, simple, atomic event. What we want to do is take the probability associated with a group of single events, and use our understanding of that to allow us to explore a complex event.

If I give you a well-shuffled deck of cards, it's easy to show that the odds of drawing the 3 of diamonds is 1/52. What we want to do with probability is things like ask: What are the odds of being dealt a flush in a poker hand?

The construction of a probability space gives us a well-defined platform to use for building probabilistic models of more interesting things. Give a probability space of two single dice, we can combine them together to create the probability space of the two dice rolled together. Given the probability space of a pair of dice, we can construct the probability space of a game of craps. And so on.

## Probability and Interpretations

(by MarkCC) May 12 2013

I'm going to do some writing about discrete probability theory. Probability is an extremely important area of math. We encounter aspects of it every day. It's also a very poorly understood area - it's one that we see abused or just fouled up every day.

I'm going to focus on discrete probability theory. What that means is that we're going to look at things where the space containing the things that we're going to look at contains a countable number of elements. The probability of getting a certain sequence of coin flips, or of getting a certain hand of cards are described by discrete probability theory. On the other hand, the odds of a radioactive isotope decaying at a particular time requires continuous probability theory.

Before getting into the details, there's one important thing to mention. When you're talking about probability, there are two fundamental schools of interpretetation. There are frequentist interpretations, and there are Bayesian interpretations.

In a frequentist interpretation, when you say the probability of an event is 0.6, what you mean is that if you were to perform a series of experiments precisely reproducing the event, then on average, if you did 100 experiments, the event would occur 60 times. In the frequentist interpretation, the probability is an intrinsic property of the event. For a frequentist, it makes sense to say that there is a "real" probability associated with an event.

In a Bayesian interpretation, when you say that the probability of an event is 0.6, what you mean is that based on your current state of knowledge about the event, you have a 60% certainty that the event will occur. In a strict Bayesian interpretation, the event doesn't have any kind of intrinsic probability associated with it. The specific event that you're interested in either will occur, or it won't. There's no real probability involved. What probability measures is how certain you are about whether or not it will occur.

For example, think about flipping a fair coin.

A frequentist would say that you can flip a coin many times, and half of the time, it will land on heads. So the probability of a coin flip landing on the head of the coin is 0.5. A Bayesian would say that the coin will land either on heads or on tails. Since you don't know which, and you have no other information to use to be able to make a better prediction, you can have a certainty of 0.5 that it will land on the head of the coin.

In the real world, I think that most people are really somewhere in between.

I think that all but the most fervent Bayesians do rely on an intuitive notion of the "intrinsic" probability of an event. They may describe it in different terms, but when it comes down to it, they're using the basic frequentist notion. And I don't think that you can find a sane frequentist anywhere who won't use Bayes theorem to update their priors in the face of new information - which is the most fundamental notion in the Bayesian interpretation.

One note before I finish this, and get started on the real meaty posts. In the past, when I've talked about probability, people have started stupid flamewars in the comments. People get downright religious about interpretations of probability. There are religious Bayesians, who think that all frequentists are stupid idiots who should be banished from the field of math; likewise, there are religious frequentists who think that Bayesians are all a crop of arrogant know-it-alls who should be sent to Siberia. I am not going to tolerate any of that nonsense. If you feel that you cannot read posts on probability without going into a diatribe about those stupid frequentists/Bayesians and their deliberately stupid ideas, please go away and don't even read these posts. If you do go into such a diatribe, I will delete your comments without any hesitation.

## Speed-Crankery

(by MarkCC) May 05 2013

A fun game to play with cranks is: how long does it take for the crank to contradict themselves?

When you're looking at a good example of crankery, it's full of errors. But for this game, it's not enough to just find an error. What we want is for them to say something so wrong that one sentence just totally tears them down and demonstrates that what they're doing makes no sense.

"The color of a clear sky is green" is, most of the time, wrong. If a crank makes some kind of argument based on the alleged fact that the color of a clear daytime sky is green, the argument is wrong. But as a statement, it's not nonsensical. It' just wrong.

On th other hand, "The color of a clear sky is steak frite with bernaise sauce and a nice side of roasted asparagus", well... it's not even wrong. It's just nonsense.

Today's crank is a great example of this. If, that is, it's legit. I'm not sure that this guy is serious. I think this might be someone playing games, pretending to be a crank. But even if it is, it's still fun.

About a week ago, I got en mail titled "I am a Cantor crank" from a guy named Chris Cuellar. The contents were:

...AND I CHALLENGE YOU TO A DUEL!! En garde!

Haha, ok, not exactly. But you really seem to be interested in this stuff. And so am I. But I think I've nailed Cantor for good this time. Not only have I come up with algorithms to count some of these "uncountable" things, but I have also addressed the proofs directly. The diagonalization argument ends up failing spectacularly, and I believe I have a good explanation for why the whole thing ends up being invalid in the first place.

And then I also get to the power set of natural numbers... I really hope my arguments can be followed. The thing I have to emphasize is that I am working on a different system that does NOT roll up cardinality and countability into one thing! As it will turn out, rational numbers are bigger than integers, integers are bigger than natural numbers... but they are ALL countable, nonetheless!

Anyway, I had started a little blog of my own a while ago on these subjects. The first post is here:

http://laymanmath.blogspot.com/2012/09/the-purpose-and-my-introduction.html

Have fun... BWAHAHAHA

So. We've got one paragraph of intro. And then everything crashes and burns in an instant.

"Rational numbers are bigger than integers, integers are bigger than natural numbers, but they are all countable". This is self-evident rubbish. The definition of "countable" say that an infinite set I is countable if, and only if, you can create a one-to-one mapping between the members of I and the natural numbers. The definition of cardinality says that if you can create a one-to-one mapping between two sets, the sets are the same size.

When Mr. Cuellar says that the set of rational numbers is bigger that the set of natural numbers, but that they are still countable... he's saying that there is not a one-to-one mapping between the two sets, but that there is a one-to-one mapping between the two sets.

Look - you don't get to redefine terms, and then pretend that your redefined terms mean the same thing as the original terms.

If you claim to be refuting Cantor's proof that the cardinality of the real numbers is bigger than the cardinality of the natural numbers, then you have to use Cantor's definition of cardinality.

You can change the definition of the size of a set - or, more precisely, you can propose an alternative metric for how to compare the sizes of sets. But any conclusions that you draw about your new metric are conclusions about your new metric - they're not conclusions about Cantor's cardinality. You can define a new notion of set size in which all infinite sets are the same size. It's entirely possible to do that, and to do that in a consistent way. But it will say nothing about Cantor's cardinality. Cantor's proof will still work.

What my correspondant is doing is, basically, what I did above in saying that the color of the sky is steak frites. I'm using terms in a completely inconsistent meaningless way. Steak frites with bernaise sauce isn't a color. And what Mr. Cuellar does is similar: he's using the word "cardinality", but whatever he means by it, it's not what Cantor meant, and it's not what Cantor's proof meant. You can draw whatever conclusions you want from your new definition, but it has no bearing on whether or not Cantor is correct. I don't even need to visit his site: he's demonstrated, in record time, that he has no idea what he's doing.

## The Gravitational Force of Rubbish

(by MarkCC) May 01 2013

Imagine, for just a moment, that you were one a group of scientists that had proven the most important, the most profound, the most utterly amazing scientific discovery of all time. Where would you publish it?

Maybe Nature? Science? Or maybe you'd prefer to go open-access, and go with PLOS ONE? Or more mainstream, and send a press release to the NYT?

Well, in the case of today's crackpots, they bypassed all of those boring journals. They couldn't be bothered with a pompous rag like the Times. No, they went for the really serious press: America Now with Leeza Gibbons.

What did they go to this amazing media outlet to announce? The most amazing scientific discovery of all time: gravity is an illusion! There's no gravity. In fact, not just is there no gravity, but all of that quantum physics stuff? It's utter rubbish. You don't need any of that complicated stuff! No - you need only one thing: the solar wind.

A new theory on the forces that control planetary orbit refutes the 400-year old assumptions currently held by the scientific community. Scientific and engineering experts Gerhard and Kevin Neumaier have established a relationship between solar winds and a quantized order in both the position and velocity of the solar system's planets, and movement at an atomic level, with both governed by the same set of physics.

The observations made bring into question the Big Bang Theory, the concept of black holes, gravitational waves and gravitons. The Neumaiers' paper, More Than Gravity, is available for review at MoreThanGravity.com

Pretty damned impressive, huh? So let's follow their instructions, and go over to their website.

Ever since humankind discovered that the Earth and the planets revolved around the Sun, there was a question about what force was responsible for this. Since the days of Newton, science has held onto the notion that an invisible force, which we have never been able to detect, controls planetary motion. There are complicated theories about black holes that have never been seen, densities of planets that have never been measured, and subatomic particles that have never been detected.

However, it is simpler than all of that and right in front of us. The Sun and the solar wind are the most powerful forces in our solar system. They are physically moving the planets. In fact, the solar wind spins outward in a spiral at over a million miles per hour that controls the velocity and distances that planets revolve around the Sun. The Sun via the solar wind quantizes the orbits of the planets – their position and speed.

The solar wind also leads to the natural log and other phenomenon from the very large scale down to the atomic level. This is clearly a different idea than the current view that has been held for over 400 years. We have been working on this for close 50 years and thanks to satellite explorations of space have data that just was not available when theories long ago were developed. We think that we have many of the pieces but there are certainly many more to be found. We set this up as a web site, rather as some authoritative book so that there would be plenty of opportunity for dialog. The name for this web site, www.MorethanGravity.com was chosen because we believe there is far more to this subject than is commonly understood. Whether you are a scientific expert in your field or just have a general interest in how our solar system works, we appreciate your comments.

See, it's all about the solar wind. There's no such thing as gravity - that's just nonsense. The sun produces the solar wind, which does absolutely everything. The wind comes out of the sun, and spirals out from the sun. That spiral motion has eddies in it an quantized intervals, and that's where the planets are. Amazing, huh?

Remember my mantra: the worst math is no math. This is a beautiful demonstration
of that.

Of course... why does the solar wind move in a spiral? Everything we know says that in the absence of a force, things move in a straight line. It can't be spiraling because of gravity, because there is no gravity. So why does it spiral? Our brilliant authors don't bother to say. What makes it spiral, instead of just move straight? Mathematically, spiral motion is very complicated. It requires a centripetal force which is smaller than the force that would produce an orbit. Where's that force in this framework? There isn't any. They just say that that's how the solar wind works, period. There are many possible spirals, with different radial velocities - which one does the solar wind follow according to this, and why? Again, no answer from the authors.

Or... why is the sun producing the solar wind at all? According to those old, stupid theories that this work of brilliance supercedes, the sun produces a solar wind because it's fusing hydrogen atoms into helium. That's happening because gravity is causing the atoms of the sun to be compressed together until they fuse. Without gravity, why is fusion happening at all? And given that it's happening, why does the sun not just explode into a supernova? We know, from direct observation, that the energy produced by fusion creates an outward force. But gravity can't be holding the sun together - so why is the sun there at all? Still, no answers.

They do, eventually, do some math. One of the big "results" of this hypothesis is about the "quantization" of the orbits of planets around the sun. They were able to develop a simple equation which predicts the locations where planets could exist in their "solar wind" system.

Let’s start with the distance between the planets and the Sun. We guessed that if the solar system was like an atom, that planetary distance would be quantized. This is to say that we thought that the planets would have definite positions and that they would be either in the position or it would be empty. In a mathematical sense, this would be represented by a numerical integer ordering (0,1,2,3,…). If the first planet, Mercury was in the 0 orbital, how would the rest of the planets line up? Amazingly well we found.

If we predict the distance from the surface of the Sun to each planet in this quantized approach, the results are astounding. If D equals the mean distance to the surface of the Sun, and d0 as the distance to Mercury, we can describe the relationship that orders the planets mathematically as:

Each planetary position can be predicted from this equation in a simple calculation as we increase the integer (or planet number) n. S is the solar factor, which equals 1.387. The solar factor is found in the differential rotation of the Sun and the profile of the solar wind which we will discuss later.

Similar to the quantized orbits that exist within an atom, the planetary bodies are either there or not. Mercury is in the zero orbital. The next orbital is missing a planet. The second, third, and fourth orbitals are occupied by Venus, Earth, and Mars respectively. The fifth orbital is missing. The sixth orbital is filled with Ceres. Ceres is described as either the largest of all asteroids or a minor planet (with a diameter a little less than half that of Pluto), depending on who describes it. Ceres was discovered in 1801 as astronomers searched for the missing planets that the Titius-Bode Law predicted would exist.

So. What they found was an exponential equation which products very approximate versions of the size of first 8 planets' orbits, as well as a couple of missing ones.

This is, in its way, interesting. Not because they found anything, but rather because they think that this is somehow profound.

We've got 8 data points (or 9, counting the asteroid belt). More precisely, we have 9 ranges, because all of the orbits are elliptical,but the authors of this junk are producing a single number for the size of the orbits, and they can declare success if their number falls anywherewithin the range from perihelion to aphelion in each of the orbits.

It would be shocking if there weren't any number of simple equations that described exactly the 9 data points of the planet's orbits.

But they couldn't even make that work directly. They only manage to get a partial hit - getting an equation that hits the right points, but which also generates a bunch of misses. There's nothing remotely impressive about that.

From there, they move on to the strawmen. For example, they claim that their "solar wind" hypothesis explains why the planets all orbit in the same direction on the same plane. According to them, if orbits were really gravitational, then planets would orbit in random directions on random planes around the sun. But their theory is better than gravity, because it says why the planets are in the same plane, and why they're all orbiting in the same direction.

The thing is, this is a really stupid argument. Why are the planets in the same plane, orbiting in the same direction? Because the solar system was formed out of a rotating gas cloud. There's a really good, solid, well-supported explanation of why the planets exist, and why they orbit the sun the way they do. Gravity doesn't explain all of it, but gravity is a key piece of it.

What they don't seem to understand is how amazingly powerful the theory of gravity is as a predictive tool. We've sent probes to the outer edges of the solar system. To do that, we didn't just aim a rocket towards Jupiter and fire it off. We've done things like the Cassini probe, where we launched a rocket towards Venus. It used the gravitational field of Venus twice to accelerate it with a double-slingshot maneuver, and send it back towards earth, using the earth's gravity to slingshot it again, to give it the speed it needed to get to Jupiter.

This wasn't a simple thing to do. It required an extremely deep understanding of gravity, with extremely accurate predictions of exactly how gravity behaves.

How do our brilliant authors answer this? By handwaving. The extend of their response is:

Gravitational theory works for things like space travel because it empirically measures the force of a planet, rather than predicting it.

That's a pathetic handwave, and it's not even close to true. The gravitational slingshot is a perfect answer to it. A slingshot doesn't just use some "empirically measured" force of a planet. It's a very precise prediction of what the forces will be at different distances, how that force will vary, and what effects that force will have.

They do a whole lot more handwaving of very much the same order. Pure rubbish.

## What the heck is a DNS amplification DoS attack?

(by MarkCC) Apr 08 2013

A couple of weeks ago, there was a bunch of news about a major DOS attack on Spamhaus. Spamhaus is an online service that maintains a blacklist of mail servers that are known for propagating spam. I've been getting questions about what a DoS attack is, and more specifically what a "DNS amplification attack" (the specific attack at the heart of last week's news) is. This all became a bit more relevant to me last week, because some asshole who was offended by my post about the Adria Richards affair launched a smallish DoS attack against scientopia. (This is why we were interrmitently very slow last week, between tuesday and thursday. Also, to be clear, the DNS amplification attack was used on Spamhaus. Scientopia was hit by a good old fashioned DDoS attack.)

So what is a DoS attack? And what specifically is a DNS amplification attack?

Suppose that you're a nastly person who wants to take down a website like scientopia. How could you do it? You could hack in to the server, and delete everything. That would kill it pretty effectively, right?

It certainly would. But from the viewpoint of an attacker, that's not a particularly easy thing to do. You'd need to get access to a privileged account on our server. Even if we're completely up to date on patches and security fixes, it's probably possible to do that, but it's still probably going to be a lot of work. Even for a dinky site like scientopia, getting that kind of access probably isn't trivial. For a big security-focused site like spamhaus, that's likely to be close to impossible: there are layers of security that you'd need to get through, and there are people constantly watching for attacks. Even if you got through, if the site has reliable backups, it won't be down for long, and once they get back up, they'll patch whatever hole you used to get in, so you'd be back to square one. It's a lot of work, and there are much easier ways to take down a site.

What you, as an attacker, want is a way to take the site down without having any kind of access to the system. You want a way that keeps the site down for as long as you want it down. And you want a way that doesn't leave easily traced connections to you.

That's where the DoS attack comes in. DoS stands for "denial of service". The idea of a DoS attack is to take a site down without really taking it down. You don't actually kill the server; you just make it impossible for legitimate users to access it. If the sites users can't access the site even though the server is technically still up and running, you've still effectively killed it.

How do you do that? You overwhelm the server. You target some finite resource on the server, and force it to use up that resource just dealing with requests or traffic that you sent to the server, leaving it with nothing for its legitimate users.

In terms of the internet, the two resources that people typically target are CPU and network bandwidth.

Every time that you send a request to a webserver, the server has to do some computation to process that request. The server has a finite amount of computational capability. If you can hit it with enough requests that it spends all of its time processing your requests, then the site becomes unusable, and it effectively goes down. This is the simplest kind of DoS attack. It's generally done in a form called a DDoS - distributed denial of server attack, where the attacker users thousands or millions of virus-infected computers to send requests. The server gets hit by a vast storm of requests, and it can't distinguish the legitimate requests from the ones generated by the attacker. This is the kind of attack that hit Scientopia last week. We were getting streams of a couple of thousands malformed requests per second.

This kind of attack can be very effective. It's hard - not impossible, but hard - to fight. You need to identify the common traits of the attackers, and set up some kind of filter to discard those requests. From the attacker's point of view, it's got one problem: price. Most people don't have a personal collection of virus-infected machines that they can use to mount an attack. What they actually do is rent machines! Virus authors run services where they'll use the machines that they've to run an attack for you, for a fee. They typically charge per machine-hour. So to keep a good attack going for a long time is expensive! Another problem with this kind of attack is that the amount of traffic that you can inflict on the server per attacker is also used by the client. The client needs to establish a connection to the server. That consumes CPU, network connections, and bandwidth on the client.

The other main DoS vector is network bandwidth. Every server running a website is connected to the network by a connection with a fixed capacity, called it's bandwidth. A network connection can only carry a certain quantity of information. People love to make fun of the congressman who said that the internet is like a series of tubes, but that's not really a bad analogy. Any given connection is a lot like a pipe. You can only cram so much information through that pipe in a given period of time. If you can send enough traffic to completely fill that pipe, then the computer on the other end is, effectively, off the network. It can't receive any requests.

For a big site like spamhaus, it's very hard to get enough machines attacking to effectively kill the site. The amount of bandwidth, and the number of different network paths connecting spamhaus to the internet is huge! The number of infected machines available for an attack is limited, and the cost of using all of them is prohibitive.

What an attacker would like for killing something like Spamhaus is an attack where the amount of work/cpu/traffic used to generate the attack is much smaller than the amount of work/cpu/traffic used by the server to combat the attack. That's where amplification comes in. You want to find some way of using a small amount of work/traffic on your attacker machines to cause your target to lost a large amount of work/traffic.

In this recent attack on Spamhaus, they used an amplification attack, that was based on a basic internet infrastructure service called the Domain Name Service (DNS). DNS is the service which is used to convert between the name of a server (like scientopia.org), and its numeric internet address (184.106.221.182). DNS has some technical properties that make it idea for this kind of attack:

1. It's not a connection-based service. In most internet services, you establish a connection to a server, and send a request on that connection. The server responds on the same connection. In a connection-based service, that means two things. First, you need to use just as much bandwidth as the target, because if you drop the connection, the server sees the disconnect and stops processing your request. Second, the server knows who it's connected to, and it always sends the results of a request to the client that requested it. But DNS doesn't work that way. In DNS, you send a request without a connection, and in the request, you provide an address that the response should be sent to. So you can fake a DNS request, by putting someone else's address as the "respond-to" address in the request.
2. It's possible to set up DNS to create very large responses to very small requests. There are lots of ways to do this. The important thing is that it's really easy to use DNS in a way that allows you to amplify the amount of data being sent to a server by a factor of 100. In one common form of DNS amplification, you send 60 byte requests, which generate responses larger than 6,000 bytes.

Put these two properties together, and you get a great attack vector: you can send tiny, cheap requests to a server, which don't cause any incoming traffic on your attacker machine, and which send large quantities of data to your target. Doing this is called a DNS amplification attack: it's an amplification attack which uses properties of DNS to generate large quantities of data send to your server, using small quantities of data sent by your attackers.

That's exactly what happened to Spamhaus last week. The attackers used a very common DNS extension, which allowed them to amplify 60 byte requests into 4,000 byte responses, and to send the responses to the spamhaus servers.

There are, of course, more details. (For example, when direct attacks didn't work, they tried an indirect attack that didn't target the spamhaus servers, but instead tried to attack other servers that spamhaus relied on.) But this is the gist.

• Scientopia Blogs