## Probability and Interpretations

(by MarkCC) May 12 2013

I'm going to do some writing about discrete probability theory. Probability is an extremely important area of math. We encounter aspects of it every day. It's also a very poorly understood area - it's one that we see abused or just fouled up every day.

I'm going to focus on discrete probability theory. What that means is that we're going to look at things where the space containing the things that we're going to look at contains a countable number of elements. The probability of getting a certain sequence of coin flips, or of getting a certain hand of cards are described by discrete probability theory. On the other hand, the odds of a radioactive isotope decaying at a particular time requires continuous probability theory.

Before getting into the details, there's one important thing to mention. When you're talking about probability, there are two fundamental schools of interpretetation. There are frequentist interpretations, and there are Bayesian interpretations.

In a frequentist interpretation, when you say the probability of an event is 0.6, what you mean is that if you were to perform a series of experiments precisely reproducing the event, then on average, if you did 100 experiments, the event would occur 60 times. In the frequentist interpretation, the probability is an intrinsic property of the event. For a frequentist, it makes sense to say that there is a "real" probability associated with an event.

In a Bayesian interpretation, when you say that the probability of an event is 0.6, what you mean is that based on your current state of knowledge about the event, you have a 60% certainty that the event will occur. In a strict Bayesian interpretation, the event doesn't have any kind of intrinsic probability associated with it. The specific event that you're interested in either will occur, or it won't. There's no real probability involved. What probability measures is how certain you are about whether or not it will occur.

For example, think about flipping a fair coin.

A frequentist would say that you can flip a coin many times, and half of the time, it will land on heads. So the probability of a coin flip landing on the head of the coin is 0.5. A Bayesian would say that the coin will land either on heads or on tails. Since you don't know which, and you have no other information to use to be able to make a better prediction, you can have a certainty of 0.5 that it will land on the head of the coin.

In the real world, I think that most people are really somewhere in between.

I think that all but the most fervent Bayesians do rely on an intuitive notion of the "intrinsic" probability of an event. They may describe it in different terms, but when it comes down to it, they're using the basic frequentist notion. And I don't think that you can find a sane frequentist anywhere who won't use Bayes theorem to update their priors in the face of new information - which is the most fundamental notion in the Bayesian interpretation.

One note before I finish this, and get started on the real meaty posts. In the past, when I've talked about probability, people have started stupid flamewars in the comments. People get downright religious about interpretations of probability. There are religious Bayesians, who think that all frequentists are stupid idiots who should be banished from the field of math; likewise, there are religious frequentists who think that Bayesians are all a crop of arrogant know-it-alls who should be sent to Siberia. I am not going to tolerate any of that nonsense. If you feel that you cannot read posts on probability without going into a diatribe about those stupid frequentists/Bayesians and their deliberately stupid ideas, please go away and don't even read these posts. If you do go into such a diatribe, I will delete your comments without any hesitation.

## Speed-Crankery

(by MarkCC) May 05 2013

A fun game to play with cranks is: how long does it take for the crank to contradict themselves?

When you're looking at a good example of crankery, it's full of errors. But for this game, it's not enough to just find an error. What we want is for them to say something so wrong that one sentence just totally tears them down and demonstrates that what they're doing makes no sense.

"The color of a clear sky is green" is, most of the time, wrong. If a crank makes some kind of argument based on the alleged fact that the color of a clear daytime sky is green, the argument is wrong. But as a statement, it's not nonsensical. It' just wrong.

On th other hand, "The color of a clear sky is steak frite with bernaise sauce and a nice side of roasted asparagus", well... it's not even wrong. It's just nonsense.

Today's crank is a great example of this. If, that is, it's legit. I'm not sure that this guy is serious. I think this might be someone playing games, pretending to be a crank. But even if it is, it's still fun.

About a week ago, I got en mail titled "I am a Cantor crank" from a guy named Chris Cuellar. The contents were:

...AND I CHALLENGE YOU TO A DUEL!! En garde!

Haha, ok, not exactly. But you really seem to be interested in this stuff. And so am I. But I think I've nailed Cantor for good this time. Not only have I come up with algorithms to count some of these "uncountable" things, but I have also addressed the proofs directly. The diagonalization argument ends up failing spectacularly, and I believe I have a good explanation for why the whole thing ends up being invalid in the first place.

And then I also get to the power set of natural numbers... I really hope my arguments can be followed. The thing I have to emphasize is that I am working on a different system that does NOT roll up cardinality and countability into one thing! As it will turn out, rational numbers are bigger than integers, integers are bigger than natural numbers... but they are ALL countable, nonetheless!

Anyway, I had started a little blog of my own a while ago on these subjects. The first post is here:

http://laymanmath.blogspot.com/2012/09/the-purpose-and-my-introduction.html

Have fun... BWAHAHAHA

So. We've got one paragraph of intro. And then everything crashes and burns in an instant.

"Rational numbers are bigger than integers, integers are bigger than natural numbers, but they are all countable". This is self-evident rubbish. The definition of "countable" say that an infinite set I is countable if, and only if, you can create a one-to-one mapping between the members of I and the natural numbers. The definition of cardinality says that if you can create a one-to-one mapping between two sets, the sets are the same size.

When Mr. Cuellar says that the set of rational numbers is bigger that the set of natural numbers, but that they are still countable... he's saying that there is not a one-to-one mapping between the two sets, but that there is a one-to-one mapping between the two sets.

Look - you don't get to redefine terms, and then pretend that your redefined terms mean the same thing as the original terms.

If you claim to be refuting Cantor's proof that the cardinality of the real numbers is bigger than the cardinality of the natural numbers, then you have to use Cantor's definition of cardinality.

You can change the definition of the size of a set - or, more precisely, you can propose an alternative metric for how to compare the sizes of sets. But any conclusions that you draw about your new metric are conclusions about your new metric - they're not conclusions about Cantor's cardinality. You can define a new notion of set size in which all infinite sets are the same size. It's entirely possible to do that, and to do that in a consistent way. But it will say nothing about Cantor's cardinality. Cantor's proof will still work.

What my correspondant is doing is, basically, what I did above in saying that the color of the sky is steak frites. I'm using terms in a completely inconsistent meaningless way. Steak frites with bernaise sauce isn't a color. And what Mr. Cuellar does is similar: he's using the word "cardinality", but whatever he means by it, it's not what Cantor meant, and it's not what Cantor's proof meant. You can draw whatever conclusions you want from your new definition, but it has no bearing on whether or not Cantor is correct. I don't even need to visit his site: he's demonstrated, in record time, that he has no idea what he's doing.

## The Gravitational Force of Rubbish

(by MarkCC) May 01 2013

Imagine, for just a moment, that you were one a group of scientists that had proven the most important, the most profound, the most utterly amazing scientific discovery of all time. Where would you publish it?

Maybe Nature? Science? Or maybe you'd prefer to go open-access, and go with PLOS ONE? Or more mainstream, and send a press release to the NYT?

Well, in the case of today's crackpots, they bypassed all of those boring journals. They couldn't be bothered with a pompous rag like the Times. No, they went for the really serious press: America Now with Leeza Gibbons.

What did they go to this amazing media outlet to announce? The most amazing scientific discovery of all time: gravity is an illusion! There's no gravity. In fact, not just is there no gravity, but all of that quantum physics stuff? It's utter rubbish. You don't need any of that complicated stuff! No - you need only one thing: the solar wind.

A new theory on the forces that control planetary orbit refutes the 400-year old assumptions currently held by the scientific community. Scientific and engineering experts Gerhard and Kevin Neumaier have established a relationship between solar winds and a quantized order in both the position and velocity of the solar system's planets, and movement at an atomic level, with both governed by the same set of physics.

The observations made bring into question the Big Bang Theory, the concept of black holes, gravitational waves and gravitons. The Neumaiers' paper, More Than Gravity, is available for review at MoreThanGravity.com

Pretty damned impressive, huh? So let's follow their instructions, and go over to their website.

Ever since humankind discovered that the Earth and the planets revolved around the Sun, there was a question about what force was responsible for this. Since the days of Newton, science has held onto the notion that an invisible force, which we have never been able to detect, controls planetary motion. There are complicated theories about black holes that have never been seen, densities of planets that have never been measured, and subatomic particles that have never been detected.

However, it is simpler than all of that and right in front of us. The Sun and the solar wind are the most powerful forces in our solar system. They are physically moving the planets. In fact, the solar wind spins outward in a spiral at over a million miles per hour that controls the velocity and distances that planets revolve around the Sun. The Sun via the solar wind quantizes the orbits of the planets – their position and speed.

The solar wind also leads to the natural log and other phenomenon from the very large scale down to the atomic level. This is clearly a different idea than the current view that has been held for over 400 years. We have been working on this for close 50 years and thanks to satellite explorations of space have data that just was not available when theories long ago were developed. We think that we have many of the pieces but there are certainly many more to be found. We set this up as a web site, rather as some authoritative book so that there would be plenty of opportunity for dialog. The name for this web site, www.MorethanGravity.com was chosen because we believe there is far more to this subject than is commonly understood. Whether you are a scientific expert in your field or just have a general interest in how our solar system works, we appreciate your comments.

See, it's all about the solar wind. There's no such thing as gravity - that's just nonsense. The sun produces the solar wind, which does absolutely everything. The wind comes out of the sun, and spirals out from the sun. That spiral motion has eddies in it an quantized intervals, and that's where the planets are. Amazing, huh?

Remember my mantra: the worst math is no math. This is a beautiful demonstration
of that.

Of course... why does the solar wind move in a spiral? Everything we know says that in the absence of a force, things move in a straight line. It can't be spiraling because of gravity, because there is no gravity. So why does it spiral? Our brilliant authors don't bother to say. What makes it spiral, instead of just move straight? Mathematically, spiral motion is very complicated. It requires a centripetal force which is smaller than the force that would produce an orbit. Where's that force in this framework? There isn't any. They just say that that's how the solar wind works, period. There are many possible spirals, with different radial velocities - which one does the solar wind follow according to this, and why? Again, no answer from the authors.

Or... why is the sun producing the solar wind at all? According to those old, stupid theories that this work of brilliance supercedes, the sun produces a solar wind because it's fusing hydrogen atoms into helium. That's happening because gravity is causing the atoms of the sun to be compressed together until they fuse. Without gravity, why is fusion happening at all? And given that it's happening, why does the sun not just explode into a supernova? We know, from direct observation, that the energy produced by fusion creates an outward force. But gravity can't be holding the sun together - so why is the sun there at all? Still, no answers.

They do, eventually, do some math. One of the big "results" of this hypothesis is about the "quantization" of the orbits of planets around the sun. They were able to develop a simple equation which predicts the locations where planets could exist in their "solar wind" system.

Let’s start with the distance between the planets and the Sun. We guessed that if the solar system was like an atom, that planetary distance would be quantized. This is to say that we thought that the planets would have definite positions and that they would be either in the position or it would be empty. In a mathematical sense, this would be represented by a numerical integer ordering (0,1,2,3,…). If the first planet, Mercury was in the 0 orbital, how would the rest of the planets line up? Amazingly well we found.

If we predict the distance from the surface of the Sun to each planet in this quantized approach, the results are astounding. If D equals the mean distance to the surface of the Sun, and d0 as the distance to Mercury, we can describe the relationship that orders the planets mathematically as:

Each planetary position can be predicted from this equation in a simple calculation as we increase the integer (or planet number) n. S is the solar factor, which equals 1.387. The solar factor is found in the differential rotation of the Sun and the profile of the solar wind which we will discuss later.

Similar to the quantized orbits that exist within an atom, the planetary bodies are either there or not. Mercury is in the zero orbital. The next orbital is missing a planet. The second, third, and fourth orbitals are occupied by Venus, Earth, and Mars respectively. The fifth orbital is missing. The sixth orbital is filled with Ceres. Ceres is described as either the largest of all asteroids or a minor planet (with a diameter a little less than half that of Pluto), depending on who describes it. Ceres was discovered in 1801 as astronomers searched for the missing planets that the Titius-Bode Law predicted would exist.

So. What they found was an exponential equation which products very approximate versions of the size of first 8 planets' orbits, as well as a couple of missing ones.

This is, in its way, interesting. Not because they found anything, but rather because they think that this is somehow profound.

We've got 8 data points (or 9, counting the asteroid belt). More precisely, we have 9 ranges, because all of the orbits are elliptical,but the authors of this junk are producing a single number for the size of the orbits, and they can declare success if their number falls anywherewithin the range from perihelion to aphelion in each of the orbits.

It would be shocking if there weren't any number of simple equations that described exactly the 9 data points of the planet's orbits.

But they couldn't even make that work directly. They only manage to get a partial hit - getting an equation that hits the right points, but which also generates a bunch of misses. There's nothing remotely impressive about that.

From there, they move on to the strawmen. For example, they claim that their "solar wind" hypothesis explains why the planets all orbit in the same direction on the same plane. According to them, if orbits were really gravitational, then planets would orbit in random directions on random planes around the sun. But their theory is better than gravity, because it says why the planets are in the same plane, and why they're all orbiting in the same direction.

The thing is, this is a really stupid argument. Why are the planets in the same plane, orbiting in the same direction? Because the solar system was formed out of a rotating gas cloud. There's a really good, solid, well-supported explanation of why the planets exist, and why they orbit the sun the way they do. Gravity doesn't explain all of it, but gravity is a key piece of it.

What they don't seem to understand is how amazingly powerful the theory of gravity is as a predictive tool. We've sent probes to the outer edges of the solar system. To do that, we didn't just aim a rocket towards Jupiter and fire it off. We've done things like the Cassini probe, where we launched a rocket towards Venus. It used the gravitational field of Venus twice to accelerate it with a double-slingshot maneuver, and send it back towards earth, using the earth's gravity to slingshot it again, to give it the speed it needed to get to Jupiter.

This wasn't a simple thing to do. It required an extremely deep understanding of gravity, with extremely accurate predictions of exactly how gravity behaves.

How do our brilliant authors answer this? By handwaving. The extend of their response is:

Gravitational theory works for things like space travel because it empirically measures the force of a planet, rather than predicting it.

That's a pathetic handwave, and it's not even close to true. The gravitational slingshot is a perfect answer to it. A slingshot doesn't just use some "empirically measured" force of a planet. It's a very precise prediction of what the forces will be at different distances, how that force will vary, and what effects that force will have.

They do a whole lot more handwaving of very much the same order. Pure rubbish.

## What the heck is a DNS amplification DoS attack?

(by MarkCC) Apr 08 2013

A couple of weeks ago, there was a bunch of news about a major DOS attack on Spamhaus. Spamhaus is an online service that maintains a blacklist of mail servers that are known for propagating spam. I've been getting questions about what a DoS attack is, and more specifically what a "DNS amplification attack" (the specific attack at the heart of last week's news) is. This all became a bit more relevant to me last week, because some asshole who was offended by my post about the Adria Richards affair launched a smallish DoS attack against scientopia. (This is why we were interrmitently very slow last week, between tuesday and thursday. Also, to be clear, the DNS amplification attack was used on Spamhaus. Scientopia was hit by a good old fashioned DDoS attack.)

So what is a DoS attack? And what specifically is a DNS amplification attack?

Suppose that you're a nastly person who wants to take down a website like scientopia. How could you do it? You could hack in to the server, and delete everything. That would kill it pretty effectively, right?

It certainly would. But from the viewpoint of an attacker, that's not a particularly easy thing to do. You'd need to get access to a privileged account on our server. Even if we're completely up to date on patches and security fixes, it's probably possible to do that, but it's still probably going to be a lot of work. Even for a dinky site like scientopia, getting that kind of access probably isn't trivial. For a big security-focused site like spamhaus, that's likely to be close to impossible: there are layers of security that you'd need to get through, and there are people constantly watching for attacks. Even if you got through, if the site has reliable backups, it won't be down for long, and once they get back up, they'll patch whatever hole you used to get in, so you'd be back to square one. It's a lot of work, and there are much easier ways to take down a site.

What you, as an attacker, want is a way to take the site down without having any kind of access to the system. You want a way that keeps the site down for as long as you want it down. And you want a way that doesn't leave easily traced connections to you.

That's where the DoS attack comes in. DoS stands for "denial of service". The idea of a DoS attack is to take a site down without really taking it down. You don't actually kill the server; you just make it impossible for legitimate users to access it. If the sites users can't access the site even though the server is technically still up and running, you've still effectively killed it.

How do you do that? You overwhelm the server. You target some finite resource on the server, and force it to use up that resource just dealing with requests or traffic that you sent to the server, leaving it with nothing for its legitimate users.

In terms of the internet, the two resources that people typically target are CPU and network bandwidth.

Every time that you send a request to a webserver, the server has to do some computation to process that request. The server has a finite amount of computational capability. If you can hit it with enough requests that it spends all of its time processing your requests, then the site becomes unusable, and it effectively goes down. This is the simplest kind of DoS attack. It's generally done in a form called a DDoS - distributed denial of server attack, where the attacker users thousands or millions of virus-infected computers to send requests. The server gets hit by a vast storm of requests, and it can't distinguish the legitimate requests from the ones generated by the attacker. This is the kind of attack that hit Scientopia last week. We were getting streams of a couple of thousands malformed requests per second.

This kind of attack can be very effective. It's hard - not impossible, but hard - to fight. You need to identify the common traits of the attackers, and set up some kind of filter to discard those requests. From the attacker's point of view, it's got one problem: price. Most people don't have a personal collection of virus-infected machines that they can use to mount an attack. What they actually do is rent machines! Virus authors run services where they'll use the machines that they've to run an attack for you, for a fee. They typically charge per machine-hour. So to keep a good attack going for a long time is expensive! Another problem with this kind of attack is that the amount of traffic that you can inflict on the server per attacker is also used by the client. The client needs to establish a connection to the server. That consumes CPU, network connections, and bandwidth on the client.

The other main DoS vector is network bandwidth. Every server running a website is connected to the network by a connection with a fixed capacity, called it's bandwidth. A network connection can only carry a certain quantity of information. People love to make fun of the congressman who said that the internet is like a series of tubes, but that's not really a bad analogy. Any given connection is a lot like a pipe. You can only cram so much information through that pipe in a given period of time. If you can send enough traffic to completely fill that pipe, then the computer on the other end is, effectively, off the network. It can't receive any requests.

For a big site like spamhaus, it's very hard to get enough machines attacking to effectively kill the site. The amount of bandwidth, and the number of different network paths connecting spamhaus to the internet is huge! The number of infected machines available for an attack is limited, and the cost of using all of them is prohibitive.

What an attacker would like for killing something like Spamhaus is an attack where the amount of work/cpu/traffic used to generate the attack is much smaller than the amount of work/cpu/traffic used by the server to combat the attack. That's where amplification comes in. You want to find some way of using a small amount of work/traffic on your attacker machines to cause your target to lost a large amount of work/traffic.

In this recent attack on Spamhaus, they used an amplification attack, that was based on a basic internet infrastructure service called the Domain Name Service (DNS). DNS is the service which is used to convert between the name of a server (like scientopia.org), and its numeric internet address (184.106.221.182). DNS has some technical properties that make it idea for this kind of attack:

1. It's not a connection-based service. In most internet services, you establish a connection to a server, and send a request on that connection. The server responds on the same connection. In a connection-based service, that means two things. First, you need to use just as much bandwidth as the target, because if you drop the connection, the server sees the disconnect and stops processing your request. Second, the server knows who it's connected to, and it always sends the results of a request to the client that requested it. But DNS doesn't work that way. In DNS, you send a request without a connection, and in the request, you provide an address that the response should be sent to. So you can fake a DNS request, by putting someone else's address as the "respond-to" address in the request.
2. It's possible to set up DNS to create very large responses to very small requests. There are lots of ways to do this. The important thing is that it's really easy to use DNS in a way that allows you to amplify the amount of data being sent to a server by a factor of 100. In one common form of DNS amplification, you send 60 byte requests, which generate responses larger than 6,000 bytes.

Put these two properties together, and you get a great attack vector: you can send tiny, cheap requests to a server, which don't cause any incoming traffic on your attacker machine, and which send large quantities of data to your target. Doing this is called a DNS amplification attack: it's an amplification attack which uses properties of DNS to generate large quantities of data send to your server, using small quantities of data sent by your attackers.

That's exactly what happened to Spamhaus last week. The attackers used a very common DNS extension, which allowed them to amplify 60 byte requests into 4,000 byte responses, and to send the responses to the spamhaus servers.

There are, of course, more details. (For example, when direct attacks didn't work, they tried an indirect attack that didn't target the spamhaus servers, but instead tried to attack other servers that spamhaus relied on.) But this is the gist.

## A White Boy's Observations of Sexism and the Adria Richards Fiasco

(by MarkCC) Mar 28 2013

I've been watching the whole Adria Richards fiasco with a sense of horror and disgust. I'm finally going to say something, but for the most part, it's going to be indirect.

See, I'm a white guy, born as a member of an upper middle class white family. That means that I'm awfully lucky. I'm part of the group that is, effectively, treated as the normal, default person in most settings. I'm also a guy who's married to a chinese woman, and who's learned a bit about how utterly clueless I am.

Here's the fundamental issue that underlies all of this, and many similar stories: our society is deeply sexist and racist. We are all raised in an environment in which mens voices are more important than womens. It's so deeply ingrained in us that we don't even notice it.

What this means is that we are all to some degree, sexist, and racist. When I point this out, people get angry. We also have learned that sexism is a bad thing. So when I say to someone that you are sexist, it's really easy to interpret that as me saying that you're a bad person: sexism is bad, if I'm sexist, them I'm bad.

But we really can't get away from this reality. We are sexists. For many of us, we're not deliberately sexist, we're not consciously sexist. But we are sexist.

Here's a really interesting experiment to try, if you have the opportunity. Visit an elementary school classroom. First, just watch the teacher interact with the students while they're teaching. Don't try to count interactions. Just watch. See if you think that any group of kids is getting more attention than any other. Most of the time, you probably will get a feeling that they're paying roughly equal attention to the boys and the girls, or to the white students and the black students. Then, come back on a different day, and count the number of times that they call on boys versus calling on girls. I've done this, after having the idea suggested by a friend. The result was amazing. I really, honestly believed that the teacher was treating her students (the teacher I did this with was a woman) equally. But when I counted?She was calling on boys twice as often as girls.

This isn't an unusual outcome. Do some looking online for studies of classroom gender dynamics, and you'll find lots of structured observations that come to the same conclusion.

My own awakening about these kinds of things came from my time working at IBM. I've told this first story before, but it's really worth repeating.

One year, I managed the summer intership programs for my department. The previous summer, IBM research had wound up with an intership class consisting of 99% men. (That's not an estimate: that's a real number. That year, IBM research hired 198 summer interns, of whom 2 were women.) For a company like IBM, numbers like that are scary. Ignoring all of the social issues of excluding potentially great candidates, numbers like that can open the company up to gender discrimination lawsuits!

So my year, they decided to encourage the hiring of more diverse candidates. The way that they did that was by allocating each department a budget for summer interns. They could only hire up to their budgeted number of interns. Only women and minority candidates didn't count against the budget.

When the summer program hiring opened, my department was allocated a budget of six students. All six slots were gone within the first day. Every single one of them went to a white, american, male student.

The second day, the guy across the hall from me came with a resume for a student he wanted to hire. This was a guy who I really liked, and really respected greatly. He was not, by any reasonable measure, a bad guy - he was a really good person. Anyway, he had this resume, for yet another guy. I told him the budget was gone, but if he could find a good candidate who was either a woman or minority, that we could hire them. He exploded, ranting about how we were being sexist, discriminating against men. He just wanted to hire the best candidate for the job! We were demanding that he couldn't hire the best candidate, he had to hire someone less qualified, in order to satisfy some politically correct bureaucrat! There was nothing I could do, so eventually he stormed out.

Three days later, he came back to my office with another resume. He was practically bouncing off the walls he was so happy. "I found another student to hire. She's even better than the guy I originally came to you with! She's absolutely perfect for the job!". We hired her.

I asked him why he didn't find her before. He had no answer - he didn't know why he didn't find her resume of his first search.

This was a pattern that I observed multiple times that year. Looking through a stack of resumes, without deliberately excluding women, somehow, all of the candidates with female names wound up back in the slushpile. I don't think that anyone was deliberately saying "Hmm, Jane, that's a woman's name, I don't want to hire a woman". But I do think that in the process of looking through a file containing 5000 resumes, trying to select which ones to look at, on an unconscious level, they were more likely to look carefully at a candidate with a male name, because we all learn, from a young age, that men are smarter than women, men are more serious than women, men are better workers than women, men are more likely to be technically skilled than women. Those attitudes may not be part of our conscious thought, but they are part of the cultural background that gets drummed into us by school, by books, by movies, by television, by commercials.

As I said, that was a real awakening for me.

I was talking about this with my next-door office neighbor, who happened to be one of the only two women in my department (about 60 people) at the time. She was shocked that I hadn't noticed this before. So she pointed out to me that in meetings, she could say things, and everyone would ignore it, but if a guy said the same thing, they'd get listened to. We'd been in many meetings together, and I'd never noticed this!

So I started paying attention, and she was absolutely right.

What happened next is my second big awakening.

I started watching this in meetings, and when people brushed over something she'd said, I'd raise my voice and say "X just suggested blah, which I think is a really good idea. What about it?". I wanted to help get her voice listened to.

She was furious at me. This just blew my mind. I was really upset at her at first. Dammit, I was trying to help, and this asshole was yelling at me for it! She'd complained about how people didn't listen to her, and now when I was trying to help get her listened to, she was complaining again!

What I realized after I calmed down and listened to her was that I was wrong. I hadn't spoken to her about doing it. I didn't understand what it meant. But the problem was, people didn't take her seriously because she was a woman. People might listen to me, because I'm also a white guy. But when I spoke for her, I wasn't helping. When a man speaks on behalf of a woman, we're reinforcing the idea that a woman's voice isn't supposed to be heard. I was substituting my man's voice for her woman's, and by doing that, I was not just not helping her, but I was actively hurting, because the social interpretation of my action was that "X can't speak for herself". And more, I learned that by taking offense at her, for pointing out that I had screwed up, I was definitely in the wrong - that I had an instinct for reacting wrong.

What I learned, gradually, from watching things like this, from becoming more sensitive and aware, and by listening to what women said, was that this kind of thing is that I was completely clueless.

The fact is, I constantly benefit from a very strong social preference. I don't notice that. Unless I'm really trying hard to pay attention, I'm not aware of all of the benefits that I get from that. I don't notice all of the times when I'm getting a benefit. Worse, I don't notice all of the times when my behavior is asserting that social preference as my right.

It's very easy for a member of an empowered majority to just take things for granted. We see the way that we are treated as a default, and assume that everyone is treated the same way. We don't perceive that we are being treated preferentially. We don't notice that the things that offend us are absolutely off limits to everyone, but that things that we do to offend others are accepted as part of normal behavior. Most importantly, we don't notice when our behavior is harmful to people who aren't part of our empowered group. And when we do offend someone who isn't part of the empowered majority, we take offense at the fact that they're offended. Because they're saying that we did something bad, and we know that we aren't bad people!

The way that this comes back to the whole Adria Richards fiasco is very simple. Many people have looked at what happened at PyCon, and said something like "She shouldn't have tweeted their picture", or "She shouldn't have been offended, they didn't do anything wrong", or "She should have just politely spoken to them".

I don't know whether what she did was right or not. I wasn't there. I didn't hear the joke that the guys in question allegedly told. What I do know is that for a member of the minority out-group, there is frequently no action that will be accepted as "right" if it includes the assertion that the majority did something offensive.

I've seen this phenomena very directly myself, not in the context of sexism, but in terms of antisemitism. There's an expression that I've heard multiple times in the northeast US, to talk about bartering a price for a car: "jewing the salesman down". I absolutely find that extremely offensive. And I've called people out on it. There is no response that's actually acceptable.

If I politely say "You know, that's relying on a stereotype of me and my ancestors that's really hurtful", the response is: "Oh, come on, it's just harmless. I'm not talking about you, it's just a word. You're being oversensitive". If I get angry, the response is "You Jews are so strident". If I go to an authority figure in the setting, "You Jews are so passive aggressive, why couldn't you just talk to me?". No matter what I do, I'm wrong. Women deal with this every day, only they're in a situation where the power dynamic is even less in their favor.

That's the situation that women - particularly women in tech - find themselves in every day. We are sexist. We do mistreat women in tech every day, without even knowing that we're doing it. And we're very likely to take offense if they mention that we did something wrong. Because we know that we're good people, and since we aren't deliberately doing something bad, they must be wrong.

For someone in Adria Richards' situation at PyCon, there is no course of action that can't be taken wrong. As a woman hearing the joke in question, she certainly knew whether or not it was offensive to her. But once she'd heard something offensive, there was nothing she could do that someone couldn't turn into a controversy.

Was the joke offensive? We don't know what, specifically, he said. The only fact that we're certain of is that in her judgement, it was offensive; that the authorities at PyCon agreed, and asked the gentleman in question to apologize.

Did the guy who made the joke deserve to be fired? I don't know. If this stupid joke were the first time he'd ever done something wrong, then he didn't deserve to be fired. But we don't know what his history is like. I know how hard it is to hire skilled engineers, so I'm very skeptical that any company would fire someone over one minor offense. It's possible that his company has a crazy hair-trigger HR department. But it's also possible that there's background that we don't know about. That he's done stuff before, and been warned. If that's the case, then his company could have decided that this was the last straw.

Did Adria Richards deserve to be fired? Almost certainly not. We know more about her case than we do about the guy who told the joke. We know that her company fired her over this specific incident, because in their announcement of her firing, they told us the reason. They didn't cite any past behavior - they just specifically cited this incident and its aftermath as the reason for firing her. It's possible that there's a history here that we don't know about, that she'd soured relations with customers of her company in incidents other than this, and that this was a last straw. But it doesn't seem likely, based on the facts that we're aware of.

Did either of them deserve to be threatened? Absolutely not.

## Genius Continuum Crackpottery

(by MarkCC) Mar 21 2013

There's a lot of mathematical crackpottery out there. Most of it is just pointless and dull. People making the same stupid mistakes over and over again, like the endless repetitions of the same-old supposed refutations of Cantor's diagonalization.

After you eliminate that, you get reams of insanity - stuff which
is simply so incoherent that it doesn't make any sense. This kind of thing is usually word salad - words strung together in ways that don't make sense.

After you eliminate that, sometimes, if you're really lucky, you'll come accross something truly special. Crackpottery as utter genius. Not genius in a good way, like they're an outsider genius who discovered something amazing, but genius in the worst possible way, where someone has created something so bizarre, so overwrought, so utterly ridiculous that it's a masterpiece of insane, delusional foolishness.

Today, we have an example of that: Existics!. This is a body of work by a high school dropout named Gavin Wince with truly immense delusions of grandeur. Pomposity on a truly epic scale!

I'll walk you through just a tiny sample of Mr. Wince's genius. You can go look at his site to get more, and develop a true appreciation for this. He doesn't limit himself to mere mathematics: math, physics, biology, cosmology - you name it, Mr. Wince has mastered it and written about it!

The best of his mathematical crackpottery is something called C3: the Canonized Cardinal Continuum. Mr. Wince has created an algebraic solution to the continuum hypothesis, and along the way, has revolutionized number theory, algebra, calculus, real analysis, and god only knows what else!

Since Mr. Wince believes that he has solved the continuum hypothesis. Let me remind you of what that is:

1. If you use Cantor's set theory to explore numbers, you get to the uncomfortable result that there are different sizes of infinity.
2. The smallest infinite cardinal number is called ℵ0,
and it's the size of the set of natural numbers.
3. There are cardinal numbers larger than ℵ0. The first
one larger than ℵ0 is ℵ1.
4. We know that the set of real numbers is the size of the powerset
of the natural numbers - 20 - is larger than the set of the naturals.
5. The question that the continuum hypothesis tries to answer is: is the size
of the set of real numbers equal to ℵ1? That is, is there
a cardinal number between ℵ0 and |20|?

The continuum hypothesis was "solved" in 1963. In 1940, Gödel showed that you couldn't disprove the continuum hypothesis using ZFC. In 1963,
another mathematician named Paul Cohen, showed that it couldn't be proven using ZFC. So - a hypothesis which is about set theory can be neither proven nor disproven using set theory. It's independent of the axioms of set theory. You can choose to take the continuum hypothesis as an axiom, or you can choose to take the negation of the continuum hypothesis as an axiom: either choice is consistent and valid!

It's not a happy solution. But it's solved in the sense that we've got a solid proof that you can't prove it's true, and another solid proof that you can't prove it's false. That means that given ZFC set theory as a basis, there is no proof either way that doesn't set it as an axiom.

But... Mr. Wince knows better.

The set of errors that Wince makes is really astonishing. This is really seriously epic crackpottery.

He makes it through one page without saying anything egregious. But then he makes up for it on page 2, by making multiple errors.

First, he pulls an Escultura:

x1 = 1/21 = 1/2 = 0.5

x2 = 1/21 + 1/22 = 1/2 + 1/4 = 0.75

x3 = 1/21 + 1/22 + 1/23 = 1/2 + 1/4 + 1/8 = 0.875

...

At the end or limit of the infinite sequence, the final term of the sequence is 1.0

...

In this example we can see that as the number of finite sums of the sequence approaches the limit infinity, the last term of the sequence equals one.

xn = 1.0

If we are going to assume that the last term of the sequence equals one, it can be deduced that, prior to the last term in the sequence, some finite sum in the series occurs where:

xn-1 = 0.999…

xn-1 = 1/21 + 1/22 + 1/23 + 1/24 + … + 1/2n-1 = 0.999…

Therefore, at the limit, the last term of the series of the last term of the sequence would be the term, which, when added to the sum 0.999… equals 1.0.

There is no such thing as the last term of an infinite sequence. Even if there were, the number 0.999.... is exactly the same as 1. It's a notational artifact, not a distinct number.

But this is the least of his errors. For example, the first paragraph on the next page:

The set of all countable numbers, or natural numbers, is a subset of the continuum. Since the set of all natural numbers is a subset of the continuum, it is reasonable to assume that the set of all natural numbers is less in degree of infinity than the set containing the continuum.

We didn't need to go through the difficult of Cantor's diagonalization! We could have just blindly asserted that it's obvious!

or actually... The fact that there are multiple degrees of infinity is anything but obvious. I don't know anyone who wasn't surprised the first time they saw Cantor's proof. It's a really strange idea that there's something bigger than infinity.

Moving on... the real heart of his stuff is built around some extremely strange notions about infinite and infinitessimal values.

Before we even look at what he says, there's an important error here
which is worth mentioning. What Mr. Wince is trying to do is talk about the
continuum hypothesis. The continuum hypothesis is a question about the cardinality of the set of real numbers and the set of natural numbers.
Neither infinites nor infinitessimals are part of either set.

Infinite values come into play in Cantor's work: the cardinality of the natural numbers and the cardinality of the reals are clearly infinite cardinal numbers. But ℵ0, the smallest infinite cardinal, is not a member of either set.

Infinitessimals are fascinating. You can reconstruct differential and integral calculus without using limits by building in terms of infinitessimals. There's some great stuff in surreal numbers playing with infinitessimals. But infinitessimals are not real numbers. You can't reason about them as if they were members of the set of real numbers, because they aren't.

Many of his mistakes are based on this idea.

For example, he's got a very strange idea that infinites and infinitessimals don't have fixed values, but that their values cover a range. The way that he gets to that idea is by asserting the existence
of infinity as a specific, numeric value, and then using it in algebraic manipulations, like taking the "infinityth root" of a real number.

For example, on his way to "proving" that infinitessimals have this range property that he calls "perambulation", he defines a value that he calls κ:

In terms of the theory of numbers, this is nonsense. There is no such thing as an infinityth root. You can define an Nth root, where N is a real number, just like you can define an Nth power - exponents and roots are mirror images of the same concept. But roots and exponents aren't defined for infinity, because infinity isn't a number. There is no infinityth root.

You could, if you really wanted to, come up with a definition of exponents that that allowed you to define an infinityth root. But it wouldn't be very interesting. If you followed the usual pattern for these things, it would be a limit: . That's clearly 1. Not 1 plus something: just exactly 1.

But Mr. Cringe doesn't let himself be limited by silly notions of consistency. No, he defines things his own way, and runs with it. As a result, he gets a notion that he calls perambulation. How?

Take the definition of κ:

Now, you can, obviously, raise both sides to the power of infinity:

Now, you can substitute ℵ0 for . (Why? Don't ask why. You just can.) Then you can factor it. His factoring makes no rational sense, so I won't even try to explain it. But he concludes that:

• Factored and simplified one way, you end up with (κ+1) = 1 + x, where x is some infinitessimal number larger than κ. (Why? Why the heck not?)
• Factored and simplified another way, you end up with (κ+1) = ℵ
• If you take the mean of of all of the possible factorings and reductions, you get a third result, that (κ+1) = 2.

He goes on, and on, and on like this. From perambulation to perambulating reciprocals, to subambulation, to ambulation. Then un-ordinals, un-sets... this is really an absolute masterwork of utter insane crackpottery.

## Pi-day randomness

(by MarkCC) Mar 14 2013

One of my twitter friends was complaining about something that's apparently making the rounds of Facebook for π-day. It annoyed me sufficiently to be worth ranting about a little bit.

Why isn't π rational if π=circumference/diameter, and both measurements are plainly finite?

There's a couple of different ways of interpreting this question.

The stupidest way of interpreting it is that the author didn't have any clue of what an irrational number is. An irrational number is a number which cannot be written as a ratio of two integers. Another way of saying essentially the same thing is that there's no way to create a finite representation of an irrational number. I've seen people get this wrong before, where they confuse not having a finite representation with not being finite.

π doesn't have a finite representation. But it's very clearly finite - it's less that 3 1/4, which is obviously not infinite. Anyone who can look at π, and be confused about whether or not it's finite is... well... there's no nice way to say this. If you think that π isn't finite, you're an idiot.

The other way of interpreting this statement is less stupid: it's a question of measurement. If you have a circular object in real life, then you can measure the circumference and the diameter, and do the division on the measurements. The measurements have finite precision. So how can the ratio of two measurements with finite precision be irrational?

The answer is, they can't. But perfect circles don't exist in the real world. Many mathematical concepts don't exist in the real world. In the real world, there's no such thing as a mathematical point, no such thing as a perfect line, no such thing as perfectly parallel lines.

π isn't a measured quantity. It's a theoretical quantity, which can be computed analytically from the theoretical properties derived from the abstract properties of an ideal, perfect circle.

No "circle" in the real world has a perfect ratio of π between its circumference and its diameter. But the theoretical circle does.

The facebook comments on this get much worse than the original question. One in particular really depressed me.

Just because the measurements are finite doesn't mean they're rational.
Pi is possibly rational, we just haven't figured out where it ends.

Gah, no!

We know an awful lot about π. And we know, with absolute, 100% perfect certainty that π never ends.

We can define π precisely as a series, and that series makes it abundantly clear that it never ends.

That series goes on forever. π can't ever end, because that series never ends.

Just for fun, here's a little snippet of Python code that you can play with. You can see how, up to the limits of your computer's floating point representation, that a series computation of π keeps on going, changing with each additional iteration.

def pi(numiter):
val = 3.0
sign = 1
for i in range(numiter):
term = ((i+1)*2) * ((i+1)*2 + 1) * ((i+1) *2 + 2)
val = val + sign*4.0/term
sign = sign * -1
return val


## Finally: Gödel's Proof of Incompleteness!

(by MarkCC) Mar 12 2013

Finally, we're at the end of our walkthrough of Gödel great incompleteness proof. As a refresher, the basic proof sketch is:

1. Take a simple logic. We've been using a variant of the Principia Mathematica's logic, because that's what Gödel used.
2. Show that any statement in the logic can be encoded as a number using an arithmetic process based on the syntax of the logic. The process of encoding statements numerically is called Gödel numbering.
3. Show that you can express meta-mathematical properties of logical statements in terms of arithemetic properties of their Gödel numbers. In particular, we need to build up the logical infrastructure that we need to talk about whether or not a statement is provable.
4. Using meta-mathematical properties, show how you can create an unprovable statement encoded as a Gödel number.

What came before:

1. Gödel numbering: The logic of the Principia, and how to encode it as numbers. This was step 1 in the sketch.
2. Arithmetic Properties: what it means to say that a property can be expressed arithemetically. This set the groundwork for step 2 in the proof sketch.
3. Encoding meta-math arithmetically: how to take meta-mathematical properties of logical statements, and define them as arithmetic properties of the Gödel numberings of the statements. This was step 2 proper.

So now we can move on to step three, where we actually see why mathematical logic is necessarily incomplete.

When we left off with Gödel, we'd gone through a very laborious process showing how we could express meta-mathematical properties of logical statements as primitive recursive functions and relations. We built up to being able to express one non-primitive recursive property, which describes the property that a given statement is provable:

pred provable(x) =
some y {
proofFor(y, x)
}
}


The reason for going through all of that was that we really needed to show how we could capture all of the necessary properties of logical statements in terms of arithmetic properties of their Gödel numbers.

Now we can get to the target of Gödel's effort. What Gödel was trying to do was show how to defeat the careful stratification of the Principia's logic. In the principia, Russell and Whitehead had tried to avoid problems with self-reference by creating a very strict stratification, where each variable or predicate had a numeric level, and could only reason about objects from lower levels. So if natural numbers were the primitive objects in the domain being reasoned about, then level-1 objects would be things like specific natural numbers, and level-1 predicates could reason about specific natural numbers, but not about sets of natural numbers or predicates over the natural numbers. Level-2 objects would be sets of natural numbers, and level-2 predicates could reason about natural numbers and sets of natural numbers, but not about predicates over sets of natural numbers, or sets of sets of natural numbers. Level-3 objects would be sets of sets of natural numbers... and so on.

The point of this stratification was to make self-reference impossible. You couldn't make a statement of the form "This predicate is true": the predicate would be a level-N predicate, and only a level N+1 predicate could reason about a level-N predicate.

What Gödel did in the laborious process we went through in the last
post is embed a model of logical statements in the natural numbers. That's the real trick: the logic is designed to work with a set of objects that are a model of the natural numbers. By embedding a model of logical statements in
the natural numbers, he made it possible for a level-1 predicate (a predicate about a specific natural number) to reason about any logical statement or object. A level-1 predicate can now reason about a level-7 object! A level-1 predicate can reason about the set defined by a level-1 predicate: a level-1 predicate can reason about itself!.

Now, we can finally start getting to the point of all of this: incompleteness! We're going to use our newfound ability to nest logical statements into numbers to construct an unprovable true statement.

In the last post, one of the meta-mathematical properties that we defined for the Gödel-numbered logic was immConseq, which defines when some statement x is an immediate consequence of a set of statements S. As a reminder, that means that x can be inferred from statements in S in one inferrence step.

We can use that property to define what it means to be a consequence of a set of statements: it's the closure of immediate consequence. We can define it in pseudo-code as:

def conseq(κ) = {
K = κ + axioms
do {
for all c in immConseq(K) {
if c not in K {
}
}
return K
}


In other words, Conseq(κ) is the complete set of everything that can possibly be inferred from the statements in κ and the axioms of the system. We can say that there's a proof for a statement x in κ if and only if x ∈ Conseq(κ).

We can the idea of Conseq use that to define a strong version of what it means for a logical system with a set of facts to be consistent. A system is ω-consistent if and only if there is not a statement a such that: a ∈ Conseq(κ) ∧ not(forall(v, a)) ∈ Conseq(κ).

In other words, the system is ω-consistent as long as it's never true that both a universal statement and it. But for our purposes, we can treat it as being pretty much the same thing. (Yes, that's a bit hand-wavy, but I'm not trying to write an entire book about Gödel here!)

(Gödel's version of the definition of ω-consistency is harder to read than this, because he's very explicit about the fact that Conseq is a property of the numbers. I'm willing to fuzz that, because we've shown that the statements and the numbers are interchangable.)

Using the definition of ω-consistency, we can finally get to the actual statement of the incompleteness theorem!

Gödel's First Incompleteness Theorem: For every ω-consistent primitive recursive set κ of formulae, there is a primitive-recursive predicate r(x) such that neither forall(v, r) nor not(forall(v, r)) is provable.

To prove that, we'll construct the predicate r.

First, we need to define a version of our earlier isProofFigure that's specific to the set of statements κ:

pred isProofFigureWithKappa(x, kappa) = {
all n in 1 to length(x) {
isAxiom(item(n, x)) or
item(n, x) in kappa or
some p in 0 to n {
some q in 0 to n {
immedConseq(item(n, x), item(p, x), item(q, x))
}
}
} and length(x) > 0
}


This is the same as the earlier definition - just specialized so that it ensures that every statement in the proof figure is either an axiom, or a member of κ.

We can do the same thing to specialize the predicate proofFor and provable:

pred proofForStatementWithKappa(x, y, kappa) = {
isProofFigureWithKappa(x, kappa) and
item(length(x), x) = y
}

pred provableWithKappa(x, kappa) = {
some y {
proofForStatementWithKappa(y, x, kappa)
}
}


If κ is the set of basic truths that we can work with, then
provable in κ is equivalent to provable.

Now, we can define a predicate UnprovableInKappa:

pred NotAProofWithKappa(x, y, kappa) = {
not (proofForKappa(x, subst(y, 19, number(y))))
}


Based on everything that we've done so far, NotAProofWithKappa is primitive recursive.

This is tricky, but it's really important. We're getting very close to the goal, and it's subtle, so let's take the time to understand this.

• Remember that in a Gödel numbering, each prime number is a variable. So 19 here is just the name of a free variable in y.
• Using the Principia's logic, the fact that variable 19 is free means that the statement is parametric in variable 19. For the moment, it's an incomplete statement, because it's got an unbound parameter.
• What we're doing in NotAProofWithKappa is substituting the numeric coding of y for the value of y's parameter. When that's done, y is no longer incomplete: it's unbound variable has been replaced by a binding.
• With that substitution, NotAProofWithKappa(x, y, kappa) is true when x does not prove that y(y) is true.

What NotAProofWithKappa does is give us a way to check whether a specific sequence of statements x is not a proof of y.

We want to expand NotAProofWithKappa to something universal. Instead of just saying that a specific sequence of statements x isn't a proof for y, we want to be able to say that no possible sequence of statements is a proof for y. That's easy to do in logic: you just wrap the statement in a "∀ x ( )". In Gödel numbering, we defined a function that does exactly that. So the universan form of provability is:
∀ a (NotAProofWithKappa(a, y, kappa)).

In terms of the Gödel numbering, if we assume that the Gödel number for the variable a is 17, and the variable y is numbered as 19, we're talking about the statement p = forall(17, ProvableInKappa(17, 19, kappa).

p is the statement that for some logical statement (the value of variable 19, or y in our definition), there is no possible value for variable 17 (a) where a proves y in κ.

All we need to do now is show that we can make p become self-referential. No problem: we can just put number(p) in as the value of y in UnprovableInKappa. If we let q be the numeric value of the statement UnprovableInKappa(a, y), then:

r = subst(q, 19, p)

i = subst(p, 19, r)

i says that there is no possible value x that proves p(p). In other words, p(p) is unprovable: there exists no possible proof that there is no possible proof of p!

This is what we've been trying to get at all this time: self-reference! We've got a predicate y which is able to express a property of itself. Worse, it's able to express a negative property of itself!

Now we're faced with two possible choices. Either i is provable - in which case, κ is inconsistent! Or else i is unprovable - in which case κ is incomplete, because we've identified a true statement that can't be proven!

That's it: we've shown that in the principia's logic, using nothing but arithmetic, we can create a true statement that cannot be proven. If, somehow, it were to be proven, the entire logic would be inconsistent. So the principia's logic is incomplete: there are true statements that cannot be proven true.

We can go a bit further: the process that we used to produce this result about the Principia's logic is actually applicable to other logics. There's no magic here: if your logic is powerful enough to do Peano arithmetic, you can use the same trick that we demonstrated here, and show that the logic must be either incomplete or inconsistent. (Gödel proved this formally, but we'll just handwave it.)

Looking at this with modern eyes, it doesn't seem quite as profound as it did back in Gödel's day.

When we look at it through the lens of today, what we see is that in the Principia's logic, proof is a mechanical process: a computation. If every true statement was provable, then you could take any statement S, and write a program to search for a proof of either S or ¬ S, and eventually, that program would find one or the other, and stop.

In short, you'd be able to solve the halting problem. The proof of the halting problem is really an amazingly profound thing: on a very deep level, it's the same thing as incompleteness, only it's easier to understand.

But at the time that Gödel was working, Turing hadn't written his paper about the halting problem. Incompletess was published in 1931; Turing's halting paper was published in 1936. This was a totally unprecedented idea when it was published. Gödel produced one of the most profound and surprising results in the entire history of mathematics, showing that the efforts of the best mathematicians in the world to produce the perfection of mathematics were completely futile.

(by MarkCC) Mar 02 2013

Over on twitter, some folks were chatting about the latest big security botch. A major service, called Evernote, had a security breach where a password file was stolen. Evernote has handled the situation quite well, being open about what happened, and explaining the risks.

In their description of the breach, they said that the stolen passwords were "both hashed and salted". Apparently this sounds funny to people outside of software. (Amazing how jargon becomes so ingrained that I didn't even notice the fact that it could be interpreted in a funny way until it was pointed out to me!)

Anyway, since discussion of this is going around, I thought I'd explain just what password hashing and salting means.

Let's start at the beginning. You're some kind of system that wants to have password security. Obviously, you need to save the passwords somewhere, right?

As we'll see, that's only partially correct - but let's go with it for now. You need to store something that lets you check if a user supplied the right password, so you need to store something.

The most naive approach is create a file (or database, or whatever) that contains the usernames and passwords of all of your users. Something like:


alice:abc
mark:pass
joe:123
jen:hello


Suppose you were a thief, and you wanted to crack this password file, what would you do? You'd try to steal that file! If you can get hold of that password file, then you'd have all of the passwords for all of the users of the system.

That means that this is a terrible way of storing the passwords. One step, and a thief has completely breached your system. We don't want that. So what should we do?

First, we could encrypt the file.

This might seem like a good idea at first. If a thief were the steal the file, the wouldn't be able to find your user's passwords without figuring out the encryption key! It's going to take a lot of work to figure out that key!

The first problem with this is that any password can be cracked given enough time and power. If there's only one encryption key for the entire file, then it's worth investing a lot of time and power into breaking it - and once it's broken, then everything is revealed.

The second problem is: how does your system check a user's password? It needs to decrypt the file! That means that the encryption key must be available to your system! So all that a thief needs to do is figure out where your system is getting the key from. You've got your entire security system for all of your users set up with a single point of protection, and somewhere in your system, everything that you need to break that protection is available!

What can we do to improve this? The answer is something called crypto graphic hashing.

Cryptographic hashing is a combination of two concepts: hashing, and one-way functions.

A really simple example of a not-very-good hash function of a string would be something like: convert all of the characters in the string to their numeric values, and exclusive-or the binary representation of those bits. With that hash function, you could take a string like "ABCD", and convert it to the numeric values of the characters ([65, 66, 67, 68]), and then x-or them together (1000001 xor 1000010 xor 1000011 xor 1000100 = 0000100) for a result of 4. Real practical hash functions are more complicated.

For example, at least some versions of Java use the following as the default hash for a string of characters:

There's a class of mathematical functions called one-way functions. A one way function is a function , where given , it's easy to compute , but given it's extremely difficult (or even impossible) to compute .

If we combine those two, we have what's called a crpytogrphic hash function: a function that takes an arbitrary input string, and converts it to a number, in a way where it's very difficult to figure out what the input string that produced the number was. That's great for storing passwords! We don't store the password at all. We just store the hash-value produced from the password. Then when a user comes and logs in, we take their password, hash it, and compute the result to the stored hash.

alice:7a2d28fc
mark:dfd4e1c6
joe:ed849ee1
jen:bb76e739


This is much better than storing the encrypted password. There is no encryption key that a thief can use to decrypt the password. Even if a thief knows the hash values of your user's passwords, they can't get in to the system! And your system actually never stores the actual values of the user's passwords - just their hashcodes!

So again, let's look at this from the perspective of a thief. How can a thief break into a system with hashed passwords?

If they don't know what hash function you're using, then they're completely stuck. Sadly, they can probably figure it out. Designing new crpytographic hash functions is hard. Implementing cryptographic hash functions correctly is hard. As a result, most people just use a hash function from a library. That means that for a thief, it's usually pretty easy to figure out what hash function is being used by a system.

Once they know what hash function you used, their only choice to break your system is to try to guess the passwords. That is, they can guess passwords, compute their hash codes, and search through your password file to see if any of the users password hashes matches. If they find one, they're gold!

In fact, there's a common strategy based on this idea called a rainbow table. A rainbox table is a list of common passwords, and the numeric value that they hash to with a common crptographic hash value. Something like:

pass 1b93eb12
abc 7a2d28fc
... ...

If you can somehow steal the passwords file, then with a rainbow table, you can find users with common passwords. For example, in the table above, you can see that the hashcode "7a2d28fc" occurs in the passwords file for the username "alice", and it's also in the rainbow table for the password "abc". So a thief could determing that alice's password was "abc". Even with the best crpytographic hash, all it takes is one idiot user who uses "password" as their password, and your system's security is breached.

Salting passwords addresses that problem. In a salting strategy, you don't hash a user's password by itself: you combine it with some additional data, and then hash that combination. The additional information is called the salt..

You can use lots of different things for the salt. There's a complex set of tradeoffs in the exact salting strategy, which are beyond the scope of this post, but a few examples include:

1. Always use a fixed salt string. This is weak, but better than nothing. It's got a similar weakness to the encrypted password system: you only need one salt to give you a handle on breaking all of the passwords, and that one salt needs to be in the system.
2. Add a random piece of data for each password. The catch here is that you need to store the salt data for each password. This is what unix passwords used to use. They added 12 random bits to each password. In the passwords file, they stored the salt and the hashed password. The weakness of this is that the salt is right there with the password entry. But because each user has a different salt, that means that any attempt to breach the system needs to look at each user separately.
3. Salt on metadata: that is, take information about the user that isn't part of their username, and use that as the salt. For example, you could use a person's birthday as the salt for their account.

If each user has a different salt, then even if you've got terrible passwords, a thief needs to do a lot of work to try to break your system. Even with a rainbow-table like strategy, they can't compute the hashcode for a given common password once, and then search the password hash list for that code - they need to recompute it for each possible salt value!

What salting does is, effectively, increase the amount of effort needed to break the passwords. If you add 12 bits of salt, then a rainbow table needs 4096 times more entries to find common passwords! If your salt is long enough, then it can make it effectively impossible to create a rainbox table at all. If they try to attack you without a rainbow table, a 12 bit salt means that your attacker needs to attack the passwords of each of your users seperately! Even if they know the value of the salt, you've made it much harder for them to breach your security.

## New Dimensions of Crackpottery

(by MarkCC) Feb 26 2013

I have, in the past, ranted about how people abuse the word "dimension", but it's been a long time. One of my followers on twitter sent me a link to a remarkable piece of crackpottery which is a great example of how people simply do not understand what dimensions are.

There are several ways of defining "dimension" mathematically, but they all come back to one basic concept. A dimension an abstract concept of a direction. We can use the number of dimensions in a space as a way of measuring properties of that space, but those properties all come back to the concept of direction. A dimension is neither a place nor a state of being: it is a direction.

Imagine that you're sitting in an abstract space. You're at one point. There's another point that I want you to go to. In order to uniquely identify your destination, how many directions do I need to mention?

If the space is a line, you only need one: I need to tell you the distance. There's only one possible direction that you can go, so all I need to tell you is how far. Since you only need one direction, the line is one-dimensional.

If the line is a plane, then I need to tell you two things. I could do that by saying "go right three steps then up 4 steps", or I could say "turn 53 degrees clockwise, and then walk forward 5 steps." But there's no way I can tell you how to get to your destination with less than two directions. You need two directions, so the plane is two dimensional.

If the space is the interior of a cube, then you'll need three directions, which means that the cube is three dimensional.

On to the crackpottery!

E=mc2 represents a translation across dimensions, from energy to matter.

No, it does not. Energy and matter are not dimensions. is a statement about the fundamental relation between energy and matter, not a statement about dimensions. Our universe could be 2 dimensional, 3 dimensional, 4 dimensional, or 22 dimensional: relativity would still mean the same thing, and it's not a statement about a "translation across dimensions".

Energy can travel at the speed of light, and as Special Relativity tells us, from the perspective of light speed it takes no time to travel any distance. In this way, energy is not bound by time and space the way matter is. Therefore, it is in a way five-dimensional, or beyond time.

Bzzt, no.

Energy does not travel. Light travels, and light can transmit energy, but light isn't energy. Or, from another perspective, light is energy: but so is everything else. Matter and energy are the same thing.

From the perspective of light speed time most certainly does pass, and it does take plenty of time to travel a distance. Light takes roughly 6 minutes to get from the sun to the earth. What our intrepid author is trying to talk about here is the idea of time dilation. Time dilation describes the behavior of particles with mass when they move at high speeds. As a massive particle moves faster and approaches the speed of light, the mass of the particle increases, and the particle's experience of time slows. If you could accelerate a massive particle to the speed of light, its mass would become infinite, and time would stop for the particle. "If" is the key word there: it can't. It would require an infinite amount of energy to accelerate it to the speed of light.

But light has no mass. Relativity describes a strange property of the universe, which is hard to wrap your head around. Light always moves at the same speed, no matter your perspective. Take two spacecraft in outer space, which are completely stationary relative to each other. Shine a laser from one, and measure how long it takes for the light to get to the other. How fast is it going? Roughly 186,000 miles/second. Now, start one ship moving away from the other at half the speed of light. Repeat the experiment. One ship is moving away from the other at a speed of 93,000 miles/second. From the perspective of the moving ship, how fast is the light moving away from it towards the other ship? 186,000 miles/second. From the perspective of the stationary ship, how fast is the laser light approaching it? 186,000 miles/second.

It's not that there's some magic thing about light that makes it move while time stops for it. Light is massless, so it can move at the speed of light. Time dilation doesn't apply because it has no mass.

But even if that weren't the case, that's got nothing to do with dimensionality. Dimensionality is a direction: what does this rubbish have to do with the different directions that light can move in? Absolutely nothing: the way he's using the word "dimension" has nothing to do with what dimensions mean.

All “objects” or instances of matter are time-bound; they change, or die, or dissolve, or evaporate. Because they are subject to time, objects can be called four-dimensional.

Nope.

Everything in our universe is subject to time, because time is one of the dimensions in our universe. Time is a direction that we move. We don't have direct control over it - but it's still a direction. When and where did I write this blog post compared to where I am when you're reading it? The only way you can specify that is by saying how far my position has changed in four directions: 3 spatial directions, and time. Time is a dimension, and everything in our universe needs to consider it, because you can't specify anything in our universe without all four dimensions.

The enormous energy that can be released from a tiny object (as in an atomic bomb) demonstrates the role dimensions play in constructing reality.

No: the enormous energy that can be released from a tiny object demonstrates the fact that a small quantity of matter is equivalent to a large quantity of energy. As you'd expect if you look at that original equation: . A gram of mass - something the size of a paperclip - is equivalent to about 25 million kilowatt-hours of energy - or more than the total yearly energy use of 1,200 average americans. That's damned impressive and profound, without needing to draw in any mangled notions of dimensions or magical dimensional powers.

Higher dimensions are mind-blowingly powerful; even infinitely so. Such power is so expansive that it can’t have form, definition, or identity, like a ball of uranium or a human being, without finding expression in lower dimensions. The limitations of time and space allow infinite power to do something other than constantly annihilate itself.

Do I even need to respond to this?

Einstein’s equation E=mc2 bridges the fourth and the fifth dimensions, expressed as matter and energy. Imagine a discovery that bridges expressions of the fifth and sixth dimensions, such as energy and consciousness. Consciousness has the five-dimensional qualities of energy, but it can’t be “spent” in the way energy can because it doesn’t change form the way energy does. Therefore, it’s limitless.

And now we move from crackpottery to mysticism. Einstein's mass-energy equation doesn't bridge dimensions, and dimensionality has nothing do with mass-energy equivalence. And now our crackpot friend suddenly throws in another claim, that consciousness is the sixth dimension? Or consciousness is the bridge between the fifth and sixth dimensions? It's hard to figure out just what he's saying here, except for the fact that it's got nothing to do with actual dimensions.

Is there a sixth dimension? Who knows? According to some modern theories, our universe actually has many more than the 4 dimensions that we directly experience. There could be 6 or 10 or 20 dimensions. But if there are, those dimensions are just other directions that things can move. They're not abstract concepts like "consciousness".

And of course, this is also remarkably sloppy logic:

1. Consciousness has the 5-dimensional qualities of energy
2. Consciousness can't be spent.
3. Consciousness can't change form.
4. Therefore consciousness is unlimited.

The first three statements are just blind assertions, given without evidence or argument. The fourth is presented as a conclusion drawn from the first three - but it's a non-sequitur. There's no real way to conclude the last statement given the first three. Even if you give him all the rope in the world, and accept those three statements as axioms - it's still garbage.

Older posts »

• Scientopia Blogs