As an avid reader of Language Log, my interest was recently piqued by a commenter asking for a linguist's eye-view on the "Knobe Effect":
"Speaking of Joshua Knobe, has any linguist looked into the Knobe Effect? The questionnaire findings are always passed off as evidence for some special philosophical character inherent in certain concepts like intentionality or happiness. I'd be interested in a linguist's take. If I had to guess, I'd say the experimenters have merely found some (elegant and) subtle polysemic distinctions that some words have. As in, 'intend' could mean different things depending on whether the questionnaire-taker believes blameworthiness or praiseworthiness to be the salient question. Or 'happy' could mean 'glad' in one context but 'wholesome' in another, etc…"
Asking for an opinion, eh? When do I not have an opinion? (To be fair, it happens more than you might expect).
But of course, I do have an opinion on this, and it's not quite the same as the one articulated by Edge. This post is a long one, so let me offer a teaser by saying that the questions at stake in this are : What is experimental philosophy and is it new? How does the language we speak both encode and subsequently shape our moral understanding? How can manipulating someone's linguistic expectations change their reasoning? And what can we learn about all these questions by productively plumbing the archives of everyday speech?
For those who are not familiar, Joshua Knobe is an up-and-coming 'experimental philosopher' at Yale, and is well-known for his experimental work looking at how we interpret a person's actions depending on linguistic context. The idea underpinning his approach is that we can better understand philosophical concepts if we look at how people use and respond to them in practice. Many of these experiments focus on intentionality : i.e., in what contexts do we say that a person acted intentionally, and in what contexts unintentionally? Based on these findings, Josh wants to claim that he has discovered something 'deep' about the nature of theory of mind, intentional action, and moral judgment. But has he? I'd argue that he's discovered something about how we use certain words and what we take them to mean. Is that deep? Perhaps! Read on -- and you tell me.
There is one thing I'd say first though, which is that while Josh's approach is widely taken to be innovative or revolutionary, it's almost certainly not. Wittgenstein proposed this method of investigation in the 1930's, and Chomsky roundly denied that linguistic research could tell us anything about these kinds of 'philosophical' questions in the 1960's, in response to an enthusiastic outburst by Zeno Vendler. (See also J.L. Austin and Gilbert Rile) So it's been 'a thing' -- or a 'thing to contest' -- for a while now. And I'm not sure it's even proper to say that it's just now experiencing a resurgence; any cognitive scientist working on the nature of language, language learning, or concepts, is almost certainly engaging with philosophy (and philosophical questions) in a similar, experimental way.
When philosophers use a word -- "knowledge," "being," "object," "I," "proposition," "name" -- and try to grasp the essence of the thing, one must always ask oneself : is the word ever actually used in this way in the language-game which is its original home? What we do is to bring words back from their metaphysical to their everyday use. --Wittgenstein, Philosophical Investigations
That quibble aside, Josh's is certainly a clever means of investigation, and one that I would endorse. It's really the various conclusions that Josh entertains that I find a mite strange.
This is because I think there is a fairly simple (and mechanistic) explanation for the "Knobe Effect." The effects he gets in his experiments come straight out of corpus data (i.e., reams of data about how we use words in everyday speech). That may sound like so-much Greek to you, so let me first begin by giving you some examples of the kinds of experiments Josh conducts.
In one of his best known papers, he asked bystanders in a public park to read one of the following two stories.
"The vice-president of a company went to the chairman of the board and said, ‘We are thinking of starting a new program. It will help us increase profits, but it will also harm the environment.’ The chairman of the board answered, ‘I don’t care at all about harming the environment. I just want to make as much profit as I can. Let’s start the new program.’ They started the new program. Sure enough, the environment was harmed."
"The vice-president of a company went to the chairman of the board and said, ‘We are thinking of starting a new program. It will help us increase profits, and it will also help the environment.’ The chairman of the board answered, ‘I don’t care at all about helping the environment. I just want to make as much profit as I can. Let’s start the new program.’ They started the new program. Sure enough, the environment was helped."
Josh then asked the bystanders to rate whether the chairman had intentionally harmed / helped the environment on a scale of 0-6. The results were intriguing: Those who read the 'harm' story were eager to blame the chairman for the ill-effects wrought on the environment (and tended to agree that he had 'intentionally' harmed the environment). Those who read the 'help' story, on the other hand, weren't eager to award the chairman credit (and tended to agree that he hadn't 'intentionally' helped the environment). In other words, even though the stories were virtually identical, intentionality was far more likely to be attributed to the chairman in the 'harm' situation than in the 'help' situation. (The differences were highly significant; p <.001)
Josh has since replicated this effect in several similar studies. The question is -- what does this tell us? In the original experiment, Josh concluded mildly that
"there seems to be an asymmetry whereby people are considerably more willing to blame the agent for bad side-effects than to praise the agent for good side-effects. And this asymmetry in people’s assignment of praise and blame may be at the root of the corresponding asymmetry in people’s application of the concept intentional: namely, that they seem considerably more willing to say that a side-effect was brought about intentionally when they regard that side-effect as bad than when they regard it as good."
Not much I disagree with there; but we'll get to his more ambitious conclusions in a moment. First, though, let's take a detour --
A Complimentary Vein of Research
At the time that I became familiar with Knobe's research, I was hard at work on an article with Prof Plum and Prof Teenie Matlock about lexical priming effects on temporal reasoning. --Which is a fancy way of saying that we were looking at how we could mess with people's understanding of time by having them read different motion verbs in context.
The experimental design was similar in style to Knobe's. In four experiments, we had people read a series of virtually identical 'priming' sentences and then had them answer the following question : "Next Wednesday's meeting has been moved ahead two days. Which day is it on now?" (Stop. Pause for a moment. Which day is it?)
Trick question. This question is ambiguous because the answer could either be Monday or Friday, depending on how you're conceptualizing time.
Don't believe me? Try this on for size:
First, imagine that time is like a conveyor belt and that you are standing in place, as it slips past you. If you conceive of time this way, then the future is a point up ahead, moving ever closer to you. Adopt this perspective, and you'll almost surely answer Monday because moving time 'ahead' will move it closer towards where you are now.
Now imagine that the conveyor belt stops, and you are the one moving through time. In this case, moving 'ahead' in time will move you farther along into the future. If you imagine it this way, then you should answer Friday.
These are known as "time-moving" and "ego-moving" metaphors, respectively. See now?
(Don't worry if you don't right away. It's kind of like those ambiguous duck / rabbit pictures; it can be hard to see one once you've spotted the other.)
But back to the experiment -- we know that when faced with that ambiguous Wednesday question, participants split pretty evenly between Monday and Friday. What we wanted to see was whether we could push our participants toward one answer or the other, simply by having them read a short series of sentences first. As in the Knobe experiments, the sentences were virtually identical but differed on one critical word.
In a particularly illustrative test of this, we had participants read sentences that differed on the motion verb -- for some it was "comes," for others "goes" (e.g., "The road comes all the way from New York" versus "The road goes all the way to New York"). They then answered the ambiguous time question. What we found was that participants who read "comes" skewed heavily toward Monday responses, whereas participants who read "goes" skewed heavily toward Friday. --Which was exactly what we expected.
The question is : why? Or perhaps -- how? How were we able to manipulate their temporal reasoning in this way?
Simple! By manipulating their expectations.
In English, when we use the word "comes," we typically follow it up with 'past-looking' words, like sooner or before (e.g., "I hope summer comes early this year" or "My station comes before hers"). But with "goes" we tend toward just the opposite pattern; we favor 'future-looking' words like after or later ("he goes later in the heat" or "her talk goes after his"). If we look at the words that follow "comes" and "goes" in a large record of human writing and speech -- say, the Contemporary Corpus of American English (COCA) -- then we find that these differences aren't trivial in English. While the log ratio of future to past-looking words for "goes" is 2:1, for "comes" the trend reverses, with a log ratio of 1:2. In plain English, this means that "goes" is future-biased, while "comes" is past-biased.
Still, you might be left wondering -- why should our participants be sensitive to this information?
Over the past couple of years, there have been a number of truly groundbreaking studies showing that we probabilistically track how words are used in speech (we know whether this word is used with these words or with those, and to what extent these and not others). These 'usage patterns' inform our expectations about what's coming next in reading or conversation. So, for example, if I strike up a new topic by asking, "What are the effects of doing ___?" you've already begun probabilistically anticipating words like "drugs," "business" and "exercise," (and I can bet you're not thinking words like "laundry," "research" or "kung fu"). Or, to give another example, should I say, "I'm coming down with a ___," then you'll be expecting words like "flu," "cold," "virus, "and so on.
We know that people are able to track this kind of information, because if we throw in a curveball and say "What are the effects of doing Youtube-style videos?" or "I'm coming down with a bottle of beer later," then we can watch their brains registering a pretty mean N400 (surprise!) spike, as they work a bit harder to compute what was just said. (Impressively, the size of the neural spike correlates well with people's reported expectations about what should fill that slot.)
Flipping back to the 'time' research for a minute, it's pretty clear that we should be sensitive to the words that tend to follow a given word, and even to whether that word tends to hang out with future-looking words or past-looking words (i.e., whether it's future or past biased) From this vantage point, what the "comes / goes" experiment is telling us, is that we can cleverly push our participants towards "Monday" or "Friday" responses simply by manipulating their linguistic expectations. Get them thinking "past" they'll answer Monday; get them thinking "future," they'll answer Friday. Now, if you're not impressed by this example --which may seem rather intuitive, on inspection-- you may be more impressed by the next experiment we did.
In that experiment, we had participants read a single sentence ("X trees run along the edge of a driveway") before answering the ambiguous Wednesday question. The only word that differed between participants was 'X', which was always a number word (either four, eight, ten, eleven, twelve, nineteen, twenty, over eighty, 0r a hundred). Because number words are frequently used with time words, and because time words are future-biased, we expected that all of these words would bias participants toward a Friday response. The question is whether we could account for the strength of that bias.
So Prof Plum built some handy models of what linguistic expectation might look like using COCA, and then tested the fit with our priming data.
To examine whether lexical prediction might provide a [good account for our data], we compared the responses provided by our participants given the various number primes to our models of the linguistic expectations the numbers could be expected to produce. As we expected, there was a good fit between the predicted priming and the degree of bias exhibited in the empirical data. The predictions of Model 1, in which we sought to account for the way that different time words might be expected to produce different degrees of future bias, correlated well with the pattern of data produced by our participants (r=.76, t=3.09, p<0.01). Perhaps surprisingly, however, the simpler model (2) performed even better (r=.92, t=5.46, p<0.001). Indeed... if only the proportion of time words amongst the 10 most frequent words in the distribution following each number word is considered, this correlation increases (r=.96, t=8.05, p<0.0001).
(Won't lie -- I do find that r=.96 to be mildly satisfactory).
The bottom line is : We have implicit knowledge of how words are used in speech that comes from a lifetime of exposure to the distributional patterns of words in our language. In reading or listening, this knowledge shapes our expectations and understanding of what we're taking in. As scientists, we can subtly, or even powerfully, manipulate people's expectations and understanding of a given question or task by using particular words in particular ways. And even better, we can predict -- with no little accuracy!-- the effect this is going to have on our subjects.
(Didn't advertisers figure this out a long time ago?)
Returning, then, to Knobe Effects
Having just finished the write-up of these findings when I read the "harm / help" paper, I was immediately struck by the thought -- what if Knobe effects can be explained in a similar way?
Curious, Prof Plum and I set out to investigate. To do so, we started out with a simple question -- how do we talk about 'help' or 'harm' in terms of intent? To get a measure of this, we looked at the words preceding 'help' and 'harm' in COCA, and calculated how many of these words expressed intent. (Intentional words include 'will,' 'can,' 'may,' 'would,' 'could,' 'should' and so on). We wanted to know how often we say things like "I could help her" versus "He will harm the project."
The difference was striking (and optimism-inspiring!):
It turns out that we talk far more about intentionally helping than intentionally harming or hurting (another word we tested).
To be specific : if we account for frequency differences between 'help' and 'harm,' the proportional difference is about three-fold; if we simply look at raw frequency counts, that number skyrockets to a fifty-odd difference. Taking either figure into account, the fact is that we have considerably more practice talking about "help" in an explicitly intentional way than we do "harm."
At first glance, this might seem like a surprising result. If we often talk about 'help' in an intentional way, then why didn't we interpret the chairman's actions as intentionally helpful in Knobe's scenario? Well -- quite simply -- it's because we have more practice with it. We know that 'help' is often used intentionally in certain contexts, which clearly don't match the scenario Knobe spells out. With "hurt," on the other hand, we have far less practice with talking about it in explicitly intentional contexts. This means that we should be less discriminating in how we apply it.
But of course, there's more to it than that. It's also the case that we're far more likely to mention that something is unintentional when it's produced ill-effect.
Here are the results of a simple Google search comparing usage of "didn't mean to..." with "meant to..."
We can see from usage of "didn't mean to..." that when we're emphasizing that something is unintentional, we're far more likely to be talking about hurt / harm than help / support. (Just to make the contrast even more obvious : the phrase "didn't mean to hurt" captures fully 5% of all usages of the word 'hurt,' whereas "didn't mean to help" accounts for just .001% of usages of 'help.') Looking back to the chart -- notice how this trend does an about face when we look at usage of "meant to..."!
Even this chart doesn't fully capture the differences though. As I was double-checking the counts, the song "Do you really want to hurt me?" came to mind. Hmm. Do I? What would happen to the frequency counts if I took out all instances of "never meant to," "not meant to," "wasn't meant to," and "weren't meant to"?
Wow. So once we clean that up, the "meant to hurt" counts drop by almost 80%, while "meant to help" count drops by, oh, a respectable 4%. This data is screaming : intentionality is the bastion of aid, not injury. Or, to put it somewhat differently : we like to take credit when a good act is intentional, and, equally, we like to avoid blame when a bad act is unintentional. In either case, we're apt to point it out!
All of which brings us back to Knobe's experiments. There is one, absolutely vital thing to note there : neither intentionality nor unintentionality is made explicit in the scenarios he presents.
What's so delightful about this, is how our expectations of what should be made explicit drive us in opposite directions in the case of 'help' versus 'harm.' So, with regards the "chairman-environment" scenario, here's the rub:
--Since the good act hasn't been explicitly noted as intentional, we're unlikely to see it as intentional, since we expect to see that attribution when it's relevant.
--At the same time, since the bad act hasn't been obviously noted as unintentional, we're unlikely to see it as unintentional, since, once again, we would expect to see that clarified if it were!
This provides one alternative way of accounting for Knobe's data. Of course, you might think (rightly) that this is simply a re-description of his findings in terms of word meaning and use. For instance, he says :
By systematically varying aspects of the vignettes, researchers can determine which factors are influencing people’s intuitions. It can thereby be shown that these intuitions show a systematic sensitivity to moral considerations.
In fairness, there is nothing I disagree with here -- 'Help' and 'harm' are, to my understanding, 'moral' words. So if their usage reflects moral considerations, my analysis (and Knobe's conclusions) are entirely consistent.
However, I would press that by offering a more mechanistic account of what is happening in this experiment, we can clear away some of the confusion about what is motivating these results. For example, Knobe lays out these possible explanations for his data:
One view holds that the emotional reactions triggered by morally bad behaviors can distort people’s theory-of-mind judgments. A second view holds that moral considerations play a role in the pragmatics of people’s use of certain terms but not in the semantics of their theory-of-mind concepts. Finally, a third view holds that moral considerations truly do play a role in the fundamental competence underlying people’s theory-of-mind capacities.
To this, he adds a fourth possibility:
These results provide at least tentative support for the view that the effects can emerge even in the absence of emotional responses, and some researchers have recently suggested that the effects might be due, not to an emotional bias, but rather to an innate, domain-specific ‘moral grammar.’
Of course, this is Hauser's speculation, not Knobe's. But -- an innate, domain-specific moral grammar? I certainly wouldn't draw that conclusion from all this.
More seriously, the analysis I've offered suggests something very simple : that the way we understand and attribute intentionality has to do with our particular linguistic expectations about how "good" or "bad" actions will be described within a given narrative frame. Far from being 'innate' or 'universal,' these expectations may be (and are even likely to be) culturally contingent. Caitlin Fausey, a graduate student wunderkind, formerly of Lera Boroditsky's lab, has done some fascinating work on how causal attribution and blame differs between languages, such as English, Spanish and Japanese. Her work shows, quite clearly, that the way we use language to describe events can significantly impact our reasoning about (and memory for) them.
From the perspective of -- well, this armchair at least -- Knobe's work can be seen as adding to that literature; a curious and important contribution, no doubt. His results also tell us something about how the language we speak both encodes, and subsequently shapes, our culture's moral perspectives and expectations.
Is this deep?
I'd say it's unexpectedly so.
Knobe, J. (2003). Intentional action and side effects in ordinary language. Analysis, 63 (279), 190-194 DOI: 10.1111/1467-8284.00419
Knobe, J. (2005). Theory of mind and moral cognition: exploring the connections. Trends in Cognitive Sciences, 9 (8), 357-359 DOI: 10.1016/j.tics.2005.06.011
Ramscar, M., Matlock, T., & Dye, M. (2010). Running down the clock: the role of expectation in our understanding of time and motion. Language and Cognitive Processes, 25 (5), 589-615
1. The Chomsky-Foucault Debate (Hat tip to Ben H.)
2. The following papers on the "time" experiments : the pioneers, McGlone & Harding (1998); see also Boroditsky (2000); Boroditsky & Ramscar (2002); Matlock, Ramscar, & Boroditsky (2005). There are many more besides; Ramscar, Matlock & Dye (2010) provides a brief review.