The question is : are you dumber than a rat?

Many developmental psychologists buy into an argument that suggests that children are dumber than rats.  Should you?


Human cognition is geared towards the central task of predicting the world around it.  As you may remember from an earlier post I did on the A-not-B task in infants, children aren't born understanding causal relationships right off the bat -- as a kid, you need to learn that when batter goes into the oven, it comes out as cake; when a dog jumps in water, it comes out wet; and when a shaggy-dog runs dripping through the house, mommy gets mad. As an adult, prediction operates in just about everything you do, from how much you drink at a party (who do you really want to be going home with?) to how hard you push down on the breaks (how fast do you need the car to stop?) to what you think I'm going to say next (yep, there's lots of evidence that you're predicting my words in a manner not wholly unlike Google auto-complete).

One thing that matters immensely in all of this is informativity.  There are many illusory correlations in the world that you might forge -- how do you establish the causal links that matter and are meaningful?

A simple way to begin answering that question is by asking -- how do other animals do it?  The behavior of rats in conditioning experiments proves illustrative.  Say you take a rat, and every so often, you play a piano tone and give it a little **zap**! Pretty quickly, the rat will begin to react fearfully whenever it hears the tone, because it's predicting the upcoming shock.  (Not too hard to learn that one, eh?)  But next, let's say that you give another rat the same number of tone-shock pairings, but this time, you also occasionally throw in the odd note without shocking the rat.  This rat won't be as skittish when it hears the tone sound, because the tone doesn't necessarily predict a shock.  In line with this, the more you increase the number of tones-without-shocks, the less the rat will fear the tone.  This is because you've introduced 'noise' into the signal, making the tone less informative about fur-singing jolts.  (This example comes courtesy of Bob Rescorla and one Prof Plum, who is ever so fond of mentioning it.)

The bottom line is that if a rat is trying to establish when to jump, it's going to want to be tracking both positive evidence (tone and shock together) and negative evidence (tone without shock).  And this becomes doubly important if you complicate the learning problem, and introduce various types of tones and shocks, and relationships between the two.  The important thing is : rats can do this kind of learning without any trouble at all. What might surprise you is that there's a huge debate in psychology over whether people can.

To be fair, this debate only occurs in language. Learning theoretic models are used widely across many disciplines in psychology and neuroscience, but are conspicuously absent from mainstream research into language acquisition. Why? Well -- there are Chomsky's many arguments that language cannot be learned from the input (that could be part of it). And in addition, there is this delightful argument that Steven Pinker and colleagues like to tout, which is misleadingly called the "logical problem of language acquisition" [1].

I'm likely going to devote a series of posts to that particular 'problem,' at some point -- but in short, the argument is that because children early on make grammatical 'mistakes' in their speech (e.g., saying 'mouses' instead of 'mice' or 'go-ed' instead of 'went'), and because they do not receive much in the way of corrective feedback from their parents (apparently no parent ever says "No, Johnny, for the last time it's MICE"), it must therefore be impossible to explain how children ever learn to correct these errors. How -- ask the psychologists -- could little Johnny ever possibly 'unlearn' these mistakes? This supposed puzzle is taken by many in developmental psychology to be one of a suite of arguments that have effectively disproved the idea that language can be learned without an innate grammar [2].

As Pinker wrote, rather famously :

"The implications of the lack of negative evidence for children’s overgeneralization are central to any discussion of learning, nativist or empiricist.” (Pinker, 2004)

This statement is, quite frankly, ridiculous, and belies a complete lack of understanding of basic human learning mechanisms. (And you thought 'igon values' was bad...!)

To help you understand why, let's start off by making the (uncontroversial) assumption that -- like other young animals -- little kids are trying to figure out just what in their environment is informative, so they can better grasp (and predict) the workings of the world around them.  It's easy to see how this pursuit might readily lend itself  to language learning, since the more predictable upcoming speech is, the easier it is to make sense of [3].  Indeed, it seems as though figuring out what things in the world predict which words, and which words predict which other words, would be a pretty fundamental aspect of what learning a language is all about.

In line with this, there's a growing body of evidence suggesting that expectation and prediction operate in both linguistic processing and production.  So, if you're listening to someone speak, you are predicting --probabilistically-- what they're going to say next (your brain is like Google Instant on crack).  For example, if I say "hit the nail on the..." you can fill in head, and if I say "I'm coming down with a...", you can predict cold -- flu -- fever -- and so on, with varying degrees of certainty.  What's more, the more you hear a word occupy a given context, the more strongly you will predict it in that context in the future (DeLong, Urbach & Kutas, 2005).

So -- we know rats do it.  We know adult (humans) do it.  But how does prediction help kids?

Well, let's say for example that a child is trying to learn to figure out what the heck 'door' means.  At first glance, this looks pretty difficult.  For one thing, people don't usually go up to doors and point them out, Vanna-White style.  "This, my darling, is a door."  Nor is it the case that every time a door is in full view, the kid hears the word 'door.' (They're more apt to hear "Hi honey, I'm home!" or "Solicitors?  Lord, not again!")

But here's precisely where informativity becomes important.  Because even with a noisy signal, a child can use both positive and negative evidence to disambiguate which things in the world best predict the word 'doors' (namely, doors).  Turns out that 'Honey,' 'home' and 'solicitors' will all get used in contexts where there are no doors to be found.  For example, mom might call dad 'Honey' at the movie theater or dad might call a 'solicitor' a dirty word over the phone.  If the child had originally anticipated that the word 'door' might be used in any of those contexts, that idea will fast be binned; prediction-error will teach the child otherwise.  It's like -- huh -- I was expecting something to happen, but it didn't. Quick, I need to revise those expectations! [4]  (Again, it's like the rat failing to be shocked -- when something fails to happen, that counts as evidence too [5]).

Prediction and prediction-error will also be helpful as children learn how to use words.  For example, if a child is learning how to talk about plural things (like rats, cats and so on), there will initially be a lot of evidence that groups-of-things take a +s ending.  (There are far more regular plural nouns in English than irregular plurals, so the vast majority of the input will suggest this).  It's no surprise then, that children initially 'over-regularize' plural words, and end up saying things like 'mouses' and 'gooses.'  However, prediction can help children learn that 'mice' and 'geese' are actually preferred.  How?  Quite simply -- if the child is expecting 'mouses' or 'gooses,' her expectations will be violated every time she hears 'mice' and 'geese' instead.  And clearly that will happen a lot.  Over time, this will so weaken her expectation of 'mouses' and 'gooses,' that she will stop producing these kinds of words in context.

I should emphasize that I'm not just saying this is possible 'in theory.'  The hive-mind in my lab --led by the buzzing Prof Plum-- has actually modeled learning in these kinds of word-learning scenarios and shown, in some pretty elegant behavioral experiments, that kids behave exactly as these learning models would predict.

But those eastern seaboard psychologists are having none of it.  Evidence -- theory -- logic aside, Pinker (and others of his ilk) would claim that negative evidence simply doesn't exist!  --and that even if it does, children simply can't learn from it.

This is a, erm, puzzling conclusion to draw.  So --   we know that rats use prediction-error to learn -- and, what's more, it's pretty obvious that people do too; it's not as if we somehow fail to notice when something that we expect will happen doesn't.

The joke Prof Plum always tells about this is: imagine you were fixed up on a date and you went to the restaurant and the date didn't show. Would you need the waiter to come tell you that your date hadn't arrived? Or might you not notice yourself that something was amiss?

(You get the point.)

Yet for all this,what learning-denialists developmental psychologists would have you believe is that even though you use predictive learning mechanisms every day of your life -- to successfully navigate a busy sidewalk, to add just the right amount of milk to your coffee, and to understand what your friend is saying over a choppy cellphone signal -- even though all these things are undeniably true -- children, you should know, can use none of these mechanisms in learning language.  No.  Because when it comes to learning language, children, it would seem, are dumber than rats...

Well, I'm so glad we got that cleared up!

What I'm left wondering is how we somehow evolved a gene that specifies the workings of a complex innate grammar, while simultaneously switching off all our general-purpose learning mechanisms -- for language and children only.  Fancy trick, that one.

[Access the follow-up discussion to this post here.]

In Support of...

"Evidently, the language faculty incorporates quite specific principles that lie well beyond any "general learning mechanisms," and there is good reason to suppose that it is only one of a number of such special faculties of mind. It is, in fact, doubtful that "general learning mechanisms," if they exist, play a major part in the growth of our systems of knowledge and belief about the world in which we live--our cognitive systems. ...It is fair to say that in any domain in which we have any understanding about the matter, specific and often highly structured capacities enter into the acquisition and use of belief and knowledge."

--Noam Chomsky in The Managua Lectures

[1]  There are an array of similar 'Poverty of the Stimulus' arguments which follow this reasoning (including, perhaps most notably, 'Baker's paradox').

[2]  These kinds of arguments have the same creationist-flare that early Chomsky arguments did : It's 'impossible,' they say, it could 'only work this way.'  Is that science?  Or dogma?

[3]  In the past, you may have had the experience of reading a highly technical text or listening to a very dense talk that you had the worst time trying to follow.  Even if the words in play were ones that you were relatively familiar with, they may have been used in ways that were completely unfamiliar and unexpected, rendering them virtually incomprehensible.  On the flip side, you may have had the experience of listening to something so predictable (and boring) that it put you to sleep.  From the perspective of information theory, one of the aims of communication is to effectively manage the amount of 'uncertainty' (or 'entropy') in what's being communicated, such that the message is predictable enough to be understood, but not so predictable as to be boring.

[4] I phrase this actively, but we think that this kind of learning isn't conscious -- it's implicit.

[5] You might reasonably think that listening to language and quivering in anticipation of a shock seem like mighty different things -- they certainly are. But there's good neural evidence that our brains' reward systems respond similarly to surprise in our environment, whether it be an unexpected musical phrasing in a concerto, or an amusing word in context, or -- woe be the rat -- a painful and surprising sensation. The learning rate tends to look a lot different, of course...

Citing Scholars

Attributions where attributions due: While the opinions discussed in this post are my own, many of the ideas -- about informativity and negative evidence -- are properly attributed to Michael Ramscar (the diligent Prof Plum). I even occasionally steal his jokes, it's true... The idea that so-called implicit negative evidence -- or 'prediction-error' -- might function in this way in language learning has been discussed by, among others, Elman, 1991; Bates & Carnevale, 1993; Rhode & Plaut, 1999; Seidenberg & MacDonald, 1999; Lewis & Elman, 2001; Pullum & Scholz, 2002; Prinz, 2002; Ramscar, 2002; Ramscar & Yarlett, 2007; Cowie, 2003; Johnson, 2004; MacWhinney, 2004; Hahn & Oaksford, 2008. Here's Jeff Elman's wonderful insight :

“If we accept that prediction or anticipation plays a role in language learning, then this provides a partial solution to what has been called Baker's paradox... The paradox is that children apparently do not receive (or ignore, when they do) negative evidence in the process of language learning. Given their frequent tendency initially to over-generalize from positive data, it is not clear how children are able to retract the faulty over-generalizations... However, if we suppose that children make covert predictions about the speech they will hear from others, then failed predictions constitute an indirect source of negative evidence which could be used to refine and retract the scope of generalization” (Elman, 1991).

Of course, you'd never know that this literature existed reading Pinker, Marcus, et al. When it comes to allowing (or addressing) new information, the other side of the debate looks suspiciously like east Germany before the wall fell. As far as they're concerned -- and in terms of how they cite the literature -- none of what I've just written exists, and neither do any of the articles cited above.  (If you're going to play the Pinker, 2004 card -- don't.  He devotes a single line to prediction error and he gets it wrong).

Remember when Gladwell wrote that article about Pinker being out there all alone on the lonely ice floe of IQ fundamentalism? It's like that with language -- except it's not a lonely ice floe. It's Antarctica. And there's still many a cold and disenchanted developmental psychologist stranded there. [[Help!]] ;)

Reading Materials

Ramscar, M., Yarlett, D., Dye, M., Denny, K., & Thorpe, K. (2010). The Effects of Feature-Label-Order and their implications for symbolic learning. Cognitive Science, 34 (6), 909-957.

Ramscar, M., & Yarlett, D. (2007). Linguistic self-correction in the absence of feedback: A new approach to the logical problem of language acquisition. Cognitive Science, 31, 927-960

Rescorla, R. (1988). Pavlovian conditioning: It's not what you think it is. American Psychologist, 43 (3), 151-160 DOI: 10.1037//0003-066X.43.3.151

Ramscar, M., & Dye, M. (2009). Expectation and error distribution in language learning: the curious absence of mouses in adult speech. (under review)

33 responses so far

  • I can't speak for all Eastern Seaboard psychologists, but you're story about over-generalization of morphology is actually one that I (or Pinker!) would endorse. But I think you've mischaracterized the problem of negative evidence (and you definitely misquoted Pinker).

    • melodye says:

      Dear Mr. Hartshorne... If you want to comment credibly on my blog, you shouldn't selectively delete comments on your own that you can't respond to. This is the same sort of communist-block style censorship that I rail against above.

      What's in the water at Harvard these days, anyway?

      Here's a brief reply, with regards the LPLA :

      Although it is agreed that children learn that the regular form of English plurals involves adding a final sibilant, many linguists argue that morphology depends on innate rules. The claim is that while the particulars (content) of the rules for specific languages are “learned” (including, explicitly, the English –s), the operations of the rules themselves are constrained and structured by innate mechanisms (see e.g., Clahsen, 1999; Pinker, 1998; Marcus, Brinkmann, Clahsen, Wiese, & Pinker, 1995; Pinker & Prince, 1988, etc.).

      “The organization of morphology has implications for the acquisition of morphology. Understanding language acquisition requires specifying the innate mechanisms that accomplish language learning, and the language-particular information that these mechanisms learn. It has been fruitful to posit that the universal basic organization of grammar is inherent in the learning mechanisms, which are deployed to acquire the particular words and rules in a given language” (Kim, Marcus, Pinker, Hollander & Coppola, 1994, p. 174-5).

      “…focusing on a single rule of grammar [regular inflection], we find evidence for a system that is independent of real-world meaning, non-associative (unaffected by frequency and similarity), sensitive to abstract formal abstractions… more sophisticated than the kinds of "rules" that are explicitly taught, developing on a schedule not timed by environmental input, organized by principles that could not have been learned, possibly with a distinct neural substrate and genetic basis" (Pinker, 1991, p. 533).

      “A model of overregularization that my colleagues and I have proposed depends on the existence of mental rules…The key property of a rule appears to be its ability to treat all instances of a class equally, regardless of their degree of resemblance to stored forms. Rules…apply in an all-or-none fashion, applying to any item carrying the appropriate symbol. For example, the add –ed rule applies just as readily to any novel word carrying the symbol [verb]” (Marcus, 1995).

      "I do not exclude the possibility that high-frequency regular [plural]s are redundantly stored in the lexicon. And of course, a mechanism is necessary that blocks composition of the regular form in the presence of a listed irregular alternative." (Jackendoff, 2007, p10)

      Now, with regards how Pinker / Marcus / and so on define negative evidence :

      As we shall see, over-regularization is a prime example of one of the fundamental problems in understanding language learnability: how children avoid or unlearn errors in the absence of parental corrections.” (Marcus et al, 1992, p4)

      “As children acquire the English past tense system, they sometimes apply the regular past rule (add -ed) to irregular stems (e.g., go, make, or sing), thus producing erroneous past tense forms such as goed or maked. If negative evidence is not available, children must stop producing these forms through some internal (possibly linguistically specific) mechanism” (Marcus, 1993 p. 78).

      “The implications of the lack of negative evidence for children!s overgeneralization are central to any discussion of learning, nativist or empiricist” (Pinker, 2004, p 950).

      Next time, don't bring your fists to a gunfight ;)

      • Dude, take a chill pill. The comments you mention were flagged as spam by Blogger. I'll leave it to you to figure out why. And just so you know, putting smiley faces after ad hominem attacks doesn't make them cute.

        I take it these quotes above are supposed to have something to do with arguments about innate rules, but since none of them mention innate rules, I'm not sure how. Can you clarify?

        I'm particularly mystified by the Marcus 1993 quote. I thought you agreed that children don't learn to avoid over-regularization through explicit feedback from parents, and that you thought they use an internal mechanism instead. So what are you arguing against?

        And the final Pinker quote isn't about what you think it's about.

        In other words, you haven't addressed any of my comments yet. Actually, you haven't addressed any of the issues I brought up in the Hummel/Ramscar debate yet, either. I'd be willing to admit I was wrong, but you have to actually supply an argument. Name-calling is not the same thing.

        • Just so people don't have to go over to my blog to get the Pinker quote, here it is in all it's glory:

          "This nature–nurture dichotomy is also behind MacWhinney’s mistaken claim that the absence of negative evidence in language acquisition can be tied to Chomsky, nativism, or poverty-of-the-stimulus arguments. Chomsky (1965, p. 32) assumed that the child’s input ‘consist[s] of signals classified as sentences and nonsentences _’ – in other words, negative evidence. He also invokes indirect negative evidence (Chomsky, 1981). And he has never appealed to Gold’s theorems to support his claims about the innateness of language. In fact it was a staunch ANTI-nativist, Martin Braine (1971), who first noticed the lack of negative evidence in language acquisition, and another empiricist, Melissa Bowerman (1983, 1988), who repeatedly emphasized it. The implications of the lack of negative evidence for children’s overgeneralization are central to any discussion of learning, nativist or empiricist."

          For context, it is this statement that Melodye calls "quite frankly, ridiculous."

          • melodye says:

            If you notice above, I link explicitly to the Pinker paper with the quote -- so that anyone can read it. I also wrote above "If you’re going to play the Pinker, 2004 card — don’t. He devotes a single line to prediction error and he gets it wrong."

            What's so galling about Pinker's paper is that he acts as if that is the *extent* of the literature. Neither he, nor Chomsky, understand how implicit negative evidence (i.e., prediction error) can function in language learning. This is why Pinker is so dismissive of it.

            Additionally, there is a huge literature on this (cited above) that goes completely unmentioned. Pinker acts as if Bowerman had the last word in 1983, and as if Chomsky explored how negative evidence might operate in language learning in good faith. Neither of these things are true.

            I don't see any reason to cite this entire quote in context -- it's just as misleading as the one-liner.

        • melodye says:

          Comments that appear briefly on the site and then disappear aren't 'flagged as spam,' Josh. You're playing dirty, and I don't respect that.

          The word innate is in almost all of the quotes above. Your comment suggesting otherwise is a strange sort of non sequitur, as far as I can understand it.

          With regards the Marcus quote -- everyone agrees that children don't learn to correct their errors by way of explicit feedback. That much is obvious and taken for granted by researchers on both sides of the debate. The question is -- do they learn by way of implicit negative feedback (i.e., prediction-error)? According to Pinker, Chomsky, etc, implicit negative feedback cannot account for how children do this type of learning. From what I can gather, this conclusion arises out of a serious misunderstanding of how learning models work.

          Here is Pinker's one line response to 'prediction error' (and implicit negative evidence) :

          "Appeals to ‘ indirect negative evidence ’ are no better : children fail to hear goed, but they also fail to hear wugged, yet they have no problem generalizing from wug to wugged (nor do adults when they first hear spam and generalize to spammed)." (Pinker, 2004)

          This statement illustrates how little Pinker understands of learning theory. In learning theory, generalization constitutes a lack of discrimination, and there is heaps of evidence that young children generalize over lots of linguistic kinds and categories (for example, children may call all animals 'dogs' until they learn to discriminate dogs from lions, tigers and bears.) But over the course of learning, categories -- word usage -- etc -- becomes better discriminated. Such that children call cats cats and dogs dogs, and rats rats and plural mouse(s) mice (not mouses). This isn't a problem for learning theory. At all. But Pinker likes to pretend like it is.

          Additionally, Prof Plum has done quite a number of behavioral experiments which show that the kinds of generalization children do can be explained by way of semantics, and not, as Pinker claims, by supposed 'grammatical status.' Indeed, one of the papers that put Plum on the map was "The role of meaning in inflection: why the past tense does not require a rule," which argues against the dual-route account of inflectional morphology. Since then, we've replicated that experiment with reading time measures (a response to the recent Huang & Pinker paper) and we've given the same treatment to level-ordering theory (a seven-experiment, eighty page paper, accepted at Cog Psych). You might also note, that despite their protestations, Gordon & Miozzo (2007) is a replication of Ramscar (2002).

        • melodye says:

          I scooped this summary out of some of my notes. You might consider taking a look at their paper as well --

          Baayen and Moscoso del Prado Martin (2005) thoroughly and meticulously examined the merits of Kim et al’s claim, from the vantage point of lexical statistics. The pair set out to determine whether there were systematic semantic differences between regular and irregular verbs, and concluded that there is “a conspiracy of subtle probabilistic (graded) semantic distributional properties that lead to irregulars having somewhat different semantic properties compared to regulars” (p. 669). Specifically, they found that irregular verbs tend to have more meanings than regulars (i.e., greater ‘semantic density’) and that irregular verbs tend to cluster in semantic neighborhoods (i.e., a higher proportion of their closest synonyms tend to also be irregular). They also found that this occurs on a graded scale, with large subclasses of irregulars behaving more like regulars, as compared to smaller, more idiosyncratic subclasses.

          With these results to hand, Baayen and Moscoso del Prado Martin found that a wide range of empirical findings suggesting ‘dissociations’ between regulars and irregulars could be accounted for in terms of their respective distributional and semantic properties, without recourse to any putative grammatical rule. (Their analysis ranged from studies involving association norms and word-naming latencies, to those involving familiarity ratings and neuroimaging). The force of their work is clear: contrary to Kim et al.’s claim, meaning and form are interrelated, if perhaps not in any simple, collapsible, or deterministic manner. As Huang and Pinker acknowledge, their findings are “consistent with the [general] idea that people tend to generalize inflectional patterns among verbs with similar meanings.”

          • This makes me think that you might enjoy Johanna Barðdal's recent book _Productivity: Evidence from Case and Argument Structure in Icelandic_, which ought to be in the Stanford library, and should be ordered if it isn't. One of the glories of Icelandic is that its case marking patterns literally have irregularities nested inside irregularities, that an experiencer of a verb without a semantically higher-ranked argument (such as Agent) will most regularly be nominative, less regularlarly dative, least regularly accusative, and the Icelanders have been running around for about 30 years collecting data on how this works, is or isn't acquired, plus what's going on in the closely related language Faroese. It is, as Prof Plum would expect, highly keyed to semantics.

            http://www.benjamins.com/cgi-bin/t_bookview.cgi?bookid=CAL%208

  • William Idsardi says:

    Coincidentally, there was an article published today in the Journal of Neuroscience, http://www.jneurosci.org/cgi/content/short/30/38/12608 "The Basolateral Amygdala Is Critical for the Acquisition and Extinction of Associations between a Neutral Stimulus and a Learned Danger Signal But Not between Two Neutral Stimuli" which demonstrates that in rats the neural substrate for learning of fear-associated responses is distinct from that for learning of other (neutral) associations. So we should probably drop the tone-zap story as a lead-in for general association-forming abilities.

    • melodye says:

      A more pertinent question is -- why would anyone think that the amygdala would be involved in learning about neutral stimuli? It's usually activated when events evoke emotional reactions...

      It's true that the authors point out that the amygdala is important when learning about shocks (and not neutral stimuli), but nowhere do they conclude that the learning mechanisms for shocks and neutral stimuli are different. Far from it :

      "Acquisition and extinction of an association between two neutral stimuli also require NMDAr activation. However, the present results show that the acquisition and extinction of these associations do not require the BLA."

      What they've shown is that learning about shocks engages the amygdala in addition to the learning processes that are engaged by neutral stimuli.

      Here's a helpful paper about how learning works:

      Burke CJ, Tobler PN, Baddeley M, Schultz W. (2010) Neuronal mechanisms of observational learning. Proc Natl Acad Sci (USA) 107: 14431-14436

  • mo says:

    Do you think that if infants were exposed to the tone-shock paradigm they would not behave in the same way? I'm assuming you would; but so would anyone else, whether it's Pinker or Chomsky. Assuming this doesn't change the nature of the logical argument against negative evidence for some aspects of language, because the problem is not of unambiguous cases, where you supply the various elements to be combined; the problem is precisely HOW you know what the elements are to be combined. To give a reductio ad absurdum; if you trained rats with low energy cosmic ray bursts paired with shock, they would have no predictions for a trivial reason - they cannot detect cosmic radiation. Therefore it is not enough to just state the elements in the environment; why it's just those elements and not others needs explanation too.

    Let me put it in another way. If all that is needed is a sensitivity to negative evidence, as rats clearly have, why don't rats learn human language?

    • melodye says:

      I've actually written extensively on one reason that may begin explain why humans -- but not apes and other mammals -- are able to learn language : the slow development of the prefrontal cortex.

      You can access the popular science article "The Advantages of Being Helpless" at Scientific American Mind.

      Here are the relevant journal articles :

      Thompson-Schill, S., Ramscar, M., & Chrysikou, E. (2009) Cognition without control: when a little frontal lobe goes a long way. Current Directions in Psychological Science, 8(5), 259-263.

      Ramscar, M. & Gitcho, N. (2007) Developmental change and the nature of learning in childhood. Trends In Cognitive Science, 11(7), 274-279.

      More importantly -- the 'shock' example was meant to be illustrative. But rats (and other animals) can do this kind of learning even if what they're learning is far less salient or surprising. The question, again, is why do we think that human learners are somehow incapable of employing this kind of learning when it comes to language?

      If learning models can explain complex linguistic phenomena, then why do we need to posit all this built-in hardware?

      • mo says:

        Show me any writing by Pinker or Chomsky that says that the rat kind of learning is not possible with human infants. There is none. The reason is that this is orthogonal to the issue of the logical poverty of stimulus argument.

        Everything boils down to representation and process. If you don't have the right representations, you cannot get off the ground, as in the cosmic rays example. If you don't have the process, you cannot get off the ground (imagine no Hebbian process, for example).

        Whether or not Chomskyan/Pinkerian representations/processes are the appropriate ones is an empirical question. What is clear is that you need to discover these, and that they must contain SOMEthing that is human-specific. Merely by replacing "NP" with "cortical activation patterns that discriminate material objects from their motion" or anything of the sort, one can hardly be said to have circumvented the problem of representations/processes - merely provided a different hypothesis for what they might be.

        Again, whether or not these processes/representations are "purely linguistic" or derived from general cognitive principles is an empirical question. The answer cannot be 100% the latter - at the very least, it must be at least a unique combination of general cognition for the simple empirical fact that only we, as a species, have the fullness of human language.

        A Chomskyan language model IS also a learning model. And all models need hardware, whether the kinds that you prefer or the kinds Chomskyans toy with. So your last sentence doesn't make much sense.

        A more meta-question - should we rely on our findings on hardware design to completely limit our models of the software? It's like insisting that the only way to understand a programming language is by understanding how computer chips work. As a reductio ad absurdum, imagine if I insisted that any model that is not built up from quantum principles is useless. Not just the science of the mind, but I suspect a whole host of sciences would be barren.

        • melodye says:

          I think you should read my post : "A Thinking Machine : On Metaphors for Mind." You're working with a computational metaphor that is -- to my mind -- out of date.

          To respond more closely -- if I can use a model of rat learning to explain complex linguistic behavior, why do I need to posit the kind of representational priors suggested by Pinker / Chomsky / etc? What does it add to my understanding? The point I've raised above is that that camp completely ignores how powerful learning theory (and general learning mechanisms are) and makes arguments about what's 'possible,' 'impossible,' 'necessary,' etc, on the basis of a profound misunderstanding of what human learning capabilities actually amount to.

          This is why it's so misleading to call a Chomskyan model of language a 'learning model.' There's a reason that no nativist makes contact with the contemporary literature on learning and learning theory. If they did, they'd be forced to explain why they still find it necessary to explain everything in terms of UG, when simple learning models can account for much of complex linguistic behavior.

          Occam's Razor, right?

          [See new post in response to further comments]

          • Well, not until we hear the story about basic clause structure. Minimally, arranging verbs in systematic patterns with nouns and their modifiers, possibly but not always with some recursive structure in the noun phrases.

          • Mo's argument that what you can learn is constrained by the representations you have. I'm not sure which part of that computational "metaphor" is out of "date".

            Are you arguing that you can have a learning theory that doesn't have representations? Certainly all the Ramscar models have representations (a perceptual representation is still a representation), and it's hard to imagine what such a model would look like.

            Or are you arguing that you have model that can learn *anything* regardless of the representations it uses? For instance, it could learn differential equations entirely in terms of colors. That'd be one hell of a model and I'd like to see it!

            If you don't object to either of those claims, then it's not clear what part of Mo's statement you could be objecting to.

            One more time for the road: there are some simple learning models that can (maybe) account for learning concrete nouns. I'll even give you some concrete adjectives like blue. I'm not sure whether that counts as complex linguistic behavior, but it certainly isn't "much of complex linguistic behavior." It's *certainly* not what motivates Chomsky's or Pinker's work.

            This is a little like saying you don't need all that fancy quantum mechanics to explain perfect spheres moving in a vacuum -- classical mechanics will do just fine. That's essentially true, but it misses the point. If you've got a "simple learning model" that can account for things like verbs and quantifiers, let's see it.

          • mo says:

            (extending gameswithwords)

            If you think the Chomskyan model is like a spreadsheet, the Ramscar model like a search engine, or even if you think in terms of neural nets, for now these are all just variants of computer programs. I don't see the paradigm shift in comparing your notions of language to computer program A vs program B.

            And I'm sorry, but there is no model of general learning that captures human language. For example, if you look at the best speech recognition systems, as when you phone the bank and the "lady" apologizes because she could not understand that and would you please say something or hold for the operator, it is via a model of mapping acoustic patterns onto pre-determined lookup tables; albeit in a probabilistic, fuzzy way.

            Sure there are teeny models for some horribly obvious relations in the world and in some speech tokens; but beyond that all of general learning is just a promissory note. Of COURSE parts of language interface with other cognitive systems and of COURSE they rely on perception. But equally, this is not enough, as a striking lack of talking rats might suggest.

            And coming to evolution, try looking for "general" solutions to a problem in other areas. Well, there is a general principle of branching structures that accounts for alveolar structure, blood circulatory systems etc; so there might be general principles of some kinds, But to suggest that that is ALL there is is to just ignore the rest of biology. It's the old old line (from Randy Gallistel) of insisting on a "general perception mechanism" as opposed to photoreceptors, hair cells, mechanoreceptors, hemoglobin.

            Anyhow - I'm gonna shut up. In the end I care about which theory makes the better predictions - not for learning simple word meanings, but for accounting for the organization of the core grammatical properties of language like constituency, categories, recursion and systematicity.

  • [...] blog Child’s Play is quickly turning into one of my favorite mind-science bloggers. Her latest post mercilessly attacks Pinker and Chomsky’s quips to the effect that learning language without an [...]

  • Kudos to you for taking biggies like Chomsky and Pinker to the task; I squarely fall in your camp and had in the past tried to be on the other side of Chomsk- see debate on vs Skinner in my blog posts. Similarly, I've never been convinced of the poverty of stimulus or the correctness of much of Pinkers theoretical positions on language. I also view the brain as a prediction machine ala jeff hawkins. Thanks for putting it all together.

  • [...] This post was mentioned on Twitter by Vaughan, Brian Mossop, Sandeep Gautam, simon dymond, Richard May and others. Richard May said: Nice to see the old foes of Mechanism and Mentalism still ruling the debate (sigh) http://is.gd/fpOIT [...]

  • [...] Play that challenges the orthodoxy that learning theories can’t explain how children acquire language because kids are often not corrected when they talk [...]

  • physioprof says:

    You had me until here:

    What I‘m left wondering is how we somehow evolved a gene that specifies the workings of a complex innate grammar, while simultaneously switching off all our general-purpose learning mechanisms — for language and children only.

    No one that I am aware of has ever claimed that human beings "evolved a gene that specifies the workings of a complex innate grammar". Whatever complex innate grammar exists must arise emergently out of the interplay among a large number of genetically and epigenetically specified developmental processes.

    • melodye says:

      So -- what you're saying is that it's genes (plural) that encode the grammar and selectively switch off learning mechanisms? Can't say I buy it. Increasing the number of unknown factors doesn't make an implausible story any more plausible.

  • physioprof says:

    I don't know about "switch[ing] off learning mechanisms", but why is it implausible that genetically determined neural structures have been selected for during human evolution that impose constraints on language that go beyond associative or other forms of learning? This should be uncontroversial. It is no different from, e.g., the fact that the evolution of trichromatic color vision imposes constraints on distinguishable visual wavelength spectra.

  • Firionel says:

    I don't get much of the intricate technicalities being debated here, but there is something I can't help wondering about:

    Might it be that one part of the problem here is that noone is exactly sure how humans reason to begin with? I mean we all have a very good idea of what would be an optimal (bayesian, probabilistic, pick your favourite) mode of inference given a certain set of information. And it is also pretty clear that humans aren't very good applying such strategies even when they explicity strive to do just that. (You may have a look at this somewhat surprising inquiry into how good professional scientists are at applying basic logic when called upon to do so.)

    So when people are confused about how humans learn language, are they maybe just confused about how humans learn in general? And conversely, is it imaginable that the task of learning language (which arguably is accomplished using rather sparse data) is among the factors favouring the specific heuristics employed in human learning behaviour more general? Again, without understanding much of the subject matter in question here, it is certainly true that overgeneralization is more or less omnipresent in human reasoning (specifically in math undergrads... :( ).

  • Thanks for this posting and for the interesting arguments following it, I'm learning a lot about the different models of learning and development and their underlying assumptions from this give and take. Language seems to particularly bring out the sharp edges of the different models.

  • [...] Just another random craft/psychology picture found while trawling the psychology blogs. [...]

  • Asp says:

    I am a complete layman, but I'm interested in the topic. However, your tone of voice is putting me off your own argument. Sarcasm doesn't particularly add to the persuasiveness of a theory, and to a layman is detracts from and blurs the science. I feel more sympathetic with Pinker and Chomsky and others, even though I DON'T EVEN KNOW anything about their work, just because I find your mocking tone so hard to swallow. It makes me suspect if you're representing their work truthfully, because it's so mocking.

    I know my opinion probably doesn't matter much because your writing is more oriented to those more knowledgeable of the field and all that, but I had to say it. Reader feedback, for what it's worth. I'm not going to stop reading your blog over it, but I wanted to say it.

    • melodye says:

      Asp : while I appreciate the comment, it's difficult for me to imagine writing dispassionately on a subject that I am -- well -- so obviously passionate about. I care deeply about the research I do, and I find the state of the field to be frustrating. My sarcasm comes out of this frustration ; as far as I'm concerned, Chomsky and Pinker's arguments are every inch the caricature I depict. Yet theirs is the dominant model for research in our field. And I think this model has set back research into language by decades, at the very least.

      Moreover, it's not simply that people accept the status quo without thinking critically; it's also that there are barriers in place to publishing research outside of this paradigm. I find this hard to stomach.

      In a way, I think too that it's important that I write in the way I do. It makes my own opinions transparent. I would argue that this is far better than the articles on language published in top science journals that are taken to represent 'scientific truth,' when they rest on seriously flawed assumptions, bad arguments and conceptual confusion. Or take "The Language Instinct" as another example : Pinker is a clever, witty and 'dispassionate' writer, and yet his portrayal of the field -- and of research into language -- is outstandingly myopic. He cherrypicks his premises; makes strawmen arguments; and cites highly questionable research as empirical 'truth.' My comments about East Germany are meant to be humorous, but at the same time, I'm dead serious about the parallel. You should read Gladwell's take on how Pinker framed the IQ debate. I think it's illustrative of the kind of tactics that Pinker (and other researchers like him) choose in these kinds of debates.

      At the very least, my tactics are everywhere obvious and subject to scrutiny. I hope that gives you some reassurance; if not, there are many other excellent blogs on language that take a different tack (see Language Log).

  • [...] Dye, The Question Is, Are You Dumber Than a Rat? And the linguists are the ones asking [...]

  • Amos Zeeberg says:

    Re "Google Instant on Crack": Google released a series of ads that prompts that response from the viewer's brain. The lure is irresistible.

  • Aw, this was an incredibly nice post. Taking the time and actual effort
    to produce a really good article… but what can I say… I procrastinate a whole lot and never seem to get nearly anything done.

  • Good way of describing, and pleasant article to take data regarding my presentation subject, which i am going
    to deliver in school.

Bad Behavior has blocked 704 access attempts in the last 7 days.