Field of Science

Showing posts with label core knowledge. Show all posts
Showing posts with label core knowledge. Show all posts

Universal Grammar is dead. Long live Universal Grammar.

Last year, in a commentary on Evans and Levinson's "The myth of language universals: Language diversity and its importance for cognitive science" in Behavioral and Brain Sciences (a journal which published one target paper and dozens of commentaries in each issue), Michael Tomasello wrote:
I am told that a number of supporters of universal grammar will be writing commentaries on this article. Though I have not seen them, here is what is certain. You will not be seeing arguments of the following type: I have systematically looked at a well-chosen sample of the world's languages, and I have discerned the following universals ... And you will not even be seeing specific hypotheses about what we might find in universal grammar if we followed such a procedure.
Hmmm. There are no specific proposals about what might be in UG... Clearly Tomasello doesn't read this blog much. Granted, for that he should probably be forgiven. But he also clearly hasn't read Chomsky lately. Here's the abstract of the well-known Hauser, Chomsky & Fitch (2002):
We submit that a distinction should be made between the faculty of language in the broad sense (FLB) and in the narrow sense (FLN). FLB includes a sensory-motor system, a conceptual-intentional system, and the computational mechanisms for recursion, providing the capacity to generate an infinite range of expressions from a finite set of elements. We hypothesize that FLN only includes recursion and is the only uniquely human component of the faculty of language.
Later on, HCF make it clear that FLN is another way of thinking about what elsewhere is called "universal grammar" -- that is, constraints on learning that allow the learning of language.

Tomasello's claim about the other commentaries (that they won't make specific claims about what is in UG) is also quickly falsified, and by the usual suspects. For instance, Steve Pinker and Ray Jackendoff devote much of their commentary to describing grammatical principles that could be -- but aren't -- instantiated in any language.

Tomasello's thinking is perhaps made more clear by a later comment later in his commentary:
For sure, all fo the world's languages have things in common, and [Evans and Levinson] document a number of them. But these commonalities come not from any universal grammar, but rather from universal aspects of human cognition, social interaction, and information processing...
Thus, it seems he agrees that there are constraints on language learning that shape what languages exist. This, for instance, is the usual counter-argument to Pinker and Jackendoff's nonexistent languages: those languages don't exist because they're really stupid languages to have. I doubt Pinker or Jackendoff are particular fazed by those critiques, since they are interested in constraints on language learning, and this proposed Stupidity Constraint is still a constraint. Even Hauser, Chomsky and Fitch (2002) allow for constraints on language that are not specific to language (that's their FLB).

So perhaps Tomasello fundamentally agrees with people who argue for Universal Grammar, this is just a terminology war. They call fundamental cognitive constraints on language learning "Universal Grammar" and he uses the term to refer to something else: for instance, proposals about specific grammatical rules that we are born knowing. Then, his claim is that nobody has any proposals about such rules.

If that is what he is claiming, that is also quickly falsified (if it hasn't already been falsified by HCF's claims about recursion). Mark C. Baker, by the third paragraph of his commentary, is already quoting one of his well-known suggested language universals:
(1) The Verb-Object Constraint (VOC): A nominal that expresses the theme/patient of an event combines with the event-denoting verb before a nominal that expresses the agent/cause does.
And I could keep on picking examples. For those outside of the field, it's important to point out that there wasn't anything surprising in the Baker commentary or the Pinker and Jackendoff commentary. They were simply repeating well-known arguments they (and others) have made many times before. And these are not obscure arguments. Writing an article about Universal Grammar that fails to mention Chomsky, Pinker, Jackendoff or Baker would be like writing an article about major American cities without mentioning New York, Boston, San Francisco or Los Angeles.

Don't get me wrong. Tomasello has produced absurd numbers of high-quality studies and I am a big admirer of his work. But if he is going to make blanket statements about an entire literature, he might want to read one or two of the papers in that literature first.

-------
Tomasello, M. (2009). Universal grammar is dead Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X09990744

Evans, N., & Levinson, S. (2009). The myth of language universals: Language diversity and its importance for cognitive science Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X0999094X

Hauser MD, Chomsky N, & Fitch WT (2002). The faculty of language: what is it, who has it, and how did it evolve? Science (New York, N.Y.), 298 (5598), 1569-79 PMID: 12446899

Baker, M. (2009). Language universals: Abstract but not mythological Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X09990604

Pinker, S., & Jackendoff, R. (2009). The reality of a universal language faculty Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X09990720

Findings: Linguistic Universals in Pronoun Resolution

Unlike a proper name (Jane Austen), a pronoun (she) can refer to a different person just about every time it is uttered. While we occasionally get bogged down in conversation trying to interpret a pronoun (Wait! Who are you talking about?), for the most part we sail through sentences with pronouns, not even noticing the ambiguity.

I have been running a number of studies on pronoun understanding. One line of work looks at a peculiar contextual effect, originally discovered by Garvey and Caramazza in the mid-70s:

(1) Sally frightens Mary because she...
(2) Sally loves Mary because she...

Although the pronoun is ambiguous, most people guess that she refers to Sally in (1) but Mary in (2). That is, the verb used (frightens, loves) seems to affect pronoun resolution. Over the last 36 years, many thousands of undergraduates (and many more thousands of participants at gameswithwords.org) have been put through pronoun-interpretation experiments in an attempt to figure out what is going on. While this is a relatively small problem in the Big World of Pronouns -- it applies only to a small number of sentences in which pronouns appear -- it is also a thorn in the side of many broader theories of pronoun processing. And so the interest.

One open question has been whether the same verbs show the same pronoun biases across different languages. That is, frighten is subject-biased and fear is object-biased (the presence of frightens in sentences like 1 and 2 causes people to resolve the pronoun to the subject, Sally, whereas the presence of loves pushes them towards the object, Mary). If this were the case, it would suggest that something about the literal meaning of the verb is what gives rise to the pronoun bias.

(What else could be causing the pronoun bias, you ask? There are lots of other possibilities. For instance, it might be that verbs have some lexical feature tagging them as subject- or object-biased -- not an obvious solution to me but no less unlikely than other proposals out there for other phenomena. Or people might have learned that certain verbs probabilistically predict that subsequent pronouns were be interpreted as referring to the previous subject or object -- that is, there is no real reason that frighten is subject-biased; it's a statistical fluke of our language and we all learn to talk/listen that way because everyone else talks/listens that way.)


random cheetah picture
(couldn't find a picture about cross-linguistic studies of pronouns)

Over the last couple years, I ran a series of pronoun interpretation experiments in English, Russian and Mandarin. There is also a Japanese experiment, but the data for that one have been slow coming in. The English and Russian experiments were run through my website, and I ran the Mandarin one in Taiwan last Spring. I also analyzed Spanish data reported by Goikoetxea et al. (2008). Basically, in all the experiments participants were given sentences like (1) and (2) -- but in the relevant language -- and asked to identify who the pronoun referred to.

The results show a great deal of cross-linguistic regularity. Verbs that are subject-biased in one language are almost always subject-biased in the others, and the same is true for object-biased verbs. I am in the process of writing up the results (just finished Draft 3) and I will discuss these data in more detail in the future, answering questions like how I identify the same verb in different languages. For now, though, here is a little data.

Below is a table with four different verbs and the percentage of people who interpreted the pronoun as referring to the subject of the previous verb. It wasn't the case that the same verbs appeared in all four experiments, so where the experiment didn't include the relevant verb, I've put in an ellipsis.


                         Subject-Biases for Four Groups of Related Verbs in Four Languages                                     
                        Group 1                        Group 2                Group 3                        Group 4
English            convinces 57%          forgives 45%      remembers 24%          understands 60%
Spanish            …                                 …                          recordar 22%               comprender 63%
Russian            ubezhdala 74%         izvinjala 33%     pomnila 47%               ponimala 60%
Mandarin         shuofu 73%               baorong 37%      …                                    …



For some of these verbs, the numbers are closer than for others, but for all verbs, if the verb was subject-biased in one language (more than 50% of participants interpreted the pronoun as referring to the subject), it was subject-biased in all languages. If it was object-biased in one language, it was object-biased in the others.

For the most part, this is not how I analyze the data in the actual paper. In general, it is hard to identify translation-equivalent verbs (for instance, does the Russian nenavidet' mean hate, despise or detest?), so I employ some tricks to get around that. So this particular table actually just got jettisoned from Draft 3 of the paper, but I like it and feel it should get published somewhere. Now it is published on the blog.


BTW If anyone knows how to make researchblogging.org bibligraphies in Chrome without getting funky ampersands (see below), please let me know.
---------
Catherine Garvey, & Alfonso Caramazza (1974). Implicit causality in verbs Linguistic Inquiry, 5, 459-464

Goikoetxea, E., Pascual, G., & Acha, J. (2008). Normative study of the implicit causality of 100 interpersonal verbs in Spanish Behavior Research Methods, 40 (3), 760-772 DOI: 10.3758/BRM.40.3.760

photo: Kevin Law

Negative Evidence: Still Missing after all these Years

My pen-pal Melodye has posted a thought-provoking piece at Child's Play on negative evidence. As she rightly points out, issues of negative evidence have played a crucial role in the development of theories of language acquisition. But she doesn't think that's a good thing. Rather, it's "ridiculous, [sic] and belies a complete lack of understanding of basic human learning mechanisms."

The argument over negative evidence, as presented by Melodye, is ridiculous, but that seems to stem from (a) conflating two different types of negative evidence, and (b) misunderstanding what the argument was about.
Fig. 1. Melodye notes that rats can learn from negative evidence, so why can't humans? We'll see why.

Here's Melodye's characterization of the negative evidence argument:
[T]he argument is that because children early on make grammatical ‘mistakes’ in their speech (e.g., saying ‘mouses’ instead of ‘mice’ or ‘go-ed’ instead of ‘went’), and because they do not receive much in the way of corrective feedback from their parents (apparently no parent ever says “No, Johnny, for the last time it’s MICE”), it must therefore beimpossible to explain how children ever learn to correct these errors. How — ask the psychologists — could little Johnny ever possibly ‘unlearn’ these mistakes? This supposed puzzle is taken by many in developmental psychology to be one of a suite of arguments that have effectively disproved the idea that language can be learned without an innate grammar.
What's the alternative? Children are predicting what word is going to come up in a sentence.
[I]f the child is expecting ‘mouses’ or ‘gooses,’ her expectations will be violated every time she hears ‘mice’ and ‘geese’ instead.  And clearly that will happen a lot.  Over time, this will so weaken her expectation of ‘mouses’ and ‘gooses,’ that she will stop producing these kinds of words in context.
I can't speak for every Nativist, or for everyone who has studied over-regularization, but since Melodye cites Pinker extensively and specifically, and since I've worked on over-regularization within the Pinkerian tradition, I think I can reasonably speak for at least a variant of Pinkerianism. And I think Melodye actually agrees with us almost 100%.


My understanding of Pinker's Words and Rules account -- and recall that I published work on this theory with one of Pinker's students, so I think my understanding is well-founded -- is that children originally over-regularize the plural of mouse as mouses, but eventually learn that mice is the plural of mouse by hearing mice a lot. That is, our account is almost identical to Melodye's except it doesn't include predictive processing. I actually agree that if children are predicting mouses and hear mice, that should make it easier to correct their mistaken over-regularization. But the essential story is the same.



Where I've usually seen Nativists bring up this particular negative evidence argument (and remember there's another) is in the context of Behaviorism, on which rats (and humans) learned through being explicitly rewarded for doing the right thing and explicitly punished for doing the wrong thing. The fact that children learning language are almost never corrected (as Melodye notes) is evidence against that very particular type of Empiricist theory.


That is, we don't (and to my knowledge, never have) argued that children can only learn the word mice through Universal Grammar. Again, it's possible (likely?) that someone has made that argument. But not us.[1]


Negative Evidence #2


There is a deeper problem with negative evidence that does implicate, if not Universal Grammar, at least generative grammars. That is, as Pinker notes in the article cited by Melodye, children generalize some things and not others. Compare:


(1) John sent the package to Mary.
(2) John sent Mary the package.
(3) John sent the package to the border.
(4) *John sent the border the package.


That * means that (4) is ungrammatical, or at least most people find it ungrammatical. Now, on a word-prediction theory that tracks only surface statistics (the forms of words, not their meaning or syntactic structure), you'd probably have to argue that whenever children have heard discussions of packages being sent to Mary, they've heard either (1) or (2), but in discussions of sending packages to borders, they've only ever heard (3) and never (4). This is surprising, and thus they've learned that (4) is no good. 


The simplest version of this theory won't work, though. Since children (and you) have presumably never heard any of the sentences below (where Gazeindenfrump and Bleizendorf are people's names, the dax is an object, and a dacha is a kind of house used in Russia):


(5) Gazeidenfrump sent the dax to Bleizendorf.
(6) Gazeidenfrump sent Bleizendorf the dax.
(7) Gazeidenfrump sent the dax to the dacha.
(8) *Gazeidenfrump sent the dacha the dax. 


Since we've heard (and expected) sentence #8 just as many times as we heard/expected (5-7), failures of predictions can't explain why we know (8) is bad but (5-7) isn't. (BTW If you don't like my examples, there are many, many more in the literature; these are the best I can think of off the top of my head.)


So we can't be tracking just the words themselves, but something more abstract. Pinker has an extended discussion of this problem in his 1989 book, in which he argues that the constraint is semantic: we know that you can use the double-object construction (e.g., 2, 4, 6 or 8) only if the recipient of the object can actually possess the object (that is, the dax becomes Bleizendorf's, but it doesn't become the dacha's, since dachas -- and borders -- can't own things). I'm working off of memory now, but I think -- but won't swear -- that Pinker's solution also involves some aspects of the syntactic/semantic structures above being innate.


Pinker's account is not perfect and may end up being wrong in some places, but it remains the fact that negative evidence (implicit or not) can't alone explain where children (and adults) do or do not generalize.


-----
Notes: 





[1] Melodye quotes Pinker saying "The implications of the lack of negative evidence for children's overgeneralization are central to any discussion of learning, nativist or empiricist." That is the quote that she says is "quite frankly, ridiculous." Here is the full quote. I'll let you decide whether it's ridiculous:
This nature–nurture dichotomy is also behind MacWhinney’s mistaken claim that the absence of negative evidence in language acquisition can be tied to Chomsky, nativism, or poverty-of-the-stimulus arguments. Chomsky (1965, p. 32) assumed that the child’s input ‘consist[s] of signals classified as sentences and nonsentences _’ – in other words, negative evidence. He also invokes indirect negative evidence (Chomsky, 1981). And he has never appealed to Gold’s theorems to support his claims about the innateness of language. In fact it was a staunch ANTI-nativist, Martin Braine (1971), who first noticed the lack of negative evidence in language acquisition, and another empiricist, Melissa Bowerman (1983, 1988), who repeatedly emphasized it. The implications of the lack of negative evidence for children’s overgeneralization are central to any discussion of learning, nativist or empiricist.



-----
Quotes:
PINKER, S. (2004). Clarifying the logical problem of language acquisition Journal of Child Language, 31 (4), 949-953 DOI: 10.1017/S0305000904006439


photo: Big Fat Rat

Intelligent Nihilism

The latest issue of Cognitive Science, which is rapidly becoming one of my favorite journals, carries an interesting and informative debate on the nature of language, thought, cognition and learning, between John Hummel at University of Illinois-Urbana-Champaign, and Michael Ramscar, at Stanford University. This exchange of papers highlights what I think is the current empirical standstill between two very different world-views.

Hummel takes up the cause of "traditional" models on which thought and language is deeply symbolic and involves algebraic rules. Ramscar defends more "recent" alternative models that are built on associate learning -- essentially, an update on the story that was traditional before the symbolic models.

Limitations of Relational Systems

The key to Hummel's argument, I think, is his focus on explicitly relational systems:
John can love Mary, or be taller than Mary, or be the father of Mary, or all of the above. The vocabulary of relations in a symbol system is open-ended ... and relations can take other relations as arguments (e.g., Mary knows John loves her). More importantly, not only can John love Mary, but Sally can love Mary, too, and in both cases it is the very same "love" relation ... The Mary that John loves can be the very same Mary that is loved by Sally. This capacity for dynamic recombination is at the heart of a symbolic representation and is not enjoyed by nonsymbolic representations.
That is, language has many predicates (e.g., verbs) that seem to allow arbitrary arguments. So talking about the meaning of love is really talking about the meaning of X loves Y: X has a particular type of emotional attachment to Y. You're allowed to fill in "X" and "Y" more or less how you want, which is what makes them symbols.

Hummel argues that language is even more symbolic than that: not only do we need symbols to refer to arguments (John, Mary, Sally), but we also need symbols to refer to predicates as well. We can talk about love, which is itself a relation between two arguments. Similarly, we can talk about friendship, which is an abstract relation. This is a little slippery if you're new to the study of logic, but doing this requires a second-order logic, which has a number of formal properties.

Where Hummel wants to go with this is that associationist theories, like Ramscar's, can't represent second-order logical systems (and probably aren't even up to the task of the types of first-order systems we might want). Intuitively, this is because associationist theories represent similarities between objects (or at least how often both occur together), and it's not clear how they would represent dissimilarities, much less represent the concept of dissimilarity:
John can be taller than Mary, a beer bottle taller than a beer can, and an apartment building is taller than a house. But in what sense, other than being taller than something, is John like a beer bottle or an apartment building? Making matters worse, Mary is taller than the beer bottle and the house is taller than John. Precisely because of their promiscuity, relational concepts defy learning in terms of simple associative co-occurrences.
It's not clear in these quotes, but there's a lot of math to back this stuff up: second-order logic systems are extremely powerful and can do lots of useful stuff. Less powerful computational systems simply can't do as much.


The Response

Ramscar's response is not so much to deny the mathematical truths Hummel is proposing. Yes, associationist models can't capture all that symbolic systems can do, but language is not a symbolic system:
We think that mapping natural language expressions onto the promiscuous relations Hummel describes is harder than her does. Far harder: We think you cannot do it.
Ramscar identifies a couple old problems: one is polysemy, the fact that words have multiple meanings  (John can both love Mary and love a good argument, but probably not in the same way). Fair enough -- nobody has a fully working explanation of polysemy.

The other problem is the way in which the symbols themselves are defined. You might define DOG in terms of ANIMAL, PET, FOUR-LEGGED, etc. Then those symbols also have to be defined in terms of other symbols (e.g., FOUR-LEGGED has to be defined in terms of FOUR and LEG). Ramscar calls this the turtles-all-the-way-down argument.

This is fair in the sense that nobody has fully worked out a symbolic system that explains all of language and thought. It's unfair in that he doesn't have all the details of this theory worked out, either, and his model is every bit as turtles-all-the-way-down. Specifically, concepts are defined in terms of cooccurrences of features (a dog is a unique pattern of co-occurring tails, canine teeth, etc.). Either those features are themselves symbols, or they are always patterns of co-occuring features (tail = co-occurrence of fur, flexibility, cylindrical shape, etc.), which are themselves patterns of other other feature co-occurrences, etc. (It's also unfair in that he's criticizing a very old symbolic theory; there are newer, possibly better ones around, too.)

Implicit in his argument is the following: anything that symbolic systems can do that associationist systems can't do are things that humans can't do either. He doesn't address this directly, but presumably this means that we don't represent abstract concepts such as taller than or friendship, or, if we do, it's via a method very different from formal logic (what that would be is left unspecified).

It's A Matter of Style

Here's what I think is going on: symbolic computational systems are extremely powerful and can do lots of fancy things (like second-order logics). If human brains instantiate symbolic systems, that would explain very nicely lots of the fancy things we can do. However, we don't really have any sense of how neurons could instantiate symbols, or even if it's possible. So if you believe in symbolic computation, you're basically betting that neurons can do more than it seems.

Associationist systems face the opposite problem: we know a lot about associative learning in neurons, so this seems like an architecture that could be instantiated in the brain. The problem is that associative learning is an extremely underpowered learning system. So if you like associationist systems, you're betting that humans can't actually do many of the things (some of) us think humans can do.

Over at Child's Play, Dye claimed that the argument in favor of Universal Grammar was a form of Intelligent Design: we don't know how that could be learned/evolve, so it must be innate/created. I'll return the favor by labeling Ramscar's argument Intelligent Nihilism: we don't how the brain could give rise to a particular type of behavior, so humans must not be capable of it.

The point I want to make is we don't have the data to choose between these options. You do have to work within a framework if you want to do research, though, and so you pick the framework that strikes you as most plausible. Personally, I like symbolic systems.

----------
John E. Hummel (2010). Symbolic versus associative learning Cognitive Science, 34, 958-865

Michael Ramscar (2010). Computing machinery and understanding Cognitive Science, 34, 966-971

photos: Anirudh Koul (jumping), wwarby (turtles), kaptain kobold (Darwin)

When is the logically impossible possible?

Child's Play has posted the latest in a series of provoking posts on language learning. There's much to recommend the post, and it's one of the better defenses of statistical approaches to language learning around on the Net. It would benefit from some corrections, though, and into the gap I humbly step...


The post sets up a classic dichotomy:
Does language “emerge” full-blown in children, guided by a hierarchy of inbuilt grammatical rules for sentence formation and comprehension? Or is language better described as a learned system of conventions — one that is grounded in statistical regularities that give the appearance of a rule-like architecture, but which belie a far more nuanced and intricate structure?
It's probably obvious from the wording which one they favor. It's also less obviously a false dichotomy. There probably was a very strong version of Nativism that at one point looked like their description of Option #1, but very little Nativist theory I've read from the last few decades looks anything like that. Syntactic Bootstrapping and Syntactic Bootstrapping are both much more nuanced (and interesting) theories.


Some Cheek!


Here's where the post gets cheeky: 

For over half a century now, many scientists have believed that the second of these possibilities is a non starter. Why? No one’s quite sure — but it might be because Chomsky told them it was impossible.
Wow? You mean nobody really thought it through? That seems to be what Child's Play thinks, but it's a misrepresentation of history. There are a lot of very good reasons to favor Nativist positions (that is, ones with a great deal of built-in structure). As Child's Play discuss -- to their credit -- any language admits an infinite number of grammatical sentences, so any finite grammar will fail (they treat this as a straw-man argument, but I think historically that was once a serious theory). There are a number of other deep learning problems that face Empiricist theories (Pinker has an excellent paper on the subject from around 1980). There are deep regularities across languages -- such as linking rules -- that are crazy coincidences or reflect innate structure. 


The big one, from my standpoint, is that any reasonable theory of language is going to have to have, in the adult state, a great deal of structure. That is, one wants to know why "John threw the ball AT Sally" means something different from "John threw the ball TO Sally." Or why "John gave Mary the book" and "John gave the book to Mary" mean subtly different things (if you don't see that, try substituting "the border" with "Mary"). A great deal of meaning is tied up in structure, and representing structure as statistical co-occurrences doesn't obviously do the job. 


Unlike Child's Play, I'm not going to discount any possibility of the opposing theories to get the job done (though I'm pretty sure they can't). I'm simply pointing out that Nativism didn't emerge from a sustained period of collective mental alienation.


Logically Inconsistent


Here we get to the real impetus for this response, which is this extremely odd section towards the end:
We only get to this absurdist conclusion because Miller & Chomsky’s argument mistakes philosophical logic for science (which is, of course, exactly what intelligent design does).  So what’s the difference between philosophical logic and science? Here’s the answer, in Einstein’s words, “No amount of experimentation can ever prove me right; a single experiment can prove me wrong.”
In context, this means something like "Just because our theories have been shown to be logically impossible doesn't mean they are impossible." I've seen similar arguments before, and all I can say each time is:


Huh?


That is, they clearly understand logic quite differently from me. If something is logically impossible, it is impossible. 2 + 2 = 100 is logically impossible, and no amount of experimenting is going to prove otherwise. The only way a logical proof can be wrong is if (a) your assumptions were wrong, or (b) your reasoning was faulty. For instance, the above math problem is actually correct if the answer is written in base 2. 


In general, one usually runs across this type of argument when there is a logical argument against a researcher's pet theory, and said researcher can't find a flaw with the argument. They simply say, "I'm taking a logic holiday." I'd understand saying, "I'm not sure what the flaw in this argument is, though I'm pretty sure there is one." It wouldn't be convincing (or worth publishing), but I can see that. Simply saying, "I've decided not to believe in logic because I don't like what it's telling me" is quite another thing.

Universal meaning

My earlier discussion of Evans and Levinson's critique of universal grammar was vague on details. Today I wanted to look at one specific argument.

Funny words

Evans and Levinson briefly touch on universal semantics (variously called "the language of thought" or "mentalese"). The basic idea is that language is a way of encoding our underlying thoughts. The basic structure of those thoughts is the same from person to person, regardless of what language they speak. Quoting Pinker, "knowing a language, then, is knowing how to translate mentalese into strings of words and vice versa. People without a language would still have mentalese, and babies and many nonhuman animals presumably have simpler dialects."

Evans and Levinson argue that this must be wrong, since other languages have words for things that English has no word for, and similarly English has words that don't appear in other languages. This is evidence against a simplistic theory on which all languages have the same underlying vocabulary and differ only on pronunciation, but that's not the true language of thought hypothesis. Many of the authors cited by Evans and Levinson -- particularly Pinker and Gleitman -- have been very clear about the fact that languages pick and choose in terms what they happen to encode into individual words.

The Big Problems of Semantics

This oversight was doubly disappointing because the authors didn't discuss the big issues in language meaning. One classic problem, which I've discussed before on this blog, is the gavagai problem. Suppose you are visiting another country where you don't speak a word. Your host takes you on a hike, and as you are walking, a rabbit bounds across the field in front of your. Your host shouts "gavagai!" What should you think gavagai means?

There are literally an infinite number of possibilities, most of which you probably won't consider. Gavagai could mean "white thing moving," or "potential dinner," or "rabbit" on Tuesdays but "North Star" any other day of the week. Most likely, you would guess it means "rabbit" or "running rabbit" or maybe "Look!" This is a problem to solve, though -- given the infinite number of possible meanings, how do people narrow down on the right one?

Just saying, "I'll ask my host to define the word" won't work, since you don't know any words yet. This is the problem children have, since before explicit definition of words can help them learn anything, they must already have learned a good number of words.

One solution to this problem is to assume that humans are built to expect words of certain sorts and not others. We don't have learn that gavagai doesn't change it's meaning based on the day of the week because we assume that it doesn't.

More problems

That's one problem in semantics that is potentially solved by universal grammar, but not the only. Another famous one is the linking problem. Suppose you hear the sentence "the horse pilked the bear". You don't know what pilk means, but you probably think the sentence describes the horse doing something to the bear. If instead you find out it describes a situation in which the bear knocked the horse flat on its back,  you'd probably be surprised.

That's for a good reason. In English, transitive verbs describe the subject doing something to the object. That's not just true of English, it's true of almost every language. However, there are some languages where this might not be true. Part of the confusion is that defining "subject" and "object" is not always straightforward from language to language. Also, languages allow things like passivization -- for instance, you can say John broke the window or The window was broken by John. When you run into a possible exception to the subject-is-the-doer rule, you want to make sure you aren't just looking at a passive verb.

Once again, this is an example where we have very good evidence of a generalization across all languages, but there are a few possible exceptions. Whether those exceptions are true exceptions or just misunderstood phenomena is an important open question.

-------
Evans, N. and Levinson, S. (2009). The myth of language universals: Language diversity and its importance for cognitive science Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X0999094X

photo credit

Do Language Universals Exist?

Is there an underlying structure common to all languages? There are at least two arguments in favor of that position. One is an in principle argument, and one is based on observed data.

Since Chomsky, many researchers have noted that language would be impossible to learn if one approached it without preconceptions. It's like solving for 4 variables with only 3 equations -- for those of you who have forgotten your math, that can't be done. Quine pointed out the problem for semantics, but the problem extends to syntax.

The data-driven argument is based on the observation that diverse languages share many properties. All languages, it is claimed, have nouns and verbs. All languages have consonants and vowels. All languages put agents (the do-ers; Jane in Jane broke the window) in subject position and patients (the do-ees; the window in Jane broke the window) in object position. And so on. (Here's an extensive list.)

Though many researchers subscribe to this universal grammar hypothesis, it has always been controversial. Last year, Evans and Levinson published an extensive refutation of the hypothesis in Behavioral and Brain Sciences. They don't tackle the in principle argument (it's actually tough to argue against, since it turns out to be logically necessary), but they do take issue with the data-based argument.

Rare languages

Evans and Levinson point out that at best 10% of the world's 7,000 or so languages have been studied in any great detail, and that the bulk of all work on language has focused on English. They claim that researchers only believe in linguistic universals because they've only looked at a relatively small number of often closely-related languages, and they bring up counter-examples to proposed universals found in obscure languages.

This argument cuts both ways. The correct characterization of a language is very, very hard. Much of the work I have been doing lately has been an attempt to correctly characterize the semantics of about 300 related verbs in English. Hundreds of papers have been written about these verbs over the last half-century. Many of them have turned out to be wrong --  not because the researchers were bad, but because the problem is hard.

That's 300 verbs in the most-studied language on the planet, and we still have work to do. Evans and Levinson are basing their arguments on broad-scale phenomena in extremely rare, poorly-studied languages.

A friend of a friend told me...

The rare languages that Evans and Levinson make use of are not -- as they readily acknowledge -- well-understood. In arguing against recursion as a linguistic universal, they bring up Piraha, a language spoken in a handful of villages deep in the Amazon. Without discussing recursion in detail, the basic claim is that there are sentences that are ungrammatical in Piraha, and these sentences are ungrammatical because they require recursion.

To my knowledge, there is one Spanish-Piraha bilingual speaker, in addition to two English-speaking missionaries who, as adults, learned Piraha. The claim that Piraha doesn't have recursion is based on the work of one of those missionaries. So the data that sentences with recursion are ungrammatical in Piraha is based on a limited number of observations. It's not that I don't trust that particular researcher -- it's that I don't trust any single study (including my own), because it's easy to make mistakes.

Looking back at English, I study emotion verbs in which the subject of the verb experiences an emotion (e.g., fear, like, love). A crucial pillar of one well-known theory from the 1990s was that such verbs can't be prefixed with "un". That is, English doesn't have the words unfeared or unliked. While I agree that these words sound odd, a quick Google search shows that unfeared and unliked are actually pretty common. Even more problematic for the theory, unloved is a perfectly good English word. In fact, many of these verbs do allow "un" prefixation. The author, despite being an experienced researcher and a native speaker of English, was just wrong.

Even assuming that you are correct in claiming that a certain word or sentence doesn't appear in a given language, you could be wrong about why. Some years ago, Michael Tomasello (and others) noticed that certain constructions are more rare in child speech than one might naively expect. He assumed this was because the children didn't know those constructions were grammatical. For instance, in inflected languages such as Spanish or Italian, young children rarely use any verbs in all possible forms. A number of people (e.g., Charles Yang) have pointed out that this assumes that the children would want to say all those words. Take a look at this chart of all the forms of the Spanish verbs hablar, comer and vivir. The child might be excused for never using the form habriamos hablado ("we would have spoken") -- that doesn't mean she doesn't know what it is.

In short, even in well-studied languages spoken by many linguists, there can be a lot of confusion. This should give us pause when looking at evidence from a rare language, spoken by few and studied by fewer.

Miracles are unlikely, and rare

Some centuries ago, David Hume got annoyed at people claiming God must exist, otherwise how can you explain the miracles recorded in the Bible? Hume pointed out that by definition, a miracle is something that is essentially impossible. As a general rule, seas don't part, water doesn't turn into wine, and nobody turns into pillars of salt. Then consider that any evidence you have that a miracle did in fact happen could be wrong. If a friend tells you they saw someone turn into a pillar of salt, they could be lying. If you saw it yourself, you could be hallucinating. Hume concludes that however strong your evidence that a miracle happened is, that could never be as strong as the extreme unlikelihood of a miracle actually happening -- and, in any case, the chance that the Bible is wrong is way higher than the chance that Moses in fact did part the Sea of Reeds.

(For those of you who are worried, this isn't necessarily an argument against the existence of God, just an argument against gullibility.)

Back to the question of universals. Let's say you have a candidate linguistic universal, such as recursion, that has shown up in a large number of unrelated and well-studied languages. These facts have been verified by many, many researchers, and you yourself speak several of the languages in question. So the evidence that this is in fact a linguistic universal is very strong.

Then you come across a paper that claims said linguistic universal doesn't apply in some language X. Either the paper is right, and you have to toss out the linguistic universal, or it's wrong, and you don't. Evans and Levinson err on the side of tossing out the linguistic universal. Given the strength of evidence in favor of some of these universals, and the fact that the counter-examples involve relatively poorly-understood languages, I think one might rather err on the other side. As they say, extraordinary claims require extraordinary evidence.

The solution

Obviously, the solution is not to say something about "extraordinary claims" and wander on. Evans and Levinson's paper includes a plea to researchers to look beyond the usual suspects and start doing more research on distant languages. I couldn't agree more, particularly as many of the world's language are dying and the opportunity to study them is quickly disappearing.

-------
Evans, N. and Levinson, S. (2009). The myth of language universals: Language diversity and its importance for cognitive science Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X0999094X

Sure faces are special, but why?

Faces are special. There appears to be a dedicated area of the brain for processing faces. Neonates just a day or two old prefer looking at pictures of faces to looking at non-faces.

This has led many researchers to claim humans are born with innate knowledge about faces. Others, however, have claimed that these data are not the result of nature so much as nurture. Pawan Sinha at MIT attached a video camera to his infant child and let the tape roll for a few hours. He found that faces were frequently the most salient objects in the baby's visual field, and (I'm working from memory of a talk here) also found that a computational algorithm could fairly easily learn to recognize faces. Similarly, a number of researchers have claimed that the brain area thought to be specialized for face detection is in fact simply involved in detecting any object for which one has expertise, and all humans are simply face experts.

Things have seemed to be at an impass, but today, Yoichi Sugita from AIST spoke at both Harvard and MIT. The abstract itself was enough to catch everybody's attention:

Infant monkeys were reared with no exposure to any faces for 12 months. Before being allowed to see a face, the monkeys showed preference for human- and monkey faces in photographs. They still preferred faces even when presented in reversed contrast. But, they did not show preference for faces presented in upside-down. After the deprivation period, the monkeys were exposed first to human faces for a week. Soon after, their preference changed drastically. They preferred upright human faces but lost preference for monkey faces. Furthermore, they lost preference for human faces presented in reversed contrast. These results indicate that the interrelated features of the face can be detected without experience, and that a face prototype develops abruptly when flesh faces are shown.
Just to parse this: the monkeys were raised individually without contact with other monkeys. They did have contact with a human caregiver who wore a mask that obstructed view of the face. The point about not preferring upside down faces is important, as this is a basic feature of face processing.

This seems pretty decisive evidence for an innate face module in the brain, though one that requires some tuning (the monkeys' face preferences evolved with experience). However, Sugita apparently noted during the talk -- I heard this second-hand -- that perhaps the monkeys in question did in fact have some experience with faces prior to the face preference test; they could have learned by touching their own faces. This strikes me as a stretch, since that doesn't explain why they would become face experts.