Games with Words: Cognitive Science, January 2010

Cognitive Science, January 2010

Posted by GamesWithWords on Tuesday, June 08, 2010

In my continuing series on the past year in Cognitive Science: January, 2010.

Once again, the discussion of some of these papers will be technical.

January

Lee & Sarnecka. A model of knower-level behavior in number concept development.

Children learn the full meanings of number words slowly, one word at a time. The authors present a Bayesian model of number word acquisition -- or, more specifically, of performance on the famous Give-A-Number task. The model assumes that each child has a certain baseline preference to give certain numbers of items more than others. It also assumes that the child knows certain number words and not others. If the child, say, knows one and two, the child will give that number of items when asked and not when asked about a different number word (e.g., three), even if the child doesn't know what that other number word means.

The model was then fed data on the actual performance of a set of actual children and estimates what words the child knows and what the child's baseline preferences are. The model learned that children prefer to either give a handful of items or all the available items, which accords well with what has been seen over the years. It also seemed to do a reasonable job of doing several other things.

None of this was necessarily surprising, in the sense that the model modeled well-known data correctly. That said, psychological theories are often complex. Theorists (often) state them in loose terms and then make claims about what predictions the theory makes in terms of behavior in different tasks. Without specifying the theory in a formal model, though, it's not always clear that those are in fact the predictions the theory makes. This paper represents, among things, an attempt to take a well-known theory and show that it does in fact account for the observed data. To the extent it gets things wrong, the model presents a starting point for further refinement.

There has been a movement in some quarters to make and test more explicit models. This is undoubtedly a good thing. The question is whether there are many behaviors that we understand sufficiently well to produce reasonable models ... that aren't so simplistic that the formal model itself doesn't really tell us anything we don't know. That seems to be a point one could argue. One thing I like about this particular model is that the authors attempt to capture fine-grained aspects of individual subjects' performances, which is something we ultimately want to be able to do.

Estigarribia. Facilitation by variation: Right-to-left learning of English yes/no questions

The syntax of questions have played a key role in the development of modern linguistics. In particular, a great deal of ink has been spilled about auxiliary inversion. Compare That is a soccer ball with Is that a soccer ball. Well-known theories of English posit that the auxiliary is is generated in normal declarative position (that is...) and must be moved to the front of the sentence to form a question (is that...).

Estigarribia argues that many theories have assumed parents model auxiliary-inverted questions for their children. A (smallish) corpus analysis reveals that in fact ~20% of parental yes/no questions with auxiliaries are non-auxiliary-initial (that is a soccer ball?). Of all yes/no questions, canonical auxiliary-first questions make up less than half, with sentence fragments being quite common (soccer ball?).

Again looking at the corpus of 6 young children, Estigarribia finds that the children begin by producing the simplest, fragment questions (a soccer ball?). Next, they begin producing what Estigarribia calls subject-predicate questions (that a soccer ball?). Full-on auxiliary-inverted questions appear relatively late (is that a soccer ball). Estigarribia finds this consistent with a learning mechanism in which children learn the ends of sentences better than the beginnings of sentences, similar to the MOSAIC model.

One limitation is that children have difficulty producing long sentences, and the data are consistent with children producing shorter sentences first and eventually progressively-longer sentences. Estigarribia shows that he finds the same order of acquisition even in children who have somewhat longer MLUs at the beginning of the study (that is, produce longer sentences), but one can still worry. The fact that children selectively produce the ends of the sentences rather than the beginning could be due to the fact that the end of a question (a soccer ball?) is a lot more informative than the beginning (is that a?).

It might be somewhat more impressive if children produce non-inverted questions (that is a soccer ball?) before inverted questions, but Estigarribia does not analyze those types of sentences. What I find most compelling about this study is in fact the adult data. As Estigarribia points out, we don't want to think of language acquisition as a process in which children ultimately eliminate non-canonical questions (that is, those without inverted auxiliaries), since in fact adults produce many such sentences.

Nakatani & Gibson. An on-line study of Japanese nesting complexity.

Mary met the senator who attacked the reporter who ignored the president is easier to understand that The reporter who the senator who Mary met attacked ignored the president, even though the latter sentence is grammatical (of sorts) and means the same thing. Why this is the case has been a focus of study in psycholinguistics for many years.

The authors lay out a couple hypotheses. On one, the second sentence is harder to interpret because the relevant nouns are far from the verbs, making integrating ignored and the reporter harder to integrate. On other hypotheses, all the nested relative clauses (who...) generate expectations about what verbs are coming up. The more expectations, the more has to be kept in memory, and the harder the sentence is.

These hypotheses (and a similar surprisal hypothesis) are tested using the self-paced reading methodology in Japanese, a language with a few nice properties like relatively free word order, which makes controlling the stimuli slightly easier than it is in English. The results ultimately support the expectancy hypotheses over the retrieval hypotheses.

One of the interesting things about this paper is that one well-known retrieval hypothesis is actually Gibson's. So is one of the expectancy hypotheses, which he developed after he (apparently) decided the original theory was probably wrong. The willingness to abandon a cherished theoretical position in the face of new evidence is a trait more prized than seen in academia, and it's something to be admired -- and something very typical of Gibson.

Mirman, Strauss, Dixon & Magnuson. Effect of representational distance between meanings on recognition of ambiguous spoken words.

The authors looked at word recognition using two different paradigms (lexical decision and eye-tracking). All the words could be nouns. Some had only strong noun meaning (acorn, lobster). Some were homophones with two common noun meanings (chest -- chest of drawers or person's chest) and some were homophones with a common noun and a common verb meaning (bark -- the dog barked or the tree's bark).

Participants were fasted to interpret the unambiguous words (acorn, lobster), next fastest at recognizing the noun-verb words (bark) and slowest at the noun-noun words (chest). They take this in the context of previous research that has shown that words with two closely related meanings are faster to interpret that words with two very different meanings. In this study, the semantic relatedness of the two meanings for the noun-verb homophones were no closer than that of the noun-noun homophones. So the authors suggest that syntactic distance matters as well -- two meanings of the same syntactic type (e.g., noun) interfere with one another more than two meanings of different types (e.g., noun-verb).

An alternative explanation of these data is one of priming. 2/3 of the stimuli in this study were unambiguously nouns. This may have primed the noun meanings of the noun-verb homophones and helped automatically suppress the verb meaning. Thus, participants processed the noun-verb homophones more like unambiguous, non-homophonic words. The way to test this, of course, would be to run a similar study with unambiguous verbs, verb-verb homophones, and the same noun-verb homophones.

Field of Science

Cognitive Science, January 2010

No comments: