Games with Words: journal club

Showing posts with label journal club. Show all posts

Fractionating IQ

Posted by GamesWithWords on Thursday, January 17, 2013

Near the dawn of the modern study of the mind, the great psychological pioneer Charles Spearman noticed that people who are good at one kind of mental activity tend to be good at most other good mental activities. Thus, the notion of g (for "general intelligence") was born: the notion that there is some underlying factor that determines -- all else equal -- how good someone is at any particular intelligent task. This of course fits folk psychology quite well: g is just another word for "smarts".

The whole idea has always been controversial, and many people have argued that there is more than one kind of smarts out there (verbal vs. numeric, logical vs. creative, etc.). Enter a recent paper by Hampshire and colleagues (Hampshire, HIghfield, Parkin & Owen, 2012) which tries to bring both neuroimaging and large-scale Web-based testing to bear on the question.

In the neuroimaging component, they asked sixteen participants to carry out twelve difficult cognitive tasks while their brains were scanned and applied principle components analysis (PCA) to the results. PCA is a sophisticated statistical method for grouping things.

A side note on PCA

If you already know what PCA is, skip to the next section. Basically, PCA is a very sophisticated way of sorting thigns. Imagine you are sorting dogs. The simplest thing you could do is have a list of dog breeds and go through each dog and sort it according to its breed.

What if you didn't already have dog breed manual? Well, German shepherds are more similar to one another than any given German shepherd is to a poodle. So by looking through the range of dogs you see, you could probably find a reasonable way of sorting them, "rediscovering" the various dog breeds in the process. (In more difficult cases, there are algorithms you could use to help out.)

That works great if you have purebreds. What if you have mutts? This is where PCA comes in. PCA assumes that there are some number of breeds and that each dog you see is a mixture of those breeds. So a given dog may be 25% German Shepherd, 25% border collie, and 50% poodle. PCA tries to "learn" how many breeds there are, the characteristics of those breeds, and the mixture of breeds that makes up each dog -- all at the same time. It's a very powerful technique (though not without its flaws).

Neuroimaging intelligence

Analysis focused only on the "multiple demands" network previously identified as being related to IQ and shown in red in part A of the graph below. PCA discovered two underlying components that accounted for about 90% of the variance in the brain scans across the twelve tasks. One was particularly important for working memory tasks, so the authors called in MDwm (see part B of the graph below), and it involved mostly the IFO, SFS and ventral ACC/preSMA (see part A below for locations). The other was mostly involved in various reasoning tasks and involved more IFS, IPC and dorsal ACC/preSMA.

Notice that all tasks involved both factors, and some tasks (like the paired associates memory task) involved a roughly equal portion of each.

Sixteen subjects isn't very many

The authors put versions of those same twelve tasks on the Internet. They were able to get data from 44,600 people, which makes it one of the larger Internet studies I've seen. The authors then applied PCA to those data. This time they got three components, two of which were quite similar to the two components found in the neuroimaging study (they correlated at around r=.7, which is a very strong correlation in psychology). The third component seemed to be particularly involved in tasks requiring language. Most likely that did not show up in the neuroimaging study because the neuroimaging study focused on the "multiple demands" network, whereas language primarily involves other parts of the brain.

The factors dissociated in other ways as well. Whereas people's working memory and reasoning abilities start to decline about the time people reach the legal drinking age in the US (coincidence?) verbal skills remain largely undiminished until around age 50. People who suffer from anxiety had lower than average working memory abilities, but average reasoning and verbal abilities. Several other demographic factors similarly had differing effects on working memory, reasoning, and verbal abilities.

Conclusions

The data in this paper are very pretty, and it was a particularly nice demonstration of converging behavioral and neuropsychological methods. I am curious what the impact will be. The authors are clearly arguing against a view on which there is some unitary notion of IQ/g. It occurred to me as I wrote this what while I've read many papers lately discussing the different components of IQ, I haven't read anything recent that endorses the idea of a unitary g. I wonder if there is anyone, and, if so, how they account for this kind of data. If I come across anything, I will post it here.

------

Hampshire, A., Highfield, R., Parkin, B., & Owen, A. (2012). Fractionating Human Intelligence Neuron, 76 (6), 1225-1237 DOI: 10.1016/j.neuron.2012.06.022

Faster fMRI?

Posted by GamesWithWords on Saturday, October 27, 2012

A paper demonstrating a new technique for "ultrafast fMRI" has been getting some buzz on the blogosphere. Although movies often depict fMRI showing real-time activity in the brain, in fact typical methods only collect from one slide of the brain at a time, taking a fair amount of time to cover the entire brain (Neuroskeptic puts this at about 2-3 seconds). This new technique (GIN) can complete the job in 50 ms, and without sacrificing spatial resolution (which is the great advantage of fMRI relative to other neuroimaging techniques like EEG or MEG).

Does this mean fMRI is about to get 50 times faster?

Not exactly. What fMRI is measuring is the change in blood oxygenation in areas of your brain. When a particular area starts working harder, more oxygen-rich blood is sent in its direction, and that can be detected using MRI. The limitation is that it takes a while for this blood to actually get there (around 5-10 seconds). One commenter on the Neuroskeptic post (which is where I heard about this article) wrote "making fMRI 50 times faster is like using an atomic clock to time the cooking of a chicken."

The basic fact is that fMRI is never going to compete with EEG or MEG in terms of temporal resolution, because the latter directly measure the electrical activity in the brain and can do so on very fine time-scales. But that doesn't mean that speeding up fMRI data acquisition isn't a good idea. As the authors of the paper write:

fMRI studies, especially related to causality and connectivity, would benefit from reduced repetition time in terms of better statistics and physiological noise characteristics...

They don't really say *how* these studies would achieve this benefit. The rest of the discussion is mostly about how their technique improves on other attempts at ultra-fast fMRI, which tend to have poor spatial resolution. They do mention that maybe ultra-fast fMRI would help simultaneous EEG-fMRI studies to strengthen the link between the EEG signal and the fMRI signal, but it's obvious to me just how helpful this would be, given the very different timing of EEG and fMRI.

But that's not going to stop me from speculating as to how faster data-acquisition might improve fMRI. (Any readers who know more about fMRI should feel free to step in for corrections/additions).

Speculations

The basic problem is that what you want to do is model the hemodynamic response (the change in blood oxygenation levels) due to a given trial. This response unfolds over a time-course of 5-10 seconds. If you are only measuring what is happening every couple seconds, you have pretty sparse data from which to reconstruct that response. Here's an example of some reconstructed responses (notice they seem to be sampling once every second or so):

Much faster data-collection would help with this reconstruction, leading to more accurate results (and conclusions). The paper also mentions that their technique helps with motion-correction. One of the basic problems in fMRI is that if somebody moves their head/brain even just a few millimeters, everything gets thrown off. It's very hard to sit in a scanner for an hour or two without moving even a smidge (one technique, used by some hard-core researchers, is a bite bar, which is perfectly fitted to your jaw and keeps you completely stabilized). Various statistical techniques can be used to try to mitigate any movement that happens, but they only work so well. The authors of the paper write:

Obviously, all InI-based and comparable imaging methods are sensitive to motion especially at the edges of the brain with possible incorrect estimation of prior information. However, due to the large amount of data, scan times are currently short (4 min in teh current study), which mitigates the motion problem.

I take this to mean that because their ultra-rapid scanning technique collects so much data from each trial, you don't need as many trials, so the entire experiment can be shortened. Note that they are focused on the comparison between their technique and other related techniques, not the comparison between their technique and standard fMRI techniques. But it does seem reasonable that more densely sampling the hemodynamic response for an individual trial should mean you need fewer trials overall, thus shortening experiments.

-----

Boyacioğlu, R., & Barth, M. (2012). Generalized iNverse imaging (GIN): Ultrafast fMRI with physiological noise correction Magnetic Resonance in Medicine DOI: 10.1002/mrm.24528

Perspective in language

Posted by GamesWithWords on Wednesday, October 24, 2012

Language often indicates perspective:

(1) Give me that.
(2) *Give me this.

The reason that (2) is weird -- by convention, an asterisk marks a bad sentence -- is that the word this suggests that whatever is being requested is close to the speaker. Consider also:

(3) Jane came home.
(4) Jane went home.

If we were currently at Jane's home, it would be more natural to say (3) than (4). Of course, we could say (4), but we would be shifting our perspective, treating wherever Jane was as the reference point, rather than where we are now (this is particularly common in story-telling).

A less prosaic example

That is all fairly familiar, so when I turned to section 6.1 of Chapter 1 of Leonard Talmy's Toward a Cognitive Semantics, titled "Perspectival Location", I wasn't expecting anything particularly new. Then I read these examples (p. 69):

(5) The lunchroom door slowly opened and two men walked in.
(6) Two men slowly opened the lunchroom door and walked in.

These sentences describe the same event, but place the reader in a very different position. As Talmy points out, when reading (5), one gets the sense that you are in the lunchroom, whereas in (6), you get the sense that you outside of the lunchroom ... either that, or the door to the lunchroom is transparent glass.

Implied movement

Talmy gives another great pair of examples on page 71:

(7) There are some houses in the valley.
(8) There is a house every now and then through the valley.

The first sentence implies a static point of view, far from the houses, allowing you to see all the houses at once (Talmy calls this "stationary distal perspective point with global scope of attention"), whereas (8) gives the sense of moving through the valley and among the houses, with only a few within view at any given time ("moving proximal perspective point with local scope of attention")

Writing

Talmy's purpose is to put together a taxonomy of linguistic devices, and most of this chapter is trying to lay out all the different factors along which language can vary (for instance, the different types of perspective one can take). And that is of course why I'm reading it.

But it's also interesting to think about as a writer. One flaw in bad writing is using sentences that adopt the wrong perspective (telling a story about Jennifer, who is in the lunchroom, and then using (6)). This example from Talmy shows just how complicated the issues are ... and the tools available to a good writer for subtly guiding the reader through the story.

Nature, Nurture, and Bayes

Posted by GamesWithWords on Saturday, July 30, 2011

I generally have very little good to say about the grant application process, but it does force me to catch up on my reading. I just finished several papers by Amy Perfors, who I think does some of the more interesting computational models of language out there.*

A strange sociological fact about language research is that people generally come in two camps: a) those who don't (really) believe language is properly characterized by hierarchical phrase structure and also don't believe in much innate structure but do believe in powerful innate learning mechanisms, and b) those who believe language is properly characterized by *innate* hierarchical phrase structure and who don't put much emphasis on learning mechanisms. But there's no logically necessary connection between being a Nativist and believing in hierarchical phrase structure or being an Empiricist and believing in relatively simple syntactic forms. In the last few years, Perfors has been staking out some of that (largely) unclaimed territory where hierarchical phrase structure and Empiricism meet.

In "The learnability of abstract syntactic principles," she and her colleagues consider the claim by (some) Nativists that children must have an innate expectation that language be something like a hierarchical context-free grammar because there isn't enough data in the input to rule out alternative grammars. (Empiricists often buck the whole question by saying language is no such thing.) Perfors et al. show that, in fact, with some relatively simple assumptions and a powerful (Bayesian) learning device, the learner would conclude that the most likely representation of English is a hierarchical context-free grammar, based on relatively little input (reproducing what happened in linguistics, where linguists came to the same conclusion). You do have to assume that children have the innate capacity to represent such grammars, but you don't need to assume that they prefer such grammars.

"Joint acquisition of word order and word reference" presents some interesting data bearing on a number of questions, but following the theme above, she notes that her model does not require very much data to conclude that the typical word-order in English is subject-verb-object. She and her colleagues note: "The fact that word order can be acquired quickly from so [little data] despite the lack of bias [for a particular word order] may suggest no need to hypothesize that children are born with strong innate constraints on word ordering to explain their rapid acquisition."

I'm sympathetic to all these points, and I think they bring an important perspective to the question of language learning (one that is not, I should say, unique to Perfors, but certainly a minority perspective). What I can't help wondering is this: she (and others) show that you could learn the structure of language based on the input without (certain) innate assumptions that the input will be of a particular sort. Fine. But why is the input of that particular sort across (most? all?) languages? One thing the Nativist positions Perfors argues against have going for them is that they give a (more or less) principled explanation. Empiricists (typically) do not. (I am aware that some try to give explanations in terms of optimal information structure. What I have seen of this work has not struck me as overwhelmingly convincing, but I admit I haven't read enough of it and that I am willing to be convinced, though my prior on this line of argumentation is fairly low).

*My quasi-journalistic training always makes me want to disclose when I know personally the people I am writing about. But psycholinguistics is a small world. It would be safe for the reader to assume that I know *all* of the people I write about to one degree or another.

*********
Perfors A, Tenenbaum JB, & Regier T (2010). The learnability of abstract syntactic principles. Cognition PMID: 21186021

Maurits, L., Perfors, A., & Navarro, D. (2009). Joint acquisition of word order and word reference Proceedings o the 31st Annual Conference of the Cognitive Science Society, 1728-1733

The missing linking hypothesis

Posted by GamesWithWords on Friday, April 15, 2011

Science just published a paper on language evolution to much fanfare. The paper, by Quentin Atkinson, presents analysis suggesting that language was "invented" just one time in Africa. That language first appeared in Africa would be of little surprise, since that's where we evolved. That there was only one point at which it evolved is somewhat more controversial, and also trivially false if one includes sign languages, at least some of which have appeared de novo in modern times (and one could make a case for including spoken creoles in the list of de novo languages).

What still boggles my mind is the analysis that supports these conclusions. In many ways, it seems brilliant -- but I can't escape the feeling that there is something amiss with the argument. The problem, as we'll see, is a series of missing linking hypotheses.

The Data

The primary finding is that the further you go from Africa (very roughly following plausible migration paths), the fewer phonemes the local language has. Hawai'ian -- the language spoken farthest from our African point of origin -- has only 13 phonemes. Some languages in Africa have more than 100.

To support the claim that this demonstrates that language evolved in Africa, one must add some additional data and hypotheses. One datum is that languages spoken by more people have more phonemes. Atkinson argues that whenever a new population migrated away from the parent population, it would necessarily be a smaller group ... and thus their language would have fewer phonemes than the parent group. Keep this up and over time, you end up with just a few phonemes left.

Population genetics

This argument seems to derive a lot of its plausibility from well-known phenomena in population genetics. Whenever a new population branches off (migrates away), it will almost by definition have less genetic diversity than the mother population. And in fact Africa has greater genetic diversity than other continents.

Atkinson tries to apply the same reasoning to phonemes:

Where individuals copy phoneme distinctions made by the most proficient speakers (with some loss), small population size will reduce phoneme diversity. De Boer models the evolution of vowel inventories using a different approach, in which individuals copy any members of their group with some error, and finds the same population size effect.

I see the logic, but then phonemes aren't genes. When ten people leave home to start a new village, they can only take ten sets of genes with them, and even some of that diversity may be lost because of vagaries of reproduction. Those alleles, once gone, are not easily reconstructed.

As far as I can tell, to apply the same logic to phonemes we have to assume a fair percentage of children fail to learn all the phonemic contrasts in their native language. For some reason, this does not prevent them from communicating successfully. In a large population, the fact that many people lack this or that phonemic contrast doesn't matter, as on average, most people know any given phonemic contrast, and thus it is transmitted across the generations. When a small group leaves home, however, it's quite possible that by accident there will be a phonemic contrast that few (or none) of them use. The next generation is then unlikely to use that contrast.

This may be true, but I don't find its plausibility so overwhelming that I'm willing to accept it on face value. I'd actually like to see data showing that many or most speakers of a given language do not use all the phonemic contrasts (beyond the fact that of course some dialects are missing certain phonemes, as in the fact that Californians do not distinguish between cot and caught; dialectical variation probably cannot support Atkinson's argument, but I leave the proof to the reader ... or to the comment section).

Phonemes and Population Size

Atkinson reports being inspired by the relatively recent finding that languages spoken by more people have more phonemes. Interestingly, the authors of that paper note that "we do not have well-developed theoretical arguments to offer about why this should be." It seems to me that Atkinson's analyses depend crucially on the answer to this puzzle, though as I mentioned at the outset, I haven't been able to quite work out all the details yet.

Atkinson's analysis crucially depends on (among things) the following supposition: the current population size of any language community is roughly predicted by the number of branching points (migrations) since the original language (which arose somewhere on the order of 50,000 and 100,000 years ago). I'm still on the fence as to whether or not this is a preposterous claim or very reasonable.

It is certainly very easy to construct scenarios on which this supposition would be false. Civilizations expand and contract rapidly (consider that English was confined to only one part of Great Britain half a millennium ago, or that Celtic languages were spoken across Europe only 2,000 years ago). Relative population size today seems to be driven more by poverty, access to birth control and education, etc., than anything else. Atkinson only needs there to be a mid-sized correlation, but 50,000 years is a very, very long time.

Atkinson also needs it to be the case that the further from Africa a language is spoken, the more branching points there have been. The problem we have is that there is a lot of migration within already-settled areas (Indo-European expansion, Mandarin expansion, Bantu expansion, etc.). So we need it to be the case that most of the branching of language groups happened going into new, unsettled areas, and relatively little of it is a result of invading already-populated areas. That may be true, but consider that all of Africa, Europe, Asia and the Americas were settled by 10,000 years ago, which leaves a lot of time for language communities to move around.

Conclusion

Atkinson put together a very interesting dataset that needs to be explained. His explanation may well be the right one. However, his explanation requires making a number of conjectures for which he offers little support. They may all be true, but this is a dangerous way to make theories. It's a little like playing Six Degrees to Kevin Bacon where you are allowed to conjecture the existence of movies and co-stars. It should be obvious that with those rules, you can connect Kevin Bacon to anyone, including yourself.

Talking about Love

Posted by GamesWithWords on Wednesday, January 26, 2011

Much of my work is on verbs that describe emotion, called "psych verbs." The curious thing about psych verbs is that they come in two varieties, those that put the experiencer of the emotion in subject position (Mary likes/hates/fears John) and those that put the experiencer of the emotion in object position (Mary delights/angers/frightens John).

These verbs have caused a four-decades-long headache for theorists trying to explain how people know what should be the subject and what should be the object of a given verb. Many theorists would like to posit theories on which you put the "do-er" in subject position and the one "done to" in object position. But some psych verbs seem to go one way and some the other.

There are basically only three theoretical possibilities:

a) There's no general rule that will tell you whether the experiencer of an emotion should be the subject or object of a given verb.

b) There's a general rule that tells you the experiencer should be the subject (or, on other theories, the object), and then there are some exceptions.

c) There are no exceptions. There are two kinds of psych verbs that actually mean very different things. Each group follows a particular rule: one sends the experiencer to subject; the other, to object.

I started out as a fan of theory (b). The results of my own work have pushed me in the direction of (c). The only theory that I'm pretty sure is wrong is (a). There are a lot of reasons I think (a) is wrong. One has to do with Broca's aphasia.

Broca's aphasia

People with Broca's aphasia -- typically caused by a stroke or brain injury -- have difficulty with grammar but are relatively good at remembering what individual words mean. Classically, Broca's aphasia was thought to result from damage to Broca's area, though I've heard that association is not as solid as once believed.

Some well-known language-related areas of the brain.

Either way, Maria Mercedes Pinango published a study in 2000 looking at how well Broca's aphasics understand psych verbs. She found that they had particular trouble with experiencer-object verbs (delights/angers/frightens) ... unless the verbs were in passive form (Mary is delighted/angered/frightened by John), in which case they had more trouble with the experiencer-subject verbs.

There are a lot of reasons this could be. The main aspect of the finding that interests me here is that this is *not* what you'd expect on theory (a), since on that theory, all psych verbs are more or less the same and there's no particular reason Broca's aphasia or anything else should impact one more than the other.

One worry one might have about this study was that it was published as a book chapter and not in a journal, and book chapters don't (usually) undergo the same review process. I don't personally know that much about aphasia or how one goes about testing aphasics, so it's hard for me to review Pinango's methods. More importantly, there weren't many participants in the study (these participants are not easy to find), so one would like replication.

Replication

As it happens, Cynthia Thompson and Miseon Lee recently published just such a replication (well, they published it in 2009, but one doesn't always hear about papers right away). It's a nice study with 5 Broca's aphasics, published in the Journal of Neurolinguistics. They tested both sentence comprehension and sentence production, finding that while passive sentences were harder overall, experiencer-subject verbs (like/hate/fear) were easier in the active form and experiencer-object verbs (delight/anger/frighten) were easier in the passive form. This effect was much more pronounced in sentence production than comprehension (in the latter case, it was not strictly significant), most likely because comprehension is easier.

Again, these are not the results you expect if the rules that tell you who should be a subject and who should be an object are verb-by-verb, since then there's no reason brain damage should affect one class of verbs as opposed to another (since there are no verb classes).* What exactly it does mean is much trickier. Give me another 20-30 years, and hopefully I'll have an answer.

*Actually, I can come up with a just-so story that saves theory (a). But it's certainly not what you would expect, and I believe there are a lot of other data from other paradigms that speak against theory (a).

_________

Thompson CK, and Lee M (2009). Psych verb production and comprehension in agrammatic Broca's aphasia. Journal of neurolinguistics, 22 (4), 354-369 PMID: 20174592

Learning What Not to Say

Posted by GamesWithWords on Monday, January 17, 2011

A troubling fact about language is that words can be used in more than one way. For instance, I can throw a ball, I can throw a party, and I can throw a party that is also a ball.

These cats are having a ball.

The Causative Alternation

Sometimes the relationship between different uses of a word is completely arbitrary. If there's any relationship between the different meanings of ball, most people don't know it. But sometimes there are straightforward, predictable relationships. For instance, consider:

John broke the vase.
The vase broke.

Mary rolled the ball.
The ball rolled.

This is the famous causative alternation. Some verbs can be used with only a subject (The vase broke. The ball rolled) or with a subject and an object (John broke the vase. Mary rolled the ball). The relationship is highly systematic. When there is both a subject and an object, the subject has done something that changed the object. When there is only a subject, it is the subject that undergoes the change. Not all verbs work this way:

Sally ate some soup.
Some soup ate.

Notice that Some soup ate doesn't mean that some soup was eaten, but rather has to mean nonsensically that it was the soup doing the eating. Some verbs simply have no meaning at all without an object:

Bill threw the ball.
*The ball threw.

In this case, The ball threw doesn't appear to mean anything, nonsensical or otherwise (signified by the *). Try:

*John laughed Bill.
Bill laughed.

Here, laughed can only appear with a subject and no object.

The dative alternation

Another famous alternation is the dative alternation:

John gave a book to Mary.
John gave Mary a book.

Mary rolled the ball to John.
Mary rolled John the ball.

Once again, not all verbs allow this alternation:

John donated a book to the library.
*John donated the library a book.

(Some people actually think John donated the library a book sounds OK. That's all right. There is dialectical variation. But for everyone there are verbs that won't alternate.)

The developmental problem

These alternations present a problem for theory: how do children learn which verbs can be used in which forms? A kid who learns that all verbs that appear with both subjects and objects can appear with only subjects is going to sound funny. But so is the kid who thinks verbs can only take one form.

The trick is learning what not to say

One naive theory is that kids are very conservative. They only use verbs in constructions that they've heard. So until they hear "The vase broke," they don't think that break can appear in that construction. The problem with this theory is that lots of verbs are so rare that it's possible that (a) the verb can be used in both constructions, but (b) you'll never hear it used in both.

Another possibility is that kids are wildly optimistic about verb alternations and assume any verb can appear in any form unless told otherwise. There are two problems with this. The first is that kids are rarely corrected when they say something wrong. But perhaps you could just assume that, after a certain amount of time, if you haven't heard e.g. The ball threw then threw can't be used without an object. The problem with that is, again, that some verbs are so rare that you'll only hear them a few times in your life. By the time you've heard that verb enough to know for sure it doesn't appear in a particular construction, you'll be dead.

The verb class hypothesis

In the late 1980s, building on previous work, Steven Pinker suggested a solution to this problem. Essentially, there are certain types of verbs which, in theory, could participate in a given alternation. Verbs involving caused changes (break, eat, laugh) in theory can participate in the causative alternation, and verbs involving transfer of possession (roll, donate) in theory can participate in the dative alternation, and this knowledge is probably innate. What a child has to learn is which verbs do participate in the dative alternation.

For reasons described above, this can't be done one verb at a time. And this is where the exciting part of the theory comes in. Pinker (building very heavily on work by Ray Jackendoff and others) argues that verbs have core aspects of their meaning and some extra stuff. For instance, break, crack, crash, rend, shatter, smash, splinter and tear all describe something being caused to fall to pieces. What varies between the verbs is the exact manner in which this happens. Jackendoff and others argues that the shared meaning is what is important to grammar, whereas the manner of falling to pieces was extra information which, while important, is not grammatically central.

Pinker's hypothesis was that verb alternations make use of this core meaning, not the "extra" meaning. From the perspective of the alternation, then, break, crack, crash, rend, shatter, smash, splinter and tear are all the same verb. So children are not learning whether break alternates, they learn whether the whole class of verbs alternate. Since there are many fewer classes than than there are verbs (my favorite compendium VerbNet has only about 270), the fact that some verbs are very rare isn't that important. If you know what class it belongs to, as long as the class itself is common enough, you're golden.

Testing the theory

This particular theory has not been tested as much as one might expect, partly because it is hard to test. It is rather trivial to show that verbs do or don't participate in alternations as a class, partly because that's how verb classes are often defined (that's how VerbNet does it). Moreover, various folks (like Stefanowitsch, 2008) argue that although speakers might notice the verb classes, that doesn't prove that people actually do use those verb classes to learn which verbs alternate and which do not.

The best test, then, is it teach people -- particularly young children -- new verbs that either belong to a class that does alternate or to a class that does not and see if they think those new verbs should or should not alternate. Very few such studies have been done.

Around the same time Pinker's seminal Language and Cognition came out in 1989, which outlines the theory I described above, a research team led by his student Jess Gropen (Gropen, Pinker, Hollander, Golberg and Wilson, 1989) published a study of the dative alternation. They taught children new verbs of transfer (such as "moop," which meant to move an object to someone using a scoop), which in theory could undergo the dative alternation. The question they asked was whether kids would be more likely to use those verbs in the alternation if the verbs were monosyllabic (moop) or bisyllabic (orgulate). They were more likely to do so for the monosyllabic verbs, and in fact in English monosyllabic verbs are more likely to alternate. This issue of how many syllables the verb has did come up in Language and Cognition, but it wasn't -- at least to me -- the most compelling part of the story (which is why I left it out of the discussion so far!).

Ambridge, Pine and Rowland (2011)

Ben Ambridge, Julian Pine and Caroline Rowland of the University of Liverpool have a new study in press which is the only study to have directly tested whether verb meaning really does guide which constructions a child thinks a given verb can be used in, at least to the best of my knowledge -- and apparently to theirs, since they don't cite anyone else. (I've since learned that Brooks and Tomasello, 1999, might be relevant, but the details are sufficiently complicated and the paper sufficiently long that I'm not yet sure.)

They taught children two novel verbs, one of which should belong to a verb class that participates in the causative alternation (a manner of motion verb: bounce, move, twist, rotate, float) and one of which should not (an emotional expression: smile, laugh, giggle). Just to prove to you that these classes exist, compare:

John bounced/moved/twisted/rotated/floated the ball.

The ball bounced/moved/twisted/rotated/floated.

*John smiled/laughed/giggled Sally.
Sally smiled/laughed/giggled.

Two groups of children (5-6 years old and 9-10 years old) were taught both types of verbs with subjects only. After a lot of training, they were shown new sentences with the verbs and asked to rate how good the sentences were. In the case of the manner of motion verb, they liked the sentences whether the verb had an subject and an object or if the verb had only a subject. That is, they thought the verb participated in the causative alternation. For the emotion expression verb, however, they thought it sounded good with a subject only; when it had both a subject and an object, they thought it did not sound good. This was true both for the older kids and the younger kids.

This is, I think, a pretty nice confirmation of Pinker's theory. Interestingly, Ambridge and colleagues think that Pinker is nonetheless wrong, but based on other considerations. Partly, our difference of opinion comes from the fact that we interpret Pinker's theory differently. I think I'm right, but that's a topic for another post. Also, there is some disagreement about a related phenomenon (entrenchment), but that, too, is a long post, and the present post is long enough.

____
Gropen, J., Pinker, S., Hollander, M., Goldberg, R., and Wilson, R. (1989). The Learnability and Acquisition of the Dative Alternation in English Language, 65 (2) DOI: 10.2307/415332

Ben Ambridge, Julian M. Pine, and Caroline F. Rowland (2011). Children use verb semantics to retreat from overgeneralization errors Cognitive Linguistics

For picture credits, look here and here.

So maybe reading should be harder

Posted by GamesWithWords on Monday, January 03, 2011

Some weeks back I chided Jonah Lehrer for his assertion that he'd

love [e-readers] to include a feature that allows us to undo their ease, to make the act of reading just a little bit more difficult. Perhaps we need to alter the fonts, or reduce the contrast, or invert the monochrome color scheme. Our eyes will need to struggle, and we’ll certainly read slower, but that’s the point: Only then will we process the text a little less unconsciously, with less reliance on the ventral pathway. We won’t just scan the words – we will contemplate their meaning.

This sounded like a bunch of neuro-babble to me, partly because the research he cited seemed to be about something else entirely.

Obviously, the ventral pathway is the problem.

Spoke too soon

To the rescue come Diemand-Yauman, Oppenheimer & Vaughan, who just published a new paper in my favorite journal, Cognition. The abstract says it all:

Previous research has shown that disfluency -- the subjective experience of difficulty associated with cognitive operations -- leads to deeper processing. Two studies explore the extent to which this deeper processing engendered by disfluency interventions can lead to improved memory performance. Study 1 found that information in hard-to-read fonts was better remembered that easier to read information in a controlled laboratory setting. Study 2 extended this finding to high school classrooms. The results suggest that superficial changes to learning materials could yield significant improvements in educational outcomes.

The first experiment involved remembering 21 pieces of information over a 15-minute interval, which while promising, has it's limitations. Here are the authors:

There are a number of reasons why this result might not generalize to actual classroom environments. First, while the effects persisted for 15 min, the time between learning and testing is typically much longer in school settings. Moreover, there are a large number of other substantive differences between the lab and actual classrooms, including the nature of materials, the learning strategies adopted, and the presence of distractions in the environment... Another concern is that because disfluent reading is, by definition, perceived as more difficult, less motivated students may become frustrated. While paid laboratory participants are willing to persist in the face of challenging fonts for 90 s, the increase in perceived difficulty may provide motivational barriers for actual students.

Or it could just make the students bored.

In a second, truly heroic study, the researchers talked a bunch of teachers at a public high school into sending them all their classroom worksheets and powerpoint slides. The researchers recreated two versions of these materials: one in an easy-to-read font and one in a difficult-to-read font. Each of the teachers taught at least two sections of the same course, so they were able to use one set of materials with one group of students and the other set with the another group. The classes included English, Physics, Chemistry and History.

Once again, the researchers found better learning with the hard-to-read fonts.

Notes and Caveats

The researchers seem open to a number of possibilities as to why hard-to-read fonts would lead to better learning:

It is worth noting that it is not the difficulty, per se, that leads to improvements in learning but rather the fact that the intervention engages processes that support learning.

Moreover, unlike Lehrer, they don't recommend making everything harder to read, learn or do:

Not all difficulties are desirable, and presumably interventions that engage more elaborative processes without also increasing difficulty would be even more effective at improving educational outcomes.

There is one obvious concern one might have about their Experiment 2: the teachers were blind to hypothesis, but not to condition. The authors attempt to wave this away but asserting that the teachers would likely make the wrong hypothesis (that learning should be worse when the font is hard), and thus any "experimenter" bias would be in the wrong direction. However, we have no way of knowing whether the teachers attempted to compensate for the hard-to-read materials by explaining thing better. In fact, the authors had no way of testing whether the teachers behaved similarly in both conditions.

That's not at all saying I think it was a bad study or shouldn't have been published. I think it's a fantastic study. I don't know how they roped those teachers into the project, but this is the kind of go-get-it science people should be practicing. The study isn't perfect or conclusive, but no studies are. The goal is simply to have results that are clear enough that they generate more research and new hypotheses.

-------
Connor Diemand-Yauman, Daniel M. Oppenheimer, and Erikka B. Vaughan (2011). Fortune favors the bold (and the italicized): Effects of disfluency on educational outcomes Cognition, 118, 111-115 : doi:10.1016/j.cognition.2010.09.012

Universal Grammar is dead. Long live Universal Grammar.

Posted by GamesWithWords on Wednesday, October 20, 2010

Last year, in a commentary on Evans and Levinson's "The myth of language universals: Language diversity and its importance for cognitive science" in Behavioral and Brain Sciences (a journal which published one target paper and dozens of commentaries in each issue), Michael Tomasello wrote:

I am told that a number of supporters of universal grammar will be writing commentaries on this article. Though I have not seen them, here is what is certain. You will not be seeing arguments of the following type: I have systematically looked at a well-chosen sample of the world's languages, and I have discerned the following universals ... And you will not even be seeing specific hypotheses about what we might find in universal grammar if we followed such a procedure.

Hmmm. There are no specific proposals about what might be in UG... Clearly Tomasello doesn't read this blog much. Granted, for that he should probably be forgiven. But he also clearly hasn't read Chomsky lately. Here's the abstract of the well-known Hauser, Chomsky & Fitch (2002):

We submit that a distinction should be made between the faculty of language in the broad sense (FLB) and in the narrow sense (FLN). FLB includes a sensory-motor system, a conceptual-intentional system, and the computational mechanisms for recursion, providing the capacity to generate an infinite range of expressions from a finite set of elements. We hypothesize that FLN only includes recursion and is the only uniquely human component of the faculty of language.

Later on, HCF make it clear that FLN is another way of thinking about what elsewhere is called "universal grammar" -- that is, constraints on learning that allow the learning of language.

Tomasello's claim about the other commentaries (that they won't make specific claims about what is in UG) is also quickly falsified, and by the usual suspects. For instance, Steve Pinker and Ray Jackendoff devote much of their commentary to describing grammatical principles that could be -- but aren't -- instantiated in any language.

Tomasello's thinking is perhaps made more clear by a later comment later in his commentary:

For sure, all fo the world's languages have things in common, and [Evans and Levinson] document a number of them. But these commonalities come not from any universal grammar, but rather from universal aspects of human cognition, social interaction, and information processing...

Thus, it seems he agrees that there are constraints on language learning that shape what languages exist. This, for instance, is the usual counter-argument to Pinker and Jackendoff's nonexistent languages: those languages don't exist because they're really stupid languages to have. I doubt Pinker or Jackendoff are particular fazed by those critiques, since they are interested in constraints on language learning, and this proposed Stupidity Constraint is still a constraint. Even Hauser, Chomsky and Fitch (2002) allow for constraints on language that are not specific to language (that's their FLB).

So perhaps Tomasello fundamentally agrees with people who argue for Universal Grammar, this is just a terminology war. They call fundamental cognitive constraints on language learning "Universal Grammar" and he uses the term to refer to something else: for instance, proposals about specific grammatical rules that we are born knowing. Then, his claim is that nobody has any proposals about such rules.

If that is what he is claiming, that is also quickly falsified (if it hasn't already been falsified by HCF's claims about recursion). Mark C. Baker, by the third paragraph of his commentary, is already quoting one of his well-known suggested language universals:

(1) The Verb-Object Constraint (VOC): A nominal that expresses the theme/patient of an event combines with the event-denoting verb before a nominal that expresses the agent/cause does.

And I could keep on picking examples. For those outside of the field, it's important to point out that there wasn't anything surprising in the Baker commentary or the Pinker and Jackendoff commentary. They were simply repeating well-known arguments they (and others) have made many times before. And these are not obscure arguments. Writing an article about Universal Grammar that fails to mention Chomsky, Pinker, Jackendoff or Baker would be like writing an article about major American cities without mentioning New York, Boston, San Francisco or Los Angeles.

Don't get me wrong. Tomasello has produced absurd numbers of high-quality studies and I am a big admirer of his work. But if he is going to make blanket statements about an entire literature, he might want to read one or two of the papers in that literature first.

-------
Tomasello, M. (2009). Universal grammar is dead Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X09990744

Evans, N., & Levinson, S. (2009). The myth of language universals: Language diversity and its importance for cognitive science Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X0999094X

Hauser MD, Chomsky N, & Fitch WT (2002). The faculty of language: what is it, who has it, and how did it evolve? Science (New York, N.Y.), 298 (5598), 1569-79 PMID: 12446899

Baker, M. (2009). Language universals: Abstract but not mythological Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X09990604

Pinker, S., & Jackendoff, R. (2009). The reality of a universal language faculty Behavioral and Brain Sciences, 32 (05) DOI: 10.1017/S0140525X09990720

Cognitive Science, March 2010

Posted by GamesWithWords on Wednesday, June 09, 2010

In my continuing series on the past year in Cognitive Science: March, 2010.

Once again, the discussion of some of these papers will be technical.

March

Baroni, Murphy, Barbu, Poesio. Strudel: A corpus-based semantic model based on properties and types.

You are who your friends are. A number of computational linguists have been interested in just how much you can learn about a word based on the other words it tends appear with. Interestingly, if you take a word (e.g., dog) and look at the words it tends to co-occur with (e.g., cat), those other words often describe properties or synonyms of the target word. A number of researchers have suggested that this might be part of how we learn the meanings of words.

Baroni et al. are sympathetic to that literature, but they point out that such models are only learned that dog and cat are somehow related. So they don't actually tell you what the word dog means. Moreover, Dog is also related to leash, but not in the same way it's related to cat, which is something those models ignore. Their paper covers a new model, Strudel, which attempts to close some of the gap.

The model also keeps track of what words co-occur with a target word. It additionally tracks how those words are related (e.g., dogs and cats is considered to be different from dogs chase cats). The more different types of constructions that connect the target word and a given "friend", the more important that friend is thought to be.

This model ends up doing a better job than some older models at finding semantic associates of target words. It also can cluster different words (e.g., apple, banana, dog, cat) into categories (fruit, animal) with some success. Moreover, with some additional statistical tricks, they were able to clump the various "friends" into different groups based on the type of constructions they appear in. Properties, for instance, often appear in constructions involving X has Y. Conceptually-similar words appear in other types of constructions (e.g., X is like Y).

This presents some clear advantages over previous attempts, but it has some of the same limitations as well. The model discovers different types of features of a target word (properties, conceptually-similar words, etc.), but the label "property" has to be assigned by the researchers. The model doesn't know that has four legs is a property of dog and that like to bark is not -- it only knows that the two facts are of different sorts.

Perruchet & Tillman. Exploiting multiple sources of information in learning an artificial language: human data and modeling.

Over the last 15 years, a number of researchers have looked at statistically-based word segmentation. After listening to a few minutes of speech in an unknown language, people can guess which sequences of phonemes are more likely to be words in that language.

It turns out that some sequences of phonemes just sound more like words, independent of any learning. The authors check to see whether that matters. Participants were assigned to learn one of two languages: a language in which half of the words a priori sounded like words, and a language in which half the words a priori sounded particularly not like words. Not only did participants do better in the first condition on the words that sound like words, they did better on the "normal" words, too -- even though those were the same as the "normal" words in the second condition. The authors argue that this is consistent with the idea that already knowing some words helps you identify other words.

They also find that the fact that some words a priori sound more like they are words is easy to implement in their previously-proposed PARSER model, which then produces data somewhat like the human data from the experiment.

Gildea & Temperley. Do grammars minimize dependency length?

Words in a sentence are dependent on other words. In secondary school, we usually used the term "modify" rather than "depend on." So in The angry butcher yelled at the troublesome child, "the angry butcher" and "at the troublesome child" both modify/depend on yelled. Similarly, "the angry" modifies/depends on butcher. Etc.

This paper explores the hypothesis that people try to keep words close to the words they depend on. They worked through the Wall Street Journal corpus and calculated both what the actual dependency lengths were in each sentence (for each word in the sentence, count all the words that are between a given word and the word it depends on, and sum) and also what the shortest possible dependency length would be. They found that actual dependency lengths were actually much closer to the optimum in both the WSJ corpus and the Brown corpus than would be expected by chance. However, when they looked at two corpora in German, while dependency lengths were shorter than would be expected by random, the effect was noticeably smaller. The authors speculate this is because German has relatively free word order, because German has some verb-final constructions, or some other reason or any combination of those reasons.

Mueller, Bahlmann & Friederici. Learnability of embedded syntactic structures depends on prosodic cues.

Center-embedded structures are hard to process and also difficult to teach people in artificial grammar learning studies that don't provide feedback. The authors exposed participants to A1A2B1B2 structures with or without prosodic cues. Participants largely failed to learn the grammar without prosodic cues. However, if a falling contour divided each 4-syllable phrase (A1A2B1B2) from each other, participants learned much more. They did even better if a pause was added in addition to the falling contour between 4-syllable phrases. Adding an additional pause between the As and Bs (in order to accentuate the difference between As and Bs) did not provide any additional benefit.

Cognitive Science, January 2010

Posted by GamesWithWords on Tuesday, June 08, 2010

In my continuing series on the past year in Cognitive Science: January, 2010.

Once again, the discussion of some of these papers will be technical.

January

Lee & Sarnecka. A model of knower-level behavior in number concept development.

Children learn the full meanings of number words slowly, one word at a time. The authors present a Bayesian model of number word acquisition -- or, more specifically, of performance on the famous Give-A-Number task. The model assumes that each child has a certain baseline preference to give certain numbers of items more than others. It also assumes that the child knows certain number words and not others. If the child, say, knows one and two, the child will give that number of items when asked and not when asked about a different number word (e.g., three), even if the child doesn't know what that other number word means.

The model was then fed data on the actual performance of a set of actual children and estimates what words the child knows and what the child's baseline preferences are. The model learned that children prefer to either give a handful of items or all the available items, which accords well with what has been seen over the years. It also seemed to do a reasonable job of doing several other things.

None of this was necessarily surprising, in the sense that the model modeled well-known data correctly. That said, psychological theories are often complex. Theorists (often) state them in loose terms and then make claims about what predictions the theory makes in terms of behavior in different tasks. Without specifying the theory in a formal model, though, it's not always clear that those are in fact the predictions the theory makes. This paper represents, among things, an attempt to take a well-known theory and show that it does in fact account for the observed data. To the extent it gets things wrong, the model presents a starting point for further refinement.

There has been a movement in some quarters to make and test more explicit models. This is undoubtedly a good thing. The question is whether there are many behaviors that we understand sufficiently well to produce reasonable models ... that aren't so simplistic that the formal model itself doesn't really tell us anything we don't know. That seems to be a point one could argue. One thing I like about this particular model is that the authors attempt to capture fine-grained aspects of individual subjects' performances, which is something we ultimately want to be able to do.

Estigarribia. Facilitation by variation: Right-to-left learning of English yes/no questions

The syntax of questions have played a key role in the development of modern linguistics. In particular, a great deal of ink has been spilled about auxiliary inversion. Compare That is a soccer ball with Is that a soccer ball. Well-known theories of English posit that the auxiliary is is generated in normal declarative position (that is...) and must be moved to the front of the sentence to form a question (is that...).

Estigarribia argues that many theories have assumed parents model auxiliary-inverted questions for their children. A (smallish) corpus analysis reveals that in fact ~20% of parental yes/no questions with auxiliaries are non-auxiliary-initial (that is a soccer ball?). Of all yes/no questions, canonical auxiliary-first questions make up less than half, with sentence fragments being quite common (soccer ball?).

Again looking at the corpus of 6 young children, Estigarribia finds that the children begin by producing the simplest, fragment questions (a soccer ball?). Next, they begin producing what Estigarribia calls subject-predicate questions (that a soccer ball?). Full-on auxiliary-inverted questions appear relatively late (is that a soccer ball). Estigarribia finds this consistent with a learning mechanism in which children learn the ends of sentences better than the beginnings of sentences, similar to the MOSAIC model.

One limitation is that children have difficulty producing long sentences, and the data are consistent with children producing shorter sentences first and eventually progressively-longer sentences. Estigarribia shows that he finds the same order of acquisition even in children who have somewhat longer MLUs at the beginning of the study (that is, produce longer sentences), but one can still worry. The fact that children selectively produce the ends of the sentences rather than the beginning could be due to the fact that the end of a question (a soccer ball?) is a lot more informative than the beginning (is that a?).

It might be somewhat more impressive if children produce non-inverted questions (that is a soccer ball?) before inverted questions, but Estigarribia does not analyze those types of sentences. What I find most compelling about this study is in fact the adult data. As Estigarribia points out, we don't want to think of language acquisition as a process in which children ultimately eliminate non-canonical questions (that is, those without inverted auxiliaries), since in fact adults produce many such sentences.

Nakatani & Gibson. An on-line study of Japanese nesting complexity.

Mary met the senator who attacked the reporter who ignored the president is easier to understand that The reporter who the senator who Mary met attacked ignored the president, even though the latter sentence is grammatical (of sorts) and means the same thing. Why this is the case has been a focus of study in psycholinguistics for many years.

The authors lay out a couple hypotheses. On one, the second sentence is harder to interpret because the relevant nouns are far from the verbs, making integrating ignored and the reporter harder to integrate. On other hypotheses, all the nested relative clauses (who...) generate expectations about what verbs are coming up. The more expectations, the more has to be kept in memory, and the harder the sentence is.

These hypotheses (and a similar surprisal hypothesis) are tested using the self-paced reading methodology in Japanese, a language with a few nice properties like relatively free word order, which makes controlling the stimuli slightly easier than it is in English. The results ultimately support the expectancy hypotheses over the retrieval hypotheses.

One of the interesting things about this paper is that one well-known retrieval hypothesis is actually Gibson's. So is one of the expectancy hypotheses, which he developed after he (apparently) decided the original theory was probably wrong. The willingness to abandon a cherished theoretical position in the face of new evidence is a trait more prized than seen in academia, and it's something to be admired -- and something very typical of Gibson.

Mirman, Strauss, Dixon & Magnuson. Effect of representational distance between meanings on recognition of ambiguous spoken words.

The authors looked at word recognition using two different paradigms (lexical decision and eye-tracking). All the words could be nouns. Some had only strong noun meaning (acorn, lobster). Some were homophones with two common noun meanings (chest -- chest of drawers or person's chest) and some were homophones with a common noun and a common verb meaning (bark -- the dog barked or the tree's bark).

Participants were fasted to interpret the unambiguous words (acorn, lobster), next fastest at recognizing the noun-verb words (bark) and slowest at the noun-noun words (chest). They take this in the context of previous research that has shown that words with two closely related meanings are faster to interpret that words with two very different meanings. In this study, the semantic relatedness of the two meanings for the noun-verb homophones were no closer than that of the noun-noun homophones. So the authors suggest that syntactic distance matters as well -- two meanings of the same syntactic type (e.g., noun) interfere with one another more than two meanings of different types (e.g., noun-verb).

An alternative explanation of these data is one of priming. 2/3 of the stimuli in this study were unambiguously nouns. This may have primed the noun meanings of the noun-verb homophones and helped automatically suppress the verb meaning. Thus, participants processed the noun-verb homophones more like unambiguous, non-homophonic words. The way to test this, of course, would be to run a similar study with unambiguous verbs, verb-verb homophones, and the same noun-verb homophones.

Cognitive Science, April 2010

Posted by GamesWithWords on Monday, June 07, 2010

This week I was tasked by the lab to check the last years' worth (or so) of issues of Cognitive Science and see what papers might be of interest to folks in the lab (other people are covering other journals). There of course many good papers not on the list below; I focused largely on the psycholinguistics articles. There are a lot of articles, so I'm going to be breaking up issues into separate posts.

Fair warning: my discussion of these articles is brief and so somewhat technical.

April 2010

Szymanik & Zajenkowski. Comprehension of simple quantifiers: empirical evaluation of a computational model.

Different quantifiers seem to require different amounts of computation. Formal logic suggests that checking the truth of Some of the cars are blue simply requires checking whether at least one car is blue (or failing to find any). Most of the cars are blue probably requires something like finding out how many cars is 1/2 of the cars and checking to see if at least more than that are. That's harder.

S&Z had people evaluate the truth value of sentences like those in the examples. People were slower for the "harder" quantifiers. This suggests people are actually running through something like the formal math theorists use to describe quantifiers.

The only odd thing about the results is a ton of research (e.g., Bott & Noveck) has suggested that evaluating sentences with some can be very slow, presumably because it involves a scalar implicature, whereas in the study some was one of the fastest quantifiers. This either suggests that for some reason people weren't computing implicatures in their study or that the other quantifiers were really slow (or that Polish, the language they used, is just different).

Matthews & Bannard, Children's production of unfamiliar word sequences is predicted by positional variability and latent classes in a large sample of child-directed speech.

Two- and three-year olds were asked to repeat back four-word sequences. Several things were varied, such as how predictable the final word was based on the first three in the sequence sequence (e.g., jelly probably commonly appears after peanut butter and ... ) and whether the words that do commonly appear as the fourth word in such a sequence are semantically related (e.g., pretty much everything following I drive a ... is going to be some kind of vehicle).

Importantly, in the actual sequences presented to the children, the final word was one that hardly ever appears in that sequence (e.g., I hate green boxes). Kids were better at repeating the sequences when (1) entropy on the 4th word was high (e.g., many different words commonly follow the first three in the sequence, as in I drive a rather than peanut butter and), and when most words that typically appear in that 4th position are semantically related (I drive a truck/car/bus/Toyota/Ford).

The authors (tentatively) suggest that such results are predicted by theories on which young children's grammars are very item-specific, involving many narrow sentence templates (I drive a + Vehicle), rather than theories on which young children's grammars involve broad abstract categories (e.g., Noun + Verb).

However, my labmate and collaborator Timothy O'Donnell has been working on a computational model that involves abstract grammatical categories but nonetheless stores high-frequency constructions (which has been allowed but not specifically explained on many grammatical theories such as Pinker's Words & Rules theory that are the traditional alternatives to item-based theories). One consequence of his model is that if a particular construction appears very frequently with little variation (peanut butter and jelly; 653,000 hits on Google), the model finds slight alternatives to that construction (peanut butter and toast; 120,000 hits on Google) extremely unlikely.

Casasanto, Fotakopoulou & Boroditsky. Space and time in the child's mind: Evidence for a cross-dimensional asymmetry.

4-5 year-olds and 9-10 year-olds watched movies of two animals traveling along parallel paths for different distances or durations and judged the which one went longer temporally or spatially. As has been previously shown in adults, the children's judgments of temporal length were affected by spatial length (e.g., if animal A went farther than B but in a shorter amount of time, children sometimes erroneously said A took the most time) more than judgments of spatial length were affected by temporal length (e.g., if animal A went farther than B but in less time, children were not as likely to be confused about when animal went the farthest).

One obvious confound, which the authors consider, is that the stimuli stayed on the screen until the children responded, which meant that information about physical distance was available at response time, but children had to remember the duration information. The authors point to a previous study with adults that controlled for this confound and got the same results, but they have not yet run that version with children (since I haven't read the study they refer to and the method isn't described, I can't really comment).

These results are taken as evidence of a theory on which our notion of time is dependent on our representations of space, but not vice versa.

Fay, Garrod, Roberts & Swoboda. The interactive evolution of human communication systems.

People played what amounted to repeated games of Pictionary. The games were played over and over with the same words, and the question was how the pictorial representations changed over repeated games. Participants were assigned either in pairs or communities of 8. The pairs played against each other only. In the communities, games were still played in pairs, but each person played first against one member of the community, then against another, and so on until they had played with everyone.

The people in the pairs condition rapidly created abstract visual symbols for the different target words, as has happened in previous research. What was interesting was that in the communities condition, participants also created similarly abstract symbols that were rapidly shared throughout the community, such that participants who had never played against one another could communicate with abstract symbols that they had each learned from others.

The study is meant to be a model of vocabulary development and dispersal, and it certainly made for a good read (I've been a fan of Garrod's work in other contexts as well). I don't know much about theories of language evolution, so it's difficult for me to say what the theoretical impact of this work is. One obvious question is whether it matters that people in a single community know they are in a single community. That is, did they continue to use the abstract symbols they'd learned because they reasonably thought the other person might know it, or was it instinct after having used that symbol many times?

How many commercial brands does your kid know?

Posted by GamesWithWords on Monday, May 03, 2010

A recently published study by Anna McAlister and T. Bettina Cornwell at the University of Michigan reports that smarter kids are more affected by branding and know more brands. A number of people are interested in this because the naive prediction might have been that smarter people (including kids) should be less impressionable and less susceptible to marketing, rather than more.

The study caught my eye because it is a nice example of a problem that developmental psychologists run into. One question a developmental psychologist might be interested in is at what age children acquire a particular ability (like susceptibility to branding). This type of research has implications for education, public policy, etc. But the age you get depends on the age of the kids you test.

It happens to be the case that the children who are most easily recruited into developmental psychology studies tend to be relatively advanced. This happens for a number of reasons. Developmental labs tend to be in universities, which tend to be surrounded by relatively affluent, well-educated communities. Even within a community, not all parents are equally likely to bring their kid into do a study, and those that do seem to often be the sorts of parents that have advanced children. Many studies may disproportionately test children of professors and graduate students. It's easy to imagine additional reasons.

Unfortunately, there is a problem with the direct link to the study. It should be the top study on this Google Scholar search.

Galileo -- Smarter than you thought

Posted by josh on Wednesday, November 26, 2008

It is often said of cognitive scientists that we have, as a group, a memory that only stretches back about 10 years. This is for good reasons and bad. Methods change and improve constantly, constantly making much of the literature irrelevant. Then there is the fact that there is so much new work, it's hard to find time to read the old.

This is a shame, because some of the really old work is impressive for its prescience. A recent issue of Trends in Neurosciences carried an article on Galileo's work on perception. Most people then -- and probably most people now -- conceived of the senses as passing along an accurate representation of the world to your brain. We now know the senses are plagued by illusions (many of them actually adaptive).

Galileo was on to this fact. His study of the moon proved that perceptions of brightness are constantly subject to illusion. More generally, he noted -- contrary to the popular view -- that much of what we sense about the world is in a real sense an illusion. Objects exist, but colors and tastes in an important sense do not. It's worth presenting a few of the quotes from the article:

I say that, as soon as I conceive of a piece of matter, or a corporeal substance,...I do not feel my mind forced to conceive it as necessarily accompanied by such states as being white or red, bitter or sweet, noisy or quiet, or having a nice or nasty smell. On the contrary, if we were not guided by our senses, thinking or imagining would probably never arrive at them by themselves. This is why I think that, as far as concerns the object in which these tastes, smells, colours, etc., appear to reside, they are nothing other than mere names, and they have their location only in the sentient body. Consequently, if the living being were removed, all these qualities would disappear and be annihilated.

New research on understanding metaphors

Posted by josh on Monday, September 29, 2008

Metaphors present a problem for anybody trying to explain language, or anybody trying to teach a computer to understand language. It is clear that nobody is supposed to take the statement, "Sarah Palin is a barracuda" literally.

However, we can imagine that such phrases are memorized like any other idiom or, for that matter, any word. Granted, we aren't sure how word-learning works, but at least metaphor doesn't present any new problems.

Clever Speech

At least, not as long as it's a well-known metaphor. The problem is that the most entertaining and inventive language often involves novel metaphors.

So suppose someone says "Sarah Palin is the new Harriet Miers." It's pretty clear what this means, but it seems to require some very complicated processing. Sarah Palin and Harriet Miers have many things in common. They are white. They are female. They are Republican. They are American. They were born in the 20th Century. What are the common characteristics that matter?

This is especially difficult, since in a typical metaphor, the common characteristics are often abstract and only metaphorically common.

Alzheimer's and Metaphor

Some clever new research just published in Brain and Language looked at comprehension of novel metaphors in Alzheimer's Disease patients.

It is already known that AD patients do reasonably well on comprehending well-known metaphors. But what about new metaphors?

Before I get to the data, a note about why anybody would bother troubling AD patients with novel metaphors: neurological patients can often help discriminate between theories that are otherwise difficult to distinguish. In this case, one theory is that something called executive function is important in interpreting new metaphors.

Executive function is hard to explain and much about it is poorly understood, but what is important here is that AD patients are impaired in terms of executive function. So they provide a natural test case for the theory that executive function is necessary to understand novel metaphors.

The results

In this study, AD patients were as good as controls at understanding popular metaphors. While control participants were also very good at novel metaphors, AD patients had a marked difficulty. This may suggest that executive function is important in understanding novel metaphors and gives some credence to theories based around that notion.

This still leaves us a long way from understanding how humans so easily draw abstract connections between largely unrelated objects to produce and understand metaphorical language. But it's another step in that direction.

-----
M AMANZIO, G GEMINIANI, D LEOTTA, S CAPPA (2008). Metaphor comprehension in Alzheimer’s disease: Novelty matters Brain and Language, 107 (1), 1-10 DOI: 10.1016/j.bandl.2007.08.003

Field of Science