Games with Words

Another problem with statistical translation

Posted by GamesWithWords on Tuesday, March 22, 2011

In the process of writing my latest article for Scientific American Mind, I spent a lot of time testing out automatic translators like Google Translate. As I discuss in the article, these programs have gotten a lot better in recent years, but on the whole they are still not very good.

I was curious what the Italian name of one of my favorite arias meant. So I typed O Soave Fanciulla into Google Translate. Programs like Google Translate are trained by comparing bilingual documents and noting, for a given word in one language, what word typically appears in the other language in the same place. Not surprisingly, Google Translate translated O Soave Fanciulla as O Soave Fanciulla -- no doubt because it was the case that, in the bilingual corpora GT was trained on, sentences with the phrase o soave fanciulla in Italian had o suave fanciulla in English.

I was reduced to translating the words one at a time: soave -> sweet, fanciulla -> girl. GT thinks o means or, but I expect that's the wrong reading in this context ("or sweet girl"?).

Blogger Spam Filter: Not Totally Useless

Posted by GamesWithWords on Monday, February 28, 2011

For the first time ever, Google/Blogger's spam filter actually caught a spam comment. Usually, it lets the spam go right through unmolested and only traps legitimate comments.

We can hope this is the start of a trend.

Overheard: Converting common knowledge into scientific knowledge

Posted by josh on Sunday, February 20, 2011

Because they are so familiar, it is easy to assume that category labels drawn from everyday language are self-evidently the correct way to describe emotion. However, transforming everyday categorical descriptions into an effective research tool is at the least a challenge.

Cowie & Cornelius (2003) Describing the emotional states that are expressed in speech. Speech Communication 40, 5-32.

New experiment: Mind Reading Quotient

Posted by GamesWithWords on Tuesday, February 15, 2011

Language requires a lot of inference. Consider the following three conversations:

A: Are there lots of people at the party?
B: Well, most people have left already.

A: How long has the party been going on?
B: Well, most people have left already.

A: Is it a good party?
B: Well, most people have left already.

In each of these cases, B's statement literally means the same thing, but the interpretation is different. Explaining (a) why this should be the case, and (b) how people figure out the implicit meanings is a very active area of research in modern linguistics and psycholinguistics.

The Mind Reading Quotient

Basically, understanding conversations like the ones above seem to require a certain amount of "mind reading" -- that is, guessing what the speaker (B, in this case) means to say. If you've ever wondered "what did she mean by that?" you were engaged in this kind of mind reading.

I just posted a new experiment -- the Mind Reading Quotient -- which consist of several short tests of this kind of mind reading ability. A couple of the tests look specifically at trying to work out what somebody is saying. A couple of the tests look at similar skills in the non-linguistic domain.

My favorite of the non-linguistic tasks is a coordination game. Thomas Schelling won a Nobel Prize in part for pioneering work on the topic. He found that people are very good at guessing what another person is thinking under certain conditions. For instance, if you tell two people they must meet up in New York City -- but without communicating with each other in any way -- they are actually fairly likely to succeed. Most likely, they would both show up on the corner of Times Square (or in one of a very small number of likely locations). The Mind Reading Quotient includes several such problems.

The goal of this study in part is to get a sense of how good people are at such tasks. There are a lot of thought experiments out there, but not nearly enough data. I will also be looking to see if people who are better at one of these tasks are also better at the others -- that is, is there a single underlying "mind reading ability," or does each task require a separate set of skills?

Reports so far are that the experiment runs 20-25 minutes. Because this is broken up into 7 separate activities, it should seem faster than that. And a lot of the tasks are fun (at least, I think so). Plus, at the end of the experiment, you'll be able to see your scores on many of the different sub-tasks. In two cases (a vocabulary test and an empathy test), I also have percentile scores already worked out, so you can see how you compare to average.

Follow this link to the study.

---
For previous posts about pragmatics and other linguistic inferences, check out this one, this one and this one.

image CC by Ignacio Conejo.

Missing Words

Posted by josh on Thursday, February 10, 2011

My dictionary lists several Chinese words for disdain, but none for discourage. The government in Orwell's 1984 would have loved this, as they -- along with many contemporary writers (I'm talking about you, Bill Bryson) -- believed that you don't have a word for something you can't think about it. I guess China has no need for the motivational speaker industry.

You can't be discouraged if you don't have a word for it.

Unfortunately for the government of Oceania, there's very little evidence this is true. The availability of certain words in a language may have effects on memory or speeded recognition, but probably does nothing so drastic as making certain thoughts inaccessible. I think examples like the one above make it clear just how unlikely the hypothesis was to be true to begin with.

-----
photo credit here.

New Experiment: Drama Queen

Posted by GamesWithWords on Thursday, February 03, 2011

The latest experiment in my quest to understand how people use emotion verbs is now posted. You will be introduced to a character who is, as the name of the game implies, a drama queen. She has many fraught relationships with her friends. You will be introduced to a number of friends, how Susan feels about each friend, and a new verb that you will try to use to describe that relationship. Enjoy.

Love, Marriage & Race

Posted by josh on Wednesday, February 02, 2011

People who have been following this blog know that birth order affects who you are friends with and who you marry. Here's some comprehensive evidence on race. It probably won't come as a surprise, but it's nice to have numbers.

Talking about Love

Posted by GamesWithWords on Wednesday, January 26, 2011

Much of my work is on verbs that describe emotion, called "psych verbs." The curious thing about psych verbs is that they come in two varieties, those that put the experiencer of the emotion in subject position (Mary likes/hates/fears John) and those that put the experiencer of the emotion in object position (Mary delights/angers/frightens John).

These verbs have caused a four-decades-long headache for theorists trying to explain how people know what should be the subject and what should be the object of a given verb. Many theorists would like to posit theories on which you put the "do-er" in subject position and the one "done to" in object position. But some psych verbs seem to go one way and some the other.

There are basically only three theoretical possibilities:

a) There's no general rule that will tell you whether the experiencer of an emotion should be the subject or object of a given verb.

b) There's a general rule that tells you the experiencer should be the subject (or, on other theories, the object), and then there are some exceptions.

c) There are no exceptions. There are two kinds of psych verbs that actually mean very different things. Each group follows a particular rule: one sends the experiencer to subject; the other, to object.

I started out as a fan of theory (b). The results of my own work have pushed me in the direction of (c). The only theory that I'm pretty sure is wrong is (a). There are a lot of reasons I think (a) is wrong. One has to do with Broca's aphasia.

Broca's aphasia

People with Broca's aphasia -- typically caused by a stroke or brain injury -- have difficulty with grammar but are relatively good at remembering what individual words mean. Classically, Broca's aphasia was thought to result from damage to Broca's area, though I've heard that association is not as solid as once believed.

Some well-known language-related areas of the brain.

Either way, Maria Mercedes Pinango published a study in 2000 looking at how well Broca's aphasics understand psych verbs. She found that they had particular trouble with experiencer-object verbs (delights/angers/frightens) ... unless the verbs were in passive form (Mary is delighted/angered/frightened by John), in which case they had more trouble with the experiencer-subject verbs.

There are a lot of reasons this could be. The main aspect of the finding that interests me here is that this is *not* what you'd expect on theory (a), since on that theory, all psych verbs are more or less the same and there's no particular reason Broca's aphasia or anything else should impact one more than the other.

One worry one might have about this study was that it was published as a book chapter and not in a journal, and book chapters don't (usually) undergo the same review process. I don't personally know that much about aphasia or how one goes about testing aphasics, so it's hard for me to review Pinango's methods. More importantly, there weren't many participants in the study (these participants are not easy to find), so one would like replication.

Replication

As it happens, Cynthia Thompson and Miseon Lee recently published just such a replication (well, they published it in 2009, but one doesn't always hear about papers right away). It's a nice study with 5 Broca's aphasics, published in the Journal of Neurolinguistics. They tested both sentence comprehension and sentence production, finding that while passive sentences were harder overall, experiencer-subject verbs (like/hate/fear) were easier in the active form and experiencer-object verbs (delight/anger/frighten) were easier in the passive form. This effect was much more pronounced in sentence production than comprehension (in the latter case, it was not strictly significant), most likely because comprehension is easier.

Again, these are not the results you expect if the rules that tell you who should be a subject and who should be an object are verb-by-verb, since then there's no reason brain damage should affect one class of verbs as opposed to another (since there are no verb classes).* What exactly it does mean is much trickier. Give me another 20-30 years, and hopefully I'll have an answer.

*Actually, I can come up with a just-so story that saves theory (a). But it's certainly not what you would expect, and I believe there are a lot of other data from other paradigms that speak against theory (a).

_________

Thompson CK, and Lee M (2009). Psych verb production and comprehension in agrammatic Broca's aphasia. Journal of neurolinguistics, 22 (4), 354-369 PMID: 20174592

New York Times, You Can't Handle the Truth.

Posted by GamesWithWords on Tuesday, January 25, 2011

Earlier today I wrote about the research behind an opinion article at the New York Times. When I looked at the sources cited, I was unable to find any information supporting the claims made in the article. In fact, what I found directly contradicted those claims. I finished by saying that while I was willing to believe these claims, I'd like to know what data support them. In passing, I mentioned that I had submitted an abbreviated version of this analysis as a comment on the Times website.

That comment was not published. I figured maybe there had been a computer error, so I submitted another one later in the day. That one was also not published. Finally, at 6:13pm, I submitted an innocuous and useless comment under an assumed name:

I agree with Pat N. It's nice to hear from someone who has some optimism (@ Dr. Q).

This comment was published almost immediately.

The Times states that "comments are moderated and generally will be posted if they are on-topic and not abusive."Since the moderators didn't publish the comment, we can conclude one of two things:

1) Discussion of the empirical claims made in a New York Times article is not "on topic."
2) Pointing out a mistake made in a New York Times article is a kind of abuse.

Do students at selective schools really study less?

Posted by GamesWithWords on Tuesday, January 25, 2011

*Updated with More Analysis*

So says Philip Babcock in today's New York Times. He claims:

Full-time college students in the 1960s studies 24 hours per week, on average, and their counterparts today study 14 hours per week. The 10-hour decline is visible for students from all demographic groups and of all cognitive abilities, in every major and at every type of college.

The claim that this is true for "every type of college" is important because he wants to conclude that schools have lowered their standards. The alternative is that there are more, low-quality schools now, or that some schools have massively lowered their standards. These are both potentially problems -- and are probably real -- but are not quite the same problem as all schools everywhere lowering their standards.

So it's important to show that individual schools have lowered their standards, and that this is true for the selective schools as well as the not-selective schools. The article links to this study by Babcock. This study analyzes a series of surveys of student study habits from the 1960s to the 2000s, and thus seems to be the basis of his argument, and in fact the introduction contains almost the identical statement that I have quoted above. Nonetheless, despite these strong conclusions, the data that would support them appear to be missing.

SAT scores and size are not available in the early years, so study time by college selectivity is not reported.

He goes on to say that he can look at selectivity in the more recent surveys: specifically matched 1988-2003 surveys. These do show a decrease in study time from on the order of 1-2 hours for high-, medium- and low-selectivity schools (I cannot find how selectivity was defined). Whether this is even statistically significant is unclear, as he does not report any statistics or confidence intervals. In any case, it is not a 10 hour difference.

What Babcock might have meant, and more problems with the data

It is possible that when Babcock was saying that the decrease in study time was true of all types of schools, he meant that when you look at all types of schools in 2003/4, students at all levels report studying less than the average student reported in 1961. The problem is that, for all we know, the schools in his sample were more selective in 1961 than they were in 2003/4.

Moreover, the is something worrisome about his selectivity data. Whenever analyzing data, many researchers like to do what is called a "sanity check": they make sure that the data contain results that are known to be true. If you were looking at a study of different types of athletes, you might make sure that the jockeys are shorter than the basketball players, lighter than the football players and chew less tobacco than the baseball players. If you find any of these things do not hold, you might go back and make sure there isn't a type-o somewhere in your data-entry process.

I worry that Babcock's data fail the sanity check. Specifically, look at the number of hours studies according to selectivity of school in 2003:

highly selective: 13.47 hours
middle: 14.68 hours
non-selective: 16.49 hours

Note that this effect is larger than the decline in number of hours studied between 1988 and 2003, so in terms of this dataset, this is a large effect (again, I cannot tell if it is significant, because the relevant statistical information is not provided) and it's not in the direction one would think. I will admit that it is possible that students at highly selective schools really do study less than the folks at JuCo, but that conflicts heavily with my pretty extensive anecdotal database. So either a) the world is very different from how I thought it was -- in which case, I want more evidence than just this survey -- b) Babcock has defined selectivity incorrectly, or c) there is something wrong with these data.

One last worrisome fact

I considered the possibility that the data Babcock was quoting were in a different paper. The only other paper on Babcock's website that looked promising was this American Enterprise Institute report. This is not a research paper, but rather summarizes research. Specifically, according to footnote #2, it summarizes the research in the paper I just discussed. Strangely, this paper does have a graph (Figure 4) breaking down study habits of students in the 1960s based on selectivity of the school they are attending: the very data he states do not exist in the later paper.

I'm not really sure what to make of that, and have nothing further to say on the topic. At the very least, I would be hesitant to use those graphs as evidence to support the general claim that study habits have changed even at the selective schools, since it's unclear where the data case from, or if in fact they even exist (to be clear: it's Babcock who says they don't exist, not me).

Conclusion

To summarize, there seems to be very little evidence to support Babcock's conclusion that study time has decreased even at selective schools by 10 hours from the 1960s to modern day. That is, he has a survey from 1961 in which students studied 25 hrs/week, two surveys in the 1980s in which students studied 17 hours/week, and two surveys in the 2000s in which students studied 14-15 hrs/week, but these surveys are all based on different types of students at different schools, so it's hard to make any strong conclusions. If I compared the weight of football places from Oberlin in 1930 and Ohio State in 2005, I'd find a great increase in weight, but in fact the weight of football players at Oberlin probably has not increased much over that time period.

Moreover, there are aspects of these data that deserve some skepticism. When report to people who went to selective schools that these data suggest students at such schools study 13 hrs/week, the response is usually something like, "Do you mean per day?"

Finally, since no statistics were run, it's quite possible that none of the results in this study are significant.

I want to be clear that I'm not saying that Babcock's claims aren't true. I'm just saying that it's not clear he has any evidence to support them (which is not to say I think it's a bad study: it was a good study to have done and clearly took a lot of work, but I find it at best suggestive of future avenues of research and certainly not conclusive).

New tags

Posted by GamesWithWords on Friday, January 21, 2011

Rather than write a new blog post (or my nearly-due BUCLD proceedings paper), I decided to revamp the post tags on this blog. Their usage has been inconsistent, which is making it harder and harder to find old blog posts that I want to link to.

Hopefully the new and improved tags will also be useful for you, dear reader. Now if you want to find any of my articles on the academic career path, on animal language or on universal grammar -- just to give a few examples -- they are only a mouse click away.

In addition to standard tags, there are also a series of tags beginning with the preposition "on". These appear on most posts now and are more meta-thematic than the others.

Learning What Not to Say

Posted by GamesWithWords on Monday, January 17, 2011

A troubling fact about language is that words can be used in more than one way. For instance, I can throw a ball, I can throw a party, and I can throw a party that is also a ball.

These cats are having a ball.

The Causative Alternation

Sometimes the relationship between different uses of a word is completely arbitrary. If there's any relationship between the different meanings of ball, most people don't know it. But sometimes there are straightforward, predictable relationships. For instance, consider:

John broke the vase.
The vase broke.

Mary rolled the ball.
The ball rolled.

This is the famous causative alternation. Some verbs can be used with only a subject (The vase broke. The ball rolled) or with a subject and an object (John broke the vase. Mary rolled the ball). The relationship is highly systematic. When there is both a subject and an object, the subject has done something that changed the object. When there is only a subject, it is the subject that undergoes the change. Not all verbs work this way:

Sally ate some soup.
Some soup ate.

Notice that Some soup ate doesn't mean that some soup was eaten, but rather has to mean nonsensically that it was the soup doing the eating. Some verbs simply have no meaning at all without an object:

Bill threw the ball.
*The ball threw.

In this case, The ball threw doesn't appear to mean anything, nonsensical or otherwise (signified by the *). Try:

*John laughed Bill.
Bill laughed.

Here, laughed can only appear with a subject and no object.

The dative alternation

Another famous alternation is the dative alternation:

John gave a book to Mary.
John gave Mary a book.

Mary rolled the ball to John.
Mary rolled John the ball.

Once again, not all verbs allow this alternation:

John donated a book to the library.
*John donated the library a book.

(Some people actually think John donated the library a book sounds OK. That's all right. There is dialectical variation. But for everyone there are verbs that won't alternate.)

The developmental problem

These alternations present a problem for theory: how do children learn which verbs can be used in which forms? A kid who learns that all verbs that appear with both subjects and objects can appear with only subjects is going to sound funny. But so is the kid who thinks verbs can only take one form.

The trick is learning what not to say

One naive theory is that kids are very conservative. They only use verbs in constructions that they've heard. So until they hear "The vase broke," they don't think that break can appear in that construction. The problem with this theory is that lots of verbs are so rare that it's possible that (a) the verb can be used in both constructions, but (b) you'll never hear it used in both.

Another possibility is that kids are wildly optimistic about verb alternations and assume any verb can appear in any form unless told otherwise. There are two problems with this. The first is that kids are rarely corrected when they say something wrong. But perhaps you could just assume that, after a certain amount of time, if you haven't heard e.g. The ball threw then threw can't be used without an object. The problem with that is, again, that some verbs are so rare that you'll only hear them a few times in your life. By the time you've heard that verb enough to know for sure it doesn't appear in a particular construction, you'll be dead.

The verb class hypothesis

In the late 1980s, building on previous work, Steven Pinker suggested a solution to this problem. Essentially, there are certain types of verbs which, in theory, could participate in a given alternation. Verbs involving caused changes (break, eat, laugh) in theory can participate in the causative alternation, and verbs involving transfer of possession (roll, donate) in theory can participate in the dative alternation, and this knowledge is probably innate. What a child has to learn is which verbs do participate in the dative alternation.

For reasons described above, this can't be done one verb at a time. And this is where the exciting part of the theory comes in. Pinker (building very heavily on work by Ray Jackendoff and others) argues that verbs have core aspects of their meaning and some extra stuff. For instance, break, crack, crash, rend, shatter, smash, splinter and tear all describe something being caused to fall to pieces. What varies between the verbs is the exact manner in which this happens. Jackendoff and others argues that the shared meaning is what is important to grammar, whereas the manner of falling to pieces was extra information which, while important, is not grammatically central.

Pinker's hypothesis was that verb alternations make use of this core meaning, not the "extra" meaning. From the perspective of the alternation, then, break, crack, crash, rend, shatter, smash, splinter and tear are all the same verb. So children are not learning whether break alternates, they learn whether the whole class of verbs alternate. Since there are many fewer classes than than there are verbs (my favorite compendium VerbNet has only about 270), the fact that some verbs are very rare isn't that important. If you know what class it belongs to, as long as the class itself is common enough, you're golden.

Testing the theory

This particular theory has not been tested as much as one might expect, partly because it is hard to test. It is rather trivial to show that verbs do or don't participate in alternations as a class, partly because that's how verb classes are often defined (that's how VerbNet does it). Moreover, various folks (like Stefanowitsch, 2008) argue that although speakers might notice the verb classes, that doesn't prove that people actually do use those verb classes to learn which verbs alternate and which do not.

The best test, then, is it teach people -- particularly young children -- new verbs that either belong to a class that does alternate or to a class that does not and see if they think those new verbs should or should not alternate. Very few such studies have been done.

Around the same time Pinker's seminal Language and Cognition came out in 1989, which outlines the theory I described above, a research team led by his student Jess Gropen (Gropen, Pinker, Hollander, Golberg and Wilson, 1989) published a study of the dative alternation. They taught children new verbs of transfer (such as "moop," which meant to move an object to someone using a scoop), which in theory could undergo the dative alternation. The question they asked was whether kids would be more likely to use those verbs in the alternation if the verbs were monosyllabic (moop) or bisyllabic (orgulate). They were more likely to do so for the monosyllabic verbs, and in fact in English monosyllabic verbs are more likely to alternate. This issue of how many syllables the verb has did come up in Language and Cognition, but it wasn't -- at least to me -- the most compelling part of the story (which is why I left it out of the discussion so far!).

Ambridge, Pine and Rowland (2011)

Ben Ambridge, Julian Pine and Caroline Rowland of the University of Liverpool have a new study in press which is the only study to have directly tested whether verb meaning really does guide which constructions a child thinks a given verb can be used in, at least to the best of my knowledge -- and apparently to theirs, since they don't cite anyone else. (I've since learned that Brooks and Tomasello, 1999, might be relevant, but the details are sufficiently complicated and the paper sufficiently long that I'm not yet sure.)

They taught children two novel verbs, one of which should belong to a verb class that participates in the causative alternation (a manner of motion verb: bounce, move, twist, rotate, float) and one of which should not (an emotional expression: smile, laugh, giggle). Just to prove to you that these classes exist, compare:

John bounced/moved/twisted/rotated/floated the ball.

The ball bounced/moved/twisted/rotated/floated.

*John smiled/laughed/giggled Sally.
Sally smiled/laughed/giggled.

Two groups of children (5-6 years old and 9-10 years old) were taught both types of verbs with subjects only. After a lot of training, they were shown new sentences with the verbs and asked to rate how good the sentences were. In the case of the manner of motion verb, they liked the sentences whether the verb had an subject and an object or if the verb had only a subject. That is, they thought the verb participated in the causative alternation. For the emotion expression verb, however, they thought it sounded good with a subject only; when it had both a subject and an object, they thought it did not sound good. This was true both for the older kids and the younger kids.

This is, I think, a pretty nice confirmation of Pinker's theory. Interestingly, Ambridge and colleagues think that Pinker is nonetheless wrong, but based on other considerations. Partly, our difference of opinion comes from the fact that we interpret Pinker's theory differently. I think I'm right, but that's a topic for another post. Also, there is some disagreement about a related phenomenon (entrenchment), but that, too, is a long post, and the present post is long enough.

____
Gropen, J., Pinker, S., Hollander, M., Goldberg, R., and Wilson, R. (1989). The Learnability and Acquisition of the Dative Alternation in English Language, 65 (2) DOI: 10.2307/415332

Ben Ambridge, Julian M. Pine, and Caroline F. Rowland (2011). Children use verb semantics to retreat from overgeneralization errors Cognitive Linguistics

For picture credits, look here and here.

New Experiment: EmotionSense

Posted by GamesWithWords on Tuesday, January 11, 2011

I just posted a new experiment on the website: EmotionSense. I have lately gotten very interested in verb-learning, specifically how we decide which of the participants in an event should be the grammatical subject, which the grammatical object, etc. (see this post and this one). In this experiment, you'll answer some questions about different types of emotions. I'll use this information to help design some upcoming verb-learning experiments.

As usual, the experiment is short and should take 3-5 minutes.

[snappy caption goes here]

-----

photo credit here

Learning the passive

Posted by GamesWithWords on Monday, January 10, 2011

If Microsoft Word had its way, passive verbs would be excised from the language. That would solve children some problems, because passive verbs are more difficult to learn than one might think, because not all verbs passivize. Consider:

*The bicycle was resembled by John.
*Three bicycles are had by John.
*Many people are escaped by the argument.

The bicycle was resembled by John: A how-to guide.

So children must learn which verbs have passives and which don't. I recently sat down to read Pinker, Lebeaux and Frost (1987), a landmark study of how children learn to passivize verbs. This is not a work undertaken lightly. At 73 pages, Pinker et al. (1987) is not Steve Pinker's longest paper -- that honor goes to his 120-page take-down of Connectionist theories of language, Pinker and Prince (1988) -- but it is long, even for psycholinguistics. It's worth the read, both for the data and because it lays out the core of what become Learnability and Cognition, one of the books that has had the most influence on my own work and thinking.

The Data

The authors were primarily interested in testing the following claim: that children are conservative learners and only passivize verbs that they have previously heard in the passive. This would prevent them from over-generating passives that don't exist in the adult language.

First, the authors looked at a database of transcriptions of child speech. A large percentage of the passive verbs they found were passives the children couldn't possibly have heard before because they aren't legal passives in the adult language:

It's broked? (i.e., is it broken?)
When I get hurts, I put dose one of does bandage on.
He all tieded up, Mommy.

Of course, when we say that the child couldn't have heard such passives before, you can't really be sure what the child heard. It just seems unlikely. To more carefully control what the child had heard, the authors taught children of various ages (the youngest group was 4 years old) made-up verbs. For instance, they might demonstrate a stuffed frog jumping on top of a stuffed elephant and say, "Look, the frog gorped the elephant." Then they would show the elephant jumping on top of a mouse and ask the child, "What happened to the mouse?"

If you think "gorp" has a passive form, the natural thing to do would be to say "The mouse was gorped by the elephant." But a child who only uses passive verbs she has heard before would refuse to utter such a sentence. However, across a range of different made-up verbs and across four different experiments, the authors found that children were willing -- at least some of the time -- to produce these new passive verbs. (In addition to production tests, there were also comprehension tests where the children had to interpret a passivization of an already-learned verb.)

Some Considerations

These data conclusively proved that children are not completely conservative, at least not by 4 years of age (there has been a lot of debate more recently about younger children). With what we know now, we know that the conservative child theory had to be wrong -- again, at least for 4 yos -- but it's worth remembering that at the time, this was a serious hypothesis.

There is a lot of other data in the paper. Children are more likely to produce new passive forms as they get older (higher rates for 5 year-olds than 4 year-olds). They taught children verbs where the agent is the object and the patient is the subject (that is, where The frog gorped the elephant means "the elephant jumped on top of the frog"). Children had more difficulty passivizing those verbs. However, a lot of these additional analyses are difficult to interpret because of the small sample sizes (16 children and only a handful of verbs per experiment or sub-experiment).

Theory

Fair warning: the rest of this post is pretty technical.

What excites me about this paper is the theoretical work. For instance, the authors propose a theory of linking rules that have strong innate constraints and yet still some language-by-language variation.

The linkages between individual thematic roles in thematic cores and individual grammatical functions in predicate-argument structures is in turn mediated by a set of unmarked universal linking rules: agents are mapped onto subjects; patients are mapped onto objects; locations and paths are mapped onto oblique objects. Themes are mapped onto any unique grammatical function but can be expressed as oblique, object or subject; specifically, as the 'highest' function on that list that has not already been claimed by some other argument of the verb.

With respect to passivization, what is important is that only verbs which have agents as subjects are going to be easily passivized. The trick is that what counts as an 'agent' can vary from language to language.

It is common for languages to restrict passivized subjects to patients affect by an action ... The English verbal passive, of course, is far more permissive; most classes of transitive verbs, even those that do not involve physical actions, have the privilege of passivizability assigned to them. We suggest this latitude is possible because what counts as the patient of an action is not self-evident ... Languages have the option of defining classes in which thematic labels are assigned to arguments whose roles abstractly resemble those of physical thematic relations...

This last passage sets up the core of the theory to be developed in Learnability and Cognition. Children are born knowing that certain canonical verbs -- ones that very clearly have agents and patients, like break -- must passivize, and that a much larger group of verbs in theory might passivize, because they could be conceived of as metaphorically having agents and patients. What they have to learn is which verbs from that broader set actually do passivize. Importantly, verbs come in classes of verbs with similar meanings. If any verb from that set passivizes, they all will.

This last prediction is the one I am particularly interested in. A later paper (Gropen, Pinker, Hollander, Goldberg & Wilson, 1989) explored this hypothesis with regards to the dative alternation, but I don't know of much other work. In general, Learnability and Cognition go less attention than it should have, perhaps because by the time it was published, the Great Past Tense Debate had already begun. I've often thought of continuing this work, but teaching novel verbs to children in the course of an experiment is damn hard. Ben Ambridge has recently run a number of great studies on the acquisition of verb alternations (like the passive), so perhaps he will eventually tackle this hypothesis directly.

----
Pinker S, Lebeaux DS, and Frost LA (1987). Productivity and constraints in the acquisition of the passive. Cognition, 26 (3), 195-267 PMID: 3677572

Mendeley -- Not quite ready for prime time

Posted by GamesWithWords on Thursday, January 06, 2011

Prompted by Prodigal Academic, I decided to give Mendeley a shot. That is, instead of working on a long over-due draft of a paper.

Mendeley is two things. First, it is a PDF library/reader. Second, it is a citation manager.

Currently, I used Papers for the first and Endnote for the second. Both work well enough -- if not perfectly -- but it is a pain that I have to enter every paper I want to cite into two different programs.

(Don't tell me I could export my Papers citations library to Endnote. First, I'd have to do that every time I update my library, which is annoying. Second, Papers was created by someone who clearly never cites books, book chapters, conference proceedings, etc. So I'd have to fix all of those in Endnote ... every time I export.)

(Also, don't tell me about Zotero. Maybe it's gotten better in the last year since I tried it, but it was seriously feature-deficient and buggy beyond all belief.)

First glance

At first, I was pleasantly surprised. Unlike Papers, Mendeley is free so long as you don't want to use their Cloud functionality much (I don't). Papers is convinced there are people named Marc Hauser, Marc D Hauser, M D Hauser, and M Hauser. Mendeley can be led astray but has some nice options to allow you to collapse two different author records -- or two different keywords.

(On that note, my Papers library has implicit causality, Implicit causality and Implicit Causality all as different keywords. Once Papers has decided the keyword for a paper is, say, Implicit Causality, nothing on G-d's green Earth will convince it to switch to implicit causality. And its searches are case sensitive. Mendeley has none of these "features.")

Also, Mendeley will let you annotate PDFs and export the PDFs with your annotations in a format readable by other PDF viewers (if, for instance, you wanted to share your annotated PDF with someone). That's a nice feature.

These would all be nice additional features if the the core functionality of Mendeley was there. I'm sorry to say that the product just doesn't seem to be ready for prime time.

I typed "prime time" into Flickr, and this is what it gave me. Not sure why.

photo credit here.

Second glance

The first disappointment is that Mendeley does not have smart collections. Like smart playlists in iTunes, smart collections are collections of papers defined by various search terms. If you have a smart collection that indexes all articles with the keywords "implicit causality," "psych verbs" and "to read", then whenever you add a new paper with those keywords, they automatically go into the smart collection. This is very handy, and it's an excellent feature of Papers (except that, as mentioned above, my smart folder for implicit causality searches for the keywords "implicit causality," "Implicit causality" OR "Implicit Causality").

I suspect Mendeley doesn't have smart collections because it doesn't have a serious search function. You can search for papers written by a given author or with a given keyword, but if you want to search for papers written by the conjunction of two authors or any paper on "implicit causality" written by Roger Brown, you're out of luck. Rather, it'll perform the search. It just won't find the right papers.

Third glance

That might be forgivable if the citation function in Mendeley was usable. The idea is that as you write a manuscript, when you want to cite, say, my paper on over-regularization (18 citations and counting!), you would click on a little button that takes you to Mendeley. You find my paper in your PDF library, click another button, and (Hartshorne & Ullman, 2006) appears in your Word document (or NeoOffice or whatever) and the full bibliographic reference appears in your manuscript's bibliography. You can even choose what citation style you're using (e.g., APA).

Sort of. Let's say you want to cite two different papers by Roger Brown and Deborah Fish, both published in 1983 (which, in fact, I did want to do). Here's what it looks like:

Implicit causality effects are found in both English (Brown﹠Fish, 1983) and Mandarin (Brown﹠Fish, 1983)

At least in APA style, those two papers should be listed as (Brown﹠Fish, 1983a) and (Brown﹠Fish, 1983b), because obviously otherwise nobody has any idea which paper you are citing.

This gets worse. Suppose you wrote:

Implicit causality effects have been found in multiple languages (Brown﹠Fish, 1983; Brown﹠Fish, 1983).

Correct APA 5th Ed. style is, I believe, (Brown﹠Fish, 1983a, 1983b). Actually, I'm not sure what exactly the correct style is, because Endnote always takes care of it for me.

There are other issues. Mendeley doesn't have a mechanism for suppressing the author. So you end up with:

As reported by Brown and Fish (Brown﹠Fish, 1983; Brown﹠Fish, 1983), verbs have causality implicit in their meaning.

instead of

As reported by Brown and Fish (1983a, 1983b), verbs have causality implicit in their meaning.

Nor does Mendeley know about et al:

Hauser, Chomsky and Fitch (Hauser, Chomsky﹠Fitch, 2001) put forward a new proposal....blah blah blah...as has been reported several times in the literature (Hauser, Chomsky﹠Fish, 2001; Brown﹠Fish, 1983; Brown﹠Fish, 1983).

That is, the second time you cite a paper with more than 2 authors, it doesn't contract to (Hauser et al. 2001). Unfortunately, there is no work-around for any of these problems. In theory, you can edit the citations to make them match APA style. Within a few seconds, a friendly dialog box pops up and asks you if you really want to keep your edited citation. You can click "OK" or click "cancel," but either way it just changes your carefully-edited citation back to its default -- at least it does on my Mac (the forums suggest that this works for some people).

It's possible that people who don't use APA won't have as many of these problems. Numbered citations, for instance, probably work fine. I've never submitted a paper anywhere that used numbered citations, though. So I either need to switch professions or continue using Endnote to write my papers.

Hopefully

One can hope that Mendeley will solve some of these issues. I found discussions on their "suggested features" forum going back many months for each of the problems discussed above, which suggests I may be waiting a while for these fixes. I do understand that Mendeley is technically in beta testing. But it's been in beta testing for over two years, so that's not really an excuse at this point.

Alternatively, maybe Papers will add a good citation feature (and discover books). Or maybe Zotero will confront its own demons. I'm going to have to wait and see.

It makes one appreciate Endnote. Yes, it's a dinosaur. No, it hasn't added any really useable features since I started using it in 2000. But it worked then, and it still works now. There's something to be said for that.

Field of Science