Field of Science

More on DragonDictate

DragonDictate continues to do a decent job of writing my email, so long as I don't talk about work. For writing papers, etc., it continues to be of limited use.

I was just dictating notes on how children learn to count. In two back-to-back sentences, I mentioned "subset-knowers". The first time, this was transcribed as "subset-members", and the second time, it was "sunset-whores".


 I have been doing a great deal of writing lately, though obviously not here. I thought that perhaps at some point in graduate school, I should try getting some of the projects I have done published, and I thought that time was now. Since this requires writing them up, I have been writing. I have gotten a lot of writing done, but I noticed that this came with an increased number of hours spent sitting at my computer. Knowing enough  friends who have suffered from repetitive stress injuries, I decided I should take a proactive approach to ergonomics.

One outcome of this process is was that I purchased voice-recognition software, namely Dragon Dictate. This actually complements my preference to pace while I think. My writing style involves a lot of thinking, punctuated by occasional bursts of typing. So being able to write as I pace seemed like a good idea.

I cannot say that this experiment has been an overwhelming success. Based on what I have learned from the documentation, Dragon Dictate seems to place a great deal of faith in transitional probabilities. That is,  the hypotheses it makes about what you are saying are based not only on the sounds that you make but based on what words typically come after one another.

Of course, what words typically follow one another depends a great deal on what you are talking about. I suspect that Dragon Dictate was not trained on a corpus involving a great deal of psycholinguistics papers, but in fact it is psycholinguistics papers that I am writing. Dragon Dictate makes a number of very systematic and very annoying errors. For instance, it is absolutely convinced that, no matter how carefully I say the word “verb”, I could not possibly have meant to say that word, and probably meant "four herbs" or some such. In the general case, this is probably the right conclusion. The word “verb” is so  rarely  spoken, that it is probably a good bet that it even if you think you  heard the word “verb”, what was actually spoken was probably something else. However, since almost all my papers are about verbs, I use that word so often that probably the right hypothesis is that no matter what you think you heard, the word I actually uttered was “verb”.

Needless to say, it doesn't do very well with technical terms from semantic and syntactic theory, either.

 The upshot is that I spend so much time correcting DragonDictate's mistakes, that it is not clear that I wouldn't be better off just typing the document begin with (you can correct using voice commands, but it is so cumbersome that I usually type instead). Dragon Dictate has a function where you can feed it various documents. The documentation appears to imply that it can learn the relevant word frequencies and transitional probabilities from these documents. I have been feeding at papers I have written, in the hopes that this will help out. So far there has been limited improvement, but I am not sure just how large a corpus of needs. I will keep you updated.

(Written using DragonDictate plus hand correction.)


I'm doing my periodic re-certification on research ethics. One of the questions on one of the quizzes is as follows:
TRUE/FALSE: A good alternative to the current peer review process would be web logs (BLOGS) where postings where [sic] papers would be posted and reviewed by those who have an interest in the work.
Apparently, the correct answer is "false". Presumably because we have much better technology for this kind of thing, rather than using a simple blog? 

Nature, Nurture, and Bayes

I generally have very little good to say about the grant application process, but it does force me to catch up on my reading. I just finished several papers by Amy Perfors, who I think does some of the more interesting computational models of language out there.*

A strange sociological fact about language research is that people generally come in two camps: a) those who don't (really) believe language is properly characterized by hierarchical phrase structure and also don't believe in much innate structure but do believe in powerful innate learning mechanisms, and b) those who believe language is properly characterized by *innate* hierarchical phrase structure and who don't put much emphasis on learning mechanisms. But there's no logically necessary connection between being a Nativist and believing in hierarchical phrase structure or being an Empiricist and believing in relatively simple syntactic forms. In the last few years, Perfors has been staking out some of that (largely) unclaimed territory where hierarchical phrase structure and Empiricism meet.

In "The learnability of abstract syntactic principles," she and her colleagues consider the claim by (some) Nativists that children must have an innate expectation that language be something like a hierarchical context-free grammar because there isn't enough data in the input to rule out alternative grammars. (Empiricists often buck the whole question by saying language is no such thing.) Perfors et al. show that, in fact, with some relatively simple assumptions and a powerful (Bayesian) learning device, the learner would conclude that the most likely representation of English is a hierarchical context-free grammar, based on relatively little input (reproducing what happened in linguistics, where linguists came to the same conclusion). You do have to assume that children have the innate capacity to represent such grammars, but you don't need to assume that they prefer such grammars.

"Joint acquisition of word order and word reference" presents some interesting data bearing on a number of questions, but following the theme above, she notes that her model does not require very much data to conclude that the typical word-order in English is subject-verb-object. She and her colleagues note: "The fact that word order can be acquired quickly from so [little data] despite the lack of bias [for a particular word order] may suggest no need to hypothesize that children are born with strong innate constraints on word ordering to explain their rapid acquisition."

I'm sympathetic to all these points, and I think they bring an important perspective to the question of language learning (one that is not, I should say, unique to Perfors, but certainly a minority perspective). What I can't help wondering is this: she (and others) show that you could learn the structure of language based on the input without (certain) innate assumptions that the input will be of a particular sort. Fine. But why is the input of that particular sort across (most? all?) languages? One thing the Nativist positions Perfors argues against have going for them is that they give a (more or less) principled explanation. Empiricists (typically) do not. (I am aware that some try to give explanations in terms of optimal information structure. What I have seen of this work has not struck me as overwhelmingly convincing, but I admit I haven't read enough of it and that I am willing to be convinced, though my prior on this line of argumentation is fairly low).

*My quasi-journalistic training always makes me want to disclose when I know personally the people I am writing about. But psycholinguistics is a small world. It would be safe for the reader to assume that I know *all* of the people I write about to one degree or another.

Perfors A, Tenenbaum JB, & Regier T (2010). The learnability of abstract syntactic principles. Cognition PMID: 21186021

Maurits, L., Perfors, A., & Navarro, D. (2009). Joint acquisition of word order and word reference Proceedings o the 31st Annual Conference of the Cognitive Science Society, 1728-1733

Statistics for Idiots

Republicans in the House are proposing to cut funding for food safety programs, despite a rise in food-born illness. Congressman Jack Kingston explains, that the nation's food supply is "99.99 percent safe". Politifact says, "That sounds great, but is it true?"

Actually, it doesn't sound that good to me. Suppose Kingston means that you only have a 0.01% chance of getting ill any particular time you eat (which seems to be the case). And let's say people eat 3 times a day. That gives you a 10.4% chance of getting sick any given year. I'd rather not get sick at all, particularly when many of the illnesses are easily preventable.

NSF fellows can teach again

I reported last month that NSF was no longer allowing its graduate fellows to teach. According to an email I received earlier today, they are reconsidering the issue:

Each Fellow is expected to devote full time to advanced scientific study or work during tenure. However, because it is generally accepted that teaching or similar activity constitutes a valuable part of the education and training of many graduate students, a Fellow may undertake a reasonable amount of such activities, without NSF approval. It is expected that furtherance of the Fellow's educational objectives and the gain of substantive teaching or other experience, not service to the institution as such, will govern these activities. Compensation for such activities is permitted based on the affiliated institution’s policies and the general employment policies outlined in this document.

New editor at Cognition (eventually)

There are no doubt many psychologists who don't count Cognition as their favorite journal. I just don't happen to know very many of them. Whenever the topic of favorite journal comes up, Cognition it is. One would think that would argue in favor of continuity; whatever they're doing is working.

That's not apparently how the for-profit publishers of Cognition (Elsevier) feel, as they've decided to find a new editor, apparently without consulting anyone in the field about it. I hope they know what they are doing.

Above average!

It's often repeated that the median study is cited less 0 times. I haven't been able to find a citation for that, but if it is true, all my papers are now above median. My birth order paper has now been cited. Actually, it was cited last year, but I didn't notice for a while. Granted, it was cited in a paper appearing in Journal of Language, Technology & Entrepreneurship in Africa, which is apparently not a high-impact journal, but a citation is a citation.

For rather boring reasons not related to the data or the review process itself, the birth order paper appeared in a journal that is not widely read by researchers, which probably has reduced its visibility. Certainly, plenty has been published on the topic in the last few years. This is a lesson for the future: it really does matter which journal you publish in, despite the wide-spread use of search engines.

For more on my birth order research, click here.

Survey on Replication

Are you a researcher working in psychology or related domains (neuroscience, linguistics, etc.)? A colleague and I are conducting a survey on replication in these fields, for inclusion in an upcoming special issue of Frontiers in Computational Neuroscience. You can fill out the survey here.

Photo credit here.

The pace of review

One of my manuscripts will shortly enter its 7th month of being under review. Apparently one of the three reviewers keeps promising to send in a review but never does. Now the 4+ months a different manuscript languished under review seems speedy.

Ray Kurzweil is convinced that the pace of science is increasing exponentially and will continue to do so. I think he's neglected one rate-limiting step in the process.

What the Best College Teachers Do: A Review of a Vexing Book

What the Best College Teachers Do is not a bad book. It is engaging and reasonably well-written. The topic is both evergreen and timely, and certainly of interest to college teachers at the very least (as well as to people who rate college quality and to people who use those ratings to decide where to go to school). My issue with this book is that it is incapable of answering the question it sets out for itself.

A problem of comparison

The book is based primarily on extensive research by the author, Ken Bain, and his colleagues. The appendix spells out in detail how they identified good college teachers (a combination of student evaluations, examples of student work, department examinations, etc.) and how they collected information about those gifted individuals (interviews, taped class sessions, course materials, etc.). They analyzed these data to determine what these best college teachers did.

Even assuming that (a) their methods successfully identified superior teachers, and (b) they collected the right information about those teachers' practices, this is only half of a study. Without even looking at their data, I can easily rattle off some things all these teachers had in common:

1. They were all human beings.
2. They were all taller than 17 inches.
3. They all spoke English, at least to some degree (the study was conducted in the USA).
4. Most were either male or female.

Commonalities are not limited to attributes of the teachers, but also to what they do in the classroom:

5. Most showed up to at least half of the class periods for a given course.
6. None of them habitually sat, silent and unmoving, at the front of the classroom for the duration of class.
7. They did not assign arbitrary grades to their students (e.g., by rolling dice).
8. Very few spoke entirely in blank verse.

While these statements are almost certainly true of good college teachers, they do not distinguish the good teachers from the bad ones. Since Bain and colleagues did not include a comparison group of bad teachers, we cannot know if their findings distinguish the good teachers from the bad ones.

Science -- like teaching -- requires training

A good test of teaching ability should pick out all the good teachers. It should also pick out only the good teachers. (A somewhat different cut of the issues is to consider test reliability and test validity). What the Best College Teachers Do focuses entirely on the first issue. As my reductio ad absurdum above shows, having only half of a good test is not a test that is 50% right; it's a useless test.

It's unfortunate that Bain and his colleagues failed in this basic and fundamental aspect of scientific inquiry. Although Bain is now the director of the Center for Teaching Excellence at New York University, he was trained as a historian. This comes out in the discussion of the study methods: "Like any good historians who might employ oral history research techniques, we subsequently sought corroborating evidence, usually in the form of something on paper..." (p. 187).

I would hope that any good historian doing comparative work would know to include a comparison group, but designing a scientific study of human behavior is hard. Even psychologists screw it up. And that's the focus of our training, whereas historians are mostly learning things other than experimental design (I assume).

Circular Definitions

Of course, failing to include a control group is not the only way to ruin a study.You can also make it circular.

Chapter 3 focuses on how excellent teachers prepare for their courses:
At the core of most professors' ideas about teaching is a focus on what the teacher does rather than on what the students are supposed to learn. In that standard conception, teaching is something that instructors do to students, usually by delivering truths about the discipline. It is what some writers call a 'transmission model.' ... 
In contrast, the best educators thought of teaching as anything they might do to help and encourage students to learn. Teaching is engaging students, engineering an environment in which they learn.
Here is what the appendix says about how the teachers were chosen for inclusion in the study:
All candidates entered the study on probation until we had sufficient evidence that their approaches fostered remarkable learning. Ultimately, the judgment to include someone in the study was based on careful consideration of his or her learning objectives, success in helping students achieve those objectives, and ability to stimulate students to have highly positive attitudes toward their studies.
It seems that perhaps teachers were included as being "excellent teachers" if they focused on student learning and on motivating students. The researchers then "found" that excellent teachers focus on student learning and on motivating students.

Vagueness and Ambiguity

Or maybe not. I'm still not entirely sure what it means to -- in the first quote -- focus on "what the teacher does" than on "what the students are supposed to learn." For instance, Bain poses the following thought problem on page 52:

"How will I help students who have difficulty understanding the questions and using  evidence and reason to answer them."

Is that focusing on what the teacher does or focusing on what the students are supposed to learn? How can we tell? By what metric?

My confusion here may merely mark me was one of those people expecting "a simple list of do's and don'ts" who are "greatly disappointed." Bain adds (p. 15), "The ideas here require careful and sophisticated thinking, deep professional learning, and often fundamental conceptual shifts." That's fine. But if there is no metric I can use to find out whether I'm following these best practices or not, what good does this book do me?

(Also, without knowing what exactly Bain means by these vague statements, there is no way to ensure that his study wasn't circular, as described in the previous section. I gave only one example, but the general problem is clear: Bain defined great teachers by one set of criteria and then analyzed their behavior in order to extract a second set of criteria. If both sets of criteria are loosely and vaguely defined, there's no way even in principle to know whether he isn't just measuring the same thing both times.)

Credible Reviews

So if we don't trust Bain's study, is there anything else in the book worth reading? Maybe. What the Best College Teachers Do is not myopically focused on Bain's own research. He reviews the literature, citing the conclusions from other studies of teaching quality, broadening the scope of the framework outlined in the book. However, this raises its own problem.

In writing a review, the reviewer is supposed to survey the literature, find all the relevant research, determine what the best research is, and then synthesize everything into a coherent whole (or at least, into something as coherent as the current state of the field allows). The reviewer generally does not describe the studies in sufficient detail to allow the reader to evaluate them directly; only a brief overview is provided, with a focus on the conclusions.

If you trust the reviewer, this is fine. That's why reviews from the most respected researchers in the field are typically highly valued, so much so that publishers and editors often solicit reviews from these researchers. Obviously, a review of the latest research on underwater basket weaving by a fifth-grader would not be so highly prized, because (a) we don't believe the fifth-grader did a particularly thorough review, and (b) we don't trust the fifth-grader's ability to sort the wheat from the chaff -- that is, identify which studies are flawed and which are to be believed.

Bain is clearly very smart. He has clearly read a lot. But I do not trust his ability to read scientific literature critically. The only evidence I have of his abilities is in the design of his own study, which is deeply flawed, as described above. If he can't design a study, why should I trust his analysis of other people's studies?

Building a better mousetrap

Criticizing a study is easy, but it's not much of a critique if you can't identify what a better study would look like. Clearly from my discussion above, I would want (a) clear criteria for defining good teaching, (b) clearly-defined measures of teacher behavior, and (c) a group of good teachers and a group of bad teachers for comparison, and probably a group of average teachers as well (otherwise, any differences between good and bad teachers could be driven by bad habits of the bad teachers rather than good habits of the good teachers).

After a set of behaviors that are typical of good teachers -- and which are less frequent or absent in average or bad teachers -- has been identified, one would then identify a new group of good, average, and bad teachers and replicate the results. (The risk is otherwise is one of over-fitting the data: the differences you found between good teachers and the rest were just the result of random chance. This actually happens quite a lot more than many people realize.)

At the end of this process, we should have a set of behaviors that really are particular to the best teachers, assuming that the criteria we used to define good teachers are valid (not an assumption to be taken lightly).

Becoming a good teacher

Whether or not this information would be of any use to those aspiring to be good teachers is unclear. To find out that, we'd actually need to do a controlled study, assigning one set of teachers to emulate this behavior and another set to emulate behavior typical of average teachers. Ideally, we'd find that the first group ended up teaching better. I'm unsure whether that's particularly likely to happen, for a number of reasons.

First, consider Bain's summary of the habits of the best teachers (summarizing, with some direct quotations, from pps. 15-20):

1. Outstanding teachers know their subjects extremely well.
2. Exceptional teachers treat their lectures, discussion sections, problem-based sessions, and other elements of teaching as serious intellectual endeavors as intellectually demanding and important as their research and scholarship.
3. They avoid objectives that are arbitrarily tied to the course and favor those that embody the kind of thinking and acting expected for life.
4. The best teachers try to create an environment in which people learn by confronting intriguing, beautiful, or important problems, authentic tasks that will challenge them to grapple with ideas, rethink their assumptions, and examine their mental models of reality.
5. Highly effective teachers tend to reflect a strong trust in students.
6. They have some systematic program to assess their own efforts and to make appropriate changes.

Much of this list looks like a combination of intelligence and discipline. That is clearly true for #1, and probably true for #2 and #3. To the extent that #4 is hard to do, it probably takes intelligence. And #6 is just a good idea, more likely to occur to smart people and only pulled off by disciplined people. I'm not sure what #5 really means.

If the key to being a good teacher is to be smart and disciplined, this news will be of little help to teachers who are neither (though it may be helpful to people who are trying to select good teachers). In other words, even if we determine what makes a good teacher, than doesn't mean we can make good teachers.

The best teachers

Of course, even if the strategies that good teachers use are ones you can use yourself, that doesn't mean you can use them correctly.

There is an old parable about two young women. One was exceptionally beautiful. She used to sit at her window and gaze out over the field, looking forlorn and sighing with melancholy. Villagers passing by would stop and stare, struck by her heavenly beauty. One such villager was another young woman, who was the opposite of beautiful. Nonetheless, on seeing this example, she went home, sat at her own window, gazed out over the field and sighed. Someone walked by, saw her, and promptly vomited.

Objectification of female beauty and strange fetishization of melancholy aside, the point of this parable is that just because something works for someone else doesn't mean it'll work for you. When I think about the very best teachers I've known, one thing that stands out is how idiosyncratic their methods and abilities have been. One is a high-energy lecturer who runs and jumps during his lectures (yes, math lectures), who is somehow able to turn linear algebra into a discussion class. Another, in contrast, faded into the background. He rarely lectured, preferring to have students work (in groups or individually) on carefully-crafted questions. A third is a gifted lecturer and the master of the anecdote. While others use funny anecdotes merely to keep a lecture lively, when he uses an anecdote, it is because it illustrates the point at hand better than anything else. Over at the law school, there are a number of revered professors famous for their tendency to humiliate students. This humiliation serves a purpose: to show the students how much they have to learn. The students, rather than being alienated, strive to win their professors' approval.

These methods work for each, but I can't imagine them swapping styles round-robin. Their teaching styles are outgrowths of their personalities. Many are high-risk strategies, which if they fail, fail disastrously (don't humiliate your students unless you have the right kind of charisma first).

Are there strategies that will work for everyone? Is there a way of determining which strategies will work for you, with your unique set of strengths and weaknesses? I'd love to find out. But it won't be by reading What the Best College Teachers Do.

The missing linking hypothesis

Science just published a paper on language evolution to much fanfare. The paper, by Quentin Atkinson, presents analysis suggesting that language was "invented" just one time in Africa. That language first appeared in Africa would be of little surprise, since that's where we evolved. That there was only one point at which it evolved is somewhat more controversial, and also trivially false if one includes sign languages, at least some of which have appeared de novo in modern times (and one could make a case for including spoken creoles in the list of de novo languages).

What still boggles my mind is the analysis that supports these conclusions. In many ways, it seems brilliant -- but I can't escape the feeling that there is something amiss with the argument. The problem, as we'll see, is a series of missing linking hypotheses.

The Data

The primary finding is that the further you go from Africa (very roughly following plausible migration paths), the fewer phonemes the local language has. Hawai'ian -- the language spoken farthest from our African point of origin -- has only 13 phonemes. Some languages in Africa have more than 100.

To support the claim that this demonstrates that language evolved in Africa, one must add some additional data and hypotheses. One datum is that languages spoken by more people have more phonemes. Atkinson argues that whenever a new population migrated away from the parent population, it would necessarily be a smaller group ... and thus their language would have fewer phonemes than the parent group. Keep this up and over time, you end up with just a few phonemes left.

Population genetics

This argument seems to derive a lot of its plausibility from well-known phenomena in population genetics. Whenever a new population branches off (migrates away), it will almost by definition have less genetic diversity than the mother population. And in fact Africa has greater genetic diversity than other continents.

Atkinson tries to apply the same reasoning to phonemes:
Where individuals copy phoneme distinctions made by the most proficient speakers (with some loss), small population size will reduce phoneme diversity. De Boer models the evolution of vowel inventories using a different approach, in which individuals copy any members of their group with some error, and finds the same population size effect.
I see the logic, but then phonemes aren't genes. When ten people leave home to start a new village, they can only take ten sets of genes with them, and even some of that diversity may be lost because of vagaries of reproduction. Those alleles, once gone, are not easily reconstructed.

As far as I can tell, to apply the same logic to phonemes we have to assume a fair percentage of children fail to learn all the phonemic contrasts in their native language. For some reason, this does not prevent them from communicating successfully. In a large population, the fact that many people lack this or that phonemic contrast doesn't matter, as on average, most people know any given phonemic contrast, and thus it is transmitted across the generations. When a small group leaves home, however, it's quite possible that by accident there will be a phonemic contrast that few (or none) of them use. The next generation is then unlikely to use that contrast.

This may be true, but I don't find its plausibility so overwhelming that I'm willing to accept it on face value. I'd actually like to see data showing that many or most speakers of a given language do not use all the phonemic contrasts (beyond the fact that of course some dialects are missing certain phonemes, as in the fact that Californians do not distinguish between cot and caught; dialectical variation probably cannot support Atkinson's argument, but I leave the proof to the reader ... or to the comment section).

Phonemes and Population Size

Atkinson reports being inspired by the relatively recent finding that languages spoken by more people have more phonemes. Interestingly, the authors of that paper note that "we do not have well-developed theoretical arguments to offer about why this should be." It seems to me that Atkinson's analyses depend crucially on the answer to this puzzle, though as I mentioned at the outset, I haven't been able to quite work out all the details yet.

Atkinson's analysis crucially depends on (among things) the following supposition: the current population size of any language community is roughly predicted by the number of branching points (migrations) since the original language (which arose somewhere on the order of 50,000 and 100,000 years ago). I'm still on the fence as to whether or not this is a preposterous claim or very reasonable.

It is certainly very easy to construct scenarios on which this supposition would be false. Civilizations expand and contract rapidly (consider that English was confined to only one part of Great Britain half a millennium ago, or that Celtic languages were spoken across Europe only 2,000 years ago). Relative population size today seems to be driven more by poverty, access to birth control and education, etc., than anything else. Atkinson only needs there to be a mid-sized correlation, but 50,000 years is a very, very long time.

Atkinson also needs it to be the case that the further from Africa a language is spoken, the more branching points there have been. The problem we have is that there is a lot of migration within already-settled areas (Indo-European expansion, Mandarin expansion, Bantu expansion, etc.). So we need it to be the case that most of the branching of language groups happened going into new, unsettled areas, and relatively little of it is a result of invading already-populated areas. That may be true, but consider that all of Africa, Europe, Asia and the Americas were settled by 10,000 years ago, which leaves a lot of time for language communities to move around.


Atkinson put together a very interesting dataset that needs to be explained. His explanation may well be the right one. However, his explanation requires making a number of conjectures for which he offers little support. They may all be true, but this is a dangerous way to make theories. It's a little like playing Six Degrees to Kevin Bacon where you are allowed to conjecture the existence of movies and co-stars. It should be obvious that with those rules, you can connect Kevin Bacon to anyone, including yourself. 


The super-lame New Yorker review of the recent Broadway revival of Stoppard's "Arcadia" moved me to do a rare thing: write a letter to the editor. They didn't publish it, despite the fact -- and I think I'm being objective here -- my letter was considerably better than the review. Reviews are no longer free on the New Yorker website (you can see a synopsis here), but I think my letter covers the main points. Here it is below:

Hilton Als ("Brainstorm", Mar 28) writes about the recent revival of "Arcadia" that Stoppard's "aim is not to show us people but to talk about ideas." Elsewhere, Als calls the show unmoving and writes that Stoppard does better with tragicomedies.
"Arcadia" is not a show about ideas. It is about the relationship people have with ideas, particularly their discovery. Anyone who has spent any amount of time around academics would instantly recognize the characters as people, lovingly and realistically depicted. (Als singles out Billy Crudup's "amped-up characterization of the British historian Bernard Nightengale" as particularly mysterious. As Ben Brantley wrote in the New York Times review, "If you've spent any time on a college campus of late, you've met this [man].")
As an academic, the production was for me a mirror on my own life and the people around me. Not everyone will have that experience. The beauty of theater (and literature) is that it gives us peek into the inner lives of folk very different from ourselves. It is a shame Als was unable to take advantage of this opportunity.
Where the play focuses most closely on ideas is the theme of an idea (Thomasina's) stillborn before its time. If one feels no pathos for an idea that came too soon, translate "idea" into "art" and "scientist" into "artist" and consider the tragedies of artists unappreciated in their time and quickly forgotten. Even a theater critic can find the tragedy in that.

Graduate School Rankings

There have been a number of interesting posts in the last few days about getting tenure (1, 2, 3). One thing that popped out at me was the use of the National Research Council graduate school rankings in this post. I am surprised that these continue to be cited, due to the deep flaws in the numbers. Notice I said "numbers", not "methodology". I actually kind of link their methodology. Unfortunately, the raw numbers that they use to determine rankings are so error-ridden as to make the rankings useless.

For those who didn't see my original posts on the subject, cataloging the errors, see here and here.

Annoyed about taxes

It's not that I mind paying taxes per se. In fact, I consider it everyone's patriotic duty to pay taxes. I just wish it wasn't so damn complicated.

The primary confusion I have to deal with every year is that Harvard provides a series of mini-grants for graduate students, which they issue as scholarships. Scholarships are taxable as income, unless they are used to pay for tuition or required course supplies are not taxable, however. Scholarships which are used to I'm a graduate student, which means that the four courses I take every semester are "independent research," and obviously doing research is required. On the other hand, the IRS regulations specifically state that any scholarships used to pay for research are taxable. So if I use the mini-grant to pay for my research, is it taxable or not?

I actually asked an IRS representative a few years ago, and she replied that something counts are "required for coursework" only if everyone else taking that course has to buy it. If "everyone else" includes everyone else in the department doing independent research, then it's trivially the case that they are not required to do my research (though that would be really nice!), nor are they actually required to spend anything at all (some people's research costs more than others). If "everyone else" is only me -- this is independent research after all -- then the mini-grant is not taxable. This of course all hinges on whether or not "independent research" is actually a class. My understanding is that the federal government periodically brings action against Harvard, claiming that independent research is not a class.

Some people occasionally deduct the mini-grant expenditures as business expenses. This is not correct. According to the IRS, graduate students are not employees and have no business, and thus we have no business expenses (this reasoning also helps prevent graduate student unions -- you can't form a union if you aren't employed). And in any case, as I mentioned, we are specifically forbidden to write off the cost of doing research.

It's not just that the rules are confusing, they don't make sense. Why does the government want to tax students for the right to do research? How is that a good idea? Research benefits the public at large, and comes at a high opportunity cost for the researcher already (one could make more doing just about anything else). Why make us pay for it?

(It probably should be pointed out that Harvard could cough up the taxes itself, or they could administer the mini-grants as grants rather than as scholarships, though that would cost them more in terms of administrative overhead. Instead, Harvard specifically forbids using any portion of the mini-grant to pay for the incurred taxes. Though since they don't ask for any accounting, it's quite possible nobody pays any attention to that rule.)

Feds to College Students: "We don't want your professors to know how to teach"

The National Science Foundation just changed the rules for their 3-year graduate fellowships: no teaching is allowed. Ostensibly, this is to ensure that fellows are spending their time doing research. This is different from the National Defense Science & Engineering graduate fellowship Vow of Poverty: you can teach as much as you want, so long as you don't earn money from it.*

Consider that, ideally, PhD programs take 5 years, and the final year is spent on (a) writing the dissertation, and (b) applying for jobs. This means that NSF graduate fellows may have as little as one year in which to get some teaching experience.

Presumably, NSF was thinking one of three things:

1) They're trying to make it harder for their fellows to get jobs at universities that care about teaching.
2) They honestly don't believe teaching experience is important.
3) They weren't thinking at all.

I'm curious what will happen at universities that require all students to teach, regardless of whether they have outside fellowships or not. Will they change that rule, or will they forbid students to have NSF fellowships. Given the current financial situation, I'm guessing they'll go with the former, but it's hard to say.

*The exact NDSEG rule is that your total income for any year should be no more than $5,000 in addition to the fellowship itself. Depending on the university, this can be less than what one would get paid for teaching a single class.

Another problem with statistical translation

In the process of writing my latest article for Scientific American Mind, I spent a lot of time testing out automatic translators like Google Translate. As I discuss in the article, these programs have gotten a lot better in recent years, but on the whole they are still not very good.

I was curious what the Italian name of one of my favorite arias meant. So I typed O Soave Fanciulla into Google Translate. Programs like Google Translate are trained by comparing bilingual documents and noting, for a given word in one language, what word typically appears in the other language in the same place. Not surprisingly, Google Translate translated O Soave Fanciulla as O Soave Fanciulla -- no doubt because it was the case that, in the bilingual corpora GT was trained on, sentences with the phrase o soave fanciulla in Italian had o suave fanciulla in English.

I was reduced to translating the words one at a time: soave -> sweet, fanciulla -> girl. GT thinks o means or, but I expect that's the wrong reading in this context ("or sweet girl"?).

Blogger Spam Filter: Not Totally Useless

For the first time ever, Google/Blogger's spam filter actually caught a spam comment. Usually, it lets the spam go right through unmolested and only traps legitimate comments.

We can hope this is the start of a trend.

Overheard: Converting common knowledge into scientific knowledge

Because they are so familiar, it is easy to assume that category labels drawn from everyday language are self-evidently the correct way to describe emotion. However, transforming everyday categorical descriptions into an effective research tool is at the least a challenge.

Cowie & Cornelius (2003) Describing the emotional states that are expressed in speech. Speech Communication 40, 5-32.

New experiment: Mind Reading Quotient

Language requires a lot of inference. Consider the following three conversations:

A: Are there lots of people at the party?
B: Well, most people have left already.

A: How long has the party been going on?
B: Well, most people have left already.

A: Is it a good party?
B: Well, most people have left already.

In each of these cases, B's statement literally means the same thing, but the interpretation is different. Explaining (a) why this should be the case, and (b) how people figure out the implicit meanings is a very active area of research in modern linguistics and psycholinguistics.

The Mind Reading Quotient

Basically, understanding conversations like the ones above seem to require a certain amount of "mind reading" -- that is, guessing what the speaker (B, in this case) means to say. If you've ever wondered "what did she mean by that?" you were engaged in this kind of mind reading.

I just posted a new experiment -- the Mind Reading Quotient -- which consist of several short tests of this kind of mind reading ability. A couple of the tests look specifically at trying to work out what somebody is saying. A couple of the tests look at similar skills in the non-linguistic domain.

My favorite of the non-linguistic tasks is a coordination game. Thomas Schelling won a Nobel Prize in part for pioneering work on the topic. He found that people are very good at guessing what another person is thinking under certain conditions. For instance, if you tell two people they must meet up in New York City -- but without communicating with each other in any way -- they are actually fairly likely to succeed. Most likely, they would both show up on the corner of Times Square (or in one of a very small number of likely locations). The Mind Reading Quotient includes several such problems.

The goal of this study in part is to get a sense of how good people are at such tasks. There are a lot of thought experiments out there, but not nearly enough data. I will also be looking to see if people who are better at one of these tasks are also better at the others -- that is, is there a single underlying "mind reading ability," or does each task require a separate set of skills?

Reports so far are that the experiment runs 20-25 minutes. Because this is broken up into 7 separate activities, it should seem faster than that. And a lot of the tasks are fun (at least, I think so). Plus, at the end of the experiment, you'll be able to see your scores on many of the different sub-tasks. In two cases (a vocabulary test and an empathy test), I also have percentile scores already worked out, so you can see how you compare to average.

Follow this link to the study.

For previous posts about pragmatics and other linguistic inferences, check out this one, this one and this one.

image CC by Ignacio Conejo.

Missing Words

My dictionary lists several Chinese words for disdain, but none for discourage. The government in Orwell's 1984 would have loved this, as they -- along with many contemporary writers (I'm talking about you, Bill Bryson) -- believed that you don't have a word for something you can't think about it. I guess China has no need for the motivational speaker industry.
You can't be discouraged if you don't have a word for it.

Unfortunately for the government of Oceania, there's very little evidence this is true. The availability of certain words in a language may have effects on memory or speeded recognition, but probably does nothing so drastic as making certain thoughts inaccessible. I think examples like the one above make it clear just how unlikely the hypothesis was to be true to begin with.

photo credit here.

New Experiment: Drama Queen

The latest experiment in my quest to understand how people use emotion verbs is now posted. You will be introduced to a character who is, as the name of the game implies, a drama queen. She has many fraught relationships with her friends. You will be introduced to a number of friends, how Susan feels about each friend, and a new verb that you will try to use to describe that relationship. Enjoy.

Love, Marriage & Race

People who have been following this blog know that birth order affects who you are friends with and who you marry. Here's some comprehensive evidence on race. It probably won't come as a surprise, but it's nice to have numbers.

Talking about Love

Much of my work is on verbs that describe emotion, called "psych verbs." The curious thing about psych verbs is that they come in two varieties, those that put the experiencer of the emotion in subject position (Mary likes/hates/fears John) and those that put the experiencer of the emotion in object position (Mary delights/angers/frightens John).

These verbs have caused a four-decades-long headache for theorists trying to explain how people know what should be the subject and what should be the object of a given verb. Many theorists would like to posit theories on which you put the "do-er" in subject position and the one "done to" in object position. But some psych verbs seem to go one way and some the other.

There are basically only three theoretical possibilities:

a) There's no general rule that will tell you whether the experiencer of an emotion should be the subject or object of a given verb.

b) There's a general rule that tells you the experiencer should be the subject (or, on other theories, the object), and then there are some exceptions.

c) There are no exceptions. There are two kinds of psych verbs that actually mean very different things. Each group follows a particular rule: one sends the experiencer to subject; the other, to object.

I started out as a fan of theory (b). The results of my own work have pushed me in the direction of (c). The only theory that I'm pretty sure is wrong is (a). There are a lot of reasons I think (a) is wrong. One has to do with Broca's aphasia.

Broca's aphasia

People with Broca's aphasia -- typically caused by a stroke or brain injury -- have difficulty with grammar but are relatively good at remembering what individual words mean. Classically, Broca's aphasia was thought to result from damage to Broca's area, though I've heard that association is not as solid as once believed.
Some well-known language-related areas of the brain.

Either way, Maria Mercedes Pinango published a study in 2000 looking at how well Broca's aphasics understand psych verbs. She found that they had particular trouble with experiencer-object verbs (delights/angers/frightens) ... unless the verbs were in passive form (Mary is delighted/angered/frightened by John), in which case they had more trouble with the experiencer-subject verbs.

There are a lot of reasons this could be. The main aspect of the finding that interests me here is that this is *not* what you'd expect on theory (a), since on that theory, all psych verbs are more or less the same and there's no particular reason Broca's aphasia or anything else should impact one more than the other.

One worry one might have about this study was that it was published as a book chapter and not in a journal, and book chapters don't (usually) undergo the same review process. I don't personally know that much about aphasia or how one goes about testing aphasics, so it's hard for me to review Pinango's methods. More importantly, there weren't many participants in the study (these participants are not easy to find), so one would like replication.


As it happens, Cynthia Thompson and Miseon Lee recently published just such a replication (well, they published it in 2009, but one doesn't always hear about papers right away). It's a nice study with 5 Broca's aphasics, published in the Journal of Neurolinguistics. They tested both sentence comprehension and sentence production, finding that while passive sentences were harder overall, experiencer-subject verbs (like/hate/fear) were easier in the active form and experiencer-object verbs (delight/anger/frighten) were easier in the passive form. This effect was much more pronounced in sentence production than comprehension (in the latter case, it was not strictly significant), most likely because comprehension is easier.

Again, these are not the results you expect if the rules that tell you who should be a subject and who should be an object are verb-by-verb, since then there's no reason brain damage should affect one class of verbs as opposed to another (since there are no verb classes).* What exactly it does mean is much trickier. Give me another 20-30 years, and hopefully I'll have an answer.

*Actually, I can come up with a just-so story that saves theory (a). But it's certainly not what you would expect, and I believe there are a lot of other data from other paradigms that speak against theory (a).


Thompson CK, and Lee M (2009). Psych verb production and comprehension in agrammatic Broca's aphasia. Journal of neurolinguistics, 22 (4), 354-369 PMID: 20174592

New York Times, You Can't Handle the Truth.

Earlier today I wrote about the research behind an opinion article at the New York Times. When I looked at the sources cited, I was unable to find any information supporting the claims made in the article. In fact, what I found directly contradicted those claims. I finished by saying that while I was willing to believe these claims, I'd like to know what data support them. In passing, I mentioned that I had submitted an abbreviated version of this analysis as a comment on the Times website.

That comment was not published. I figured maybe there had been a computer error, so I submitted another one later in the day. That one was also not published. Finally, at 6:13pm, I submitted an innocuous and useless comment under an assumed name:
I agree with Pat N. It's nice to hear from someone who has some optimism (@ Dr. Q).
This comment was published almost immediately.

The Times states that "comments are moderated and generally will be posted if they are on-topic and not abusive."Since the moderators didn't publish the comment, we can conclude one of two things:

1) Discussion of the empirical claims made in a New York Times article is not "on topic."
2) Pointing out a mistake made in a New York Times article is a kind of abuse.

Do students at selective schools really study less?

*Updated with More Analysis*

So says Philip Babcock in today's New York Times. He claims:
Full-time college students in the 1960s studies 24 hours per week, on average, and their counterparts today study 14 hours per week. The 10-hour decline is visible for students from all demographic groups and of all cognitive abilities, in every major and at every type of college.
The claim that this is true for "every type of college" is important because he wants to conclude that schools have lowered their standards. The alternative is that there are more, low-quality schools now, or that some schools have massively lowered their standards. These are both potentially problems -- and are probably real -- but are not quite the same problem as all schools everywhere lowering their standards.

So it's important to show that individual schools have lowered their standards, and that this is true for the selective schools as well as the not-selective schools. The article links to this study by Babcock. This study analyzes a series of surveys of student study habits from the 1960s to the 2000s, and thus seems to be the basis of his argument, and in fact the introduction contains almost the identical statement that I have quoted above. Nonetheless, despite these strong conclusions, the data that would support them appear to be missing.
SAT scores and size are not available in the early years, so study time by college selectivity is not reported. 
He goes on to say that he can look at selectivity in the more recent surveys: specifically matched 1988-2003 surveys. These do show a decrease in study time from on the order of 1-2 hours for high-, medium- and low-selectivity schools (I cannot find how selectivity was defined). Whether this is even statistically significant is unclear, as he does not report any statistics or confidence intervals. In any case, it is not a 10 hour difference.

What Babcock might have meant, and more problems with the data

It is possible that when Babcock was saying that the decrease in study time was true of all types of schools, he meant that when you look at all types of schools in 2003/4, students at all levels report studying less than the average student reported in 1961. The problem is that, for all we know, the schools in his sample were more selective in 1961 than they were in 2003/4.

Moreover, the is something worrisome about his selectivity data. Whenever analyzing data, many researchers like to do what is called a "sanity check": they make sure that the data contain results that are known to be true. If you were looking at a study of different types of athletes, you might make sure that the jockeys are shorter than the basketball players, lighter than the football players and chew less tobacco than the baseball players. If you find any of these things do not hold, you might go back and make sure there isn't a type-o somewhere in your data-entry process.

I worry that Babcock's data fail the sanity check. Specifically, look at the number of hours studies according to selectivity of school in 2003:

highly selective: 13.47 hours
middle:               14.68 hours
non-selective:     16.49 hours

Note that this effect is larger than the decline in number of hours studied between 1988 and 2003, so in terms of this dataset, this is a large effect (again, I cannot tell if it is significant, because the relevant statistical information is not provided) and it's not in the direction one would think. I will admit that it is possible that students at highly selective schools really do study less than the folks at JuCo, but that conflicts heavily with my pretty extensive anecdotal database. So either a) the world is very different from how I thought it was -- in which case, I want more evidence than just this survey -- b) Babcock has defined selectivity incorrectly, or c) there is something wrong with these data.

One last worrisome fact

I considered the possibility that the data Babcock was quoting were in a different paper. The only other paper on Babcock's website that looked promising was this American Enterprise Institute report. This is not a research paper, but rather summarizes research. Specifically, according to footnote #2, it summarizes the research in the paper I just discussed. Strangely, this paper does have a graph (Figure 4) breaking down study habits of students in the 1960s based on selectivity of the school they are attending: the very data he states do not exist in the later paper.

I'm not really sure what to make of that, and have nothing further to say on the topic. At the very least, I would be hesitant to use those graphs as evidence to support the general claim that study habits have changed even at the selective schools, since it's unclear where the data case from, or if in fact they even exist (to be clear: it's Babcock who says they don't exist, not me).


To summarize, there seems to be very little evidence to support Babcock's conclusion that study time has decreased even at selective schools by 10 hours from the 1960s to modern day. That is, he has a survey from 1961 in which students studied 25 hrs/week, two surveys in the 1980s in which students studied 17 hours/week, and two surveys in the 2000s in which students studied 14-15 hrs/week, but these surveys are all based on different types of students at different schools, so it's hard to make any strong conclusions. If I compared the weight of football places from Oberlin in 1930 and Ohio State in 2005, I'd find a great increase in weight, but in fact the weight of football players at Oberlin probably has not increased much over that time period.

Moreover, there are aspects of these data that deserve some skepticism. When report to people who went to selective schools that these data suggest students at such schools study 13 hrs/week, the response is usually something like, "Do you mean per day?"

Finally, since no statistics were run, it's quite possible that none of the results in this study are significant.

I want to be clear that I'm not saying that Babcock's claims aren't true. I'm just saying that it's not clear he has any evidence to support them (which is not to say I think it's a bad study: it was a good study to have done and clearly took a lot of work, but I find it at best suggestive of future avenues of research and certainly not conclusive).

New tags

Rather than write a new blog post (or my nearly-due BUCLD proceedings paper), I decided to revamp the post tags on this blog. Their usage has been inconsistent, which is making it harder and harder to find old blog posts that I want to link to.

Hopefully the new and improved tags will also be useful for you, dear reader. Now if you want to find any of my articles on the academic career path, on animal language or on universal grammar -- just to give a few examples -- they are only a mouse click away.

In addition to standard tags, there are also a series of tags beginning with the preposition "on". These appear on most posts now and are more meta-thematic than the others.

Learning What Not to Say

A troubling fact about language is that words can be used in more than one way. For instance, I can throw a ball, I can throw a party, and I can throw a party that is also a ball.

These cats are having a ball.

The Causative Alternation

Sometimes the relationship between different uses of a word is completely arbitrary. If there's any relationship between the different meanings of ball, most people don't know it. But sometimes there are straightforward, predictable relationships. For instance, consider:

John broke the vase.
The vase broke.

Mary rolled the ball.
The ball rolled.

This is the famous causative alternation. Some verbs can be used with only a subject (The vase broke. The ball rolled) or with a subject and an object (John broke the vase. Mary rolled the ball). The relationship is highly systematic. When there is both a subject and an object, the subject has done something that changed the object. When there is only a subject, it is the subject that undergoes the change. Not all verbs work this way:

Sally ate some soup.
Some soup ate.

Notice that Some soup ate doesn't mean that some soup was eaten, but rather has to mean nonsensically that it was the soup doing the eating. Some verbs simply have no meaning at all without an object:

Bill threw the ball.
*The ball threw.

In this case, The ball threw doesn't appear to mean anything, nonsensical or otherwise (signified by the *). Try:

*John laughed Bill.
Bill laughed.

Here, laughed can only appear with a subject and no object.

The dative alternation

Another famous alternation is the dative alternation:

John gave a book to Mary.
John gave Mary a book.

Mary rolled the ball to John.
Mary rolled John the ball.

Once again, not all verbs allow this alternation:

John donated a book to the library.
*John donated the library a book.

(Some people actually think John donated the library a book sounds OK. That's all right. There is dialectical variation. But for everyone there are verbs that won't alternate.)

The developmental problem

These alternations present a problem for theory: how do children learn which verbs can be used in which forms? A kid who learns that all verbs that appear with both subjects and objects can appear with only subjects is going to sound funny. But so is the kid who thinks verbs can only take one form.
The trick is learning what not to say

One naive theory is that kids are very conservative. They only use verbs in constructions that they've heard. So until they hear "The vase broke," they don't think that break can appear in that construction. The problem with this theory is that lots of verbs are so rare that it's possible that (a) the verb can be used in both constructions, but (b) you'll never hear it used in both.

Another possibility is that kids are wildly optimistic about verb alternations and assume any verb can appear in any form unless told otherwise. There are two problems with this. The first is that kids are rarely corrected when they say something wrong. But perhaps you could just assume that, after a certain amount of time, if you haven't heard e.g. The ball threw then threw can't be used without an object. The problem with that is, again, that some verbs are so rare that you'll only hear them a few times in your life. By the time you've heard that verb enough to know for sure it doesn't appear in a particular construction, you'll be dead.

The verb class hypothesis

In the late 1980s, building on previous work, Steven Pinker suggested a solution to this problem. Essentially, there are certain types of verbs which, in theory, could participate in a given alternation. Verbs involving caused changes (break, eat, laugh) in theory can participate in the causative alternation, and verbs involving transfer of possession (roll, donate) in theory can participate in the dative alternation, and this knowledge is probably innate. What a child has to learn is which verbs do participate in the dative alternation.

For reasons described above, this can't be done one verb at a time. And this is where the exciting part of the theory comes in. Pinker (building very heavily on work by Ray Jackendoff and others) argues that verbs have core aspects of their meaning and some extra stuff. For instance, break, crack, crash, rend, shatter, smash, splinter and tear all describe something being caused to fall to pieces. What varies between the verbs is the exact manner in which this happens. Jackendoff and others argues that the shared meaning is what is important to grammar, whereas the manner of falling to pieces was extra information which, while important, is not grammatically central.

Pinker's hypothesis was that verb alternations make use of this core meaning, not the "extra" meaning. From the perspective of the alternation, then, break, crack, crash, rend, shatter, smash, splinter and tear are all the same verb. So children are not learning whether break alternates, they learn whether the whole class of verbs alternate. Since there are many fewer classes than than there are verbs (my favorite compendium VerbNet has only about 270), the fact that some verbs are very rare isn't that important. If you know what class it belongs to, as long as the class itself is common enough, you're golden.

Testing the theory

This particular theory has not been tested as much as one might expect, partly because it is hard to test. It is rather trivial to show that verbs do or don't participate in alternations as a class, partly because that's how verb classes are often defined (that's how VerbNet does it). Moreover, various folks (like Stefanowitsch, 2008) argue that although speakers might notice the verb classes, that doesn't prove that people actually do use those verb classes to learn which verbs alternate and which do not.

The best test, then, is it teach people -- particularly young children -- new verbs that either belong to a class that does alternate or to a class that does not and see if they think those new verbs should or should not alternate. Very few such studies have been done.

Around the same time Pinker's seminal Language and Cognition came out in 1989, which outlines the theory I described above, a research team led by his student Jess Gropen (Gropen, Pinker, Hollander, Golberg and Wilson, 1989) published a study of the dative alternation. They taught children new verbs of transfer (such as "moop," which meant to move an object to someone using a scoop), which in theory could undergo the dative alternation. The question they asked was whether kids would be more likely to use those verbs in the alternation if the verbs were monosyllabic (moop) or bisyllabic (orgulate). They were more likely to do so for the monosyllabic verbs, and in fact in English monosyllabic verbs are more likely to alternate. This issue of how many syllables the verb has did come up in Language and Cognition, but it wasn't -- at least to me -- the most compelling part of the story (which is why I left it out of the discussion so far!).

Ambridge, Pine and Rowland (2011)

Ben Ambridge, Julian Pine and Caroline Rowland of the University of Liverpool have a new study in press which is the only study to have directly tested whether verb meaning really does guide which constructions a child thinks a given verb can be used in, at least to the best of my knowledge -- and apparently to theirs, since they don't cite anyone else. (I've since learned that Brooks and Tomasello, 1999, might be relevant, but the details are sufficiently complicated and the paper sufficiently long that I'm not yet sure.)

They taught children two novel verbs, one of which should belong to a verb class that participates in the causative alternation (a manner of motion verb: bounce, move, twist, rotate, float) and one of which should not (an emotional expression: smile, laugh, giggle). Just to prove to you that these classes exist, compare:

John bounced/moved/twisted/rotated/floated the ball.

The ball bounced/moved/twisted/rotated/floated.

*John smiled/laughed/giggled Sally.
Sally smiled/laughed/giggled.

Two groups of children (5-6 years old and 9-10 years old) were taught both types of verbs with subjects only. After a lot of training, they were shown new sentences with the verbs and asked to rate how good the sentences were. In the case of the manner of motion verb, they liked the sentences whether the verb had an subject and an object or if the verb had only a subject. That is, they thought the verb participated in the causative alternation. For the emotion expression verb, however, they thought it sounded good with a subject only; when it had both a subject and an object, they thought it did not sound good. This was true both for the older kids and the younger kids.

This is, I think, a pretty nice confirmation of Pinker's theory. Interestingly, Ambridge and colleagues think that Pinker is nonetheless wrong, but based on other considerations. Partly, our difference of opinion comes from the fact that we interpret Pinker's theory differently. I think I'm right, but that's a topic for another post. Also, there is some disagreement about a related phenomenon (entrenchment), but that, too, is a long post, and the present post is long enough.

Gropen, J., Pinker, S., Hollander, M., Goldberg, R., and Wilson, R. (1989). The Learnability and Acquisition of the Dative Alternation in English Language, 65 (2) DOI: 10.2307/415332

Ben Ambridge, Julian M. Pine, and Caroline F. Rowland (2011). Children use verb semantics to retreat from overgeneralization errors Cognitive Linguistics

For picture credits, look here and here.

New Experiment: EmotionSense

I just posted a new experiment on the website: EmotionSense. I have lately gotten very interested in verb-learning, specifically how we decide which of the participants in an event should be the grammatical subject, which the grammatical object, etc. (see this post and this one). In this experiment, you'll answer some questions about different types of emotions. I'll use this information to help design some upcoming verb-learning experiments.

As usual, the experiment is short and should take 3-5 minutes.

[snappy caption goes here]

photo credit here

Learning the passive

If Microsoft Word had its way, passive verbs would be excised from the language. That would solve children some problems, because passive verbs are more difficult to learn than one might think, because not all verbs passivize. Consider:

*The bicycle was resembled by John.
*Three bicycles are had by John.
*Many people are escaped by the argument.

The bicycle was resembled by John: A how-to guide.

So children must learn which verbs have passives and which don't. I recently sat down to read Pinker, Lebeaux and Frost (1987), a landmark study of how children learn to passivize verbs. This is not a work undertaken lightly. At 73 pages, Pinker et al. (1987) is not Steve Pinker's longest paper -- that honor goes to his 120-page take-down of Connectionist theories of language, Pinker and Prince (1988) -- but it is long, even for psycholinguistics. It's worth the read, both for the data and because it lays out the core of what become Learnability and Cognition, one of the books that has had the most influence on my own work and thinking.

The Data

The authors were primarily interested in testing the following claim: that children are conservative learners and only passivize verbs that they have previously heard in the passive. This would prevent them from over-generating passives that don't exist in the adult language.

First, the authors looked at a database of transcriptions of child speech. A large percentage of the passive verbs they found were passives the children couldn't possibly have heard before because they aren't legal passives in the adult language:

It's broked? (i.e., is it broken?)
When I get hurts, I put dose one of does bandage on.
He all tieded up, Mommy.

Of course, when we say that the child couldn't have heard such passives before, you can't really be sure what the child heard. It just seems unlikely. To more carefully control what the child had heard, the authors taught children of various ages (the youngest group was 4 years old) made-up verbs. For instance, they might demonstrate a stuffed frog jumping on top of a stuffed elephant and say, "Look, the frog gorped the elephant." Then they would show the elephant jumping on top of a mouse and ask the child, "What happened to the mouse?"

If you think "gorp" has a passive form, the natural thing to do would be to say "The mouse was gorped by the elephant." But a child who only uses passive verbs she has heard before would refuse to utter such a sentence. However, across a range of different made-up verbs and across four different experiments, the authors found that children were willing -- at least some of the time -- to produce these new passive verbs. (In addition to production tests, there were also comprehension tests where the children had to interpret a passivization of an already-learned verb.)

Some Considerations

These data conclusively proved that children are not completely conservative, at least not by 4 years of age (there has been a lot of debate more recently about younger children). With what we know now, we know that the conservative child theory had to be wrong -- again, at least for 4 yos -- but it's worth remembering that at the time, this was a serious hypothesis.

There is a lot of other data in the paper. Children are more likely to produce new passive forms as they get older (higher rates for 5 year-olds than 4 year-olds). They taught children verbs where the agent is the object and the patient is the subject (that is, where The frog gorped the elephant means "the elephant jumped on top of the frog"). Children had more difficulty passivizing those verbs. However, a lot of these additional analyses are difficult to interpret because of the small sample sizes (16 children and only a handful of verbs per experiment or sub-experiment).


Fair warning: the rest of this post is pretty technical.

What excites me about this paper is the theoretical work. For instance, the authors propose a theory of linking rules that have strong innate constraints and yet still some language-by-language variation.
The linkages between individual thematic roles in thematic cores and individual grammatical functions in predicate-argument structures is in turn mediated by a set of unmarked universal linking rules: agents are mapped onto subjects; patients are mapped onto objects; locations and paths are mapped onto oblique objects. Themes are mapped onto any unique grammatical function but can be expressed as oblique, object or subject; specifically, as the 'highest' function on that list that has not already been claimed by some other argument of the verb.
With respect to passivization, what is important is that only verbs which have agents as subjects are going to be easily passivized. The trick is that what counts as an 'agent' can vary from language to language.
It is common for languages to restrict passivized subjects to patients affect by an action ... The English verbal passive, of course, is far more permissive; most classes of transitive verbs, even those that do not involve physical actions, have the privilege of passivizability assigned to them. We suggest this latitude is possible because what counts as the patient of an action is not self-evident ... Languages have the option of defining classes in which thematic labels are assigned to arguments whose roles abstractly resemble those of physical thematic relations...
This last passage sets up the core of the theory to be developed in Learnability and Cognition. Children are born knowing that certain canonical verbs -- ones that very clearly have agents and patients, like break -- must passivize, and that a much larger group of verbs in theory might passivize, because they could be conceived of as metaphorically having agents and patients. What they have to learn is which verbs from that broader set actually do passivize. Importantly, verbs come in classes of verbs with similar meanings. If any verb from that set passivizes, they all will.

This last prediction is the one I am particularly interested in. A later paper (Gropen, Pinker, Hollander, Goldberg & Wilson, 1989) explored this hypothesis with regards to the dative alternation, but I don't know of much other work. In general, Learnability and Cognition go less attention than it should have, perhaps because by the time it was published, the Great Past Tense Debate had already begun. I've often thought of continuing this work, but teaching novel verbs to children in the course of an experiment is damn hard. Ben Ambridge has recently run a number of great studies on the acquisition of verb alternations (like the passive), so perhaps he will eventually tackle this hypothesis directly.

Pinker S, Lebeaux DS, and Frost LA (1987). Productivity and constraints in the acquisition of the passive. Cognition, 26 (3), 195-267 PMID: 3677572

Mendeley -- Not quite ready for prime time

Prompted by Prodigal Academic, I decided to give Mendeley a shot. That is, instead of working on a long over-due draft of a paper.

Mendeley is two things. First, it is a PDF library/reader. Second, it is a citation manager.

Currently, I used Papers for the first and Endnote for the second.  Both work well enough -- if not perfectly -- but it is a pain that I have to enter every paper I want to cite into two different programs.

(Don't tell me I could export my Papers citations library to Endnote. First, I'd have to do that every time I update my library, which is annoying. Second, Papers was created by someone who clearly never cites books, book chapters, conference proceedings, etc. So I'd have to fix all of those in Endnote ... every time I export.)

(Also, don't tell me about Zotero. Maybe it's gotten better in the last year since I tried it, but it was seriously feature-deficient and buggy beyond all belief.)

First glance

At first, I was pleasantly surprised. Unlike Papers, Mendeley is free so long as you don't want to use their Cloud functionality much (I don't). Papers is convinced there are people named Marc Hauser, Marc D Hauser, M D Hauser, and M Hauser. Mendeley can be led astray but has some nice options to allow you to collapse two different author records -- or two different keywords.

(On that note, my Papers library has implicit causality, Implicit causality and Implicit Causality all as different keywords. Once Papers has decided the keyword for a paper is, say, Implicit Causality, nothing on G-d's green Earth will convince it to switch to implicit causality. And its searches are case sensitive. Mendeley has none of these "features.")

Also, Mendeley will let you annotate PDFs and export the PDFs with your annotations in a format readable by other PDF viewers (if, for instance, you wanted to share your annotated PDF with someone). That's a nice feature.

These would all be nice additional features if the the core functionality of Mendeley was there. I'm sorry to say that the product just doesn't seem to be ready for prime time.
I typed "prime time" into Flickr, and this is what it gave me. Not sure why.
photo credit here.

Second glance

The first disappointment is that Mendeley does not have smart collections. Like smart playlists in iTunes, smart collections are collections of papers defined by various search terms. If you have a smart collection that indexes all articles with the keywords "implicit causality," "psych verbs" and "to read", then whenever you add a new paper with those keywords, they automatically go into the smart collection. This is very handy, and it's an excellent feature of Papers (except that, as mentioned above, my smart folder for implicit causality searches for the keywords "implicit causality," "Implicit causality" OR "Implicit Causality").

I suspect Mendeley doesn't have smart collections because it doesn't have a serious search function. You can search for papers written by a given author or with a given keyword, but if you want to search for papers written by the conjunction of two authors or any paper on "implicit causality" written by Roger Brown, you're out of luck. Rather, it'll perform the search. It just won't find the right papers.

Third glance

That might be forgivable if the citation function in Mendeley was usable. The idea is that as you write a manuscript, when you want to cite, say, my paper on over-regularization (18 citations and counting!), you would click on a little button that takes you to Mendeley. You find my paper in your PDF library, click another button, and (Hartshorne & Ullman, 2006) appears in your Word document (or NeoOffice or whatever) and the full bibliographic reference appears in your manuscript's bibliography. You can even choose what citation style you're using (e.g., APA).

Sort of. Let's say you want to cite two different papers by Roger Brown and Deborah Fish, both published in 1983 (which, in fact, I did want to do). Here's what it looks like:
Implicit causality effects are found in both English (BrownFish, 1983) and Mandarin (BrownFish, 1983)
At least in APA style, those two papers should be listed as (BrownFish, 1983a) and (BrownFish, 1983b), because obviously otherwise nobody has any idea which paper you are citing.

This gets worse. Suppose you wrote:
Implicit causality effects have been found in multiple languages (BrownFish, 1983; BrownFish, 1983).
Correct APA 5th Ed. style is, I believe, (BrownFish, 1983a, 1983b). Actually, I'm not sure what exactly the correct style is, because Endnote always takes care of it for me.

There are other issues. Mendeley doesn't have a mechanism for suppressing the author. So you end up with:
As reported by Brown and Fish (BrownFish, 1983; BrownFish, 1983), verbs have causality implicit in their meaning.
instead of
 As reported by Brown and Fish (1983a, 1983b), verbs have causality implicit in their meaning.
Nor does Mendeley know about et al:
Hauser, Chomsky and Fitch (Hauser, ChomskyFitch, 2001) put forward a new proposal....blah blah has been reported several times in the literature (Hauser, ChomskyFish, 2001; BrownFish, 1983; BrownFish, 1983).
That is, the second time you cite a paper with more than 2 authors, it doesn't contract to (Hauser et al. 2001). Unfortunately, there is no work-around for any of these problems. In theory, you can edit the citations to make them match APA style. Within a few seconds, a friendly dialog box pops up and asks you if you really want to keep your edited citation. You can click "OK" or click "cancel," but either way it just changes your carefully-edited citation back to its default -- at least it does on my Mac (the forums suggest that this works for some people).

It's possible that people who don't use APA won't have as many of these problems. Numbered citations, for instance, probably work fine. I've never submitted a paper anywhere that used numbered citations, though. So I either need to switch professions or continue using Endnote to write my papers.


One can hope that Mendeley will solve some of these issues. I found discussions on their "suggested features" forum going back many months for each of the problems discussed above, which suggests I may be waiting a while for these fixes. I do understand that Mendeley is technically in beta testing. But it's been in beta testing for over two years, so that's not really an excuse at this point.

Alternatively, maybe Papers will add a good citation feature (and discover books). Or maybe Zotero will confront its own demons. I'm going to have to wait and see.

It makes one appreciate Endnote. Yes, it's a dinosaur. No, it hasn't added any really useable features since I started using it in 2000. But it worked then, and it still works now. There's something to be said for that.