Field of Science

Caveat emptor: Is academia a pyramid scheme?

That's the question on the blogs this week (see here and here). The question arises because each professor will have some number of students during their career (10-20 is common among the faculty I know), whereas the number of professorships increases very slowly. So the number of PhDs being produced far exceeds the number of academic positions.

As pointed out elsewhere, this neglects the fact that many PhD students have no intention of going into academia. Even so, it's clear the system is set up to produce more graduates who want academic jobs than there are jobs available. Prodigal Academic wonders if that's any different from any profession -- generally, there are more people who want the best jobs than there are best jobs to go around. Unlike PA, who doesn't think there's a problem, Citation Needed thinks most people entering graduate school aren't aware of how unlikely it is that they will get a tenure-track job, partly because it isn't in the schools' interest to mention this.

It depends

I largely agree with these fine posts, but I think they overgeneralize. Not all PhD programs are the same. Different fields vary wildly in terms of number of students produced, the likelihood of getting an industry job, etc., and also in terms of the caliber of the program. For instance, nearly every graduate of the psych program at Harvard goes on to get a tenure-track job. A sizable percentage get tenure-track jobs at the top institutions (Harvard, Yale, UChicago, etc.).

On the other hand, at even highly-respected but lower-ranked schools, getting a tenure-track job seems to be the exception. Here I have less personal experience, but a friend from Harvard who is a post-doc at a well-known state school was surprised to discover basically none of the students in that program expected to get an academic job. I've heard similar stories from a few other places.

A common problem

This isn't unique to academia. Many people believe lawyers earn a lot of money. Much fuss is made in the New York Times about how starting salary at a major law first is around $170,000/year (or was, prior to the Great Recession). While basically anyone who graduates from the top three law schools who wants such a job can get one (some go into lower-paying public-interest or public-service work), at most law schools, few if any graduates land such jobs and most lawyers never earn anywhere near that money. As a first approximation, nobody who graduates from law school lands a big firm job, just as, as a first approximation, nobody with a PhD gets a tenure track job at a top research institution.

From my vantage point, the problem is that media (newspapers, movies, etc.) fixate on the prosperous tip of the iceberg. Newspapers do this because their target audience (rather, the target audience of many of the advertisers in newspapers) are people who themselves graduated from Harvard or Yale and for whom getting a tenure-track job or being partner at a major law firm is a reasonably common achievement. Movies and television shows do this for the same reason everyone is beautiful and rich on the screen -- nobody ever said Hollywood was realistic.

This is fine as it goes, but can get people into trouble when they don't realize (a) that the media is presenting the outliers, not the norm, and/or (b) just where their own school/program fits into the grand scheme of things. As Citation Needed points out, it's not necessarily in the interest of less successful schools to warn incoming students that their chances of a job are poor. And, particularly in the realm of undergraduate education, there are certainly there are schools who cynically accept students knowing that their degree is so worthless that the students will almost certainly default on their loans.

What to do

Obviously the real onus is on the student (caveat emptor) to make sure they know what their chances of getting the job they want are prior to matriculating -- and this is true for every degree, not just PhDs. For most schools -- undergraduate and particularly graduate -- you can get data on how graduates fare in the marketplace. This can help determine not only which school to go to but whether it's worth going to school at all (it may not be). But to the extent it is in society's interest that people aren't wasting time and money (often as not, taxpayer money), it is worth considering how, as a society, we can make sure that not only is the information available, but people know that it's available and where to get it.

Do professors teach?

Luis Von Ahn has an excellent discussion on his blog about the teaching/research balance at major research universities. The comments are worth a read as well, especially this one and Von Ahn's response.

Overnight data on lying and bragging

Many thanks to all those who responded to my call for data last week. By midnight, I had enough data to be confident of the results, and the results were beautiful. I would have posted about them here on Friday, but in the lead-up to this presentation, I did so much typing I burned out my wrists and have been taking a much-needed computer break.

The study looked at the interpretation of the word some. Under some conditions, people interpret some as meaning some but not all, but other times, it means simply not none. For instance compared John did some of his homework with If you eat some of your green beans, you can have dessert. Changing some to some-but-not-all doesn't change the meaning of the first sentence, but (for most people) changes the interpretation of the second.

This phenomenon, called "scalar implicature" is one of the hottest topics in pragmatics -- a subdivision of linguistic study. The reasons for this are complex -- partly it's because Ira Noveck and his colleagues turned out a series of fascinating studies capturing a lot of people's attention. Partly it's because scalar implicature is a relatively easily-studied test case for several prominent theories. Partly it's other reasons.

Shades of meaning

On most theories, there are a few reasons some might be interpreted as some-but-not-all or not. The usual intuition is that part of why we assume John did some of his homework means some-but-not-all is because if it were true that John did all of his homework, the speaker would have just said so ... unless, of course, the speaker doesn't know if John did all his homework or if the speaker does know but have a good reason to obfuscate.

At least, that's what many theorists assume, but proving it has been hard. Last year, Bonnefon, Feeney & Villejoubert published a nice study showing that people are less likely to interpret some as some-but-not-all in so-called "face-threatening" contexts -- that is, when the speaker is being polite. For instance, suppose you are a poet and you send 10 poems to a friend to read. Then you ask the friend what she thinks, and she says, "Some of the poems need work." In this case, many people suspect that the friend actually means all of the poems need work, but is being polite.

The study

In this quick study, I wanted to replicate and build on Bonnefon et al's work. The experiment was simple. People read short statements and then answered a question about each one. The first two statement/question pairs were catch trials -- trials with simple questions and obvious answers. The small number of participants who got those wrong were excluded (presumably, they misunderstood the instructions or simply weren't paying attention).

The critical trial was the final one. Here's an example:
Sally: 'John daxed some of the blickets.'
'Daxing' is a neutral activity, neither good nor bad.
Based on what Sally said, how likely is it that John daxed ALL the blickets?
As you can see, the sentence contained unknown words ('daxing', 'blickets'), and participants were presented with a partial definition of one of them (that daxing is a neutral activity). The reason to do this was that it allowed us to manipulate the context carefully.

Each participant was in one of six conditions. Either Sally said "John daxed some...," as in the example above, or she said "I daxed some..." Also, "daxing" was described as either a neutral activity, as in the example above, or a negative activity (something to be ashamed of), or a positive activity (something to be proud of).


As shown in the graph, whether daxing was described as positive, negative or neutral affected whether participants thought all the blickets were daxed (e,g, that some meant at least some rather than some-but-not-all) when Sally was talking about her own actions ("I daxed some of the blickets").

This makes sense, if 'daxing' is something to be proud of, then if Sally daxed all of the blickets, she'd say so. Since she didn't, people assume she daxed only some of them (far right blue bar in graph). Whereas if daxing is something to be ashamed of, then even if she daxed all of them, she might prefer to say "I daxed some of the blickets" as a way of obfuscating -- it's technically true, but misleading.

Interestingly, this effect didn't show up if Sally was talking about John daxing blickets. Presumably this is because people think the motivation to brag or lie is less strong when talking about a third person.

Also  interestingly, people weren't overall more likely to interpret some as meaning some-but-not-all when the sentence was in the first-person ("I daxed..."), which I had predicted to be the case. As described above, many theories assume that some should only be interpreted as some-but-not-all if we are sure the speaker knows whether or not all holds. We should be more sure when the speaker is talking about her own actions than someone else's. But I didn't find any such affect. This could be because the theory is wrong, because the effect of using first-person vs. third-person is very weak, or because participants were at floor already (most people in all 6 conditions thought it was very unlikely that all the blickets were daxed, which can make it hard detect an effect -- though it didn't prevent us from finding the effect of the meaning of 'daxing').


I presented these data at a workshop on scalar implicature that I organized last Thursday. It was just one experiment of several dozen included in that talk, but it was the one that seemed to have generated the most interest. Thanks once again to all those who participated.

Bonnefon, J., Feeney, A., & Villejoubert, G. (2009). When some is actually all: Scalar inferences in face-threatening contexts Cognition, 112 (2), 249-258 DOI: 10.1016/j.cognition.2009.05.005

Need data by morning

In preparing a talk for a workshop I've organized tomorrow, I realized there was one simple experiment that would tie several pieces together neatly. Unfortunately, I hadn't run it. But I figured, hey, I can get data quick on Amazon Mechanical Turk.

Turk let me down. I don't know why, but today there aren't a lot of fish biting. So I turn to my usual, pre-Turk subject pool: you. The experiment takes 1-2 minutes -- it's really just 3 questions. As an added inducement to get people to run the experiment now, I'll be posting the results later this week or early next week.

Ask a stupid question, get a stupid answer

This morning, Slate is running a bizarre feature on transportation. Cities and transportation are in crisis, we're told, and we need new ideas to solve problems of traffic, efficiency, greenhouse gas emissions, etc. We need new ideas to solve the problems of our cities "and so we need new visions for the city."

Alternatively, we could just build what so many cities around the world already have. The article mentions free Wi-Fi in buses (don't we already have that?). A few months ago, I had lunch with a professor from a university in Switzerland. He had recently moved to that university from another university a good 1-2 hours drive away. With his kids in school and a wife with a job, he didn't want to move them. Luckily, there was a high speed train with working Wi-Fi (take that, Bolt Bus!) that only took about an hour each way, so he was commuting in, working on the train both directions.

In Hong Kong, you can check your luggage in at a station in the city center up to 24 hours before your flight. You only have to hop on the express train to the airport just before your flight, so you can enjoy downtown without your luggage on the day you head out of the city. Also downtown in Hong Kong, incidentally, they've built a pedestrian street one floor above the vehicular street, so that pedestrians can walk to and from offices and shops without getting in the way of the traffic -- safer and more convenient.

Anyone who has spent much time traveling abroad knows that the US transportation system is a good half century (or more) behind the more developed parts of the world. Even where our transportation is technologically on par (e.g., of places I've lived, Spain and Russia), it often works better, runs faster and has more geographic coverage. There are many ideas out there -- most of them not new -- that work very well in other countries. So any real discussion would not be about finding clever new ideas, but figuring out how to implement them in the US.


This brings up a different question: what happened to Slate? Some years ago, it was the first place I turned for news, but the quality has steadily declined. Some of it is just attrition (e.g., the sublime and irreplaceable movie critic David Edelstein was "replaced" by Dana Stevens). The science coverage has been turned over largely to William Saletan who -- bless his heart -- tried very hard, but simply doesn't know enough about science to understand what he's writing about, leading to articles that are either shallow or just wrong (see here and here). Not that shallow science writing is a problem specific to Slate.

Slate's travel writing used to be incredible, written by interesting folks with deep, deep knowledge of the places they were visiting. So several years ago, when I pitched a piece to Slate about the Trans-Siberian railway, I assumed I never got a response because despite a couple years in Russia, I wasn't up to their level of expertise. Recently, though, Slate's ad critic (generally one of my favorite writers) posted an article about his trip on the Trans-Siberian, written with detailed horror of life in Russia (which he can only observe from a distance, since he's afraid to ride in platskart, which he describes as "P.O.W." camp, but which is more accurately called "a party which begins in Moscow and ends 7 days later in Vladivostok"). Though, in Stevenson's defense, the article wasn't nearly so bad nor so clueless as Daniel Gross's description of his visit to Japan, during which he discovered (wow!) that the Japanese really like things written in English.

Seriously, Slate -- I expect better.

Cognitive Science, March 2010

In my continuing series on the past year in Cognitive Science: March, 2010.

Once again, the discussion of some of these papers will be technical.


Baroni, Murphy, Barbu, Poesio. Strudel: A corpus-based semantic model based on properties and types.

You are who your friends are. A number of computational linguists have been interested in just how much you can learn about a word based on the other words it tends appear with. Interestingly, if you take a word (e.g., dog) and look at the words it tends to co-occur with (e.g., cat), those other words often describe properties or synonyms of the target word. A number of researchers have suggested that this might be part of how we learn the meanings of words.

Baroni et al. are sympathetic to that literature, but they point out that such models are only learned that dog and cat are somehow related. So they don't actually tell you what the word dog means. Moreover, Dog is also related to leash, but not in the same way it's related to cat, which is something those models ignore. Their paper covers a new model, Strudel, which attempts to close some of the gap.

The model also keeps track of what words co-occur with a target word. It additionally tracks how those words are related (e.g., dogs and cats is considered to be different from dogs chase cats). The more different types of constructions that connect the target word and a given "friend", the more important that friend is thought to be.

This model ends up doing a better job than some older models at finding semantic associates of target words. It also can cluster different words (e.g., apple, banana, dog, cat) into categories (fruit, animal) with some success. Moreover, with some additional statistical tricks, they were able to clump the various "friends" into different groups based on the type of constructions they appear in. Properties, for instance, often appear in constructions involving X has Y. Conceptually-similar words appear in other types of constructions (e.g., X is like Y).

This presents some clear advantages over previous attempts, but it has some of the same limitations as well. The model discovers different types of features of a target word (properties, conceptually-similar words, etc.), but the label "property" has to be assigned by the researchers. The model doesn't know that has four legs is a property of dog and that like to bark is not -- it only knows that the two facts are of different sorts.

Perruchet & Tillman. Exploiting multiple sources of information in learning an artificial language: human data and modeling. 

Over the last 15 years, a number of researchers have looked at statistically-based word segmentation. After listening to a few minutes of speech in an unknown language, people can guess which sequences of phonemes are more likely to be words in that language.

It turns out that some sequences of phonemes just sound more like words, independent of any learning. The authors check to see whether that matters. Participants were assigned to learn one of two languages: a language in which half of the words a priori sounded like words, and a language in which half the words a priori sounded particularly not like words. Not only did participants do better in the first condition on the words that sound like words, they did better on the "normal" words, too -- even though those were the same as the "normal" words in the second condition. The authors argue that this is consistent with the idea that already knowing some words helps you identify other words.

They also find that the fact that some words a priori sound more like they are words is easy to implement in their previously-proposed PARSER model, which then produces data somewhat like the human data from the experiment.

Gildea & Temperley. Do grammars minimize dependency length?

Words in a sentence are dependent on other words. In secondary school, we usually used the term "modify" rather than "depend on." So in The angry butcher yelled at the troublesome child, "the angry butcher" and "at the troublesome child" both modify/depend on yelled. Similarly, "the angry" modifies/depends on butcher. Etc.

This paper explores the hypothesis that people try to keep words close to the words they depend on. They worked through the Wall Street Journal corpus and calculated both what the actual dependency lengths were in each sentence (for each word in the sentence, count all the words that are between a given word and the word it depends on, and sum) and also what the shortest possible dependency length would be. They found that actual dependency lengths were actually much  closer to the optimum in both the WSJ corpus and the Brown corpus than would be expected by chance. However, when they looked at two corpora in German, while dependency lengths were shorter than would be expected by random, the effect was noticeably smaller. The authors speculate this is because German has relatively free word order, because German has some verb-final constructions, or some other reason or any combination of those reasons.

Mueller, Bahlmann & Friederici. Learnability of embedded syntactic structures depends on prosodic cues. 

Center-embedded structures are hard to process and also difficult to teach people in artificial grammar learning studies that don't provide feedback. The authors exposed participants to A1A2B1B2 structures with or without prosodic cues. Participants largely failed to learn the grammar without prosodic cues. However, if a falling contour divided each 4-syllable phrase (A1A2B1B2) from each other, participants learned much more. They did even better if a pause was added in addition to the falling contour between 4-syllable phrases. Adding an additional pause between the As and Bs (in order to accentuate the difference between As and Bs) did not provide any additional benefit.

Cognitive Science, January 2010

In my continuing series on the past year in Cognitive Science: January, 2010.

Once again, the discussion of some of these papers will be technical.


Lee & Sarnecka. A model of knower-level behavior in number concept development.

Children learn the full meanings of number words slowly, one word at a time. The authors present a Bayesian model of number word acquisition -- or, more specifically, of performance on the famous Give-A-Number task. The model assumes that each child has a certain baseline preference to give certain numbers of items more than others. It also assumes that the child knows certain number words and not others. If the child, say, knows one and two, the child will give that number of items when asked and not when asked about a different number word (e.g., three), even if the child doesn't know what that other number word means.

The model was then fed data on the actual performance of a set of actual children and estimates what words the child knows and what the child's baseline preferences are. The model learned that children prefer to either give a handful of items or all the available items, which accords well with what has been seen over the years. It also seemed to do a reasonable job of doing several other things.

None of this was necessarily surprising, in the sense that the model modeled well-known data correctly. That said, psychological theories are often complex. Theorists (often) state them in loose terms and then make claims about what predictions the theory makes in terms of behavior in different tasks. Without specifying the theory in a formal model, though, it's not always clear that those are in fact the predictions the theory makes. This paper represents, among things, an attempt to take a well-known theory and show that it does in fact account for the observed data. To the extent it gets things wrong, the model presents a starting point for further refinement.

There has been a movement in some quarters to make and test more explicit models. This is undoubtedly a good thing. The question is whether there are many behaviors that we understand sufficiently well to produce reasonable models ... that aren't so simplistic that the formal model itself doesn't really tell us anything we don't know. That seems to be a point one could argue. One thing I like about this particular model is that the authors attempt to capture fine-grained aspects of individual subjects' performances, which is something we ultimately want to be able to do.

Estigarribia. Facilitation by variation: Right-to-left learning of English yes/no questions

The syntax of questions have played a key role in the development of modern linguistics. In particular, a great deal of ink has been spilled about auxiliary inversion. Compare That is a soccer ball with Is that a soccer ball. Well-known theories of English posit that the auxiliary is is generated in normal declarative position (that is...) and must be moved to the front of the sentence to form a question (is that...).

Estigarribia argues that many theories have assumed parents model auxiliary-inverted questions for their children. A (smallish) corpus analysis reveals that in fact ~20% of parental yes/no questions with auxiliaries are non-auxiliary-initial (that is a soccer ball?). Of all yes/no questions, canonical auxiliary-first questions make up less than half, with sentence fragments being quite common (soccer ball?).

Again looking at the corpus of 6 young children, Estigarribia finds that the children begin by producing the simplest, fragment questions (a soccer ball?). Next, they begin producing what Estigarribia calls subject-predicate questions (that a soccer ball?). Full-on auxiliary-inverted questions appear relatively late (is that a soccer ball). Estigarribia finds this consistent with a learning mechanism in which children learn the ends of sentences better than the beginnings of sentences, similar to the MOSAIC model.

One limitation is that children have difficulty producing long sentences, and the data are consistent with children producing shorter sentences first and eventually progressively-longer sentences. Estigarribia shows that he finds the same order of acquisition even in children who have somewhat longer MLUs at the beginning of the study (that is, produce longer sentences), but one can still worry. The fact that children selectively produce the ends of the sentences rather than the beginning could be due to the fact that the end of a question (a soccer ball?) is a lot more informative than the beginning (is that a?).

It might be somewhat more impressive if children produce non-inverted questions (that is a soccer ball?) before inverted questions, but Estigarribia does not analyze those types of sentences. What I find most compelling about this study is in fact the adult data. As Estigarribia points out, we don't want to think of language acquisition as a process in which children ultimately eliminate non-canonical questions (that is, those without inverted auxiliaries), since in fact adults produce many such sentences.

Nakatani & Gibson. An on-line study of Japanese nesting complexity.

Mary met the senator who attacked the reporter who ignored the president is easier to understand that The reporter who the senator who Mary met attacked ignored the president, even though the latter sentence is grammatical (of sorts) and means the same thing. Why this is the case has been a focus of study in psycholinguistics for many years.

The authors lay out a couple hypotheses. On one, the second sentence is harder to interpret because the relevant nouns are far from the verbs, making integrating ignored and the reporter harder to integrate. On other hypotheses, all the nested relative clauses (who...) generate expectations about what verbs are coming up. The more expectations, the more has to be kept in memory, and the harder the sentence is.

These hypotheses (and a similar surprisal hypothesis) are tested using the self-paced reading methodology in Japanese, a language with a few nice properties like relatively free word order, which makes controlling the stimuli slightly easier than it is in English. The results ultimately support the expectancy hypotheses over the retrieval hypotheses.

One of the interesting things about this paper is that one well-known retrieval hypothesis is actually Gibson's. So is one of the expectancy hypotheses, which he developed after he (apparently) decided the original theory was probably wrong. The willingness to abandon a cherished theoretical position in the face of new evidence is a trait more prized than seen in academia, and it's something to be admired -- and something very typical of Gibson.

Mirman, Strauss, Dixon & Magnuson. Effect of representational distance between meanings on recognition of ambiguous spoken words.

The authors looked at word recognition using two different paradigms (lexical decision and eye-tracking). All the words could be nouns. Some had only strong noun meaning (acorn, lobster). Some were homophones with two common noun meanings (chest -- chest of drawers or person's chest) and some were homophones with a common noun and a common verb meaning (bark -- the dog barked or the tree's bark).

Participants were fasted to interpret the unambiguous words (acorn, lobster), next fastest at recognizing the noun-verb words (bark) and slowest at the noun-noun words (chest). They take this in the context of previous research that has shown that words with two closely related meanings are faster to interpret that words with two very different meanings. In this study, the semantic relatedness of the two meanings for the noun-verb homophones were no closer than that of the noun-noun homophones. So the authors suggest that syntactic distance matters as well -- two meanings of the same syntactic type (e.g., noun) interfere with one another more than two meanings of different types (e.g., noun-verb).

An alternative explanation of these data is one of priming. 2/3 of the stimuli in this study were unambiguously nouns. This may have primed the noun meanings of the noun-verb homophones and helped automatically suppress the verb meaning. Thus, participants processed the noun-verb homophones more like unambiguous, non-homophonic words. The way to test this, of course, would be to run a similar study with unambiguous verbs, verb-verb homophones, and the same noun-verb homophones.

Video games, rotted brains, and book reviews

Jonah Lehrer has an extended discussion of his review of The Shallows, a new book claiming that the Internet is bad for our brains. Lehrer is skeptical, pointing out that worries about new technology are as old as time (Socrates thought books would make people stupid, too). I am skeptical as well, but I'm also skeptical of (parts of) Lehrer's arguments. The crux of the argument is as follows:
I think it's far too soon to be drawing firm conclusions about the negative effects of the web. Furthermore, as I note in the review, the majority of experiments that have looked directly at the effects of the internet, video games and online social networking have actually found significant cognitive benefits.
That, so far as it goes, is reasonable. My objection is to some of the evidence given:
A 2009 study by neuroscientists at the University of California, Los Angeles, found that performing Google searches led to increased activity in the dorsolateral prefrontal cortex, at least when compared with reading a "book-like text." Interestingly, this brain area underlies the precise talents, like selective attention and deliberate analysis, that Carr says have vanished in the age of the Internet. Google, in other words, isn't making us stupid -- it's exercising the very mental muscles that make us smarter.
This cuts several ways. Extra activation of a region in an fMRI experiment is interpreted different ways by different researchers. It could be evidence of extra specialization ... or evidence that the brain network in question is damaged and so needs to work extra hard. Lehrer is at least partially aware of this problem:
Now these studies are all imperfect and provisional. (For one thing, it's not easy to play with Google while lying still in a brain scanner.)
This is the line I have a particular issue with. If the question is whether extra Internet use makes people stupid, why on Earth would anyone need to use a $600/hr MRI machine to answer that question? We have loads of cheap psychometric tests of cognition. All methodologies have their place, and a behavior question is most easily answered with behavioral methods. MRI is far more limited.

Lehrer's discussion of the 2009 study above underscores this point: the interpretation of the brain images rests on our understanding of what behaviors the dorsolateral prefrontal cortex has shown up with in other studies. The logic is: A correlates with B correlates with C, thus A correlates with C. This is, as any logician will tell you, an unsound conclusion. When you add that using MRI can cost ten thousand dollars for a single experiment, it's a very expensive middleman!

Which isn't to say that MRI is useless or such studies are a waste of time. MRI is particularly helpful in understanding how the brain gives rise to various types of behavior, and it's sometimes helpful for analyzing behavior that we can't directly see. Neither applies here. If the Internet makes us dumb in a way only detectable with super-modern equipment, I think we can breath easy and ignore the problem. What we care about is whether people in fact are more easily distracted, have worse memory, etc. That doesn't require any special technology -- even Socrates could run that experiment.

Lehrer does discuss a number of good behavioral experiments. Despite my peevishness over the "Google in the scanner" line, the review is more than worth reading.

Cognitive Science, April 2010

This week I was tasked by the lab to check the last years' worth (or so) of issues of Cognitive Science and see what papers might be of interest to folks in the lab (other people are covering other journals). There of course many good papers not on the list below; I focused largely on the psycholinguistics articles. There are a lot of articles, so I'm going to be breaking up issues into separate posts.

Fair warning: my discussion of these articles is brief and so somewhat technical.

April 2010

Szymanik & Zajenkowski. Comprehension of simple quantifiers: empirical evaluation of a computational model.

Different quantifiers seem to require different amounts of computation. Formal logic suggests that checking the truth of Some of the cars are blue simply requires checking whether at least one car is blue (or failing to find any). Most of the cars are blue probably requires something like finding out how many cars is 1/2 of the cars and checking to see if at least more than  that are. That's harder.

S&Z had people evaluate the truth value of sentences like those in the examples. People were slower for the "harder" quantifiers. This suggests people are actually running through something like the formal math theorists use to describe quantifiers.

The only odd thing about the results is a ton of research (e.g., Bott & Noveck) has suggested that evaluating sentences with some can be very slow, presumably because it involves a scalar implicature, whereas in the study some was one of the fastest quantifiers. This either suggests that for some reason people weren't computing implicatures in their study or that the other quantifiers were really slow (or that Polish, the language they used, is just different).

Matthews & Bannard, Children's production of unfamiliar word sequences is predicted by positional variability and latent classes in a large sample of child-directed speech.

Two- and three-year olds were asked to repeat back four-word sequences. Several things were varied, such as how predictable the final word was based on the first three in the sequence sequence (e.g., jelly probably commonly appears after peanut butter and ... ) and whether the words that do commonly appear as the fourth word in such a sequence are semantically related (e.g., pretty much everything following I drive a ... is going to be some kind of vehicle).

Importantly, in the actual sequences presented to the children, the final word was one that hardly ever appears in that sequence (e.g., I hate green boxes). Kids were better at repeating the sequences when (1) entropy on the 4th word was high (e.g., many different words commonly follow the first three in the sequence, as in I drive a rather than peanut butter and), and when most words that typically appear in that 4th position are semantically related (I drive a truck/car/bus/Toyota/Ford). 

The authors (tentatively) suggest that such results are predicted by theories on which young children's grammars are very item-specific, involving many narrow sentence templates (I drive a + Vehicle), rather than theories on which young children's grammars involve broad abstract categories (e.g., Noun + Verb).

However, my labmate and collaborator Timothy O'Donnell has been working on a computational model that involves abstract grammatical categories but nonetheless stores high-frequency constructions (which has been allowed but not specifically explained on many grammatical theories such as Pinker's Words & Rules theory that are the traditional alternatives to item-based theories). One consequence of his model is that if a particular construction appears very frequently with little variation (peanut butter and jelly; 653,000 hits on Google), the model finds slight alternatives to that construction (peanut butter and toast; 120,000 hits on Google) extremely unlikely.

Casasanto, Fotakopoulou & Boroditsky. Space and time in the child's mind: Evidence for a cross-dimensional asymmetry.

4-5 year-olds and 9-10 year-olds watched movies of two animals traveling along parallel paths for different distances or durations and judged the which one went longer temporally or spatially. As has been previously shown in adults, the children's judgments of temporal length were affected by spatial length (e.g., if animal A went farther than B but in a shorter amount of time, children sometimes erroneously said A took the most time) more than judgments of spatial length were affected by temporal length (e.g., if animal A went farther than B but in less time, children were not as likely to be confused about when animal went the farthest).

One obvious confound, which the authors consider, is that the stimuli stayed on the screen until the children responded, which meant that information about physical distance was available at response time, but children had to remember the duration information. The authors point to a previous study with adults that controlled for this confound and got the same results, but they have not yet run that version with children (since I haven't read the study they refer to and the method isn't described, I can't really comment).

These results are taken as evidence of a theory on which our notion of time is dependent on our representations of space, but not vice versa.

Fay, Garrod, Roberts & Swoboda. The interactive evolution of human communication systems.

People played what amounted to repeated games of Pictionary. The games were played over and over with the same words, and the question was how the pictorial representations changed over repeated games. Participants were assigned either in pairs or communities of 8. The pairs played against each other only. In the communities, games were still played in pairs, but each person played first against one member of the community, then against another, and so on until they had played with everyone.

The people in the pairs condition rapidly created abstract visual symbols for the different target words, as has happened in previous research. What was interesting was that in the communities condition, participants also created similarly abstract symbols that were rapidly shared throughout the community, such that participants who had never played against one another could communicate with abstract symbols that they had each learned from others.

The study is meant to be a model of vocabulary development and dispersal, and it certainly made for a good read (I've been a fan of Garrod's work in other contexts as well). I don't know much about theories of language evolution, so it's difficult for me to say what the theoretical impact of this work is. One obvious question is whether it matters that people in a single community know they are in a single community. That is, did they continue to use the abstract symbols they'd learned because they reasonably thought the other person might know it, or was it instinct after having used that symbol many times?

Dothraki -- a response

The Language Creation Society has officially responded to my open letter requesting that they embed some useful experiments in Dothraki, a language they are creating for a new HBO show. You can read the response at Scientific American.

This formal response follows a series of informal emails between myself and both David Peterson (the author of the response) and Sai Emrys (the LCS president). It was a fun conversation, and while they're not taking me up on my suggestion -- at least not for this language -- I did learn a great deal from them, some of which makes it into their letter, which I recommend reading.

Overheard: The Prodigal Academic

I recently started reading The Prodigal Academic, a blog by a professor recently returned to academia after 7 years away. Lately she's written a number of useful posts about academia as a career. See these posts on spousal hiring, search committee dynamics, interviewing for tenure-track jobs, and women in science.

Lie detection: Part 2

I wrote recently about whether fMRI should be used for lie detection in court. US Magistrate Judge Tu Pham says "no". Science reports:
But while Judge Pham agreed that the technique had been subject to testing and peer review, it flunked on the other two points suggested by the Supreme Court to weigh cases like this one: the test of proven accuracy and general acceptance by scientists.
What I find interesting about this argument, as noted in my previous post, is that it's not clear that commonly-accepted "evidence" passes those tests: fingerprinting and eyewitness testimony are two good examples.