Living in an Imperfect World: Psycholinguistics Edition

You, sir, have tasted two whole worms. You have hissed all my mystery lectures and been caught fighting a liar in the quad. You will leave Oxford by the next town drain. -- Reverend Spooner.

There is an old tension in psycholinguistic (or linguistic) theory, which boils down to two ways of looking at language comprehension. When somebody says something to you, what do you do with that linguistic input? Is your goal to decode the sentence and figure out what the sentence means, or do you try to figure out what message the speaker intended to convey? The tension comes in because presumably we do a bit of both.

Suppose a young child says, "Look! A doggy!" while pointing to a cat. Most people will agree that technically, the child's sentence is about a dog. But most of can still work out that probably the child meant to talk about the cat; she used the word doggy either due to lack of vocabulary, confusion about the distinction between dogs and cats, or a simple speech error. Similarly, if your friend says at 7pm, "Let's go have lunch," technically your friend is suggesting having the midday meal, but probably you charitably assume he is just very hungry and so made a mistake in saying "lunch" instead of "dinner".

For a variety of reasons, linguistics and psycholinguistics have focused mostly on decoding sentences rather than intended meanings. This is important work about an important problem, but -- as we saw above -- it's only half the story. PNAS just published a paper by Gibson, Bergen, and Piantadosi that addresses the second half. Gibson and Bergen are at M.I.T., and Piantadosi recently graduated from M.I.T., and like much of the work coming out of Eastern Cambridge lately, they take a Bayesian perspective on the problem, and point out that the probability that the speaker intended to convey a particular message m given that they said sentence s is proportional to the prior probability that the speaker might want to convey m times the probability that they would say sentence s when intending to convey m.

This ends up accounting for the phenomenon brought up in Paragraph #2: If the literal meaning of the speaker's sentence isn't very likely to be what they intended to say ("Let's go have lunch", spoken at 7pj), but there is some other sentence that contains roughly the same words but has a more plausible meaning ("Let's go have dinner"), then you should infer that the intended message is the latter one and that the speaker made an error.

So far, this is not much more than a restatement of our intuitive theory in Paragraph #2. But a Gibson, Bergen and Piantadosi point out that a few non-trivial predictions come out of this. One is that you should assume that deletions (dropping a word) are more likely than insertions (adding a word). The reason is that there are only so many words that can be dropped from a particular sentence, so even if the probability of accidentally dropping a word is low, the probability of accidentally dropped a particular word isn't all that much lower. So if the intended sentence was "The ball was kicked by the girl", and the speaker accidentally dropped two words, the probability that the speaker happened to drop "was" and "by", resulting in the grammatical but unlikely sentence "The ball kicked the girl" is not so bad. However, suppose the intended sentence was "The girl kicked the ball", what are the chances the speaker accidentally adds "was" and "by", resulting in the grammatical but unlikely sentence "The girl was kicked by the ball"? Pretty much zilch, since English contains hundreds of thousands of words: There is pretty much no chance that those particular words would be inserted in those particular locations?

The authors present some data to back up these and some other predictions. For instance, if listeners are given reason to suspect that the speaker makes lots of speech errors, they are then even more likely to "correct" an unlikely sentence to a similar sentence with a more likely meaning.

There's plenty more work to be done. There are plenty of speech errors out there besides insertions and deletions, such as substitutions and the various phonological errors that made Rev. Spooner famous (see quote above). Work on phonological errors shows that speaker are more likely to make errors that result in real words (train->drain) than non-words (train->frain). Likely, the same is true of other types of errors. Building a full theory that incorporates all the complexity of speech processes is a ways off yet. But the work just published is an important proof of concept.

Gibson, E., Bergen, L., and Piantadosi, S. (2013). Rational integration of noisy evidence and prior semantic expectations in sentence interpretation Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.1216438110

