Field of Science

Cognitive Science, March 2010

In my continuing series on the past year in Cognitive Science: March, 2010.

Once again, the discussion of some of these papers will be technical.

March


Baroni, Murphy, Barbu, Poesio. Strudel: A corpus-based semantic model based on properties and types.

You are who your friends are. A number of computational linguists have been interested in just how much you can learn about a word based on the other words it tends appear with. Interestingly, if you take a word (e.g., dog) and look at the words it tends to co-occur with (e.g., cat), those other words often describe properties or synonyms of the target word. A number of researchers have suggested that this might be part of how we learn the meanings of words.

Baroni et al. are sympathetic to that literature, but they point out that such models are only learned that dog and cat are somehow related. So they don't actually tell you what the word dog means. Moreover, Dog is also related to leash, but not in the same way it's related to cat, which is something those models ignore. Their paper covers a new model, Strudel, which attempts to close some of the gap.

The model also keeps track of what words co-occur with a target word. It additionally tracks how those words are related (e.g., dogs and cats is considered to be different from dogs chase cats). The more different types of constructions that connect the target word and a given "friend", the more important that friend is thought to be.

This model ends up doing a better job than some older models at finding semantic associates of target words. It also can cluster different words (e.g., apple, banana, dog, cat) into categories (fruit, animal) with some success. Moreover, with some additional statistical tricks, they were able to clump the various "friends" into different groups based on the type of constructions they appear in. Properties, for instance, often appear in constructions involving X has Y. Conceptually-similar words appear in other types of constructions (e.g., X is like Y).

This presents some clear advantages over previous attempts, but it has some of the same limitations as well. The model discovers different types of features of a target word (properties, conceptually-similar words, etc.), but the label "property" has to be assigned by the researchers. The model doesn't know that has four legs is a property of dog and that like to bark is not -- it only knows that the two facts are of different sorts.

Perruchet & Tillman. Exploiting multiple sources of information in learning an artificial language: human data and modeling. 

Over the last 15 years, a number of researchers have looked at statistically-based word segmentation. After listening to a few minutes of speech in an unknown language, people can guess which sequences of phonemes are more likely to be words in that language.

It turns out that some sequences of phonemes just sound more like words, independent of any learning. The authors check to see whether that matters. Participants were assigned to learn one of two languages: a language in which half of the words a priori sounded like words, and a language in which half the words a priori sounded particularly not like words. Not only did participants do better in the first condition on the words that sound like words, they did better on the "normal" words, too -- even though those were the same as the "normal" words in the second condition. The authors argue that this is consistent with the idea that already knowing some words helps you identify other words.

They also find that the fact that some words a priori sound more like they are words is easy to implement in their previously-proposed PARSER model, which then produces data somewhat like the human data from the experiment.

Gildea & Temperley. Do grammars minimize dependency length?

Words in a sentence are dependent on other words. In secondary school, we usually used the term "modify" rather than "depend on." So in The angry butcher yelled at the troublesome child, "the angry butcher" and "at the troublesome child" both modify/depend on yelled. Similarly, "the angry" modifies/depends on butcher. Etc.

This paper explores the hypothesis that people try to keep words close to the words they depend on. They worked through the Wall Street Journal corpus and calculated both what the actual dependency lengths were in each sentence (for each word in the sentence, count all the words that are between a given word and the word it depends on, and sum) and also what the shortest possible dependency length would be. They found that actual dependency lengths were actually much  closer to the optimum in both the WSJ corpus and the Brown corpus than would be expected by chance. However, when they looked at two corpora in German, while dependency lengths were shorter than would be expected by random, the effect was noticeably smaller. The authors speculate this is because German has relatively free word order, because German has some verb-final constructions, or some other reason or any combination of those reasons.

Mueller, Bahlmann & Friederici. Learnability of embedded syntactic structures depends on prosodic cues. 

Center-embedded structures are hard to process and also difficult to teach people in artificial grammar learning studies that don't provide feedback. The authors exposed participants to A1A2B1B2 structures with or without prosodic cues. Participants largely failed to learn the grammar without prosodic cues. However, if a falling contour divided each 4-syllable phrase (A1A2B1B2) from each other, participants learned much more. They did even better if a pause was added in addition to the falling contour between 4-syllable phrases. Adding an additional pause between the As and Bs (in order to accentuate the difference between As and Bs) did not provide any additional benefit.

No comments: