Field of Science

Citizen Science Project: Likely Events

VerbCorner was our first step towards opening up the rest of the process. I have just opened up a new good to segment of the website called “Experiment Creator”, which is our second endeavor.

Experiment Creator

One of the most important parts of language experiments is choosing the stimuli. For many types of research, such as in many low level or mid-level vision projects, the experimenter has free reign to design essentially what ever stimuli they like. Language researchers are constrained by the fact that someone suggest other words don't, and each word that has the properties you want may also have other properties that you don't want along for the ride. For instance, you might want to compare nouns and verbs, which don't just differ in terms of part of speech but also frequency (there are many very low-frequency nouns) and length (in some languages, nouns will be systematically longer than verbs; in other languages, it will be the reverse).

Typically, we have to run one or more “norming” experiments to choose stimuli that are controlled for various nuisance factors. These are not really experiments. There is no hypothesis. The purpose of the experiment is indirect (it's an experiment to create another experiment). So I usually do not post them at, which recruits people who want to participate in experiments.

The new Experiment Creator project changes this. The tasks posted there will be meta-experiments, used to choose stimuli for other experiments. I just posted the first one, Likely Events.

Likely Events

One of the big discoveries about language in the last few decades is that when we are listening to someone talk or reading a passage, we are actively predicting what will come next. If you hear “John needed money, so he went to the…” you probably expect the next word to be “ATM," not “hibernate.” There are two reasons: 1) "the" is usually followed by a noun, not a verb, and 2) "hibernate" is a relatively rare word.

Much of this research has focused on word frequency and what words follow what other words. We are developing several projects to look more carefully at predictions based not on what word follows what word but on what event is likely to follow what event. In general, "the street" is a more common sequence of words than "the ATM" and "street" is more common than "ATM", but you probably didn't think that the example sentence above was likely to end with "street" for a simple reason: That's not (usually) where you go when you need money.

To do this research, we need to have sequences of events and vary how likely it is that the one event would follow the other, as well as how likely each event is to happen on its own. And we need many, many such sequences. If you would like to help us out, you can do so here.

On the theory that the people interested in these projects will be more committed, Likely Events takes a bit longer than our typical project (in order to make up for the smaller number of volunteers). I expect participation will take on the order of half an hour. We will see how this goes and how many people are interested. Feedback is welcome.

No comments: