Games with Words

Results (Round 1): Crowdsourcing the Structure of Meaning & Thought

Posted by GamesWithWords on Tuesday, December 17, 2013

Language is a device for moving a thought from one person's head into another's. This means to have any real understanding of language, we also need to understand thought. This is what makes work on language exciting. It is also what makes it hard.

With the help of over 1,500 Citizen Scientists working through our VerbCorner project, we have been making rapid progress.

Grammar, Meaning, & Thought

You can say Albert hit the vase and Albert hit at the vase. You can say Albert broke the vase but you can't say Albert broke at the vase. You can say Albert sent a book to the boarder [a person staying at a guest house] or Albert sent a book to the border [the line between two countries], but while you can say Albert sent the boarder a book, you can't say Albert sent the border a book. And while you say Albert frightened Beatrice -- where Beatrice, the person experiencing the emotion, is the object of the verb -- you must say Beatrice feared Albert -- where Beatrice, the person experiencing the emotion, is now the subject.

How do you know which verb gets used which way? One possibility is that it is random, and this is just one of those things you must learn about your language, just like you have to learn that the animal in the picture on the left is called a "dog" and not a "perro", "xiaogou," or "sobaka." This might explain why it's hard to learn language -- so hard that non-human animals and machines can't do it. In fact, it results in a learning problem so difficult that many researchers believe it would be impossible, even for humans (see especially work on Baker's Paradox).

Many researchers have suspected that there are patterns in terms of which verbs can get used in which ways, explaining the structure of language and how language learning is possible, as well as shedding light on the structure of thought itself. For instance, the difference (it is argued) between Albert hit the vase and Albert hit at the vase is that the latter sentence means that Albert hit the vase ineffectively. You can't say Albert broke at the vase because you can't ineffectively break something: It is either broken or not. The reason you can't say Albert sent the border a book is that this construction means that the border owns the book, which a border can't do -- borders aren't people and can't own anything -- but a boarder can. The difference between Albert frightened Beatrice and Beatrice feared Albert is that the former describes an event that happened in a particular time and place (compare Albert frightened Beatrice yesterday in the kitchen with Beatrice feared Albert yesterday in the kitchen).

When researchers look at the aspects of meaning that matter for grammar across different languages, many of the same aspects pop up over and over again. Does the verb describe something changing (break vs. hit)? Does it describe something only people can do (own, know, believe vs. exist, break, roll)? Does it describe an event or a state (frighten vs. fear)? This is too suspicious of a pattern to be accidental. Researchers like Steven Pinker have argued that language cares about these aspects of meaning because these are basic distinctions our brain makes when we think and reason about the world (see Stuff of Thought). Thus, the structure of language gives us insight into the structure of thought.

The Question

The theory is very compelling and is exciting if true, but there are good reasons to be skeptical. The biggest one is that there simply isn't that much evidence one way or another. Although a few grammatical constructions have been studied in detail (in recent years, this work has been spearheaded by Ben Ambridge of the University of Liverpool), the vast majority have not been systematically studied, even in English. Although evidence so far suggests that which verbs go in which grammatical constructions is driven primarily or entirely by meaning, skeptics have argued that is because researchers so far have focused on exactly those parts of language that are systematic, and that if we looked at the whole picture, we would see that things are not so neat and tidy.

The problem is that no single researcher -- nor even an entire laboratory -- can possibly investigate the whole picture. Checking every verb in every grammatical construction (e.g., noun verb noun vs. noun verb at noun, etc.) for every aspect of meaning would take one person the rest of her life.

CrowdSourcing the Answer

Last May, VerbCorner was launched to solve this problem. For the first round of the project, we posted questions about 641 verbs and six different aspects of meaning. By October 18th, 1,513 volunteers had provided 117,584 judgments, which works out to 3-4 people per sentence per aspect of meaning. That was enough data to start analyzing.

As predicted, there is a great deal of systematicity in the relationship between meaning and grammar (for details on the analysis, see the next section). These results suggest that the relationship between grammar and meaning may indeed be very systematic, helping to explain how language is learnable at all. It also gives us some confidence in the broad project of using language as a window into how the brain thinks and reasons about the world. This is important, because the mind is not easy to study, and if we can leverage what we know about language, we will have learned a great deal. As we test more verbs and more aspects of meaning -- I recently added an additional aspect of meaning and several hundred new verbs -- that window will be come clearer and clearer.

Unless, of course, it turns out that not all of language is so systematic. While our data so far represent a significant proportion of all research to date, it's only a tiny fraction of English. That is what makes research on language so hard: there is so much of it, and it is incredibly complex. But with the support of our volunteer Citizen Scientists, I am confident that we will be able to finish the project and launch a new phase of the study of language.

That brings up one additional aspect of the results: It shows that this project is possible. Citizen Science is rare in the study of the mind, and many of my colleagues doubted that amateurs could provide reliable results. In fact, by the standard measures of reliability, the information our volunteers contributed is very reliable.

Of course, checking for a systematic relationship between grammar and meaning is only the first step. We'd also liked to understanding which verbs and grammatical constructions have which aspects of meaning and why, and leverage this knowledge into understanding more about the nature of thought. Right now, we still don't have enough data to have exciting new conclusions (for exciting old conclusions, see Pinker's Stuff of Thought). I expect I'll have more to say about that after we complete the next phase of data collection.

Details of the Analysis

Here is how we did the analyses. If meaning determines which grammatical constructions a given verb can appear in, then you would expect that all the verbs that appear in the same set of frames should be the same in terms of the core aspects of meaning discussed above. So if one of those verbs describes, for instance, physical contact, then all of them should.

Helpfully, the VerbNet project -- which was built on earlier work by Beth Levin -- has already classified over 6,000 English verbs according to which grammatical constructions they can appear in. The 641 verbs posted in the first round of the VerbCorner project consisted of all the verbs from 11 of these classes.

So is it the case that in a given class, all the verbs describe physical contact or all of them do not? One additional complication is that, as I described above, the grammatical construction itself can change the meaning. So what I did was count what percentage of verbs from the same class have the same value for a given aspect of meaning for each grammatical construction, and then I averaged over those constructions.

The "Explode on Contact" task in VerbCorner asked people to determine whether a given sentence (e.g., Albert hugged Beatrice) described contact between different people or things. Were the results for a given verb class and a given grammatical construction? Several volunteers checked each sentence. If there was disagreement among the volunteers, I used whatever answer the majority had chosen.

This graph shows the degree of consistency by verb class (the classes are numbered according to their VerbNet number), with 100% being maximum consistency. You can see that all eleven classes are very close to 100%. Obviously, exactly 100% would be more impressive, but that's extremely rare to see when working with human judgments, simply because people make mistakes. We addressed this in part by having several people check each sentence, but there are so many sentences (around 5,000), that simply by bad luck sometimes several people will all make a mistake on the same sentence. So this graph looks as close to 100% as one could reasonably expect. As we get more data, it should get clearer.

Results were similar for other tasks. Another one looked at whether the sentence described someone applying force (pushing, shoving, etc.) to something or someone else:

Maybe everything just looks very consistent? We actually had a check for that. One of the tasks measures whether the sentence describes something that is good, bad, or neither. These is no evidence that this aspect of meaning matters for grammar (again, the hypothesis is not that every aspect of meaning matters -- only certain ones that are particularly important for structuring thought are expected to matter). And, indeed, we see much less consistency:

Notice that there is still some consistency, however. This seems to be mostly because most sentences describe something that is neither good nor bad, so there is a fair amount of essentially accidental consistency within each verb class. Nonetheless, this is far less consistency that what we saw for the other five aspects of meaning studied.

Citizen Science in Harvard Magazine

Posted by GamesWithWords on Tuesday, December 17, 2013

A nice, extended article on recent projects, covering a wide range -- including GamesWithWords.org. Check it out.

Science Mag studies science. Forgets to include control group.

Posted by GamesWithWords on Friday, October 04, 2013

Today's issue of Science carries the most meta sting operation I have ever seen. John Bohannon reports a study of open access journals, showing lax peer review standards. He sent 304 fake articles with obvious flaws to 304 open access journals, more than half of which were accepted.

The article is written as a stinging rebuke of open access journals. Here's the interesting thing: There's no comparison to traditional journals. For all we know, open access journals actually have *stricter* peer review standards than traditional journals. We all suspect not, but suspicion isn't supposed to count as evidence in science. Or in Science.

So this is where it gets meta: Science -- which is not open access -- published an obviously flawed article about open access journals publishing obviously flawed articles.

It would be even better if Bohannon's article had run in the "science" section of Science, rather than in the news section, where it actually ran, but hopefully we can agree that Science can't absolve itself of checking its articles for factualness and logical coherence just by labeling them "news".

Titling

Posted by GamesWithWords on Wednesday, September 25, 2013

I have never been good at coming up with titles for articles. When writing for newspapers or magazines, I usually leave it up to the editor. There is some danger that comes with this, however.

Last week, I wrote a piece for Scientific American about similarities across languages. This piece was then picked up by Salon, which re-ran the article under a new title:

Chomsky's "Universal Language" is incomplete. Chomsky's theory does not adequately explain why different languages are so similar.

I agree that this is snappier than any title I would have come up with. It's also perhaps a bit snappier than the one Scientific American used. It's also dead wrong. For one, there is no such thing as Chomsky's "Universal Language." Or if there is, presumably it is love. Or maybe mathematics. Or maybe music. The term is "Universal Grammar."

If you squint, the subtitle isn't exactly wrong. In the article, I do claim that standard Universal Grammar theory's explanation of similarities across languages isn't quite right. But the title implies that UG suggests that languages are not that similar, whereas the real problem with UG is that -- at least on standard interpretations -- it suggests that languages should be more similar than they actually are.

I sent in a letter to "corrections" at Salon, and the title has now been switched to something more correct. The moral of the story? Apparently writing good titles really is just very hard.

GamesWithWords on Scientific American

Posted by GamesWithWords on Tuesday, September 24, 2013

Over the last week, ScientificAmerican.com has published two articles by me. The most recent, "Citizen Scientists decode meaning, memory and laughter," discusses how citizen science projects -- science projects involving collaborations between professional scientists and amateur volunteers -- are now being used to answer questions about the human mind.

Citizen Science – projects which involve collaboration between professional scientists and teams of enthusiastic amateurs — is big these days. It’s been great for layfolk interested in science, who can now not just read about science but participate in it. It has been great for scientists, with numerous mega-successes like Zooniverse and Foldit. Citizen Science has also been a boon for science writing, since readers can literally engage with the story.

However, the Citizen Science bonanza has not contributed to all scientific disciplines equally, with many projects in zoology and astronomy but less in physics and the science of the mind. It is maybe no surprise that there have been few Citizen Science projects in particle physics (not many people have accelerators in their back yards!), but the fact that there has been very little Citizen Science of the mind is perhaps more remarkable.

The article goes on to discuss three new mind-related citizen science projects, including our own VerbCorner project.

The second, "How to understand the deep structures of language," describes some really exciting work on how to explain linguistic universals -- work that was conducted by colleagues of mine at MIT.

In an exciting recent paper, Ted Gibson and colleagues provide evidence for a design-constraint explanation of a well-known bias involving case endings and word order. Case-markers are special affixes stuck onto nouns that specify whether the noun is the subject or object (etc.) of the verb. In English, you can see this on pronouns (compare "she talked with her"), but otherwise, English, like most SVO languages (languages where the typical word order is Subject, Verb, Object) does not mark case. In contrast, Japanese, like most SOV languages (languages where the typical word order is Subject, Object, Verb) does mark case, with -wa added to subjects and -o added to direct objects. "Yasu saw the bird" is translated as "Yasu-wa tori-o mita" and "The bird saw Yasu" is translated as "Tori-wa Yasu-o mita." The question is why there is this relationship between case-marking and SOV word order.

The article ran in the Mind Matters column, which invites scientists to write about the paper that came out in the last year that they are most excited about. It was very easy for me to choose this one.

Language and Memory Redux

Posted by GamesWithWords on Thursday, September 12, 2013

One week only: If you did not do our Language and Memory task when it was running earlier this year, now is your chance. We just re-launched it to collect some additional data.

I expect we'll have enough data without a week to finish this line of studies, rewrite the paper (this is a follow-up experiment that was requested by peer reviewers), and also post the full results here.

Вы понимаете по-русски?

Posted by GamesWithWords on Tuesday, September 10, 2013

У нас новый русский эксперимент. Большинство психолингвистов занимаются английским. Мы хотим узнать больше об остальних. Не волнуйтесь -- я не сам перевёл эксперимент. Перевела его настоящая рускоязычная!

If you didn't understand that, that's fine. We're recruiting participants for a new experiment in Russian. Apparently you aren't eligible. :)

Much of the research on language is done on a single language: English. In part, that's because many researchers happen to live in English-speaking countries. The great thing about the Internet is we are freed from the tyranny of geography.

One week left to vote

Posted by GamesWithWords on Tuesday, September 03, 2013

There is less than a week left to vote for our panel at SXSW -- or to leave comments (apparently comments are weighted more heavily than mere votes).

There is less than a week left to vote for our panel at SXSW -- or to leave comments (apparently comments are weighted more heavily that mere votes). So if you want to support our work in improving psychology and the study of the mind & language, please go vote.

Go to this link to create an SXSW account:

https://auth.sxsw.com/users/sign_up

Then go to this link and click on the thumb’s up (on the left under “Cast Your Vote”) to vote for us:

http://panelpicker.sxsw.com/vote/20525

You can read more about our proposal at the SXSW site, as well as here.

Who knows more words? Americans, Canadians, the British, or Australians?

Posted by GamesWithWords on Thursday, August 29, 2013

I have been hard at work on preliminary analyses of data from the the Vocab Quiz, which is a difficult 32 word vocabulary test. Over 2,000 people from around the world have participated so far, so I was curious to see which of the English-speaking nationalities was doing best.

Since the test was made by an American (me), you might expect Americans to do best (maybe I chose words or definitions of words that are less familiar to those in other countries). Instead, Americans (78.4% correct) are near the bottom of the heap, behind the British (79.8%), New Zealanders (82.2%), the Irish (80.1%), South Africans (83.9%), and Australians (78.6% -- OK that one is close). At least we're beating the Canadians (77.4%).

A fluke?

Maybe that was just bad luck. Plus, some of those samples are small -- there are fewer than 10 folks from New Zealand so far. So I pulled down data from the Mind Reading Quotient, which also includes a (different) vocabulary test. Since the Mind Reading Quotient has been running longer, there are more participants (around 3,000). The situation was no better: This time, we weren't even beating the Canadians.

Maybe this poor showing was due to immigrants in America who don't know English well? Sorry -- the above results only include people whose native language is English.

I also considered the possibility that maybe Americans are performing poorly because I designed the tests to be hard, inadvertently including worse that are rare in America but common elsewhere. But the consistency of results across other countries makes that seem unlikely: What do the British, New Zealanders, Irish, South Africans and Australians all know that we don't? This hypothesis suggests that the poor showing by Americans is due to one or two items in particular. Right now there isn't enough data to do item-by-item analyses, but once we have more. Which brings me to...

Data collection continues

If you want to check how good your vocabulary is compared to everyone else who has taken the test -- and if you haven't done so already -- you can take the Vocab Quiz here. At the Mind Reading Quotient, you can test your ability to understand other people -- to read between the lines.

Update:

Phytophactor asks whether these results are significant. In the MRQ data, all the comparisons are significant, with the exception of US v. Canada (which went the other direction in the Vocab Quiz data anyway). The comparison with Australia is a trend (p=.06). See comments below for additional details. I did not run the stats for Vocab Quiz.

Children don't always learn what you want

Posted by GamesWithWords on Monday, August 26, 2013

Someone has not been watching his/her speech around this little girl.

It's clear she has some sense as to what the phrase means, but clearly she's got the words wrong. But she is treating this phrase as compositional (notice how she switches between "his" and "my").

One of my younger brothers went around for a couple months saying "ship" whenever anything bad happened. But unfortunately we don't have that on video.

Taking research out into the wild

Posted by GamesWithWords on Wednesday, August 21, 2013

Like others, we believe that science is a little bit WEIRD — much of research is based on a certain type of person, from a very specific social, cultural, and economic background (WEIRD stands for Western Educated Industrialized Rich Democratic; Henrich, Heine, Norenzayan, 2010). We want to use the web and the help of citizen scientists to start changing that. In the next few months, we will be launching an initiative called Making Science Less Weird (stay tuned).

As part of Making Science Less Weird, we have proposed a panel presentation at the SXSW conference next year. Here, "we" includes the team at gameswithwords.org but also at testmybrain.org and labinthewild.org.

In order to be selected, however, *we need votes*. To support Making Science Less Weird and help us increase diversity in human research, please go to this link to create an SXSW account:

https://auth.sxsw.com/users/sign_up

Then go to this link and click on the thumb’s up (on the left under “Cast Your Vote”) to vote for us:

http://panelpicker.sxsw.com/vote/20525

Thanks for your support!

What makes interdisciplinary work difficult

Posted by GamesWithWords on Wednesday, August 21, 2013

I just read "When physicists do linguistics." Yes, I'm late to the party. In my defense, it only just appeared in my twitter feed. This article by Ben Zimmer describes work published earlier this year, in which a group of physicists applied the mathematics of gas expansion to vocabulary change. This paper was not well received. Among the experts discussed, Josef Fruehwald, a University of Pennsylvania graduate student, compares the physicists to Intro to Linguistics students (not favorably).

Part of the problem is that the physicists seem to have not understood the dataset they were working with and were in any case confused about what a word is, which is a problem if you are studying words! Influential linguist Mark Liberman wrote "The paper's quantitative results clearly will not hold for anything that a linguist, lexicographer, or psychologist would want to call 'words.'"

Zimmer concludes that

Tensions over [the paper] may really boil down to something simple: The need for better communication between disciplines that previously had little to do with each other. As new data models allow mathematicians and physicists to make their own contributions about language, scientific journals need to make sure that their work is on a firm footing by involving linguists in the review process. That way, culturomics can benefit from an older kind of scholarship -- namely, what linguists already know about humans shape words and words shape humans.

Beyond pointing out that linguists and other non-physicists don't already apply sophisticated mathematical models to language -- there are several entire fields that already do this work, such as computational linguistics and natural language processing -- I respectfully suggest that involving linguists at the review process is way too late. If the goal is to improve the quality of the science, bringing in linguists to point out that a project is wrong-headed after the project is already completed doesn't really do anyone much good. I guess it's good not to publish something that is wrong, but it would be even better to publish something that is right. For that, you need to make sure you are doing the right project to begin with.

This brings me to the difficulty with interdisciplinary research. The typical newly-minted professor -- that is, someone just starting to do research on his/her own without regular guidance from a mentor/advisor -- has studied that field for several years as an undergraduate, 5+ years as a graduate student, and several more years as a post-doc. In fact, in some fields even newly-minted professors aren't considered ready to release into the wild and are still working with a mentor. What this tells me is that it takes as much as 10 years of training and guidance before you are ready to be fully on your own. (This will vary somewhat across disciplines.)

Now maybe someone who has already mastered one scientific field can master the second one more quickly. I'm frankly not sure that's true, but it is an empirical question. But it seems very unlikely that anyone, no matter how smart nor how well trained in their first field, is ready to tackle big questions in a new field without at least a few years of training and guidance from an experienced researcher in that field.

This is not a happy conclusion. I'm getting a taste of this now, as I cross-train in computational modeling (my background is pure experimental). It is not fun to go from being regarded as an expert in your field to suddenly being the least knowledgeable person in your laboratory. (After a year of training, it's possible I'm finally a more competent computational modeler than at least the incoming graduate students, though it's a tough call -- they, at least, typically have several years of relevant undergraduate coursework.) And I'm not even moving disciplines, just sub-disciplines within cognitive science!

So it's not surprising that some choose the "shortcut" of reading a few papers, diving in, and hoping for the best, especially since the demands of the career mean that nobody really has time to take a few years off to learn a new discipline. But it's not clear that this is a particularly effective strategy. All the best interdisciplinary work I have seen -- or been involved in -- involved an interdisciplinary team of researchers. This makes sense. It's hard enough to be an expert in one field. Why try to be an expert in two fields when you could just collaborate with someone who has already done the hard work of becoming an expert in that discipline? Just sayin'.

VerbCorner (and others) on SciStarter.Com

Posted by GamesWithWords on Thursday, July 18, 2013

There is a brief profile of our crowd-sourcing project VerbCorner on SciStarter.com, with a number of quotes form yours truly.

SciStarter profiles a lot of Citizen Science / Crowd-sourced Science projects. Interestingly, most are physical sciences, with only one project listed under psychology (interestingly, also a language project).

This is not a feature of SciStarter but more a feature of Citizen Science. The Scientific American database only lists two projects under "mind and brain" -- and I'm pretty sure they didn't even have that category last time I checked. This is interesting, because psychologists have been using the Internet to do research for a very long time -- probably longer than anyone else. But we've been very late to the Citizen Science party.

Not, of course, that you shouldn't want to participant in non-cognitive science projects. There are a bunch of great ones. I've personally mostly only done the ones at Zooniverse, but SciStarter lists hundreds.

Peaky performance

Posted by GamesWithWords on Tuesday, July 16, 2013

Right now there is a giant spike of traffic to GamesWithWords.org, following Steve Pinker's latest tweet about one of the experiments (The Verb Quiz). I looked back over the five years since I started using Google Analytics, and you can see that in general traffic to the site is incredibly peaky.

The three largest single-day peaks account for over 10% of all the visitors to the site over that time period.

Moral of the story: I need Pinker to tweet my site every day!

Findings: GamesWithWords.org at DETEC2013

Posted by GamesWithWords on Wednesday, July 03, 2013

I recently returned from the inaugural Discourse Expectations: Theoretical, Experimental, and Computational Perspectives workshop, where I presented a talk ("Three myths about implicit causality") which ties together a lot of the pronoun research that I have been doing over the last few years, including results from several GamesWithWords.org experiments (PronounSleuth, That Kind of Person, and Find the Dax).

Field of Science