Field of Science

Showing posts with label Web-based research. Show all posts
Showing posts with label Web-based research. Show all posts

Fractionating IQ

Near the dawn of the modern study of the mind, the great psychological pioneer Charles Spearman noticed that people who are good at one kind of mental activity tend to be good at most other good mental activities. Thus, the notion of g (for "general intelligence") was born: the notion that there is some underlying factor that determines -- all else equal -- how good someone is at any particular intelligent task. This of course fits folk psychology quite well: g is just another word for "smarts".

The whole idea has always been controversial, and many people have argued that there is more than one kind of smarts out there (verbal vs. numeric, logical vs. creative, etc.). Enter a recent paper by Hampshire and colleagues (Hampshire, HIghfield, Parkin & Owen, 2012) which tries to bring both neuroimaging and large-scale Web-based testing to bear on the question.

In the neuroimaging component, they asked sixteen participants to carry out twelve difficult cognitive tasks while their brains were scanned and applied principle components analysis (PCA) to the results. PCA is a sophisticated statistical method for grouping things.

A side note on PCA

If you already know what PCA is, skip to the next section. Basically, PCA is a very sophisticated way of sorting thigns. Imagine you are sorting dogs. The simplest thing you could do is have a list of dog breeds and go through each dog and sort it according to its breed.

What if you didn't already have dog breed manual? Well, German shepherds are more similar to one another than any given German shepherd is to a poodle. So by looking through the range of dogs you see, you could probably find a reasonable way of sorting them, "rediscovering" the various dog breeds in the process. (In more difficult cases, there are algorithms you could use to help out.)

That works great if you have purebreds. What if you have mutts? This is where PCA comes in. PCA assumes that there are some number of breeds and that each dog you see is a mixture of those breeds. So a given dog may be 25% German Shepherd, 25% border collie, and 50% poodle. PCA tries to "learn" how many breeds there are, the characteristics of those breeds, and the mixture of breeds that makes up each dog -- all at the same time. It's a very powerful technique (though not without its flaws).

Neuroimaging intelligence

Analysis focused only on the "multiple demands" network previously identified as being related to IQ and shown in red in part A of the graph below. PCA discovered two underlying components that accounted for about 90% of the variance in the brain scans across the twelve tasks. One was particularly important for working memory tasks, so the authors called in MDwm (see part B of the graph below), and it involved mostly the IFO, SFS and ventral ACC/preSMA (see part A below for locations). The other was mostly involved in various reasoning tasks and involved more IFS, IPC and dorsal ACC/preSMA.


Notice that all tasks involved both factors, and some tasks (like the paired associates memory task) involved a roughly equal portion of each.

Sixteen subjects isn't very many

The authors put versions of those same twelve tasks on the Internet. They were able to get data from 44,600 people, which makes it one of the larger Internet studies I've seen. The authors then applied PCA to those data. This time they got three components, two of which were quite similar to the two components found in the neuroimaging study (they correlated at around r=.7, which is a very strong correlation in psychology). The third component seemed to be particularly involved in tasks requiring language. Most likely that did not show up in the neuroimaging study because the neuroimaging study focused on the "multiple demands" network, whereas language primarily involves other parts of the brain.

The factors dissociated in other ways as well. Whereas people's working memory and reasoning abilities start to decline about the time people reach the legal drinking age in the US (coincidence?) verbal skills remain largely undiminished until around age 50. People who suffer from anxiety had lower than average working memory abilities, but average reasoning and verbal abilities. Several other demographic factors similarly had differing effects on working memory, reasoning, and verbal abilities.

Conclusions

The data in this paper are very pretty, and it was a particularly nice demonstration of converging behavioral and neuropsychological methods. I am curious what the impact will be. The authors are clearly arguing against a view on which there is some unitary notion of IQ/g. It occurred to me as I wrote this what while I've read many papers lately discussing the different components of IQ, I haven't read anything recent that endorses the idea of a unitary g. I wonder if there is anyone, and, if so, how they account for this kind of data. If I come across anything, I will post it here.


------
ResearchBlogging.orgHampshire, A., Highfield, R., Parkin, B., & Owen, A. (2012). Fractionating Human Intelligence Neuron, 76 (6), 1225-1237 DOI: 10.1016/j.neuron.2012.06.022

Small World of Words

A group of researchers in Belgium is putting together a very large word association network by asking volunteers to say which words are related to which other words. They are hoping to recruit around 300,000 participants, which makes it my kind of study! (Technically, I've never tried 300,000 participants -- I think we've never gone beyond about 50,000, though we have some new things in the pipeline...)

It looks interesting. To participate, go to www.smallworldofwords.com. You can read more about the project here.

New Experiment: Drama Queen

The latest experiment in my quest to understand how people use emotion verbs is now posted. You will be introduced to a character who is, as the name of the game implies, a drama queen. She has many fraught relationships with her friends. You will be introduced to a number of friends, how Susan feels about each friend, and a new verb that you will try to use to describe that relationship. Enjoy.

New Experiment: EmotionSense

I just posted a new experiment on the website: EmotionSense. I have lately gotten very interested in verb-learning, specifically how we decide which of the participants in an event should be the grammatical subject, which the grammatical object, etc. (see this post and this one). In this experiment, you'll answer some questions about different types of emotions. I'll use this information to help design some upcoming verb-learning experiments.

As usual, the experiment is short and should take 3-5 minutes.


[snappy caption goes here]


-----
photo credit here

Crowdsourcing My Data Analysis

I just finished collecting data for a study. Do you want to help analyze it?

Puns

What makes a pun funny? If you said "nothing," then you should probably skip this post. But even admirers of puns recognize that while some are sublime, others are ... well, not.

Over the last year, I've been asking people to rate funniness of just over 2300 different puns. (Where did I get 2300 puns? The user-submitted site PunoftheDay. PunoftheDay also has funniness ratings, but I wanted a bit more control over how the puns were rated and who rating them.).

Why care what makes puns funny?

There are three reasons I ran this experiment. I do mostly basic research, and while I believe in its importance and think it's fun, the idea of doing a project I could actually explain to relatives was appealing. I was partly inspired by Zenzi Griffin's 2009 CUNY talk reporting a study she ran on why parents call their kids by the wrong names (typically, calling younger children by elder children's names), work which has now been published in a book chapter.

Plus, I was just interested. I mean: puns!

Finally, I was beginning a line of work on the interpretation of homophones. One of the best-established facts about homophones is that we very rapidly suppress context-irrelevant meanings of words -- in fact, so rapidly that we rarely even notice. If your friend said, "I'm out of money, so I'm going to stop by the bank," would you really even notice considering that bank might mean the side of a river?

A river bank. 
photo: Istvan, creative commons 

A successful pun, on the other hand, requires that at least two meanings be accessed and remain active. In some sense, a pun is homophone processing gone bad. By better understanding puns, I thought I might get some insight into language processing.

Puntastic


As already mentioned, my first step down this road was to collect funniness ratings for a whole bunch of puns. I popped them into a Flash survey, called it Puntastic, and put it on the Games With Words website. The idea was to mine the data and try to find patterns which could then be systematically manipulated in subsequent experiments.

It turns out that there are a lot of ways that 2300 puns can be measured and categorized. So while I have a few ideas I want to try out, no doubt many of the best ones have not occurred to me. Data collection was crowdsourced, and I see no reason why the analyses shouldn't be as well.

I have posted the data on my website. If you have some ideas about what might make one pun funnier than another -- or just want to play around with the data -- you are welcome to it. Please post your findings here.

If you are a researcher and might use the data in an official publication, please contact me directly before beginning analysis (gameswithwords$at*gmail.com) just so there aren't misunderstandings down the line. Failure to get permission to publish analyses of these data may be punished by extremely bad karma and/or nasty looks cast your way at conferences.

The results so far...

Unfortunately for the crowd, I've already done the easiest analyses. The following are based on nearly 800 participants over the age of 13 who listed English as both their native and primary languages (there weren't enough non-native English speakers to conduct meaningful analyses on their responses).

The average was 2.6 stars out of 7 (participants could choose anywhere from 1 to 7 stars, as well as "I don't get it," which was scored as -1 for these analyses), which says something either about the puns I used or the people who rated them.

First I looked at differences between participants to see if I could find types of people who like puns more than others. There was no significant difference in overall ratings by men or women.



I also asked participants if they thought they had good or poor social skills. There was no significant difference there, either.



I also asked them in they had difficulty reading or if they had ever been diagnosed with any psychiatric illnesses, but neither of those factors had any significant effect either (got tired of making graphs, so just trust me on this one).

The effect of age was unclear.


It was the case that the youngest participants produced lower ratings than the older participants (p=.0029), which was significant even after a conservative Bonferroni correction for 15 possible pairwise comparisons (alpha=.0033). However, the 10-19 year-olds' ratings were also significantly lower than the 20-29 year-olds' (p=.0014) and the 30-39 year-olds' (p=.0008), but obviously this was not true of the 40-49 year-olds' or 50-59 year-olds' ratings. So it's not clear what to make of that. Given that the overall effect size was small and that this is an exploratory analysis, I wouldn't make much of the effect without corroboration from an independent data set.

The funniest puns

The only factor I've looked at so far that might explain pun funniness is the length of the joke. I considered only the 2238 puns for which I had at least 5 ratings (which was most of them). I asked whether there might be a relationship between the length of the pun and how funny it was. I could imagine this going either way, with concise jokes being favored (short and sweet) or long jokes having a better lead-up (the shaggy dog effect). In fact, the correlation between pun ratings and length in terms of number of characters (r=.05) or in terms of number of words (r=.05) were both so small I didn't bother to do significance tests.

I broke up the puns into five groups according to length to see if maybe there was a bimodal effect (shortest and longest jokes are funniest) or a Goldilocks effect (average-length jokes are best). There wasn't.

In short, I can't tell you anything about what makes some people like puns more than others, or why people like some puns more than others. What I can tell you is which puns people did or didn't like. Here are the top 5 and bottom 5 puns:

1. He didn't tell his mother that he ate some glue. His lips were sealed.
2. Cartoonist found dead in home. Details are sketchy.
3. Biologists have recently produced immortal frogs by removing their vocal cords. They can't croak.
4. The frustrated cannibal threw up his hands.
5. Can Napoleon return to his place of birth? Of Corsican.
...
2234. The Egyptian cinema usherette sold religious icons in the daytime. Sometimes she got confused and called out, 'Get your choc isis here!'
2235. Polly the senator's parrot swallowed a watch.
2236. Two pilgrims were left behind after their diagnostic test came back positive.
2237. In a baseball season, a pitcher is worth a thousands blurs.
2238. He said, "Hones', that is the truth', but I knew elide.

Ten points to anyone who can even figure out what those five puns are about. Mostly participants rated this as "I don't get it."

----------------------
BTW Please don't take from this discussion that there hasn't been any serious studies of puns. There have been a number, going back at least as far as Sapir of the Sapir-Whorf hypothesis, who wrote a paper on "Two Navaho Puns." There is a well-known linguistics paper by Zwicky & Zwicky and at least one computer model that generates its own puns. However, I know a lot less about this literature than I would like to, so if there are any experts in the audience, please feel free to send me links.

Do you speak Japanese?

Do you speak Japanese?



If so, I've got an experiment for you. A while back I presented some results from a project comparing pronoun processing in English, Spanish, Mandarin and Russian. We're also testing Japanese. So if you speak Japanese and have a few minutes, please follow this link. Even better, if you know someone who is a fluent Japanese speaker -- or, even better, a native Japanese speaker, please send him/her the link.

If you speak English -- and you probably do if you're reading this post -- and have never participating in any of my English pronoun experiments, you can follow this link. These experiments usually take less than 5 minutes.

Huh? Pronoun processing?

For those of you wondering what I could possibly be studying, the interesting thing about pronouns is that their meaning changes wildly depending on context. Given the right context, she can refer to any female (and some things that aren't actually female, like ships). That isn't true of proper names (Jane Austen can only be used to refer to one person).

Some theories state that we learn language-specific cues that help us figure out what a given pronoun in a given context means. Other theories state we use general intelligence to pull off the feat. On the second theory, if you use the same contexts in different languages, people should interpret pronouns the same way. On the first theory, that isn't necessarily the case.

(Obviously I'm being cagey here in terms of how exactly we're manipulating context in the experiment, since I don't want to bias any potential participants.)


More posts on pronouns: here, here and here.

New Language Experiment for Bilinguals

I'm not sure I've ever blogged about a conference past the first day. I'm usually too tired by the second day. BUCLD is particularly grueling, running over 12 hours on the first day and near 12 hours on the second. Plus the parties.

I do want to point folks to one thing: Thomas Roeper, Barbara Zurer Pearson and Margaret Grace, all of the University of Massachusetts, are running an interesting study on quantifiers (words like all, some, each, and most). One interesting thing about this study is that while language researchers very often exclude non-native speakers and bilinguals, the researchers are very interested in comparing results from native and non-native speakers of English. Right now, they're looking for people who learned some language other than English prior to learning English.

The study is here. They are particularly interested right now in getting data from non-native English speakers. There is a raffle that participants can win (details are on the site).

Does Global Warming Exist, and Other Questions We Want Answered

This week, I asked 101 people on Amazon Mechanical Turk both whether global temperatures have been increasing due to human activity AND what percentage of other people on Amazon Mechanical Turk would say yes to the first question. 78% agree with the answer to the first question. Here's the answers to the second, broken down by whether the respondent did or did not believe in man-made global warming:

Question: How many other people on Amazon Mechanical Turk believe global temperatures have been increasing due to human activity?

                     Average            1st Quartile-3rd Quartile
Believers         72%                         60%-84%
Denialists        58%                         50%-74%
Correct            78%                             ------

Notice that those who believe global warming is caused by human activity are much better at estimating how many other people will agree than are those who do not. Interestingly, the denialists' answer is much closer to the average of all Americans, rather than  Turkers (who are mostly but not exclusively American, and are certainly a non-random sample).

So what?

Why should we care? More importantly, why did I do this experiment? A major problem in science/life/everything is that people disagree about the answers to questions, and we have to decide who to believe. A common-sense strategy is to go with whatever the majority of experts says. There are two problems, though: first, it's not always easy to identify an expert, and second, the majority of experts can be wrong.

For instance, you might ask a group of Americans what the capital of Illinois or New York is. Although in theory, Americans should be experts in such matters (it's usually part of the high school curriculum), in fact the majority answer in both cases is likely to be incorrect (Chicago and New York City, rather than Springfield and Albany). This was even true in a recent study of, for instance, MIT or Princeton undergraduates, who in theory are smart and well-educated.

Which of these guys should you believe?

So how should we decide which experts to listen to, if we can't just go with "majority rules"? A long chain of research suggests an option: ask each of the experts to predict what the other experts would say. It turns out that the people who are best at estimating what other people's answers will be are also most likely to be correct. (I'd love to cite papers here, but the introduction here is coming from a talk I attended earlier in the week, and I don't have the the citations in my notes.) In essence, this is an old trick: ask people two questions, one of which you know the answer to and one of which you don't. Then trust the answers on the second question that come from the people who got the first question right. 

This method has been tested on a number of questions and works well. It was actually tested on the state capital problem described above, and it does much better than a simple "majority rules" approach. The speaker at the talk I went to argued that this is because people who are better able to estimate the average answer simply know more and are thus more reliable. Another way of looking at it though (which the speaker mentioned) is that someone who thinks Chicago is the capital of Illinois likely isn't considering any other possibilities, so when asked what other people will say guesses "Chicago." The person who knows that in fact Springfield is the capital probably nonetheless knows that many people will be tricked by the fact that Chicago is the best-known city in Illinois and thus will correctly guess lots of people will say Chicago but that some people will also say Springfield. 

Harder Questions

I wondered, then, how well it would work on for a question where everybody knows that there are two possible answers. So I surveyed Turkers about Global Warming. Believers were much better at estimating how many believers there are on Turk than were denialists.

Obviously, there are a few ways of interpreting this. Perhaps denialists underestimate both the proportion of climate scientists who believe in global warming (~100%) and the percentage of normal people who believe in global warming, and thus they think the evidence is weaker than it is. Alternatively, denialists don't believe in global warming and thus have trouble accepting that other people do and thus lower their estimates. The latter proposal, though, would suggest that believers should over-estimate the percentage of people who believe in global warming, though that is not in fact the case.

Will this method work in general? In some cases, it won't. If you asked expert physicists in 1530 about quantum mechanics, presumably none of them would believe it and all would correctly predict that none of the other would believe it. In other cases, it's irrelevant (near 100% of climatologists believe in man-made global warming, and I expect they all know that they all believe in it). More importantly, the method may work well for some types of questions and not others. I heard in this talk that researchers have started using the method to predict product sales and outcomes of sports matches, and it actually does quite well. I haven't seen any of the data yet, though.


------
For more posts on science and politics, click here and here.

Findings: The Causality Implicit in Language

Finding Causes

Consider the following:

(1) Sally hates Mary.
a. How likely is this because Sally is the kind of person who hates people?
b. How likely is this because Mary is the kind of person whom people hate?

Sally hates Mary doesn't obviously supply the relevant information, but starting with work by Roger Brown and Debora Fish in 1983, numerous studies have found that people nonetheless rate (a) as more likely than (b). In contrast, people find Sally frightens Mary more indicative of Sally than of Mary (the equivalent of rating (b) higher than (a)). Sentences like Sally likes Mary are called “object-biased,” and sentences like Sally frightens Mary are called “subject-biased.” There are many of sentences of both types.

Brown and Fish, along with many of the researchers who followed them, explain this in terms of an inference from knowledge about how the world works:
Consider the two verbs flatter and slander… Just about everyone (most or all persons) can be flattered or slandered. There is no special prerequisite. It is always possible to be the object of slander or flattery … By sharp contrast, however, not everyone, by any means, not even most or, perhaps, many are disposed to flatter or to slander… [Thus] to know that one party to an interaction is disposed to flatter is to have some basis for predicting flattery whereas to know only that one party can be flattered is to know little more than that that party is human. (Brown and Fish 1983, p. 265)
Similar results are found by using other ways of asking about who is at fault:

(2) Sally hates Mary.
a. Who is most likely responsible?   Sally or Mary?



(The photo on the right came up on Flickr when I searched for pictures about "causes". It turns out Flickr is not a good place to look for pictures about "hating," "frightening," or "causes". But I liked this picture.)


Understanding Pronouns


Now consider:

(3) Sally hates Mary because she...
(4) Sally frightens Mary because she...

Most people think that "she" refers to Mary in (3) but Sally in (4). This is a bias -- not absolute -- but it is robust and easy to replicate. Again, there are many verbs which are "object-biased" like hates and many which are "subject-biased" like frightens. Just as in the causal attribution effect above, this pronoun effect seems to be a systematic effect of (at least) the verb used. This fact was first discovered by Catherine Garvey and Alfonso Caramazza in the mid-70s and has been studied extensively first.

The typical explanation of the pronoun effect is that the word "because" implies that you are about to get an explanation of what just happened. Explanations usually refer to causes. So you expect the clause starting with she to refer to the cause of first part of the sentence. Therefore, people must think that Mary caused Sally hates Mary but Sally caused Sally frightens Mary.

Causes and Pronouns

Both effects are called "implicit causality," and researchers have generally assumed that the causal attribution effect and the pronoun effect are basically one and the same. An even stronger version of this claim would be that the pronoun effect relies on the causal attribution effect. People resolve the meaning of the pronouns in (3) and (4) based on who they think the cause of the first part of the sentence is. The causal attribution task in (1) and (2) is supposed to measure exactly that: who people think the cause is.

Although people have been doing this research for around three decades, nobody seems to have actually checked whether this is true -- that is, are verbs that are subject-biased in terms of causal attribution also subject-biased in terms of pronoun interpretation?

I recently ran a series of three studies on Amazon Mechanical Turk to answer this question. The answer is "no."



This figure shows the relationship between causal attribution biases (positive numbers mean the verb is subject-biased, negative means its object-biased) and pronoun biases (100 = completely subject-biased, 0 = completely object-biased). Though there is a trend line in the right direction, it's essentially artifactual. I tested four different types of verbs (the details of the verb classes take longer to explain than they are interesting), and it happens that none of them were subject-biased in terms of pronoun interpretation but object-biased in terms of causal attribution (good thing, since otherwise I would have had nowhere to put the legend). There probably are some such verbs; I just only tested a few types.

I ran three different experiments using somewhat different methods, and all gave similar results (that's Experiment 2 above).

More evidence


A number of previous studies showed that causal attribution is affected by who the subject and object are. For instance, people are more object-biased in interpreting The employee hated the boss than for The boss hated the employee. That is, they seem to think that whether the boss is more likely to be the cause whether the boss is the one hating or hated. This makes some sense: bosses are in a better position to effect employees than vice versa.

I was able to find this effect in my causal attribution experiments, but there was no effect on pronoun resolution. That is, people thought "he" referred to the employee in (5) and the boss in (6) at pretty much the same rate.

(5) The boss hated the employee because he...
(6) The employee hated the boss because he...

Conclusion

This strongly suggests that these two effects are two different effects, due to different underlying mechanisms. I think this will come as a surprise to most people who have studied these effects in the past. It also is a surprise in terms of what we know about language processing. There is lots of evidence that people use any and all relevant information when they are interpreting language. Why aren't people using the conceptualization of the world as revealed by the causal attribution task when interpreting pronouns? And what are people doing when they interpret pronouns in these contexts?

I do have the beginnings of an answer to the latter question, but since  the data in this experiment doesn't speak it, that will have to wait for a future post.


---------
Brown, R., & Fish, D. (1983). The psychological causality implicit in language Cognition, 14 (3), 237-273 DOI: 10.1016/0010-0277(83)90006-9

Garvey, C., & Caramzza, A. (1974). Implicit causality in verbs Linguistic Inquiry, 5, 459-464

Picture: Cobalt123.

Thank you, Amazon!

As regular readers know, I've been brushing up my CV in anticipation of some application deadlines. This mostly means trying to move papers from the in prep column to the in submission column. (I'd love to get some new stuff into the in press column, but with the glacial pace of review, that's unlikely to happen in the time frame I'm working with).

This means, unfortunately, I'm working during a beautiful Saturday morning. This would be more depressing if it weren't for the wonder that is Amazon Mechanical Turk. I ran two experiments yesterday (48 subjects each), have another one running right now, and will shortly put up a fourth. The pleasure of getting new data -- of finding things out -- is why I'm in this field. It's almost as fun as walking along the river on a beautiful Saturday morning.

Almost.


How I feel about Amazon Mechanical Turk -- the psycholinguist's new best friend.


photo: Daniel*1977

Thank you, Oprah

Oprah's magazine linked to my collaborator's web-based lab. I'm a little miffed at the lack of the link love, but I still got something out of it -- we now have over 20,000 participants in the experiment we've been running on her site. So thank you, Oprah.

Busy analyzing...

Help Games with Words get a job!

As job application season comes around, I'm trying to move some work over from the "in prep" and "under revision" columns to the "submitted" column (which is why I'm working on a Sunday). There is one old project that's just waiting for more data before resubmission. I've already put up calls here for readers to participate, so you've probably participated. But if anyone is willing to pass on this call for participation to their friends, it would be much appreciated. I personally think this is the most entertaining study I've run online, but for whatever reason it's never attracted the same amount of traffic as the others, so progress has been slow.

You can find the experiment (The Video Test) here.

Overnight data on lying and bragging

Many thanks to all those who responded to my call for data last week. By midnight, I had enough data to be confident of the results, and the results were beautiful. I would have posted about them here on Friday, but in the lead-up to this presentation, I did so much typing I burned out my wrists and have been taking a much-needed computer break.

The study looked at the interpretation of the word some. Under some conditions, people interpret some as meaning some but not all, but other times, it means simply not none. For instance compared John did some of his homework with If you eat some of your green beans, you can have dessert. Changing some to some-but-not-all doesn't change the meaning of the first sentence, but (for most people) changes the interpretation of the second.

This phenomenon, called "scalar implicature" is one of the hottest topics in pragmatics -- a subdivision of linguistic study. The reasons for this are complex -- partly it's because Ira Noveck and his colleagues turned out a series of fascinating studies capturing a lot of people's attention. Partly it's because scalar implicature is a relatively easily-studied test case for several prominent theories. Partly it's other reasons.

Shades of meaning

On most theories, there are a few reasons some might be interpreted as some-but-not-all or not. The usual intuition is that part of why we assume John did some of his homework means some-but-not-all is because if it were true that John did all of his homework, the speaker would have just said so ... unless, of course, the speaker doesn't know if John did all his homework or if the speaker does know but have a good reason to obfuscate.

At least, that's what many theorists assume, but proving it has been hard. Last year, Bonnefon, Feeney & Villejoubert published a nice study showing that people are less likely to interpret some as some-but-not-all in so-called "face-threatening" contexts -- that is, when the speaker is being polite. For instance, suppose you are a poet and you send 10 poems to a friend to read. Then you ask the friend what she thinks, and she says, "Some of the poems need work." In this case, many people suspect that the friend actually means all of the poems need work, but is being polite.

The study

In this quick study, I wanted to replicate and build on Bonnefon et al's work. The experiment was simple. People read short statements and then answered a question about each one. The first two statement/question pairs were catch trials -- trials with simple questions and obvious answers. The small number of participants who got those wrong were excluded (presumably, they misunderstood the instructions or simply weren't paying attention).

The critical trial was the final one. Here's an example:
Sally: 'John daxed some of the blickets.'
'Daxing' is a neutral activity, neither good nor bad.
Based on what Sally said, how likely is it that John daxed ALL the blickets?
As you can see, the sentence contained unknown words ('daxing', 'blickets'), and participants were presented with a partial definition of one of them (that daxing is a neutral activity). The reason to do this was that it allowed us to manipulate the context carefully.

Each participant was in one of six conditions. Either Sally said "John daxed some...," as in the example above, or she said "I daxed some..." Also, "daxing" was described as either a neutral activity, as in the example above, or a negative activity (something to be ashamed of), or a positive activity (something to be proud of).

Results


As shown in the graph, whether daxing was described as positive, negative or neutral affected whether participants thought all the blickets were daxed (e,g, that some meant at least some rather than some-but-not-all) when Sally was talking about her own actions ("I daxed some of the blickets").

This makes sense, if 'daxing' is something to be proud of, then if Sally daxed all of the blickets, she'd say so. Since she didn't, people assume she daxed only some of them (far right blue bar in graph). Whereas if daxing is something to be ashamed of, then even if she daxed all of them, she might prefer to say "I daxed some of the blickets" as a way of obfuscating -- it's technically true, but misleading.

Interestingly, this effect didn't show up if Sally was talking about John daxing blickets. Presumably this is because people think the motivation to brag or lie is less strong when talking about a third person.

Also  interestingly, people weren't overall more likely to interpret some as meaning some-but-not-all when the sentence was in the first-person ("I daxed..."), which I had predicted to be the case. As described above, many theories assume that some should only be interpreted as some-but-not-all if we are sure the speaker knows whether or not all holds. We should be more sure when the speaker is talking about her own actions than someone else's. But I didn't find any such affect. This could be because the theory is wrong, because the effect of using first-person vs. third-person is very weak, or because participants were at floor already (most people in all 6 conditions thought it was very unlikely that all the blickets were daxed, which can make it hard detect an effect -- though it didn't prevent us from finding the effect of the meaning of 'daxing').

Afterword

I presented these data at a workshop on scalar implicature that I organized last Thursday. It was just one experiment of several dozen included in that talk, but it was the one that seemed to have generated the most interest. Thanks once again to all those who participated.

--------------
Bonnefon, J., Feeney, A., & Villejoubert, G. (2009). When some is actually all: Scalar inferences in face-threatening contexts Cognition, 112 (2), 249-258 DOI: 10.1016/j.cognition.2009.05.005

Need data by morning

In preparing a talk for a workshop I've organized tomorrow, I realized there was one simple experiment that would tie several pieces together neatly. Unfortunately, I hadn't run it. But I figured, hey, I can get data quick on Amazon Mechanical Turk.

Turk let me down. I don't know why, but today there aren't a lot of fish biting. So I turn to my usual, pre-Turk subject pool: you. The experiment takes 1-2 minutes -- it's really just 3 questions. As an added inducement to get people to run the experiment now, I'll be posting the results later this week or early next week.

Your age

Who participates in Web-based experiments? I recently analyzed preliminary results from about 4,500 participants in Keeping Things In Mind, an experiment I'm running in collaboration with a colleague and friend at TestMyBrain.org.

One of the things we're interested in is the age of people who participate. Here is the breakdown:


Not surprisingly, the bulk are college age (particularly freshmen). There are still a sizable number in their 30s, 40s and early 50s, but by the 60s it drops off considerably.

And then there are the few jokers who claim to be 3 or 100.

This is pretty similar to the breakdown I usually see at GamesWithWords.org, except that I usually have fewer tweens and more people in their 60s. But the mode is usually 18.

What this means for the experiment is that people in their 50s on up are woefully underrepresented. We're continuing to run the experiment in the hopes that more will participate.

Web Experiment Tutorial: Chapter 10, Recruiting Participants

Several years ago, I wrote a tutorial for my previous lab on how to create Web-based experiments in Flash. This is the final chapter in the original tutorial.


10. Recruiting participants online


1.     Overview

So now you have an experiment implemented on the Web. All you need are participants. Where do you get them?

If you need only very small numbers of subject (50-100), this part is easy. If you want larger numbers of subjects, or if you want to run several experiments under the same URL (so as to prevent the same subject from participating in multiple versions of the experiment), this may be the most challenging part of Web-based experiments.

There are several methods you can use. I recommend using all of them. Each will be discussed below in turn, but briefly: you can list the experiments on experiment portal pages, you can recruit from within your own social network, you can buy ads, you can promote the experiments in online forums, you can blog, you can swap links with other researchers, and you can get media attention.

Media attention, if you can get it, is far more valuable than all those other methods combined.

2.     Experiment portal pages

There are several web sites that list online experiments. By far the one that has provided the most subjects to vacognition is:


The second-most useful is:


Others, much less useful, include:



Another place you can list is:


In the first 3 weeks of May, 2007, vacognition (my previous site) received 251 hits from psych.hanover, 67 from genpsylab-wexlist, 15 from onlinepsychresearch, and only 2 from language-experiments.

Here are some other lists I have not used, which may or may not be useful:



3.     Your social network

Your own friends and family are the most likely to be convinced to do your experiments. Some of them may pass along the URL to their friends and family. Every time I have sent out requests to my F&F, I get about 40-50 participants in various studies.

You can also use Web-based social networking. For instance, I have an account on Facebook.com. My page lists vacognition. A friend of mine created a Facebook “group” called “Harvard Studied My Brain.” We invited all our friends to join (about 200), and anybody on Facebook could in theory join if they found the group in a search. 35 people did join, and many more have clicked on the link.

To make the group more enticing, I created a “certificate of membership,” which members can download. Generally, it is good to think about why anybody would want to participate. What can you do to make it more fun?

Other social networking sites include Stumbleupon.com, Reddit.com and Digg.com. “vacognition” has accounts for all of those. Every time a new webpage mentions your website, it is a good idea to “vote” for that website on Stumbleupon, Reddit and Digg. This increases the likelihood people will surf to that page, and then to your page. However, these services only have an effect if a fairly large number of people vote for the site, and the traffic may or may not be high-quality. At one point, a number of people voted for vacognition on StumbleUpon. In the space of an hour, we suddenly got 150 visitors. However, most did not participate in any experiments, and the traffic died down within 90 minutes. This is likely due to the fact that users of StumbleUpon are randomly sent to the website. In contrast, users of Digg or Reddit know what sort of website they are going to and are more likely to actually be interested. Visitors we have gotten through Digg have been highly likely to participate in an experiment.

You can also add a link to your website as part of your email signature. Ask your labmates to do so as well. Ask your friends to link to your website from their websites.

4.     Purchasing ads

You can also purchase ads. One obvious place to put ads is Google. I have never tried this.

I did, however, buy adds on Facebook. For $5/day, Facebook promised to display my add to at least 10,000 people on the Oberlin network. For another $5/day, it was displayed to another 10,000 people in the Harvard network.

I bought $20 worth of ads as an experiment. Vacognition got about 80 hits. That’s 4/$1. This is not very impressive, but it may be worth it. Also, my ads may not have been very good. (Keep in mind that a hit to the website does not mean that the person completed or even started an experiment!)

5.     Online forums

Another way to recruit participants is to mention the website or a particular experiment in an online forum. Here, it is particularly important to make the post relevant to the forum discussion. Otherwise, you are spamming and may (not unfairly) receive hate mail.

There are many psychology or science forums. It may be perfectly fair to write a post called “Please help me finish this experiment.” Another option is to write about the topic you are studying (“Visual short-term memory is very limited. We are trying to find out exactly hoe limited. Please do this experiment.”). You can also be very oblique about it. Post something interesting about your area of research, and just mention your website (“It turns out 1/100 people have prosopagnosia from a young age, not as a result of stroke. This is something we’ve found through our online experiments at www.faceblind.org. In fact, blah blah blah.”)

You can also pick targeted forums. If you are studying reading, you can post on dyslexia or reading education forums. (“My colleagues and I are trying to better understand reading. The results may eventually help us better understand how to teach reading to children. We need volunteers for our short, 3-4 minute experiments. I thought that participants in this forum might be particularly interested…”)

Because I use Flash for my experiments, I have also posted on forums dealing with Flash programming (“You may be interested in this other application of Flash technology…”). Also, sometimes I have a question about Flash, and I post the question, with a link.

There are also website creation forums where you are encouraged to showcase your website.

The best success I have had with this method is when my experiment has been set up as a type of quiz. My visual short-term memory experiment gives people a score at the end. So I posted the experiment on several forums where people advertise quizzes (“How good is your short-term memory for what you see?”). Normally, a forum post generates only a small amount of traffic (0-10 hits), but these posts on quiz forums produced as many as 100 each.

Vacognition has accounts on many, many message forums.

6.     Swapping links

Reciprocal advertising is an easy-to-use but very limited strategy. Vacognition has a “links” page, where we link to other websites, mostly other Web-based experiments. In return, those websites link to ours.

This serves two purposes. First, visitors to those other sites may click on the link and come to our website. This is extremely rare.

Second, the more links there are to a website, the better its “page rank” – that is, the higher it appears in the list of search results. Swapping links improves your page rank, and thus you are easier to find through Google, etc. My data suggest that visitors that come via Google tend to be low-quality visitors – that is, they tend not to participate in experiments. However, a few do, and it doesn’t hurt.

Usually I arrange these link swaps by emailing the webmasters of websites that I think may be interested. Most do not respond, but some do.

7.     Media attention

By far the most effective method is media attention. Extremely successful online labs (like faceblind.org or the Moral Sense Test) get a lot of mainstream media attention, and they also get huge numbers of participants.

Media attention is hard to orchestrate. Ideally, your research will be so interesting that reporters will come to you. However, you can contact reporters yourself. The university can put out a press release. I got a fair amount of media attention after Georgetown wrote a press release about a paper I wrote.

In the end, though, you have to have work that is interesting to reporters and the public (see “The most important thing,” below).

8.     Blogging

Bloggers are more approachable members of the media. Bloggers of many shades and stripes may be interested in showcasing your experiments. And they are much more likely to respond to an email. Some blogs produce disappointing traffic. I guest-blogged for The New Scientist, whose blog gets a thousand hits a day, but I only got a few dozen hits out of it. However, Skepchic blogged (without my contacting her) about one of my experiments, and I got about 300 hits.

You can also write your own blogs. This will be of minimal help if you don’t attract a following, but even a blog with little following and only one new post every month or so can generate some traffic. The links from the blog to your website can also help your page rank (see “Swapping links”).



9.     Email list

We maintain an email list. On several parts of the website, visitors are encouraged to join a Google Groups email list, which now has over 100 members. The list is emailed when new experiments are posted or results have been posted, although I try to keep this to a minimum. If you overuse an email list, people tend not to read the messages and/or withdraw from the list.

Setting up a Google Groups email list is simple, and it can be set up so that anyone can join. Vacognition’s list can be found here:


10.  The most important thing

When recruiting participants, you should always keep in mind one question:

Why would anybody want to participate in this experiment?

Participants are expending resources (time, energy, and sometimes money) in order to participate. What is the product that you are selling them?

This is particularly important when trying to generate media attention – whether newspapers or bloggers. You may get your brother-in-law to blog about your experiment as a favor (mine did), but most bloggers aren’t going to write about something if they don’t find it interesting. Make it interesting. Testmybrain.org is a great example of a site that is fun, and not surprisingly it gets tons of traffic.

However, this issue is important even when using online experiment lists. Anyone who visits an online experiment list is already interested in doing online experiments. However, these lists post many experiments. No visitor is going to do all of them. So how do they choose which one(s) to do? Presumably, this is partly a function of how interesting the experiment looks. Compare:

“This experiment investigates the role of proactive interference in estimates of visual short-term memory capacity.”

with

“How much of what you see can you remember? Probably less than you thought. Take this 5 minute quiz to see how many visual objects you can remember. Typical scores are between 1 and 3 objects.”

Which experiment sounds more interesting? They are the same experiment.

You will want to craft your pitch to your audience. If you are posting on a forum for vision scientists, the 2nd description above may come across as patronizing. However, if you are posting to a forum about online quizzes and games, the 1st description will probably get you banned from the forum for spamming.

The design of the website itself also matters. An ugly, unprofessional-looking website will turn away visitors. Many participants are participating because they are interested in science. Make sure they learn something. Post results. Have pages that discuss the research topics. Make sure the debriefing is informative. Many participants find seeing their own results very motivating, so if possible, try to incorporate that into the experiment.

You can also experiment in your advertising. Try different pitches. See which work the best. Modify the website and see if the number of visitors who actually participate in experiments increases or decreases.

11. Where does GamesWithWords.org get it's traffic?

I currently use Google Analytics to track my web traffic. Here is what it shows for the top 10 referrers from Dec 1. 2009 through April 21 2010:

As you can see, the biggest chunk of traffic comes from people simply typing in the name of the site. Word of mouth seems to do a great deal. One thing to consider, however, is also the average time on site and the bounce rate. By these measures, the direct traffic is better than those who come via Google.

I should note that this traffic to dwarfed by weeks in which I get media attention. I can easily get several thousand visits per day when the site is mentioned in a prominent news source (which has not happened in the last few months, unfortunately). Notice also that while there are many other sources of traffic beyond the top 10 listed here, all the rest combined only contributed 1,385 visits.

Games with Words: New Web lab launched

The new Lab is launched (finally). I was a long ways from the first to start running experiments on the Web. Nonetheless, when I got started in late 2006, the Web had mostly been used for surveys, and there were only a few examples of really successful Web laboratories (like the Moral Sense Test, FaceResearch and Project Implicit). There were many examples of failed attempts. So I wasn't really sure what a Web laboratory should look like, how it could best be utilized, or what would make it attractive and useful for participants.

I put together a website known as Visual Cognition Online for the lab I was working at. I was intrigued by the possibility of running one-trial experiments. Testing people involves a lot of noise, so we usually try to get many measurements (sometimes hundreds) from each participant, in order to get a good estimate of what we're trying to measure. Sometimes this isn't practical.
The best analogy that comes to mind is football. A lot of luck and random variation goes into each game, so ideally, we'd like each team to play each other several times (like happens in baseball). However, the physics of football makes this impractical (it'd kill the players).

Running a study on the Web makes it possible to test more participants, which means we don't need as many trials from each. A few studies worked well enough, and I got other good data along the way (like this project), so when the lab moved to MN and I moved to graduate school, I started the Cognition and Language Lab along the same model.

Web Research blooms

In the last two years, Web research has really taken off, and we've all gotten a better sense of what it was useful for. The projects that make me most excited are those run by the likes of TestMyBrain.org, Games with a Purpose, and Phrase Detectives. These sites harness the massive size of the Internet to do work that wasn't just impossible before -- it was frankly inconceivable.

As I understand it, the folks behind Games with a Purpose are mainly interested in machine learning. They train computer programs to do things, like tag photographs according to content. To train their computer programs, they need a whole bunch of photographs tagged for content; you can't test a computer -- or a person -- if you don't know what the correct answer is. Their games are focused around doing things like tagging photographs. Phrase Detectives does something similar, but with language.

The most exciting results from TestMyBrain.org (full disclosure: the owner is a friend of mine, a classmate at Harvard, and also a collaborator) have focused on the development and aging of various skills. Normally, when we look at development, we test a few different age groups. An extraordinarily ambitious project would test some 5 year olds, some 20 year olds, some 50 year olds, and some 80 year olds. By testing on the Web, they have been able to look at development and aging from the early teenage years through retirement age (I'll blog about some of my own similar work in the near future).

Enter: GamesWithWords.org

This Fall, I started renovating coglanglab.org in order to incorporate some of the things I liked about those other sites. The project quickly grew, and in the end I decided that the old name (Cognition and Language Lab) just didn't fit anymore. GamesWithWords.org was born.

I've incorporated many aspects of the other sites that I like. One is simply to make the site more engaging (reflected, I hope, in the new name). It's always been my goal to make the Lab interesting and fun for participants (the primary goal of this blog is to explain the research and disseminate results), and I've tried to adopt the best ideas I've seen elsewhere.

Ultimately, of course, the purpose of any experiment is not just to produce data, but to produce good data that tests hypotheses and furthers theory. This sometimes limits what I can do with experiments (for instance, while I'd love to give individualized feedback to each participant for every experiment, sometimes the design just doesn't lend itself to feedback. Of the two experiments that are currently like, one offers feedback, one doesn't.

I'll be writing more about the new experiments over the upcoming days.

Recruiting Laboratory Participants

I am in the process of revamping the Internet laboratory, as I'm trying to increase the number of participants. Some very successful websites recruit ~500/day. I have been averaging about 30/day -- still respectable, but it limits what I can do.

In this context, I read recent reports from the folks behind Phrase Detectives with interest. Phrase Detectives, it appears, gets a slightly greater amount of traffic than I do. What I focused on was their method of advertising and how well it works. They noted that their traffic comes in the following forms:

direct: 46%
website link: 29%
search: 12%
Facebook advertisement: 13%

Then they looked at the bounce rate (the number of visitors who arrive at the home page then scoot away) for each of these sources:

direct: 33%
link: 29%
search: 44%
Facebook advertisement: 90%

It appears that paid advertisements -- the only one of these sources that actually costs money -- isn't worth much. In the end, only 4% of visitors who didn't bounce came through the paid advertisement.

Are Web-Based Experiments Reliable? The Data Say 'Yes.'

After a few months, I'm back to the task of getting the Video Test experiments published. As I mentioned last year, the paper had run aground partly due to reviewers' skepticism about Web-based experiments.

I sat down to improve the section of the paper that justifies using Web-based experiments. That required looking for other published experiments. I've done this haphazardly over the years, but this time I was much more systematic. I knew there were a fair number of published Web surveys, but I was surprised to discover there are many, many more published Web-based experiments than I thought. I also turned up a fairly large number of studies in which researchers directly compared Web-based and lab-based studies, typically finding the former to be as reliable as the latter.

In fact, I found so much I almost felt silly writing the justification. It seems strange to be justifying what has become essentially a well-established method. In fact, many researchers who use write up Web-based experiments don't even bother to do so.

The Data
Without further ado, here is a draft of that justification:

Internet-based experiments have become increasingly popular in recent years, with at least 21% of APA journals having published at least one paper relying on Internet-based methods (Skitka & Sargis, 2006). In the cognitive and perceptual research, domains in which the methodology has been particularly productive include face perception (inter alia, Bestelmeyer, Jones, DeBruine, Little & Welling, in press; Boothroyd, Jones, Burt, Cornwell, Little, Tiddeman & Perrett, 2005; Feinberg, DeBruine, Jones & Little, 2008; Feinberg, Jones, DeBruine, Moore, Smith, Cornwell, Tiddeman, Boothroyd & Perrett, 2005; Fessler & Navarrete, 2003; Little, Burriss, Jones, DeBruine & Caldwell, 2008; Little, Jones & Burriss, 2007; Little, Jones, Burt & Berrett, 2007; Little, Jones & DeBruine, 2008; Little, Jones, DeBruine & Feinberg, 2008; Smith, Jones DeBruine & Little, in press; Welling, Jones & DeBruine, 2008; Wilson & Daly, 2004) and reaction-time based studies of implicit social biases (inter alia, Bar-Anan, Nosek & Vianello, in press; Graham, Haidt & Nosek, in press; Lindner & Nosek, 2009; Nosek & Hansen, 2008; Ranganath & Nosek, 2008; Schwartz, Vartanian, Nosek & Brownell, 2006).

A number of researchers have directly compared the results of Internet-based and laboratory-based studies, finding that the former are highly reliable and the two methods produce similar results, both within and between subjects (Buchanan, T., & Smith, J. L., 2000; Gosling, Vazire, Srivastava & John, 2004; Linnman, Carlbring, Ahman, Anderesson & Andersson, 2004; McGraw, Tew, & Williams, 2000; Meyerson & Tryon, 2003; Ollesch, Heineken & Schulte, 2006; Srivastava, John, Gosling & Potter, 2003). Importantly for the present work, a recent study of VWM found converging results from Internet-based and Laboratory-based methods (Hartshorne, 2008).




Bar-Anan, Y., Nosek, B. A., & Vianello, M. (in press). The sorting paired features task: A measure of association strengths. Experimental Psychology.

Bestelmeyer, P. E. G., Jones, B. C., DeBruine, L. M., Little, A. C., & Welling, L. L. M. (in press). Face aftereffects demonstrate interdependent processing of expressions and the invariant characteristics of sex and race. Visual Cognition.

Boothroyd, L. G., Jones, B. C., Burt, D. M., Cornwell, R. E., Little, A. C., Tiddeman, B. P., & Perrett, D. I. (2005). Facial masculinity is related to perceived age but not perceived health. Evolution and Human Behavior, 26, 417-431.

Buchanan, T., & Smith, J. L. (1999). Using the Internet for psychological research: Personality testing on the World Wide Web. British Journal of Psychology, 90, 125-144.

Feinberg, D. R., DeBruine, L. M., Jones, B. C., & Little, A. C. (2008). Correlated preferences for men’s facial and vocal masculinity. Evolution and Human Behavior, 29, 233-241.

Feinberg, D. R., Jones, B. C., DeBruine, L. M., Moore, F. R., Smith, M. J. L., Cornwell, R. E., Tiddeman, B. P., Boothroyd, L. G., Perrett. (2005). The voice and face of woman: One ornament that signals quality? Evolution and Human Behavior, 26, 398-408.

Fessler, D. M. T., & Navarrete, C. D. (2003). Domain-specific variation in disgust sensitivity across the menstrual cycle. Evolution and Human Behavior, 24, 406 – 417.

Gosling, S. D., Vazire, S., Srivastava, S. & John, O. P. (2004). Should we trust web-based studies? A comparitive analysis of six preconceptions about Internet questionnaires. American Psychologist, 49, 93-104.

Graham, J., Haidt, J., & Nosek, B. A. (in press). Liberals and conservatives rely on different sets of moral foundations. Journal of Personality and Social Psychology.

Hartshorne, J. K. (2008). Visual working memory capacity and proactive interference. PloS ONE 3(7): e2716.
Lindner, N. M., & Nosek, B. A. (2009). Alienable speech: Ideological variations in the application of free-speech principles. Political Psychology, 30 67-92.

Linnman, C., Carlbring, P., Ahman, A., Andersson, H., & Andersson, G. (2004). The Stroop effect on the Internet. Computers in Human Behavior, 22, 448-455.

Little, A. C., Burriss, R. P., Jones, B. C., DeBruine, L. M., & Caldwell, C. C. (2008). Social influence in human face preference: men and women are influenced for long-term but not short-term attractiveness decisions. Evolution and Human Behavior, 29, 140-146.

Little, A. C., Jones, B. C., & Burriss, R. P. (2007). Preferences for masculinity in male bodies change across the menstrual cycle. Hormones and Behavior, 52, 633-639.

Little, A. C., Jones, B. C., Burt, D. M., & Perrett, D. I. (2007). Preferences for symmetry in faces change across the menstrual cycle. Biological Psychology, 76, 209-216.

Little, A. C., Jones, B. C., & DeBruine, L. M. (2008). Preferences for variation in masculinity in real male faces change across the menstrual cycle. Personality and Individual Differences, 45: 478-482.

Little, A. C., Jones, B. C., DeBruine, L. M., & Feinberg, D. R. (2008). Symmetry and sexual-dimorphism in human faces: Interrelated preferences suggest both signal quality. Behavioral Ecology, 19: 902-908.

Meyerson, P. & Tryon, W. W. (2003). Validating Internet research: a test of the psychometric equivalence of Internet and in-person samples. Behavior Research Methods, Instruments & Computers, 35, 614-620.

Nosek, B. A., & Hansen, J. J. (2008). The associations in our heads belong to us: Searching for attitudes and knowledge in implicit evaluation. Cognition and Emotion, 22, 553-594.

Heike Ollesch, Edgar Heineken, Frank P. Schulte (2006). Physical or virtual presence of the experimenter: Psychological online-experiments in different settings International Journal of Internet Science, 1 (1), 71-81

Ranganath, K. A., & Nosek, B. A. (2008). Implicit attitude generalization occurs immediately, explicit attitude generalization takes time. Psychological Science, 19, 249-254.

Schwartz, M. B., Vartanian, L. R., Nosek, B. A., & Brownell, K. D. (2006). The influence of one's own body weight on implicit and explicit anti-fat bias. Obesity, 14(3), 440-447

Smith, F. G., Jones, B. C., DeBruine, L. M., & Little, A. C. (in press). Interactions between masculinity-femininity and apparent health in face preferences. Behavioral Ecology.

Srivistava, S., John, O. P., Gosling, S. D., & Potter, J. (2003). Development of personality in early and middle adulthood: Set like plaster or persistent change? Journal of Personality and Social Psychology, 84, 1041-1053.

Welling, L. L. M., Jones, B. C., & DeBruine, L. M. (2008). Sex drive is positively associated with women’s preferences for sexual dimorphism in men’s and women’s faces. Personality and Individual Differences, 44(1): 161-170.

Wilson, M., & Daly, M. (2004). Do pretty women inspire men to discount the future? Proceedings of the Royal Society of London B: Biological Sciences, 271, S177-S179.