Field of Science

Showing posts with label science in the media. Show all posts
Showing posts with label science in the media. Show all posts

Citizen Science in Harvard Magazine

A nice, extended article on recent projects, covering a wide range -- including GamesWithWords.org. Check it out.

Science Mag studies science. Forgets to include control group.

Today's issue of Science carries the most meta sting operation I have ever seen. John Bohannon reports a study of open access journals, showing lax peer review standards. He sent 304 fake articles with obvious flaws to 304 open access journals, more than half of which were accepted.

The article is written as a stinging rebuke of open access journals. Here's the interesting thing: There's no comparison to traditional journals. For all we know, open access journals actually have *stricter* peer review standards than traditional journals. We all suspect not, but suspicion isn't supposed to count as evidence in science. Or in Science.

So this is where it gets meta: Science -- which is not open access -- published an obviously flawed article about open access journals publishing obviously flawed articles.

It would be even better if Bohannon's article had run in the "science" section of Science, rather than in the news section, where it actually ran, but hopefully we can agree that Science can't absolve itself of checking its articles for factualness and logical coherence just by labeling them "news".

GamesWithWords on Scientific American

Over the last week, ScientificAmerican.com has published two articles by me. The most recent, "Citizen Scientists decode meaning, memory and laughter," discusses how citizen science projects -- science projects involving collaborations between professional scientists and amateur volunteers -- are now being used to answer questions about the human mind.

Citizen Science – projects which involve collaboration between professional scientists and teams of enthusiastic amateurs — is big these days. It’s been great for layfolk interested in science, who can now not just read about science but participate in it. It has been great for scientists, with numerous mega-successes like Zooniverse and Foldit. Citizen Science has also been a boon for science writing, since readers can literally engage with the story.
However, the Citizen Science bonanza has not contributed to all scientific disciplines equally, with many projects in zoology and astronomy but less in physics and the science of the mind. It is maybe no surprise that there have been few Citizen Science projects in particle physics (not many people have accelerators in their back yards!), but the fact that there has been very little Citizen Science of the mind is perhaps more remarkable.

The article goes on to discuss three new mind-related citizen science projects, including our own VerbCorner project.

The second, "How to understand the deep structures of language," describes some really exciting work on how to explain linguistic universals -- work that was conducted by colleagues of mine at MIT.
In an exciting recent paper, Ted Gibson and colleagues provide evidence for a design-constraint explanation of a well-known bias involving case endings and word order. Case-markers are special affixes stuck onto nouns that specify whether the noun is the subject or object (etc.) of the verb. In English, you can see this on pronouns (compare "she talked with her"), but otherwise, English, like most SVO languages (languages where the typical word order is Subject, Verb, Object) does not mark case. In contrast, Japanese, like most SOV languages (languages where the typical word order is Subject, Object, Verb) does mark case, with -wa added to subjects and -o added to direct objects. "Yasu saw the bird" is translated as "Yasu-wa tori-o mita" and "The bird saw Yasu" is translated as "Tori-wa Yasu-o mita." The question is why there is this relationship between case-marking and SOV word order.
The article ran in the Mind Matters column, which invites scientists to write about the paper that came out in the last year that they are most excited about. It was very easy for me to choose this one.

Professor -- The Easiest Job in the World

There has been a small kerfuffle over Susan Adams's article at Forbes, titled "The least stressful jobs of 2013":
University professors have a lot less stress than most of us. Unless they teach summer school, they are off between May and September and they enjoy long breaks during the school year, including a month over Christmas and New Year's and another chunk of time in the spring. Even when school is in session they don't sped too many hours in the classroom ... Working conditions tend to be cozy and civilized and there are minimal travel demands...
She also mentions the great job prospects ("Universities are expected to add 305,700 adjunct and tenure-track professors by 2020").

To her credit, Adams has added a sizable addendum to her article, correcting -- but not apologizing for -- her mistakes. Unfortunately, this is far from the first time this kind of article has appeared in a major publication. Some time back, a columnist for the New York Times wrote an article suggesting that the solution to rising costs of higher education was to make professors work more than a few hours a week. An article in the New Yorker casually noted that the new head of a particular company was concerned that his employees worked "the hours of college professors" (I initially assumed they meant "way too hard" and that the boss wanted them to take a break!). What gives?

Scicurious suggests it's the curse of half-knowledge:
The vast majority of us aren't teachers or professors, but we've all been students, right? ... We thought that, because of what we saw of them in our classes, we knew what they did ... Because of this half-knowledge, people make assumptions about our jobs, assumptions that can really affect how we are perceived as people...
 That is no doubt part of it, but it also requires that people not think very hard. If I heard that someone made a pretty good living working only a few hours a week, it would immediately set off my implausibility alarm. I mean, what are the chances? And you'd only have to think for a moment to realize this can't be true.

Adams got hundreds of comments and letters pointing out that professors, in addition to giving a few lectures a week, also grade papers, advise students, write papers and books, go to conferences, give invited talks, etc. Adams presents this as if this came as a surprise, but that seems equally implausible. I'm going to assume she's read one or two articles about medicine or science, in which case the people discussed are inevitably professors. In fact, articles about politics occasionally cite professors as well. If she went to college, she knows that professors have office hours and grade papers. Many of the books on science and politics in the bookstore are written by faculty, as are essentially all college textbooks.

Even if she had never attended college, never interacted with a professor, and didn't read articles about higher education, a few minutes of Googling prior to writing her article would have corrected that mistake. My guess is that she didn't really think about her article before writing it and didn't consult either her own memory or Google because she -- and the others who write similar articles -- wanted this crazy claim about the lazy professor to be true. The interesting question is why she wanted it to be true. Anti-intellectualism? A desire to believe that such cushy jobs really exist? Or is this just an example of one of those ideas that are crazy enough that they inspire belief (like one of those many apocryphal "weird facts")?

*I do realize that some professors do very little work. Some people in all professions do very little work.

Is Psychology a science?: Redux

The third-most read post on this blog is "Is Psychology a science?". I was a few years younger then and still had strong memories of one of my friends complaining, when we were both undergraduates, that he had to take a psychology course as part of his science distributional requirements. "Psychology isn't a science," he said, "because they don't do experiments." Since he was telling me this over AIM as I was sitting in my psychology laboratory, analyzing an experiment, it didn't go over well.

It's been a popular post, but I haven't written about the subject much since in part because I started to suspect that the "psychology isn't a science" bias might actually be a thing of ignorant undergraduates and a few cranks. It's not something I've rarely heard in the last few years, and there's no need to write diatribes against a non-existant prejudice.

In retrospect, maybe I haven't come across these opinions because I mostly hang out with other psychologists. A colleague recently forwarded me this blog post ("Keep Psychology out of the science club"), which links to a few other similar pieces on blogs and in newspapers. So it seems the issue is alive and well.

Some articles one comes across are of the "psychologists don't do experiments" variety; these are easily explained by ignorance and an inability to use Google. But some folks raise some real concerns which, while I think they are misplaced, really are worth thinking about.


Psychology is too hard

One common theme that I came across is that psychology is simply too difficult. We'll never understand human behavior very well, so maybe we shouldn't even try. For instance, Gary Gutting, writing in the Opinionator at the New York Times, said:
Social sciences may be surrounded by the "paraphernalia" of the natural sciences, such as technical terminology, mathematical equations, empirical data and even carefully designed experiments. But when it comes to generating reliable scientific knowledge, there is nothing more important than frequent and detailed predictions of future events ... while the physical sciences produce many detailed and precise predictions, the social sciences do not ... Because of the many interrelated causes at work in social systems, many questions are simply "impervious to experimentation" ... even when we can get reliable experimental results, the causal complexity restricts us...
In a Washington Post editorial, Charles Lane wrote:
The NSF shouldn't fund any social science. Federal funding for mathematics, engineering and other "hard" sciences is appropriate. In these fields, researchers can test their hypotheses under controlled conditions; then those experiments can be repeated by others. Though quantitative methods may rule economics, political science and psychology, these disciplines can never achieve the objectivity of the natural sciences. Those who study social behavior -- or fund studies of it -- are inevitably influenced by value judgments, left, right, and center. And unlike hypotheses in the hard sciences, hypotheses about society usually can't be proven or disproven by experimentation. Society is not a laboratory.
Alex Berezow at the Newton Blog agrees:
Making useful predictions is a vital part of the scientific process, but psychology has a dismal record in this regard.
Is that a fair critique?

These writers don't entirely miss the mark. It really is true that psychology does not make as precise or as accurate predictions as, say, physics. That is not the same thing as saying that we can't make any predictions. Berezow complains about happiness research:
Happiness research is a great example of why psychology isn't a science. How exactly should "happiness" be defined? The meaning of the word differs from person to person, and especially between cultures. What makes Americans happy doesn't necessarily make Chinese people happy. How does one measure happiness? Psychologists can't use a ruler or a microscope, so they invent an arbitrary scale. Today, personally, I'm feeling about a 3.7 out of 5. How about you? ...  How can an experiment be consistently reproducible or provide any useful predictions if the basic terms are vague and unquantifiable?
That's a great question! Let's start with the facts. It is true that we don't know exactly what it means to be a 3.7 on a scale of 1-5. But we do know a few interesting things.

People's predictions of how happy they will rate themselves in the future are systematically biased. People will say that good things (like getting tenure) will make them very happy (a 5 out of 5) whereas bad things (like not getting tenure) will make them very sad (a 1 out of 5), whereas when you then ask those same people to rate their happiness a little while after the event, people generally rate themselves as not nearly so happy or unhappy as they predicted. (Similarly, people who lose a limb usually rate themselves as about as happy afterwards as before, provided you give them a little time to adjust.) People who have children normally see a drop in how happy they rate themselves. They only start to recover after their children leave the nest. There is also the "future ahedonia" effect: people think that good things (e.g., an ice cream sunday) will make them happier now (on our 1-5 scale) than those same good things would make them happy in the future, and conversely for bad things (e.g., doing my homework won't feel so bad if I do it tomorrow rather than today). And so on. (These and many other examples can be found in Dan Gilbert's Stumbling on Happiness.)

These and other findings are highly reliable, despite the fact that we don't have a direct, objective measurement of happiness. In fact, as Dan Gilbert has pointed out, we would only consider that "direct" measurement to be a measurement of happiness if it correlated really well with how happy people said they were. To the extent it diverged from how happy people claim to be, we would start to distrust the "direct" measurement.

I personally am glad that we know what we know about happiness, though I wish we knew more. I picked happiness to defend because I've noticed that even those who defend psychology in comments sections give up happiness research as a lost cause. I think it's pretty interesting, useful work. It would be even easier to defend, for instance, low-level vision research, which makes remarkably precise predictions, has clear theories of the relationship between the psychological phenomena and the neural implementations, etc. (See also this post for some psychology success stories.)

Just how good do you need your predictions to be?

Still, it is true that we can't always make the precise predictions that can be made in some other fields. Of course, other fields can't always make the precise predictions, either. While physicists are great at telling you what will happen to rigid objects moving through vacuums, predicting the motions of real objects in the real world has been traditionally a lot harder, and understanding fluid dynamics has been deeply problematic (though I understand this has been getting a lot better in recent years). And that's without pulling out the Heisenberg Uncertainty Principle, which should cause anyone who wants precise, deterministic predictions to declare physics a non-science.

Also, some parts of psychology are able to make much more precise predictions than others do. Anything amenable to psychophysics tends to be much more precise, and vision researchers, as already noted, have remarkably well worked-out theories of low- and mid-level vision.

This line of discussion also raises an interesting question: when exactly did physics become a science? Was it a science in Newton's day, when we still new squat about electromagnetism -- much less elementary particles -- and couldn't make even rough predictions about turbulent air or fluid systems? And to people from 350 years from now, will the physics of today seem like a "real" science (my guess: no).

Worries

Berezow ends his post with the following caution:
To claim [psychology] is a "science" is inaccurate. Actually, it's worse than that. It's an attempt to redefine science. Science, redefined, is no longer the empirical analysis of the natural world; instead, it is any topic that sprinkles a few numbers around. This is dangerous, because, under such a loose definition, anything can qualify as science. And when anything qualifies as science, science can no longer claim to have a unique grasp on secular truth.
I have a different worry. My worry is that someone gets ahold of a time machine, goes back in time to 1661 and convinces Newton to lay off that non-scientific "physics" crap. Pre-Newtonian physics was a hodgepodge of knowledge, little resembling what we think of science today. Making precise predictions about the messy, physical world we live in no doubt seemed an impossible pipe-dream to many. Luckily, folks like Newton kept plugging away, and three and a half centuries later, here we are.

We should keep in mind that the serious study of the mind only began in the mid-1800s; physics has a significant head-start. And, as the anti-psychology commentators are happy to point out, psychology is much, much harder than physics or chemistry. But the only reason I can see to pull the plug is if we are sure that (a) we have learned nothing in the last 150 years, and (b) we will never make any further progress. These are empirical claims and so subject to test (I think the first one has already been falsified). So here's a proposed experiment: psychologists keep on doing psychology, and people who don't want to don't have to. And we'll wait a few decades and see who knows more about the human mind.

Survey results: Where do you get your science news

The last poll asked people where they get their science news. Most folks reported that they get science news from blogs, which isn't surprising, since they were reading this blog. Interestingly, less than 10% reported getting science news from newspapers. This fits my own experience; once I discovered science blogs, I stopped reading science news in newspapers altogether.

I would report the exact numbers for the poll, but Blogger ate them. I can tell that it still has all the data (it remembers that I voted), but is reporting 0s for every category. I'll be switching to Google Forms for the next survey.

Arcadia

The super-lame New Yorker review of the recent Broadway revival of Stoppard's "Arcadia" moved me to do a rare thing: write a letter to the editor. They didn't publish it, despite the fact -- and I think I'm being objective here -- my letter was considerably better than the review. Reviews are no longer free on the New Yorker website (you can see a synopsis here), but I think my letter covers the main points. Here it is below:

Hilton Als ("Brainstorm", Mar 28) writes about the recent revival of "Arcadia" that Stoppard's "aim is not to show us people but to talk about ideas." Elsewhere, Als calls the show unmoving and writes that Stoppard does better with tragicomedies.
"Arcadia" is not a show about ideas. It is about the relationship people have with ideas, particularly their discovery. Anyone who has spent any amount of time around academics would instantly recognize the characters as people, lovingly and realistically depicted. (Als singles out Billy Crudup's "amped-up characterization of the British historian Bernard Nightengale" as particularly mysterious. As Ben Brantley wrote in the New York Times review, "If you've spent any time on a college campus of late, you've met this [man].")
As an academic, the production was for me a mirror on my own life and the people around me. Not everyone will have that experience. The beauty of theater (and literature) is that it gives us peek into the inner lives of folk very different from ourselves. It is a shame Als was unable to take advantage of this opportunity.
Where the play focuses most closely on ideas is the theme of an idea (Thomasina's) stillborn before its time. If one feels no pathos for an idea that came too soon, translate "idea" into "art" and "scientist" into "artist" and consider the tragedies of artists unappreciated in their time and quickly forgotten. Even a theater critic can find the tragedy in that.

New York Times, You Can't Handle the Truth.

Earlier today I wrote about the research behind an opinion article at the New York Times. When I looked at the sources cited, I was unable to find any information supporting the claims made in the article. In fact, what I found directly contradicted those claims. I finished by saying that while I was willing to believe these claims, I'd like to know what data support them. In passing, I mentioned that I had submitted an abbreviated version of this analysis as a comment on the Times website.

That comment was not published. I figured maybe there had been a computer error, so I submitted another one later in the day. That one was also not published. Finally, at 6:13pm, I submitted an innocuous and useless comment under an assumed name:
I agree with Pat N. It's nice to hear from someone who has some optimism (@ Dr. Q).
This comment was published almost immediately.


The Times states that "comments are moderated and generally will be posted if they are on-topic and not abusive."Since the moderators didn't publish the comment, we can conclude one of two things:

1) Discussion of the empirical claims made in a New York Times article is not "on topic."
2) Pointing out a mistake made in a New York Times article is a kind of abuse.

And for my next trick, I'll make this effect disappear!

In this week's New Yorker, Jonah Lehrer shows once again just how hard it is to do good science journalism if you are not yourself a scientist.

His target is the strange phenomenon that many high profile papers are failing to replicate. This has been very much a cause celebre lately, and Lehrer follows a series of scientific papers on the topic as well as an excellent Atlantic article by David Freedman. At this point, many of the basic facts are well-known: anecdotally, many scientists report repeated failures to replicate published findings. The higher-profile the paper, the less likely it is to replicate, with around 50% of the highest-impact papers in medicine failing to replicate. As Lehrer points out, this isn't just scientists failing to replicate each other's work, but scientists failing to replicate their own work: a thread running through the article is the story of Jonathan Schooler, a professor at UC-Santa Barbara who has been unable to replicate his own seminal graduate student work on memory.

Lehrer's focus in this article is shrinking effects.


No, not this one.

Some experimental effects seem to shrink steadily over time:
In 2001, Michael Jennions, a biologist at the Australian National University, set out to analyze "temporal trends" across a wide range of subjects in ecology and evolutionary biology. He looked at hundreds of papers and forty-four meta-analyses (that is, statistical syntheses of related studies), and discovered a consistent decline effect over time, as many of the theories seemed to fade into irrelevance.
As described, that's weird. But there is a good explanation for such effects, and Lehrer brings it up. Some results are spurious. It's just one of those things. Unfortunately, spurious results are also likely to be exciting. Let's say I run a study looking for a relationship between fruit-eating habits and IQ. I look at the effects of 20 different fruits. By chance, one of them will likely show a significant -- but spurious -- effect. So let's say I find that eating an apple every day leads to a 5-point increase in IQ. That's really exciting because it's surprising -- and the fact that it's not true is integral to what makes it surprising. So I get it published in a top journal (top journals prefer surprising results).

Now, other people try replicating my finding. Many, many people. Most will fail to replicate, but some -- again by chance -- will replicate. It is extremely difficult to get a failure to replicate published, so only the replications get published. After time, the "genius apple hypothesis" becomes part of established dogma. Remember that anything that challenges established dogma is exciting and surprising and thus easier to publish. So now failures to replicate are surprising and exciting and get published. When you look at effect-sizes in published papers over time, you will see a gradual but steady decrease in the "effect" of apples -- from 5 points to 4 points down to 0.

Where I get off the Bus

So far so good, except here's Lehrer again:
While the publication bias almost certainly plays a role in the decline effect, it remains an incomplete  explanation. For one thing, it fails to account for the initial prevalence of positive results among studies that never even get submitted to journals. It also fails to explaint eh experience of people like Schooler, who have been unable to replicate their initial data despite their best efforts.
Huh? Lehrer seems to be suggesting that it is publication that makes a result spurious. But that can't be right. Rather, there are just lots of spurious results out there. It happens that journals preferentially publish spurious results, leading to biases in the published record, and eventually the decline effect.

Some years ago, I had a bad habit of getting excited about my labmate's results and trying to follow them up. Just like a journal, I was interested in the most exciting results. Not surprisingly, most of these failed to replicate. The result was that none of them got published. Again, this was just a factor of some results being spurious -- disproportionately, the best ones. (Surprisingly, this labmate is still a friend of mine; personally, I'd hate me.)

The Magic of Point O Five

Some readers at this point might be wondering: wait -- people do statistics on their data and only accept a results that is extremely unlikely to have happened by chance. The cut-off is usually 0.05 -- a 5% chance of having a false positive. And many studies that turn out later to have been wrong pass even stricter statistical tests. Notes Lehrer:
And yet Schooler has noticed that many of the data sets that end up declining seem statistically solid--that is, they contain enough data that any regression to the mean shouldn't be dramatic. '"These are the results that pass all the tests," he says. "The odds of them being random are typically quite remote, like one in a million. This means that the decline effect should almost never happen. But it happens all the time!"
So there's got to be something making these results look more unlikely than they really are. Lehrer suspects unconscious bias:
Theodore Sterling, in 1959 ... noticed that ninety-seven percent of all published psychological studies with statistically significant data found the effect they were looking for ... Sterling saw that if ninety-seven per cent of psychology studies were proving their hypotheses, either psychologists were extraordinarily lucky or they published only the outcomes of successful experiments

and again:
The problem seems to be one of subtle omissions and unconscious misperceptions, as researchers struggle to make sense of their results.
I expect that unconscious bias is a serious problem (I illustrate some reasons below), but this is pretty unsatisfactory, as he doesn't explain how unconscious bias would affect results, and the Schooler effect is a complete red herring. 


I wasn't around in 1959, so I can't speak to that time, but I suspect that the numbers are similar today ... but in fact Sterling was measuring the wrong thing. Nobody cares what our hypotheses were. They don't care what order the experiments were actually run in. They care about the truth, and they have very limited time to read papers (most papers are never read, only skimmed). Good scientific writing is clear and concise. The mantra is: Tell them what you're going to tell them. Tell them. And then tell them what you told them. No fishing excursions, no detours. When we write scientific papers, we're writing science, not history.

And this means we usually claim to have expected to find whatever it is that we found. It just makes for a more readable paper. So when a scientist reads the line, "We predicted X," we know that really means "We found X" -- what the author actually predicted is beside the point.

Messing with that Point O Five

So where do all the false positive come from, if they should be less than 5% of conducted studies? There seem to be a number of issues.

First, it should be pointed out that the purpose of statistical tests (and the magic .05 threshold for significance) is to make a prediction as to how likely it is that a particular result will replicate. A p-value of .05 means roughly that there is a 95% chance that the basic result will replicate (sort of; this is not technically true but is a good approximation for present purposes).

But statistics are estimates, not facts. They are based on a large number of idealizations (for instance, many require that measurement error is distributed normally



a normal distribution

meaning that the bulk of measurements are very close to the true measurement and a measurement is as likely to be larger than the true number as it is likely to be smaller. In fact, most data is heavily skewed, with measurements more likely to be too large than too smaller (or vice versa).

For instance, give someone an IQ test. IQ tests have some measurement error -- people will score higher or lower than their "true" score due to random factors such as guessing answers correctly (or incorrectly), being sleepy (or not), etc. But it's a lot harder to get an IQ score higher than your true score than lower, because getting a higher score requires a lot of good luck (unlikely) whereas there are all sorts of ways to get a low score (brain freeze, etc.). 

Most statistical tests make a number of assumptions (like normally distributed error) that are not true of actual data. That leads to incorrect estimates of how likely a particular result is to replicate. The truth is most scientists -- at the very least, most psychologists -- aren't experts in statistics, and so statistical tests are misapplied all the time.

I don't actually think that issues like the ones I just discussed lead to most of the difficulties (though I admit I have no data one way or another). I bring these issues up mainly to point out at that statistical tests are tools that are either used or misused according to the skill of the experimenter. And there are lots of nasty ways to misuse statistical tests. I discuss a few of them below:

Run enough experiments and...

Let's go back to my genius fruit experiment. I ask a group of people to eat an apple and then give them an IQ test. I compare their IQ scores with scores from a control group that didn't eat an apple. Now let's say in fact eating apples doesn't affect IQ scores. Assuming I do my statistics correctly and all the assumptions of the statistical tests are met, I should have only a 5% chance of finding a "significant" effect of apple-eating.

Now let's say I'm disappointed in my result. So I try the same experiment with kiwis. Again, I have only a 5% chance of getting a significant result for kiwis. So that's not very likely to happen either.

Next I try oranges....

Hopefully you see where this is going. If I try only one fruit, I have a 5% chance of getting a significant result. If I try 2 fruits, I have a 1 - .95*.95 = 9.8% chance of getting a significant result for at least one of the fruits. If I try 4 fruits, now I'm up to a 1 - .95*.95*.95*.95 = 18.5% chance that I'll "discover" that one of these fruits significantly affects IQ. By the time I've tried 14 fruits, I've got a better than 50% chance of an amazing discovery. But my p-value for that one experiment -- that is, my estimate that these results won't replicate -- is less than 5%, suggesting there is only a 5% chance the results were due to chance.

While there are ways of statistically correcting for this increased likelihood of false positives, my experience suggests that it's relatively rare for anyone to do so. And it's not always possible. Consider the fact that there may be 14 different labs all testing the genius fruit hypothesis (it's suddenly very fashionable for some reason). There's a better than 50% chance that one of these labs will get a significant result, even though from the perspective of an individual lab, they only ran one experiment.

Data peaking

Many researchers peak at their data. There are good reasons for doing this. One is curiosity (we do experiments because we really want to know the outcome). Another is to make sure all your equipment is working (don't want to waste time collecting useless data). Another reason -- and this is the problematic one -- is to see if you can stop collecting data.

Time is finite. Nobody wants to spend longer on an experiment than necessary. Let's say you have a study where you expect to need -- based on intuition and past experience -- around 20 subjects. You might check your data after you've run 12, just in case that's enough. What usually happens is that if the results are significant, you stop running the study and move on. If they aren't, you run more subjects. Now maybe after you've got 20 subjects, you check your data. If it's significant, you stop the study; if not, you run some more. And you keep on doing this until either you get a significant result or you give up.

It's a little harder to do back-of-the-envelop calculations on the importance of this effect, but it should be clear that this habit has the unfortunate result of increasing the relative likelihood of a false positive, since false positives lead you to declare victory and end the experiment, whereas false negatives are likely to be corrected (since you keep on collecting more subjects until the false negative is overcome). I read a nice paper on this issue that actually crunched the numbers a while back (for some reason I can't find it at the moment), and I remember the result was a pretty significant increase in the expected number of false positives.

Data massaging

The issues I've discussed so real problems but are pretty common and not generally regarded as ethical violations. Data massaging is at the borderline.

Any dataset can be analyzed in a number of ways. Once again, if people get the result they were expecting with the first analysis they run, they're generally going to declare victory and start writing the paper. If you don't get the results you expect, you try different analysis methods. There are different statistical tests that be used. There are different covariates that could be factored out. You can through out "bad" subjects or items. This is going to significantly increase the rate of false positives.

It should be pointed out that interrogating your statistical model is a good thing. Ideally, researchers should check to see if there are bad subjects or items, check whether there are covariates to be controlled for, check whether different analysis techniques give different results. But doing this affects the interpretation of your p-value (the estimate of how likely it is that your results will replicate), and most people don't know how to appropriately control for that. And some are frankly more concerned with getting the results they want than doing the statistics properly (there is where the "borderline" comes in).

Better estimates

The problem, at least from where I stand, is one of statistics. We want our statistical tests to tell us how likely it is that our results will replicate. We have statistical tests which, if used properly, will give us just such an estimate. However, there are lots and lots of ways to use them incorrectly.

So what should we do? One possibility is to train people to use statistics better. And there are occasional revisions in standard practice that do result in better use of statistics.

Another possibility is to lower the p-value that is considered significant. The choice of p=0.05 as a cutoff was, as Lehrer notes, arbitrary. Picking a smaller number would decrease the number of false positives. Unfortunately, it also decreases the number of real positives by a lot. People who don't math can skip this next section.

Let's assume we're running studies with a single dependent variable and one manipulation, and that we're going to test for significance with a t-test. Let's say the manipulation really should work -- that is, it really does have an effect on our dependent measure. Let's say that the effect size is large-ish (Cohen's d of .8, which is large by psychology standards) and that we run 50 subjects. The chance of actually finding a significant effect at the p=.05 level is 79%. For people who haven't done power analyses before, this might seem low, but actually an 80% chance of finding an effect is pretty good. Dropping our significant threshold to p=.01 drops the chance of finding the effect to 56%. To put this in perspective, if we ran 20 such studies, we'd find 16 significant effects at the p=.05 level but only 11 at the p=.01 level. (If you want to play around with these numbers yourself, try this free statistical power calculator.)

Now consider what happens if we're running studies where the manipulation shouldn't have an effect. If we run 20 such studies, 1 of them will nonetheless give us a false positive at the p=.05 level, whereas we probably won't get any at the p=.01 level. So we've eliminated one false positive, but at the cost of nearly 1/3 of our true positives.

No better prediction of replication than replication

Perhaps the easiest method is to just replicate studies before publishing them. The chances of getting the same spurious result twice in a row are vanishingly small. Many of the issues I outlined above -- other than data massaging -- won't increase your replication rate. Test 14 different fruits to see if any of them increase IQ scores, and you have over a 50% chance that one of them will spuriously do so. Test that same fruit again, and you've only got a 5% chance of repeating the effect. So replication decreases your false positive rate 20-fold. Similarly, data massaging may get you that coveted p.05, but the chances of the same massages producing the same result again are very, very low.

True positives aren't nearly so affected. Again, a typical power level is B=0.80 -- 80% of the time that an effect is really there, you'll be able to find it. So when you try to replication a true positive, you'll succeed 80% of the time. So replication decreases your true positives by only 20%.

So let's say the literature has a 30% false positive rate (which, based on current estimates, seems quite reasonable). Attempting to replicate every positive result prior to publication -- and note that it's extremely rare to publish a null result (no effect), so almost all published results are positive results -- should decrease the false positives 20-fold and the true positives by 20%, leaving us with a 2.6% false positive rate. That's a huge improvement.

So why not replicate more?

So why don't people replicate before publishing? If 30% of your own publishable results are false positives, and you eliminate them, you've just lost 30% of your potential publications. You've also lost 20% of your true positives as well, btw, which means overall you've decreased your productivity by 43%. And that's without counting the time it takes to run the replication. Yes, it's nice that you've eliminated your false positives, but you also may have eliminated your own career!

When scientists are ranked, they're largely ranked on (a) number of publications, (b) number of times a publication is cited, and (c) quality of journal that the publications are in. Notice that you can improve your score on all of these metrics by publishing more false positives. Taking the time to replicate decreases your number of publications and eliminates many of the most exciting and surprising results (decreasing both citations and quality of journal). Perversely, even if someone publishes a failure to replicate your false positive, that's a citation and another feather in your cap.

I'm not saying that people are cynically increasing their numbers of bogus results. Most of us got into science because we actually want to know the answers to stuff. We care about science. But there is limited time in the day, and all the methods of eliminating false positives take time. And we're always under incredible pressure to pick up the pace of research, not slow it down.

I'm not sure how to solve this problem, but any solution I can think of involves some way of tracking not just how often a researcher publishes or how many citations those publications get, but how often those publications are replicated. Without having a way of tracking which publications replicate and which don't, there is no way to reward meticulous researchers or hold sloppy researchers to account.

Also, I think a lot of people just don't believe that false positives are that big a problem. If you think that only 2-3% of published papers contain bogus results, there's not a lot of incentive to put in a lot of hard work learning better statistical techniques, replicating everything, etc. If you think the rate is closer to 100%, you'd question the meaning of your own existence. As long as we aren't keeping track of replication rates, nobody really knows for sure where we are on this continuum.

That's my conclusion. Here's Lehrer's:
The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that's often not the case. Just because an idea is true doesn't mean it can be proved. And just because an idea can be proved doesn't mean it's true. When the experiments are done, we still have to chose what to believe.
I say it again: huh?

Words and non-words

"...the modern non-word 'blogger'..." -- Dr. Royce Murray, editor of the journal Analytic Chemistry.

"209,000,000 results (0.21 seconds)" -- Google search for the "non-word" blogger.


------------
There has been a lot of discussion about Royce Murray's bizzarre attack on blogging in the latest JAC editorial (the key sentence: I believe that the current phenomenon of "bloggers" should be of serious concern to scientists).

Dr. Isis has posted a nice take-down of the piece focusing on the age old testy relationship between scientists and journalists. My bigger concern with the editorial is that it is clear that Murray has no idea what a blog is, yet feels justified in writing an article about blogging. Here's a telling sentence:
Bloggers are entrepreneurs who sell “news” (more properly, opinion) to mass media: internet, radio, TV, and to some extent print news. In former days, these individuals would be referred to as “freelance writers”, which they still are; the creation of the modern non-word “blogger” does not change the purveyor.
Wrong! Wrong! Wrong! A freelance writer does sell articles to established media entities. Bloggers mostly write for their own blog (hence the "non-word" blog-ger). There are of course those who are hired to blog for major media outlets like Scientific American or Wired, but then they are essentially columnists (in fact, many of the columnists at The New York Times have NYTimes blogs at the request of the newspaper).
This magnifies, for the lay reader, the dual problems in assessing credibility: a) not having a single stable employer (like a newspaper, which can insist on credentials and/or education background) frees the blogger from the requirement of consistent information reliability ... Who are the fact-checkers now?
Wait, newspapers don't insist on credentials and don't fact-check the stories they get from freelancers? Why is Murray complaining about bloggers, then? In any case, it's not like journals like Analytic Chemistry do a good job of fact-checking what they publish or that they stop publishing papers by people whose results never replicate. Journal editors living in glass houses...

This focus on credentials is a bit odd -- I thought truth was the only credential a scientist needed -- and in any case seriously misplaced. I challenge Murray to find a popular science blog written by someone who is neither a fully-credentialed scientist writing about his/her area of expertise, nor a well-established science journalist working for a major media outlet.

Are there crack-pot bloggers out there? Sure. But most don't have much of an audience (certainly, their audience is smaller than the fact-checked, establishment media-approved Glenn Beck). Instead, we have a network of scientists and science enthusiasts discussing, analyzing and presenting science. What's to hate about that?

Thank you, Oprah

Oprah's magazine linked to my collaborator's web-based lab. I'm a little miffed at the lack of the link love, but I still got something out of it -- we now have over 20,000 participants in the experiment we've been running on her site. So thank you, Oprah.

Busy analyzing...

Slate's Report on Hauser Borders on Fraud

Love, turned sour, is every bit as fierce. I haven't written about the Hauser saga for a number of reasons. I know and like the guy, and I find nothing but sadness in the whole situation. Nonetheless, I've of course been following the reports, and I wondered why my once-favorite magazine had so long been silent.

Enjoying my fastest Wi-Fi connection in weeks here at the Heathrow Yotel, I finally found Slate's take on scandal, subtitled What went wrong with Marc Hauser's search for moral foundations. The article has a nice historical overview of Hauser's work, in context, and neatly describes several experiments. The article is cagey, but you could be excused for believing that (a) Hauser has done a lot of moral cognition research with monkeys, and (b) that work was fraudulent. The only problem is that nobody, to my knowledge, has called Hauser's moral cognition research into question -- in fact, most people have gone out of their way to say that that work (done nearly exclusively with humans) replicates very nicely. There was some concern about some work on intention-understanding in monkeys, which is probably a prerequisite for some types of moral cognition, but that's not the work one thinks of when talking about Hauser's Moral Grammar hypothesis.

I can't tell if this was deliberately misleading or just bad reporting, and I'm not sure which is more disturbing.

Slate's science reporting has always been weak (see here, here, here, and especially here), and the entire magazine has been on a steady decline for several years. Sigh. I need a new magazine.

I liked "Salt," but...

What's with movies in which fMRI can be done remotely. In an early scene, the CIA do a remote brain scan of someone sitting in a room. And it's fully analyzed, too, with ROIs shown. I want that technology -- it would make my work so much easier!

UPDATE I'm not the only one with this complaint. Though Popular Mechanics goes a bit easy on the movie by saying fMRI is "not quite at the level Salt portrays." That's a bit like saying space travel is not quite at the level Star Trek portrays. There may someday be a remote brain scanner, but it won't be based on anything remotely like existing fMRI technology, which requires incredibly powerful, supercooled and loud magnets. Even if you solved the noise problems, there's nothing to be done about the fact that the knife embedded in the Russian spy's shoe (yes -- it is that kind of movie) would have gone flying to the center of the magnetic field, along with many of the other metal objects in the room.

Recent Findings Don't Prove there's a Ghost in the Machine (Sorry Saletan)

When I took intro to psychology (way too long ago), the graduate instructor posed the following question to the class: Does your brain control your mind, or does your mind control your brain? At first I thought this was a trick question -- coming from most neuroscientists or cognitive scientists it would be -- but she meant it seriously.

On Tuesday, William Saletan at Slate posed the same question. Bouncing off recent evidence that some supposedly vegetative patients are in fact still able to think, Saletan writes, "Human minds stripped of every other power can still control one last organ--the brain."

Huh?

Every neuroscientist I've talked to would read this as a tautology: "the brain controls the brain." Given the gazillions of feedback circuits in the brain, that's a given. Reading further, though, Saletan clearly has something else in mind:

We think of the brain as its own master, controlling or fabricating the mind ... If the brain controls the mind this way, then brain scanning seems like mind reading ... It's fun to spin out these neuro-determinist theories and mind-reading fantasties. But the reality of the European scans is much more interesting. They don't show the brain controlling the mind ... The scans show the opposite: the mind operating the brain."

Evidence Mind is Master

As I've already mentioned above, the paragraph quoted above is nonsensical in modern scientific theory, and I'll get back to why. But before that, what evidence is Saletan looking at?

In the study he's talking about, neuroscientists examined 54 patients who show limited or no awareness and no ability to communicate. Patients brains were scanned while they were asked to think of motor activities (swinging a tennis racket) or navigation activities (moving around one's home town). 5 of the 54 were able to do this. They also tried to ask the patients yes-no questions. If the answer was 'yes', the patient was to think about swinging a tennis racket; if 'no', moving around one's home town. One patient was able to do this successfully.

Note that the brain scans couldn't see the patient deciding 'yes' or 'no' -- actually, they couldn't see the patient deciding at all. This seems to be why Saletan thinks this is evidence of an invisible will controlling the physical brain: "On the tablet of your brain, you can write whatever answer you want."

The Mistake

The biggest problem with this reasoning is a misunderstanding of the method the scientists used. FMRI detects very, very small signals in the brain. The technology tracks changes in blood oxygenation levels, which correlates with local brain activity (though not perfectly). A very large change is on the order of 1%. For more complicated thoughts, effect sizes of 0.5% or even 0.1% are typical. Meanwhile, blood oxygen levels fluctuate a good deal for reasons of their own. This low signal-to-noise ratio means that you usually need dozens of trials: have the person think the same thoughts over and over again and average across all the trials. In the fMRI lab I worked in previously, the typical experiment took 2 hours. Some labs take even longer.

To use fMRI for meaningful communication between a paralyzed person and their doctors, you need to  be able to detect the response to an individual question. Even if we knew were to look in the brain for 'yes' or 'no' answers -- and last I heard we didn't, but things change quickly -- its unlikely we could hear this whispering over the general tumult in the brain. The patients needed to shout at the top of their lungs. It happens that physical imagery produces very nice signals (I know less about navigation, but presumably it does, too, or the researchers wouldn't have used it).

Thus, the focus on visual imagery rather than more direct "mind-reading" was simply an issue of technology.

Dualism

The more subtle issue is that Saletan takes dualism as a starting point: the mind and brain are separate entities. Thus, it makes sense to ask which controls the other. He seems to understand modern science as saying the brain controls the mind.

This is not the way scientists currently approach the problem -- or, at least, not any I know. The starting assumption is that the mind and brain are two ways of describing the same thing. Asking whether the mind can control the brain makes as much sense as asking whether the Senate controls the senators or senators control the Senate. Talking about the Senate doing something is just another way of talking about some set of senators doing something.

Of course, modern science could be wrong about the mind. Maybe there is a non-material mind separate from the brain. However, the theory that the mind is the brain has been enormously productive. Without it, it is extremely difficult to explain just about anything in neuroscience. Why does brain trauma lead to amnesia, if memories aren't part of the brain? Why can strokes leave people able to see but unaware that they can see?

Descartes' Error

A major problem with talking about the mind and brain is that we clearly conceptualize of them differently. One of the most exciting areas of cognitive science in the last couple decades has looked at mind perception. It appears humans are so constructed that we are good at detecting minds. We actually over-detect minds, otherwise puppet shows wouldn't work (we at least half believe the puppets are actually thinking and acting). Part of our concept of mind is that it is non-physical but controls physical bodies. While our concept of mind appears to develop during early childhood, the fact that almost all humans end up with a similar concept suggests that either the concept or the propensity to develop it is innate.

Descartes,  who produced probably the most famous defense of dualism, thought the fact that he had the concept of God proved that God exists (his reasoning: how can an imperfect being have the thought of a perfect being, unless the perfect being put that thought there?). Most people would agree, however, that just because you have a concept doesn't mean the thing the concept refers to exists. I, for instance, have the concept of cylons, but I don't expect to run into any.

Thus, even as science becomes better and better at explaining how a physical entity like the brain gives rise to our perceptions, our experience of existing and thinking, the unity of mind and brain won't necessarily make any more intuitive sense. This is similar to the problem with quantum physics: we have plenty of scientific evidence that something can be both a wave and a particle simultaneously, and many scientists work these theories with great dexterity. But I doubt anyone really has a clear conception of a wave/particle. I certainly don't, despite a semester of quantum mechanics in college. We just weren't set up to think that way.

For this reason, I expect we'll continue to read articles like Saletan's long in the future. This is unfortunate, as neuroscience is becoming an increasingly important part of our lives and society, in a way quantum physics has yet to do. Consider, for instance, insanity pleas in the criminal justice system, lie detectors, and so on.

Origin of Language Pinpointed

Scientists have long debated the evolution of language. Did it emerge along with the appearance of modern homo sapiens, 130,000-200,000 years ago? Or did it happen as late as 50,000 years ago, explaining the cultural ferment at that time? What are we to make of the fact that Neanderthals may have had the ability to produce sounds similar to those of modern humans?

In a stunning announcement this morning, freelance writer Joshuah Bearman announced that he had pinpointed the exact location, if not the date, of the origin of modern language: Lake Turkana in Kenya.

WTF?

Actually, what Bearman says is
This is where Homo sapiens emerged. It is the same sunset our ancestors saw, walking through this very valley. To the north is Lake Turkana, where the first words were spoken. To the south is Laetoli, where, in 1978, Mary Leakey's team was tossing around elephant turds when they stumbled across two sets of momentous footprints: bipedal, tandem, two early hominids together...
Since this is in an article about a wedding, I suspect tha Bearman was not actually intending to floor the scientific establishment with an announcement; he assumed this was common knowledge. I can't begin to imagine where he got this idea though. I wondered if perhaps this was some sort of urban legend (like all the Eskimo words for snow), but Googling has turned up nothing, though of course some important archaeological finds come from that region.

Oops

Probably he heard it from a tour guide (or thought he had heard something like that from a tour guide). Then neither he nor his editor bothered to think through the logic: how would we know where the first words were spoken, given that there can be no archaeological record? It's unlikley we'll ever even find the first human(s), given the low likelihood of fossilization.

I have some sympathy. Bearman was simply trying to provide a setting for his story. In one of my first published travel articles, I similarly mentioned in passing that Lake Baikal (the topic of my story) was one of the last strongholds of the Whites in the Russian Revolution. I have no idea where I got that idea, since it was completely untrue. (Though, in comparison with the Lake Turkana hypothesis, at least my unfounded claim was possible.)

So I'm sympathetic. I also had to write a correction for a subsequent issue. Bearman?

Alzheimer's, Autism & the NCAA: Science News for 3/17

Do vaccines give Somalis autism? Can diabetes give you Alzheimer's? Does losing make you win? Anyone scanning the science news articles this week would know the answers to these questions.

First, Freakonomics has a discussion of a recent paper showing that NCAA basketball teams are more likely to win if they are 1 point behind at halftime than if they are 1 point ahead. It seems that when people are slightly behind in a game at halftime, they work even harder in the second half relative to people who are way behind, slightly ahead or way ahead.

Second, the New York Times (byline: Donald McNeil Jr.) discusses the abnormally high rate of autism among Somali immigrants in Minneapolis. The article gives several explanatory hypotheses (including a statistical fluke), but a lot of time is spent on the "possibility" that these cases of autism are caused by vaccinations. The fact that the article doesn't mention that this is simply absurd is glaring (though it does mention "some children" had autistic tendancies before being vaccinated). More interesting is that many of these kids appear to have had seizures, something which is mentioned only in passing.

Finally, Amanda Schaffer at Slate discusses the possible relationship between insulin and Alzheimer's (Diabetes of the Brain: Is Alzheimer's disease actually a form of diabetes?).

Who are you calling a neuroscientist: Has neuroscience killed psychology?

The Chronicle of Higher Education just produced a list of the five young scholars to watch who combine neuroscience and psychology. The first one listed is George Alvarez, who was just hired by Harvard.

Alvarez should be on anybody's top five list. The department buzzed for a week after his job talk, despite the fact that many of us already knew his work. What is impressive is not only the quantity of his research output -- 19 papers at last count, with 6 under review or revision -- but the number of truly ground-breaking pieces of work. Several of his papers have been very influential in my own work on visual working memory.

He is also one of the best exemplars of classical cognitive psychology I know. His use of neuroscience techniques is minimal, and currently appears to be limited to a single paper (Batelli, Alvarez, Carlson & Pascual-Leone, in press). Again, this is not a criticism.

Neurons vs. Behavior

This is particularly odd in the context of the attached article, which tries to explore the relationship between neuroscience techniques and psychology. Although there is some balance, with a look at the effect of neuroscience in draining money away from traditional cognitive science, I read the article as promoting the notion that the intersection of neuroscience and psychology is not just the place to be at, it's the only place to be at.

Alvarez is one of the best examples of the opposite claim: that there is still a lot of interesting cognitive science to be done that doesn't require neuroimaging. I should point out that I say this all as a fan of neuroscience, and as somebody currently designing both ERP and fMRI experiments.

EEG vs. fMRI

One more thing before I stop beating up on the Chronicle (which is actually one of my favorite publications). The article claims that EEG (the backbone of ERP) offers less detailed information about the brain in comparison with fMRI. The truth is that EEG offers less detailed information about spatial location, but its temporal resolution is far greater. If the processes you are studying are lightning-fast and the theories you are testing make strong claims about the timing of specific computations, fMRI is not ideal. I think this is why fMRI has had less impact on the study of language than it has in some other areas.

For instance, the ERP study I am working on looks at complex interactions between semantic and pragmatic processes that occur over a few hundred milliseconds. I have seen some very inventive fMRI work on the primary visual cortex that managed that kind of precision, but it is rare (and probably only succeeded because the layout of the visual areas of the brain, in contrast with the linguistic areas, is fairly well-established).

Do Bullies like Bullying?

Although Slate is my favorite magazine, and usually the first website I check each day, I've been known to complain about its science coverage, which typically lacks the insight of its other features. A much-too-rare exception to this are the occasional articles by Daniel Engber (full disclosure: I have attempted to convince Engber, a Slate editor, to run articles by me in the past, unsuccessfully).

Yesterday, he wrote an excellent piece about a recent bit of cognitive neuroscience looking at bullies and how they relate to bullying. Researchers scanned the brains of "bullies" while they viewed videos of bullying and reported that pleasure centers in the brain activated.

In a cheeky fashion typical of Slate, Engber questions the novelty of these findings:

Bullies like bullying? I just felt a shiver run up my spine. Next we'll find out that alcoholics like alcohol. Or that overeaters like to overeat. Hey, I've got an idea for a brain-imaging study of child-molesters that'll just make your skin crawl!
Obviously, I was a sympathetic reader. But Engber does not stop there:

OK, OK: Why am I wasting time on a study so lame that it got a write-up in the Onion? Hasn't this whole fMRI backlash routine gotten a bit passé?
Engber goes on to detail a number of limitations to the study, including how the kids were defined as "bullies" (some appear to be rapists, for instance) and also how "pleasure center" was defined (the area in question is also related to anxiety, so one could reasonably argue bullies find bullying worrisome, not pleasurable).

The second half of the article is a plea for better science reporting, one that I hope is widely-read. Read it yourself here.

Sorry, New York Times, cognitive dissonance still exists

Earlier this week, New York Times columnist John Tierney reported a potential flaw in a classic psychology experiment. It turns out that the experimental finding -- cognitive dissonance -- is safe and sound (see below). But first, here are the basic claims:

Cognitive dissonance generally refers to changing your beliefs and desires to match what you do. That is, rather than working hard for something you like, you may believe you like something because you worked so hard for it. 

Laboratory experiments (of which there have been hundreds if not thousands) tend to be of the following flavor (quoted from the Tierney blog post). Have someone rate several different objects (such as different colored M&Ms) in terms of how much they like them. From that set of objects, choose three (say, red, blue and green) that the person likes equally well. Then let the person choose between two of them (the red and blue M&M). 

Presumably (and this will be the catch) the person chooses randomly, since she likes both equally. Say she chooses the red M&M. Then let her choose between red and green. You would predict that she would choose randomly, since she likes the two colors equally, but she nearly invariably will be the red M&M. This is taken as evidence that her originally random choice of the red M&M actually changed her preferences to where she now likes red better than either blue or green.

The basic problem with this experiment, according to M. Keith Chen of Yale and as reported by Tierney, is that we don't really know that the person didn't originally prefer red. She may have rated them similarly, but she chose red over blue. The math works out such that if she in fact already preferred red over blue, she probably also actually preferred red over green.

Tierney calls this a "fatal flaw" in cognitive dissonance research, and asks "choice rationalization has been considered one of the most well-established theories in social psychology. Does it need to be reconsidered?"

Short answer: No.

First, it is important to point out that Chen has shown that if the original preferences were measured incorrectly, then this type of experiment might suggest cognitive dissonance even where there is none. He does not show that the original measurements were in error. 

However, even if that were true, that would not mean that cognitive dissonance does not exist. This is a classic problem in logic. Chen's argument is of the following form: If Socrates is a woman, then he is mortal. Socrates is not a woman. Therefore, he is not mortal.

In any case, cognitive dissonance has been shown in studies that do not fall under Chen's criticisms. Louisa Egan and collaborators solved this problem by having their subjects choose between items they couldn't see. Since the subjects knew nothing about the items, they couldn't possibly have a pre-existing preference. Even so, they showed the classic pattern of results.

By all appearances in the Tierney article, Chen is unaware of this study (which, to be fair, has not yet been published). "I wouldn't be completely surprised if [cognitive dissonance] exists, but I've never seen in measured correctly." This is hard to believe, since Chen not only works in the same university as Egan, he is a close collaborator of Laurie Santos (Egan's graduate advisor). It's not clear why he would neglect to mention this study, particularly since this blanket critique of cognitive dissonance research in the New York Times is embarrassing to Egan and Santos at a time when Egan is on the job market (and it appears to have a lot of people upset). 

Thus, it's puzzling that Chen claims that no existing study unambiguously shows cognitive dissonance. He might, however, be able to make the weaker claim that it is possible that some studies that have been claimed to show cognitive dissonance in fact to not. That is a reasonable claim and worth testing. In fact, Chen reports that he is testing it now. It is worth keeping in mind that for the time being, Chen has only an untested hypothesis. It's an intriguing and potentially valuable hypothesis, but there isn't any evidence yet that it is correct.

See the original article here.

Snake oil and Neuroscience

Readers of this blog know how I feel about neuroscience reporting (here, here and here). One consistent problem is that reporters enthusiastically relate findings that involve brain scans, while ignoring the original and groundbreaking behavioral work.

A truism in psychology, however, is to never trust your impressions of a situation. So often our intuitions (e.g., the average American wouldn't torture an innocent bystander to death just because someone in a lab coat told them to) turn out to be completely incorrect. So I was very happy to hear that a group at Yale actually tested the hypothesis that people will believe basic behavioral findings more (like the existence of cognitive dissonance) more if brain-related words are mentioned.

In brief, it appears that the average non-expert does, in fact, believe it more if there is a picture of the brain somewhere. However, students who have taken an introductory neuroscience class are not only immune to this effect, but they actually find explanations that include references to brain anatomy less compelling. So perhaps this research explains not only why the average reader (and reporter) likes the typical neuroscience reportage as why people like myself (and Dan Gilbert) dislike it.

Cognitive Daily has an excellent in-depth description of the article here.



Weisberg, D.S., Keil, F.C., Rawson, .J., Gray, J.R. (2008). The seductive allure of neuroscience explanations. Journal of Cognitive Neuroscience, 20(3), 470-477.