Biases at the level of selection of experiments.

Flexibility in which experiments to include in a paper or

publication, how many subjects to run and when you stop running subjects.

So if you just run subjects till you find a significant result then that

induces a bias.

The third one is flexibility in the model, so model selection bias.

So if you have exercised flexibility in which outcome you choose,

which covariant you include, how you divide up the outcomes,

like a median split or other kinds of divisions and which procedures you employ,

and you pick the models that look the best at the end, you're also inducing a bias.

And finally, a voxel selection bias, or more generally, a test selection bias.

If you run many, many tasks like we do in imaging,

thousands of tests, hundreds of thousands of test,

then we often end up picking out the significant ones to look at, to focus on.

And that both increases the false positive rate and

inflates the apparent effect sizes of the winning voxels, the ones we focus on.

So we talked about those things in brief and some solutions to them.

And we also talked about some positive responses by the community,

in terms of homes and funding for replications and null findings,

changes in journal policies, other new platforms for conducting research.

In this module, we're going to home in on FMRI biases and

we're going to talk more about voxel selection bias.

We'll talk about circularity, regression to the mean, and along the way,

kind of familiarize you with some of the important terminology and effects.

Effects like the decline effect and publication bias and

p-hacking, so let's look at voxel selection bias.

This is at the heart of the voodoo correlation paper and debate.

And here's a hypothetical study.

This is a correlation, then,

between reward sensitivity in the brain and behavior.

So it might be a decision making task, and this is a pretty typical finding,

there's the nucleus accumbens right there, and I see this correlation here between

reward sensitivity, behaviorally and in the brain, of 0.82, it looks really great.

Right, great finding, great paper, and the problem with this,

is that this is a null hypothesis simulation.

There are no true effects.

And how many times I have to replicate a brain analysis before I found this

null finding that looks really good, once.

[LAUGH] It happens all the time.

So here, in this case, here are the correlations across the whole brain.

All the effect sizes, and you can see that they're centered on zero, and

that's what happens if there's no true effect and truly symmetric.

So, it's really just to know how hypothesis distribution,

they're just many test and we could pick out the ones that look the best by chance.

And this is what the map looks like at P is less than .005 uncorrected,

which is a pretty common threshold for reporting findings in published papers,

and what we see here is at .005 uncorrected,

there are several hundred significant voxels.

So I see lots of findings, and what I did is I picked out the nucleus sucombus

finding and I zoomed in on it, and it looked great.

So uncorrected thresholds, the problem is you find something every time,

so this is something you have to guard against.

So here's the series of simulations, null hypothesis simulations and I'll do

the same thing as I did before, but I'm going to repeat it 10 times just so

you get a feeling for what this maps look like.

So what we'll see is a display that looks like this, and

on the top panel there, we see what looks like a whole brain 0.005 uncorrected, so

we'll see how many blobs there are and where they are.

Then on the bottom left, we'll see the distribution of those brain behavior

correlation values across the brain, and they'll always be centered on zero.

We'll see the region with the strongest

correlation In the bottom middle panel, that's the maximally correlated region and

then what the correlation looks like in the best region.

And this just illustrates that when we pick out the winners,

the findings look really good.

So let's repeat this ten times.

So there's a positive correlation of 0.8,

next time here's a negative correlation of about the same strength.

Here negative 0.86, and you can kind of get a sense that no matter which run I do,

I can do this again and again, I'll always find something.

It's about 0.8 by chance, and this is with 20 subjects.

If you would increase the sample size,

then the chance colorationables will go down to some degrees, so

it won't look quite as good, but the smaller the study the more likely it is

you going to find something that looks really great by chance.

And this problem initially with colorations here but

it applies to all kinds of effects sizes as well o any kind of statistical task.

So, this is the idea,

the Voxel selection bias is what inflates those observed tests.

So, here's another representation of a brain map with some results,

20 subjects, p is less than .001.

And there are some true areas here that are in purple,

and the red areas are false positives.

And, if the true correlation looks like this,

in this simulation, in the areas that are truly activated,

the purple areas This is the true correlation on average, 0.5.

So that's the ground truth.

But I'll always see something stronger.

A typical significant voxel will be correlated about 0.8.

So there's some signal there, but it's not as strong as it looks.