Hi. Anderson Smith here.
Today, we're going to talk about Construct Validity which one is
one variable that determines whether or not your experiment is a good experiment.
But there are others that really determine the quality of an experiment.
They are reliability, whether the results that you're getting are consistent,
whether they're consistency in the findings.
And in fact, there can be two kinds of reliability.
They can be test-retest reliability, simply saying,
if I measure the behavior that I'm interested in at different points in time,
do I get the same measure?
Then there's also internal consistency, that is,
within that experiment myself,
are different things that I'm measuring,
measuring the same thing?
So, consistency.
And then, the second is validity,
which is not consistent results but accurate results.
Am I measuring what I say I want to measure?
The accuracy of the measure to the construct.
So, let's first look at reliability.
First, external reliability and that sometimes just simply test-retests.
I measure the behavior I'm interested at one time and then I measure it again,
do I get the same results?
An example of that would be here where I have
the score at time one and have the score at time two.
And I see this very positive relationship
saying that I get the same results at a different points in time.
So I have tests-retest consistency or external reliability.
Then there's also internal reliability where and sometimes,
that's just a split-half measure where I take half the questions that are asked
or half the measures that I make and then compare them to
the other half of the measures within the same experiment.
So, I might have even-numbered items and
odd-numbered items and has since shown that I get the same relationship among them,
showing that I have internal consistency or internal reliability.
So, reliability is the consistency of the results.
The second variable is validity.
And we have to understand that we can have a very reliable measure,
but it might not be valid at all.
Let me just make up an example.
Let's say that I read the Pinocchio story to my grandchildren and so I say,
I want to test that.
I want to see whether the link of the nose really
measures honesty as it did in the Pinocchio tale.
So, I might first look at my hypothesis,
longer noses mean less honesty.
I first look at reliability and I could come up with
a very reliable measure of measuring the length of the nose,
and I get differences among people and their nose length.
But we know there's really no validity in that.
There's no reason we can come up with a hypothesis that says that
the length of the nose measures honesty.
So I have no validity but a very reliable measure.
Sometimes, we call that face validity.
Because I read the story of Pinocchio,
I have this belief that the nose length determines honesty.
So on the surface,
it might look like I have a valid hypothesis but in fact, there's no reason.
There's no really theory or other data that would even suggest that.
It's not valid. And we can measure validity just like we can then measure reliability.
So face validity, on the surface,
doesn't look like it's measuring what it's intended to measure,
versus a very weak measure of accuracy,
because we've got to have some reason to believe that
that measure really measures what it is that we're looking at.
It's really based on N,
intuitions, a subjective beliefs,
what we've learned in reading children's stories,
but we know many times that can be wrong and that can lead us to come up
with psychological facts that might not be accurate.
And remember, validity measures accuracy.
And sometimes, we don't even want to have face-to-face validity.
We want to have the minimum of face validity because sometimes people
will give what they think is the correct answer but in fact,
it's simply what it might be a correct answer that is on
the surface looks correct but doesn't
really measure the underlying thing we're looking at.
An example of that would be, in Harvard,
they developed something called the Implicit Association Test,
which measures people's attitudes about things.
But on the surface, they can't tell what attitude they're being measured.
For example, we might be measuring prejudice.
What is your attitude towards African-Americans?
And you would say, if I simply asked you the question,
do you have any prejudice against African-Americans?
You're going to probably say, "No" even though you might have that prejudice,
and you have to have tests that have no face validity so you can disguise
what it is you're really measuring to get the accurate answer.
So we have face validity.
There are other kinds of validity.
There's external validity whether what we are
measuring now really generalizes to the population as a whole.
We'll talk about that later.
And we have construct validity,
just like the nose length doesn't measure honesty.
Does what we're measuring really reflect the construct that we're looking at?
And there are three ways to look at that: content validity,
predictive validity, and discriminant validity.
And we'll talk about those now.
So first, let's talk about content validity.
First, the psychological constructs we can't observe directly.
I can't look at memory or perception or attitudes.
I've got to come up, make an inference about
measuring that particular thing by looking at
the behavior that I assume is an operational definition of that particular construct.
Second, psychological constructs are very
complex and often require multiple measures and not just a simple measure.
Like for example, we might want to be measuring something that's very complex like
an attitude but we have to have lots of
questions or lots of measures that come up with what that attitude is.
So we need to have a precise, complete,
and clear definition of the construct that we want to
study so we can measure it accurately,
have validity to our measures.
Let me give you an example of my own research.
You know I study memory,
and I study aging in memory,
and somebody might ask me, "Well, what have you learned?
Does aging affect memory?"
Well, that should be an easy question for me to
answer given that I study memory and I study aging.
But it's a very difficult question to answer because I got to first know,
well, what kind of memory are you talking about?
Because memory is not a single construct.
It's a very complex construct and we know now it
is constructed from many different kinds
of memory that are actually located in different parts of the brain.
For example, we can have sensory memory.
We can have primary memory.
We can have working memory.
We can have episodic memory.
We can have semantic memory.
And we can have procedural memory.
Different kinds of memory that we can measure.
So when I am asked the question,
does aging affect memory?
I've got to first say, what kind of memory are you talking about?
Be precise in what it is you want to measure because I have studied aging in memory.
Sensory memory, yes, there is an effect of aging. Primary memory, no.
Working memory, yes, and it declines with aging.
Episodic memory, yes, it declines with aging.
Semantic memory, yes, but it increases with aging.
And procedural memory, no.
Very different answers to my simple question,
does aging affect memory?
The painting upon what it is I really want to measure,
what is the precise,
clear definition that you want me to deal with when I deal with memory.
Okay, so how do we select a measure?
Well first, we can use existing measures from other research,
research that's already been validated
through the peer review system in publishing and journals.
So we look at what other people do when they
measure what it is they're trying to measure.
That's where I would learn,
for example, if I didn't know much about memory.
Well, there are different kinds of memory.
Now, I got to know what kind of memory you're talking about.
Or we can develop a new measure,
a measure that really comes up with our own measure of what the construct is.
But if we do that, we have to first show that
a new measure is both reliable and it's valid,
these two characteristics of what a good measure and that's a good experiment is.
Now we have predictive validity,
a second way of looking at validity,
and it's sometimes called criterion validity because we're
looking at how valid a measure is according to some criterion.
And, validity determines how well it predicts something.
Like for example, the SAT was designed to predict success in the first year of college.
So I could say it has validity.
It's measuring what I say it wants to measure by looking at
the SAT score and success in the first year of college,
and showing that they have the same kind of relationship.
Or, I have a new measure of test anxiety.
In order to show that it really is a valid measure,
I have to show that it actually predicts performance on tests.
Notice that the SAT score would predict a higher level of success in the first year of
college and the test anxiety tests predict a lower performance on test,
so it could be about a positive and a negative effect.
But I have to show that it predicts what I say it predicts,
it meets the criterion that I've set to design the particular measure for.
And then the third is discriminant validity,
sometimes called divergent validity,
and that says that it measures what I say it's measuring,
which means it doesn't measure things that are not related to that.
Now, if I am looking at measures that do relate to each other,
I call that convergent validity.
If it's so discriminant validity,
I call that divergent validity.
But I have to measure whether or not I get those effects.
An example would be, let's say,
I have a measure that I say it measures
one thing and I have to show that it's not measuring other things.
For example, in this example,
I have a sociability scale that I've developed and I have to show that it does
predict its conversion with things like number of friends and hours spent alone,
both a positive and a negative correlation.
But I want to show it's not related to other constructs associated with personality
like neuroticism and conscientiousness and that would be discriminant validity.
So, my new measure of sociability shows both convergent validity but it also shows
discriminant validity depending on what it's supposed to measure and
what it's not supposed to measure.
So, we have internal validity,
whether or not they're confounding variables.
We talked about that earlier.
Now, we have construct validity,
and we know that there are different kinds of construct validity,
whether the content of the measure really measures the content of the construct,
predictive validity does it have
a criterion it's supposed to reach in terms of a measure,
and discriminant validity, whether it
discriminates between what I want to measure and what I do not want to measure.
Then the third kind of validity that we'll talk about later is external validity.
Validity retaught reliability,
the two things that make an experiment a good experiment. Thank you.