0:01

Welcome back, in the previous lecture, we talked about how we could use models to

Â become clearer thinkers. In this lecture, what we're gonna do is talk about how we

Â can use models with data. And this is an important reason why people use models, in

Â fact when you talk to scientists about why they use models whether they are social

Â scientists or natural scientists. What they'll typically say is well we use

Â models to take them to data, to basically use and understand data in better ways.

Â What I am going to do is unpack that in several directions. I wanna give some

Â specific reasons or ways in which people use models with data. Alright so the first

Â one first real reason is just to understand some basic patterns in the

Â data. So what do I need? Well you could look at data and it could just be a

Â straight line, and nothing could change. So for you look at a system where there's

Â not enough energy in the system we know that energy is neither lost nor gained so

Â energy is a constant. And we have a model that explains why we see energy being a

Â constant. Alternatively we can see something that's a straight line, and

Â increasing line. When you're on a model that explains that. And then we also

Â talked about how we can see patterns in data. So we could see things that go up

Â and down slowly like this, like business cycles and we have models that tell us why

Â we see these kinds of cyclic curves. We could have something that's much more

Â spiking. We could have a model that explains that. So. Again we talked about how there

Â this sort of hairball of data, this firehose of data. There's tons of data out

Â there. That datas gonna have patterns to it. And what we can do is use models to

Â understand why we see those particular patterns. Okay. In addition to the

Â patterns, there's also the use of models to predict specific points. So suppose you

Â are looking for a house and you see this house that's for sale and you're

Â wondering, I wonder how much that house is gonna cost. Well, you can have a model

Â that says okay, the price of the house depends on it's size. So here's sort of the size of

Â the house in square feet. And here's the price. We just put dollar sign there for

Â price. And maybe you get a linear model. And your linear model says basically for

Â every, you know, additional square foot the price of the house goes $100 or $200

Â or something like that. Well then if this is your model, so on your model you've got

Â a house that's got this many square feet it's 2,000 square feet, right, and you go

Â up here and find the point it's $100 per square foot then you're model would

Â predict that the house is $200,000 so. We can use just a simple model to make some

Â sort of prediction about, just in ballpark, how much a particular house

Â would cost, so this is, again, a common use of models to either construct a model

Â and from that model, you predict a point value. Okay, third reason why we use

Â models. It's not so much to predict the points, but to produce bounds. So suppose

Â you're the economic advisor to the president, not a job you'd necessarily

Â want, [laugh], but suppose you are. And the president comes to you and says,

Â what's inflation gonna be next year or next month? Well, you know, inflation

Â doesn't move that quickly. You might be able to say to the president, well, you

Â know, I think it's gonna be 1.2%. And you might be pretty confident that it's 1.2%.

Â But suppose the president says, you know what? I'm just doing some long range

Â forecasts, so, what if, what's inflation gonna be ten years from now? Well, who

Â knows what inflation is going to be ten years from now? So you may have some fairy

Â sophisticated models, but they're not going to give you a point estimate. So,

Â instead, what they might say is that I can tell you with pretty high probability that

Â it's going to be between zero and three percent. So it gives you a range. Right?

Â So what your model won't tell you exactly what's going to happen, cause there's too

Â many contingencies out there, there's too much complexity, too much uncertainty. You

Â can't say for sure, but your model might give you some bounds about what's going to

Â happen, and that can be really useful for making policy decisions. Okay. Reason

Â Four. Retrodiction. What do I mean by that? Well, you can use models with data

Â to predict. Past. Now there's a couple reasons you might do this. One reason is

Â you might not have data from the past, you might want to sort of use models to

Â figure out, what do we think the past was like? And this is think, you know,

Â geologists do this. You know, biologists do this, anthropologists do this,

Â archaeologists do this. They use models and data to try and figure out, what do we

Â think you know, temperature was like, how many animals do you think there were, what

Â were these civilizations like, those sorts of things. If you have the data, then you

Â can use models to see how good they are so you can actually retrodict data to

Â see if in fact your model would've worked, let me explain what it means, now suppose.

Â We're looking at some data streams. Perhaps it's, let's stick with that

Â employment. Suppose the unemployment data looks like this for some period of time.

Â Right. And now what you're doing is, is you're saying okay. We've got a model.

Â We're gonna ask how well that model will do. So what you do is you sort of fix

Â that. You give that model data up to here. So it's fitting pretty well. And then at

Â this point. Right here, you say hey, let's see how our model would predict from here

Â on now. If you run your model, it sort of goes like this. If it goes like that, you

Â can say, you know, our model in the past, if we were using the same model in the

Â past, it wouldn't have worked. And so that makes you fairly dubious about whether the

Â model's gonna work now. So, retrodiction, going back and testing past data, is a

Â good way to test how good your model really works. Fifth reason, predicting

Â other stuff. So you might construct a model for one reason. Let's suppose you're

Â really interested in the unemployment rate. You know, you construct a model to

Â predict the unemployment rate. But out of that pops out the inflation rate, so you

Â get something else. This is a good way to tell, you know, how strong your model

Â is.'Cause typically, you construct a model for one reason that gives you other stuff.

Â There's another type of predicting other that's way cool about models.

Â So when they developed the first models of the solar system, right? The heliocentric

Â model, the sun in the center, right? So you've got the sun sitting here in the

Â center, and the planets orbiting. The math didn't quite work out right. And they figured

Â out, there must be a big planet out here. That's causing the orbits of the

Â other planet to be skewed a little bit. And the big planet was Neptune. They

Â couldn't see it. But their model predicted it. So the model predicted something,

Â something else, something other, that was evident in the data. So models can

Â predict stuff. Other than what you expect them to predict. Which was really

Â cool. Alright, six, 63, to inform data collection. So let's suppose that you're

Â interested educational reform which is something I'm interested in. You want to

Â think okay, how do we make better schools? Well, what you can, remember in our last

Â lecture about being a clear better thinker. One thing models force us to do

Â is name the parts. So, I want to think, how are schools, how to make better

Â schools? Well there's a lot of data out there on school performance. So what i

Â want is, is I want some sort of model that explains why students do poorly and why

Â students do well. So you think, well what are the parts of that model? Well it might

Â things like Teacher quality, we call that TQ, right? There might be parental status,

Â we call that PS, whether your parents went to college, whether they got high school

Â degrees, whether they're doctors, lawyers, that sort of thing. There might be total

Â spending in the school district, that might matter, right? Things like class

Â size, just put CS for class size. Class size probably matters a lot. Right? You

Â might argue that, you know, technology. Matters is there technology in the

Â classroom. You might even argue, you know, there's general health. Is health a big

Â consideration. And you can even, you know argue, what is the, what are the other

Â students like in the school? What are the other peer effects? What is the effect of

Â what other students do? So if you don't have a model, you don't even know what

Â data to go get. So models help you to figure, okay, what data should we get, and

Â what data should be included, and what data, what data should we go out there and

Â find, so that use of models can be very useful since it tells you what data to go

Â out there and get. Our last two. For why you model art a little bit different, but

Â they're, they're similar to one another. And that is that we can use data, right?

Â To sort of tell us more about the model, and then we can use the model to tell us

Â more about the world. So let me, let me explain what I mean a little bit.

Â [inaudible] confused. So, one thing that these models force us to estimate hidden

Â parameters in the model. So, here's a, sort of a classic model from. Disease from

Â epidemiology the study of disease, is called the SIR model, so there's three

Â types of people, there's susceptible people, there's infected people, and

Â there's recovered people, so there's a disease you could be susceptible to it,

Â you could be infected, or you could be recovered and when you're recovered then

Â you're immune. You're not gonna get it again. So let's suppose that you know, you

Â work for the Center for Disease Control, and something you see, oh my gosh, people

Â are getting sick. But you don't know, there's some sort of flu going on. But

Â you're not quite sure how this is spreading. Is it spreading, is it

Â airborne, right? Is this virus spreading, you know, through mucus or something?

Â You're not sure. And you're also not sure how virulent it is, so you're not sure how

Â many people are gonna get the disease. What you've got, let's draw a little graph

Â where you get time on this axis. And you've got the number of people. Who have

Â the disease. And, what you can do is you can sort of see. Over time, exactly how

Â many people are getting the disease. Well, if you can see over time how many are

Â getting it from that data, you can predict how virulent the disease is. Like, how

Â likely it is to pass from one person to the other. And that's gonna allow you to

Â figure out, is the disease gonna go like this, or is it gonna go like that? And so,

Â from that data, you can estimate hidden parameters, right? Namely, how virulent

Â the disease is. Like, you can't tell by looking at data how likely one person is

Â to get it from another. You know, from just, you can't tell by looking at the

Â world. But by looking at how many people get it, you can go back and estimate. That

Â parameter. You can figure it out. That's what's really cool. Alright? Last reason,

Â calibration, so calibration refers to sort of constructing a model and then

Â calibrating it as close as possible to the real world. Let me give an example here.

Â So suppose I want to write a model of forest fires. So I'm going to draw some

Â really bad trees here. Here's a tree. Here's another tree, right. And I want to

Â know what's the probability, these are horrible trees, what's the probability

Â that the fire moves, right, from this tree to this tree. How fast does it move and

Â all that sort of stuff. Well what I can do, what I can gather is if the state

Â exists, tons of data about past forest fires, and with that past data I can

Â calibrate a really accurate model of forest fires. How likely are they to

Â spread? How you know their speed depends on how dry the trees are, how much

Â precipitation there's been, what the wind speed is, all that sort of stuff. Once

Â I've got all that data that would allow me then to figure out. You know, how

Â dangerous are particular forests? Right? I could say, oh my gosh, northern New Mexico

Â hasn't had rain in over two years. Here's how dry the soil is. Here's how dry the

Â trees are here is, you know, how many acres of forest we have, here's what the

Â wind speed is, and you can know exactly how dangerous a particular forest happens

Â to be at that particular moment in time. So you use all sorts of past indexes to

Â calibrate a particular model, you know, your big model and then you can use that

Â model. To construct policy. And that's what we're going to talk about in the next

Â lecture, right, how do we use models to make decisions, to strategize, right, and

Â to design things. Thank you.

Â