此课程是为影响转变数据成为更好的决定的想法而设计。最近在数据采集技术上的显著提升改变了公司进行有效决定的方式。

Loading...

來自 University of Pennsylvania 的課程

运营分析

個評分

此课程是为影响转变数据成为更好的决定的想法而设计。最近在数据采集技术上的显著提升改变了公司进行有效决定的方式。

從本節課中

Introduction, Descriptive and Predictive Analytics

In this module you’ll be introduced to the Newsvendor problem, a fundamental operations problem of matching supply with demand in uncertain settings. You'll also cover the foundations of descriptive analytics for operations, learning how to use historical demand data to build forecasts for future demand. Over the week, you’ll be introduced to underlying analytic concepts, such as random variables, descriptive statistics, common forecasting tools, and measures for judging the quality of your forecasts.

- Senthil VeeraraghavanAssociate Professor of Operations, Information and Decisions

The Wharton School - Sergei SavinAssociate Professor of Operations, Information and Decisions

The Wharton School - Noah GansAnheuser-Busch Professor of Management Science, Professor of Operations, Information and Decisions

The Wharton School

Hi everyone.

Welcome to Operations Analytics.

I'm very excited to have you in the course.

I'm Senthil Veeraraghavan.

I'm an associate professor in the Department of Operations, Information and

Decisions at the Wharton school.

I'll be introducing you to the Operations Analytics course.

In week one you're going to focus on descriptive analytics.

Uncertainty about future events is a key feature of the world that we live in.

Uncertainty plays a big role in many business decisions.

Operations is often about making good decisions in such uncertain settings.

First, I'm going to introduce you to one of the most central problems in

operations, a problem of matching demand with supply

in uncertain settings, called a newsvendor problem.

We need to be able to describe uncertainty in our data

in order to make better decisions.

Hence, we're going to grow our methods to forecast future outcomes.

We're going to learn how to think about forecasting when there is strand or

seasonal variation.

The descriptive analytics that we will cover in this week, as you will see,

will provide a strong conceptual basis for predictive and prescriptive analytics.

In the next three weeks, you'll learn how to use the analytics toolkits.

To evaluate different courses of action, to optimize and

to choose the best possible action.

Once again, I'm very excited to have you here.

Hello, welcome to Operations Analytics,

this the first week of Operations Analytics.

Where we are going to be covering Descriptive Analytics.

In the Descriptive Analytics, I'm going to be covering all four sessions.

Here is how we are going to split the material we are going to cover in

four sessions.

In the first week, in the first Session, I'm going to

talk about an operational decision problem called the newsvendor problem.

After I introduce the newsvendor problem, I'm going to talk about

random variables and I'm going to talk about demand distributions.

And that leads us to Session two.

In Session two,

I'm going to be talking about forecasting the past historical data.

Then I will talk about moving averages,

then I will talk about exponential smoothing in the advanced material.

In Session three, I'm going to be talking about

forecasting when the data shows trend or seasonality.

Finally in Session four, I'm going to be talking about forecasting for

new products and about fitting demand distributions.

And these are the four sessions we will be covering in the first week of our course.

In the first week, again, we'll focus on descriptive analytics.

Now let's jump right into Session one,

where I'm going to talk about an interesting operational decision problem.

Before we dive into analyzing data,

let's take a look at a fundamental problem that firms face.

It's an operations problem on how much to produce.

To answer this question, we need to know or estimate the cost of the product,

the price of the product and some data on the demand of the product.

Let's explore a problem to get started.

Here's a fundamental problem in operations.

I'll give you an example.

Suppose that you're making operations decisions for

a retailer, who orders a product from a supplier, and sells it to customers.

The ordered product items are received and placed on a store shelf.

Suppose there is a large customer population, and

each customer in the population may choose to buy or not buy the product.

If the customer chooses to buy, he arrives at the store to buy the product.

He buys it as long as it is available on the shelf.

However, you have to order the product before you see the customer demand

since you have to have the items available on the shelf.

Suppose that you only get one chance to order, that is,

you can not really change your purchase order after you make your decision.

Let's take a look at the costs in our operations problem.

You order the product from the supplier at some cost, three talers per item.

Talers are just some currency units that we're going to use for this example.

After your order is received and placed on the shelves, some demand, of course.

The product on the shelf sells at price 12 talers an item.

All unsold items at the end of the season or at the end of the day are salvaged.

You get no money from salvaging, that is the salvage value is zero talers per item.

You lose all the money if you buy and don't sell it.

Now, let's look at a timeline of events to understand the problem better.

Here's a timeline of events.

This is what happens in every period.

You submit an order to your supplier.

And the cost of purchase is three talers an item.

You receive all the other items, so whatever you order, you will receive it.

You receive these items almost immediately.

Very small time window passes before you receive them.

And you store them and you shelve them immediately.

Once you shelve them, there are some answered on demand.

Customers choose to come to the store and as they come they

see the items in the shelf and they buy it as long as it's available.

Let's say the selling pricing of the item is 12 talers an item.

And the key factor here is, the demand is uncertain.

So, you really do not know how many customers are exactly going

to turn up in this store and buy your item.

If you sell all the items that you have bought, good.

But sometimes, if you have leftover items that remain unsold,

they have to be salvaged.

In this example we're gonna assume the salvage value is zero talers an item.

So whatever is left is given away, and you lose all the money.

And that's it.

The problem ends, and then the next period you have to make an order and

meet the demand in the next period and so on and so forth.

Let us focus on an important point I mentioned before.

The demand is uncertain.

That is, demand could be anything.

It could be high, low, and so on.

Suppose you bought ten items.

Let us look at a high demand scenario.

Let us suppose the demand is 100.

You will sell all ten items.

Even though the demand was 100, you could sell only ten because that's all you got.

You sell all ten items and make a profit on all those ten items, 10*(12-3),

you bought it at three and you sold it at 12, so 10*(12-3) is 90 talers.

That's your profit.

Even though your demand was 100, you sold only ten and that's the profit you made.

There's also low demand scenario, a demand scenario where suppose there's no demand.

In this case you bought 10 items and the demand is nothing.

So you sell nothing and you loose all the money in buying those items

because you bought them at 3 talers, ten items and therefore you loose 30 talers.

This is because there is no salvage value for items that are left over.

With this timeline in mind, let's look at the problem.

Based on the information before, let's recap the problem.

To recap, you don't know what the demand is going to be, and

you have to decide on the number of units to order from supplier.

Before seeing the customer demand.

In this case, what could help?

Past demand data could be helpful.

Fortunately, we have demand data from the past 100 periods.

Here's the past demand information.

In the graph that you see,

you see demands that were observed in the past 100 periods.

As you can notice, there's a lot of variability in demand.

In the first period of the observations, you see the demand was 29.

In the last period of observations you see the demand was 41.

Let's understand the demand part some more.

Here's some more information about the past demand data.

From the observations over the past 100 such periods

we see that the maximum demand observed was 81.

And the minimum demand observed was 15.

You can even calculate the arithmetic average of

those 100 observations and that is 52.8.

Based on the data I'm going to ask you to go through an exercise.

An exercise on deciding how much to order.

Before you make your decision, let's go through the following points.

First, there is no penalty for a wrong answer, or conversely,

no extra course credit for the right answer.

It's honor based, but you get one attempt at making your decision.

The objective of the exercise is not to test you or

to grade you, but to set an initial baseline thinking about

how to think about these problems as we start the course.

So I ask you the following.

Write down your answer on a sheet of paper or a post-it note and

keep the sheet or the note throughout the course.

We will see the best answer in the course and you will then get a chance to

compare your answers and calibrate the learning progress.

How much would you order?

That is the question.

Suppose you're a manager contemplating the question of how many items to order from

the supplier.

Choose the quantity (Q) that you will order.

Once you select Q, the market will produce 50 random demand instances.

From the distribution of demand, similar to the figure I showed you.

Each random demand instance will correspond to the demand value

you may face in the coming selling season.

Your objective is to select Q to maximize total profit that you

will earn when faced with these 50 random demand values.

Now, take a moment and write down your answer.

On a sheet of paper or a post-it note.

Once you have written down the answer,

we are now ready to move on to the next slide.

The problem you just saw is called a Newsvendor problem.

It's characteristics are the following.

You have an objective, usually maximize profits, minimize costs, or

improve market share, etc.

You have to make one decision.

Usually, how much to buy or how much to plan for, and

this happens before you see the future demand.

Then demand occurs, profits and costs are realized.

This is called a Newsvendor problem because it is similar to

a vendor who sells newspapers.

You buy too much, and you may be left with unsold newspapers.

Or you buy too little, and you'll forego revenue opportunity.

In the scores, we will show you how to think about this problem and

how to analyze this problem.

Now I'm going to show you an application of the Newsvendor problem

at Time Magazine who's a Newsvendor.

In the Time Magazine supply chain they had the following problem.

The stores were either selling out inventories,

which means they had too little inventory.

Or they sold only a small fraction of the allocation,

which means they had too much inventory.

So, this is a news vendor problem.

Time Magazine evaluated and adjusted for every issue they printed, the following.

The national print order,

which is the total number of copies printed and shipped.

Two, the wholesale allotment structure,

which is how those copies were allocated to different wholesalers.

Three, the store distribution,

which is the final distribution of the magazines to the stores.

Note, the above three decisions are made before the actual demand for

the weekly issue is realized.

Therefore, they need to analyze past data and

they have to be able to forecast future demand.

In fact, Time Magazine reports saving

about $3.5 million annually from tackling the Newsvendor problem.

This story is captured in a white paper by Koschat

in Interfaces Magazine in the year 2003, volume 33.

Other than the Time Magazine Newsvendor problem,

let's look at some broad applications of the Newsvendor problem.

Here are some of them.

Every year, governments order flu vaccines before the flu season begins, and

they make this decision before the extent or the nature of the flu strain is known.

One question is, how many vaccines to order?

This is a Newsvendor problem because you have to know how to make your decision

before the demand is known.

Smartphone users buy mobile data plans before they know their

actual future usage.

In this case, what's the right plan for you?

This is a Newsvendor problem again,

because you have to make a decision before future demand is known.

Consumers buy healthcare insurance

plans before they know their actual health expenditures.

Again, how to think about the right plans.

This is also a Newsvendor problem.

For all the above examples that we saw, some forecast or

future demand is essential.

Let's think about how to do this.

It is essential to forecast future demand.

So, let's understand what forecasting is all about.

What's forecasting?

Forecasting, the primary function of forecasting is to predict the future.

Why are we interested in predicting the future?

Because it dictates the kind of decisions we make today.

If we know something about the future, we can make better decisions today.

Who uses forecasting in their jobs?

A lot of jobs use forecasting.

Typically, generally speaking,

we forecast demand for products, we forecast demand for services.

We forecast inventory needs, we forecast capacity needs daily and so

on and so forth.

But what makes a good forecast?

First, forecasts should be timely, it should be reliable,

it should be as accurate as possible, and it should be in meaningful units.

The forecasting method should be easy to use, and be understood in practice.

Let's look at characteristics of forecasts.

Point forecasts are usually wrong.

In fact, this is the first rule of forecasting.

Why?

Let me give you a couple of examples.

I forecast in December 2015, there will be 37 centimeters of snow.

I forecast we will sell 314 umbrellas during the rains next week.

Really likely this forecast will turn out to be wrong.

For example, you could have 37.5 centimeters of snow in December.

Or you could sell 317 umbrellas

during the rains next week or we could deviate even more.

This happens because the demand could be a random variable.

It could deviate from your forecast.

Therefore, a good forecast should be more than a single number.

Typically we provide mean and standard deviation.

You could also provide a range, high and low for example.

For example, TV weather forecasts provide the high temperature tomorrow and

the low temperature tomorrow.

That's a way of providing more than a single number.

We have to think about modeling uncertain future.

Usually we can model future through probability distributions.

Let's think about this further.

We often do not control purchasing behavior.

As a result, we cannot predict future demand with certainty.

So how do we describe uncertain future demand?

We can try to decide what future demand scenarios are possible and for

each scenario, estimate the likelihood of its realization.

So where do scenarios come from?

They could come from your past data or they could come from expert estimates.

Let's look at an example of a model of future demand.

let's start by looking at a small number of scenarios, let's say, three scenarios.

Let's call them high demand scenario, ordinary demand scenario,

and low demand scenario.

Let's say that high demand scenario corresponds to a demand value of 80.

Ordinary demand scenario to the value of 50.

And low demand scenario to a value of 20.

For each scenario, a likelihood of its occurring must be estimated.

In our example of model of future demand,

we have to estimate how likely each scenario is.

Where do the estimates of the likelihood come from?

They come from statistical analysis of past data.

Suppose that after analyzing the past data and

using some subjective inputs, we estimated that the scenarios

have the following likelihoods of being realized in the next selling season.

Likelihood of high demand is 20%.

Likelihood of normal or ordinary demand is 70% and likelihood of low demand is 10%.

In our scenario analysis, the project of the demand is not

equal to a certain number with sure probability of one, but

rather can take any of the three values with the corresponding probabilities.

In essence, we have just created a probability distribution for

the future demand.

Demand could be 80 with probability P1 0.2.

Demand could be 50 with probability P2 0.7.

Demand could be 20 with probability P3 of 0.1.

Probability distributions like that one I just described,

are described by a number of distinct scenarios, each with attach probabilities.

And such probability distributions are called discrete probability distributions.

Finally, note that the probabilities are all greater than 0,

.2, .7, .1 and so on, and they all add up to 1.

That is .2 plus .7 plus .1 is equal to 1.

In this light, I show you how the probability distribution looks like.

The scenarios are shown here, 20, 50, and 80 and

the corresponding probabilities are shown.

Probability distributions are typically described by mean and standard deviation.

For any probability distribution,

even a simple one reflecting three demand scenarios,

which we just saw, two useful descriptive quantities are often calculated,

mean, which is also known as the expected value, and standard deviation.

Let's try and describe them.

For a discrete probability distribution, the mean is just defined

as the sum of the products of scenario values and their probabilities.

For a demand distribution, the mean, represented by D bar

will be p1 times D1 + p2 times D2, + p3 times D3 or

0.2 times 80 + 0.7 times 50 +

0.1 times 20, which gives us 53.

How do we interpret mean of 53

reflects the demand value that we will get on average in a selling season if

you keep observing the deviations over an infinite number of selling seasons?

In other words, if you keep drawing observations over infinite selling

seasons, the average value, or your expectations should be 53.

I showed you the distribution before, now on that graph let me show you the mean.

The red line, the red vertical line, represents the mean or

the expected value of the distribution.

Now we will look at the standard deviation.

Standard deviation describes roughly speaking,

how far away the actual random variable values are from the mean.

On average.

In other words, in the colloquial sense,

it describes how spread out your distribution is around its mean.

How can you calculate standard deviation?

Standard deviation is defined as the square root of

the sum of following things.

Products of the scenario probabilities,

with the squares of the difference between the scenario value and the mean value.

Let me say that again.

You take the scenario value and the mean value, take the difference.

Square the difference.

Multiply it by the scenario probability.

Do this for every scenario.

And then add them up.

And then take the square root.

For example, for the three-scenario demand probability we consider,

the standard deviation is calculated as follows.

Take the difference between the mean scenario, And the average.

Square that.

Multiply it by the probability.

And do this for every scenario.

Take the scenario demand, Its difference with the average.

Square that and multiply it by the corresponding probability and

do this for every scenario.

You get 0.2 * (20- 53) squared 0.7 * (50- 53) squared.

0.1 * (80- 53), the whole thing squared, and

we get 16.16 is your standard deviation.

I showed you the graph of the probability distribution in green before.

I also showed you the mean,

which is represented by the vertical red line, mean of 53.

Now, I show you roughly how the standard deviation looks like.

The standard deviation is a measure of how

spread out your probability distribution is around the mean.

The knowledge of mean in standard deviation values helps us to support

a general intuition about the the nature of the random variable.

What if we have more than three scenarios?

It is somewhat straightforward to do this.

So let's think about the following n scenarios.

Demand 1 with probability p1, demand 2 with probability p2,

demand 3 with probability p3.

And so on up to demand n with probability pn.

Now all these probabilities are positive and they all add up to 1.

Which is p1 + p2 + p3 + and so on and so forth up to pn = 1.

Now, how do we calculate the mean and

standard deviation of this demand distribution with n scenarios?

It's again the straightforward extension of the three scenario case.

We take for the mean of the expected value, D bar.

We calculate p1 times D1, + p2 times D2, + p3 times D3,

and so on, up to pn times Dn.

For standard deviation,

we calculate the difference between the scenario and the average.

D1 minus D bar, square it, and multiply it by the corresponding probability p1.

Do this for p2.

p2 * (D2- D), the whole squared.

And so on and so forth, up to the last scenario,

where the square is Dn minus the average, the whole thing squared multiplied by pn.

Once you have the sum of all these sums,

take the square root of the entire sum and that gives you the standard deviation.

So far we have looked at a discrete probability distribution

with a number of future scenarios.

Each scenario with some attached probability.

But what will happen to a discrete probability picture when

the random variable being modeled has a really large number

of scenarios on any small interval of values?

And the probability that any one scenario is realized is really small.

Think of examples such as stock prices or the amount of rainfall in a region.

For example, There are very many possibilities and

very many scenarios of rainfall being between 37 centimeters to 39 centimeters.

And the probability of the rainfall being exactly in one scenario.

Let's say, 37.1 centimeter, is really low.

In the cases like this, it makes sense to describe such probability distributions

using groups of scenarios rather than focusing on each individual scenario.

Distributions like this are called continuous distributions.

In the picture below I show you the distribution of a random variable X.

The values of X are on the x-axis, and

the corresponding probability densities are on the y-axis.

Distributions like this are called continuous distributions.

In the continuous distributions' case,

we're going to look at groups of scenarios rather than a single scenario.

Again, the light green area shows you the probability that

random variable X takes values between a minimum of X1 and a maximum of X2.

And the area under the entire curve is equal to 1.

If I ask you, what's the probability that random variable X can take any value

between the lowest possible point and

the highest possible point, the probability must be equal to 1.

And therefore, the area under the entire curve is equal to 1.

One of the most popular examples of a continuous probability distribution

is the normal distribution.

Normal distribution allows for the random variable X to take any value from negative

infinity to positive infinity as you see in the graph.

The nice thing about normal distribution is that it is completely characterized by

two parameters, the mean mu and the the standard deviation sigma.

Normal distribution looks like a bell curve.

And it's likely the most commonly encountered distribution.

There exist statistical formulas, also implemented in Excel,

that calculate a probability that a normal random variable with given mean and

given standard deviation, will produce a value between X minimum and

X maximum in this interval.

And this can be calculated on Excel.

Other than normal distribution,

there exists a large number of other popular continuous property distributions.

Exponential distribution, beta distribution, etc.

With easily computable mean and standard deviation or variance.

Each of those distributions is often used to describe

a specific uncertain setting or quantity.

For example, normal distribution is often used to describe the distribution

of future percentage changes in the values of stocks or in FX rates.

Another example: exponential distribution can be used to characterize

the time between successive arrivals of customers in a service system,

such as a call center.

Let's return back to characteristics of forecasts.

Point forecasts are usually wrong.

Why?

This is because the demand could be a random variable.

In the past few slides, we've been looking at how to describe demand distribution and

how to characterize random variables.

Using that information a good forecast should be more than a single number.

Forecasts should include some distribution information, typically the mean and

standard deviation.

It is also worth remembering that aggregate forecasts are usually more

accurate and accuracy of forecasts erode as we go further and

further into the future.

Therefore long term forecasts are less accurate than short term forecasts.

Finally, don't exclude known information in your forecasting process.

Unless you have a good reason to do so.

Let’s examine some subjective forecasting methods.

First composites, composites are a way of aggregating forecasts that come

from different locations, or different people, or different geographies.

For example, there are sales force composites.

Sales force composites are found by aggregation of sales personnel's

estimates of demand.

There is election polling composites,

there are websites that aggregate polling data and put them together.

There are customer surveys.

Customers provide the subjective evaluation of different services or

demand and that's put together.

There's jury of executive opinion.

Jury of executive opinion is to collect informative

data from executives and put them together.

Then finally, there is the Delphi Method in which the individual opinions

are compiled and reconsidered.

And you compile and reconsider and

repeat the process until a group consensus is hopefully reached.

We'll get into subjective forecasting methods nearly at the end of week one,

during our last session.

Let's look at some subjective forecasting methods.

The methods we're gonna look at are composites.

Composites could be an aggregation of data.

For example, there's a sales force composite where

the estimates from sales personnel are collected and aggregated together.

Sales personnel are commonly in touch with customers.

And they collect customer data and they collect forecast information and

that's aggregated to come up with a subjective forecast.

Similarly, election polling composites exist and these are done by

websites that aggregate polling data that is collected from customers.

Similarly, customer surveys.

Or a jury of executive opinion where there is a limited number of experts

put together in a single location who come up with a forecast and

then combine the forecast together.

Delphi Method.

In Delphi Method the individual opinions are compiled and reconsidered.

The collected opinions are put together and

it is examined whether a group consensus has evolved.

If there is no consensus then the individual opinions are recompiled and

reconsidered and this process is repeated until consensus is reached.

There are many ways of doing subjective forecasting methods.

We will return to subjective forecasting methods at the end of week one,

in our last session when we have limited demand data and see how we can do better.

For now, we're going to focus on objective forecasting methods.

How do we forecast objectively using past data?

We can leverage past data to come up with forecasts.

And two primary methods that are used for

forecasting are causal models and time series models.

Causal models are models that are explained through causal analysis.

Let's see what it means.

Let D be the demand or future outcome that you need to predict and assume

that there are n variables, or n root causes, that will influence the demand.

A causal model is one in which the demand D

is formulated as a function of those n root causes.

As you can imagine, causal models are generally intricate and

complex and therefore need advanced tools in addition to domain expertise.

In this course, we will focus mainly on time series based models.

What's a time series method?

A time series is just a collection of past values of

the variable that's being predicted.

In fact, it can be considered as a naive method.

The goal is to isolate patterns in the past data

to come up with good predictions about the future using the past data.

The past data might have characteristics such as trend,

seasonality, or cycles in the data, or just randomness in your data.

We use these patterns to come up with descriptive statistics,

which are useful then in coming up with a prediction or weather forecast.

And that's what we will do in the next few sessions.

Continuing in the same vein,

we're gonna be doing forecasting, leveraging the past historical data.

In particular, I'm gonna be looking at two methods.

One, moving averages method and

exponential smoothing, which I will cover in the advanced slides.

In this session, we saw one of the fundamental problems in operations,

called the Newsvendor Problem.

I showed you an example of the Newsvendor Problem,

in which you have to make an operational decision under uncertainly.

We also saw an application of the Newsvendor Problem

at a well known magazine firm.

I've emphasized the importance of making good decisions in the face of uncertainty.

In this course we hope to guide you to making better decisions.

To make better decisions,

first we need to be able to describe the uncertainty in the data we collect.

And we also need to use these data to forecast future events.

These are exciting concepts.

We'll see more of these concepts in the next session.