你是否好奇数据可以告诉你什么？你是否想在关于机器学习促进商业的核心方式上有深层次的理解？你是否想能同专家们讨论关于回归，分类，深度学习以及推荐系统的一切？在这门课上，你将会通过一系列实际案例学习来获取实践经历。在这门课结束的时候，

Loading...

From the course by University of Washington

机器学习基础：案例研究

7080 ratings

你是否好奇数据可以告诉你什么？你是否想在关于机器学习促进商业的核心方式上有深层次的理解？你是否想能同专家们讨论关于回归，分类，深度学习以及推荐系统的一切？在这门课上，你将会通过一系列实际案例学习来获取实践经历。在这门课结束的时候，

From the lesson

Deep Learning: Searching for Images

You’ve probably heard that Deep Learning is making news across the world as one of the most promising techniques in machine learning. Every industry is dedicating resources to unlock the deep learning potential, including for tasks such as image tagging, object recognition, speech recognition, and text analysis.<p>In our final case study, searching for images, you will learn how layers of neural networks provide very descriptive (non-linear) features that provide impressive performance in image classification and retrieval tasks. You will then construct deep features, a transfer learning technique that allows you to use deep learning very easily, even when you have little data to train the model.</p>Using iPhython notebooks, you will build an image classifier and an intelligent image retrieval system with deep learning.

- Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering - Emily FoxAmazon Professor of Machine Learning

Statistics

[MUSIC]

Great. We talked about an application of finding

cool shoes or dresses just based on image features.

The technique that we're gonna use today is called deep learning.

And in particular it's based on something called neural networks.

But before we get there, let's talk about data representation.

We discussed things like TFIDF and

bag of word models, but how do you really represent data when it comes to images?

These are called features, and is a key part of machine learning.

So typically, when you talk about machine learning, we're given some input.

Let's say we're doing classification, we talked about sentimental analysis.

You're giving a sentence, it goes through a classifier model and

we decided that sentence has positive or negative sentiment.

In image classification, the goal is to go from an image,

this is my input, the pixels of the images, to a classification.

In this case, this is my dog and I want to classify it as

a Labrador retriever as opposed to other kinds of dogs.

So as we discussed, features are the representation of our data

that's use to feed into the classifier.

There are many representations, so for example text,

we talked about bag of words and TFIDF.

With images there's a lot of other representations.

And we'll discuss a few more of those in this module.

But today we're really gonna focus on neural networks,

which provide the non-linear representation for the data.

Now,let's go back to classification.

Let's do a little review.

We discuss linear classifiers which create this line or

linear decision boundary between say the positive class and and the negative class.

And the boundary is stated by the score, w0 +

w1 x the first feature, x1 x2 x the second feature and so on.

On one side, on the positive side, the score is greater than zero and

on the negative side the score is less than zero.

So, if I have this nice score function,

I can separate the positives from the negatives.

In neural networks we're going to represent such classifiers using graphs.

And so, here we have a node for each feature x1, x2, all the way

to the dth feature, xd, and a node for the output y, which we are trying to predict.

So, the first feature x1 is multiplied by the weight w1, so

I'm putting that weight on the edge.

X2 is multiplied the second weight, w2 I'm gonna put it on that edge,

all the way to xd, which is multiplied the weight wd, to put in the last edge.

And the last weight, w0 doesn't get multiplied by any feature, but it gets

multiplied by the number one, so we put a little one at the top multiplying w0.

And so, if you imagine multiplying the weights, w0 through wd

with the features x1 through xd and the coefficient one, you get the score.

And when the score is greater than zero, we declared the output to be one and

when it scores less than zero, we declare the output to be zero.

This is an example of a small, one layer, neural network.

So, we describe the small linear classifiers as a neural network,

as a one layer neural network.

What can this one layer neural network represent?

Let's take the function x1 or x2.

Can we represent that using a small neural network like this?

Well, let's define the function a little bit more formally.

So, what we have is variables x1,

x2, and the output y.

And there's some possibilities.

X1 can be zero, x2 can be zero, and since it's x1 or x2,

the output y would be zero in this case.

When x1 is one, and x2 is zero, the output is one.

When x is zero, when x2 is one, the output is one,

and similarly, when they're both one, the output is 1.

So, we want to define the score function such that the value is greater

than zero for the last three rows, but it is less than zero for the first row.

How do we do that?

So for example, there are many ways of doing it actually, but

if I put a weight of one.

When each one of the edges x1 and x2, and we think about the score, the score

of the first row is zero and the score of the other rows are greater than zero.

So, we wanna add a little bit of separation, so

we might put a negative value on the first edge.

Let's say -0.5 and so let's see what happens to the score.

When x1 is zero and x2 is zero, then the score becomes -0.5 and

yey I have something less than zero and my score is correct, my output is correct.

When x1 is one and x2 is zero, I get the score of 0.5.

Similarly, when x1 is zero and x2 is one.

And finally when they're both one, I get a score of 1.5.

So, with this simple weights on the edges, I represent the function x1 or x2.

Now, can we represent the function x1 and x2?

Well, similarly we can put weights one and one on the edges x1 and

x2 but in this case, we only want to turn it on when both x1 and x2 have value one.

So, instead of putting -0.5 on that top edge, we put -1.5.

And, if you fill out the table just like we did with the first example,

you'll notice that we now represented the function X1 and

X2 using a simple neural network.

So, a one layer neural network is basically the same as the standard

linear classifiers we've been learning about in this course so far.

So what can linear classifiers not represent,

we said it can represent x1 or x2.

It can represent x1 and x2 but what's a function,

a very simple function it cannot represent?

Well, here's an example.

There is no line that separates the plusses and minus in this example.

This function's called the XOR.

I like to call it the counter example to everything.

So, whenever you're trying to find the counter example,

the first thing to try is XOR.

Now, for this case, the linear features we described are not enough and

we need some kind of non-linear features, and this is when

you're networks come to play for reals and we'll see an example of that.

So let's review what XOR is.

So, XOR has value one when either X1 is true and

x2 is false so not x2 or not x1, so

x1 is false or zero and x2 has value 1.

So, how can we represent this with a neural network?

Well, let's call this first term z1 and the second term z2.

What we're gonna do is build a neural network to represent

not directly the inputs x1 and x2 to predict y, but

don't predicts intermediate values z1 and z2, and then those are going to predict y.

So let's take z1.

How do we represent a neural network, only a neural network that can predict z1.

We discussed that a little bit, but here goes.

It's a little bit different than the end case from before.

Since, we have to negate, we say not x2, we put a minus 1 on that edge and

we put a plus 1 on x1 and a minus 0.5 in there.

We now have our representation for z1.

Similarly for z2, we put a minus sign on the edge x1,

this edge here, x1 to z2, we put a plus one on the edge x2 to z2 and

then -0.5 on the constant edge, and now it represents z2.

And the last step, if Z1 is it to exist, all we have to do is or them.

And we already know how to or to bullien variables.

It's just one, 1- 0.5.

And this is an amazing time for us.

We've now built out first deep neural network,

not super deep, has two layers but is a little exciting.

We just built our first neural network.

It was a two layer neural network, but in general neural networks about,

there's layers and layers of transformations of your data.

And we use these transformations to create these non linear features, and

we'll see some examples of that in computer vision.

Now, neural networks have been around for

about 50 years, about as long as machine learning's been around.

However, they fell is disfavor around the 90's because

folks are having a hard time getting good accuracy in neural networks.

But everything changed about ten years ago,

because of two things that came about.

First, it was a lot more data and because neural networks have so many, many,

many more layers.

So, many layers you need a lot of data to be able to train all those layers.

They have a lot of parameters.

We'll see a really exciting neural network of 60 million parameters.

So, we need a lot of data to train them.

So, we recently have come about lots and lots and

lots of data from a variety of sources, especially the web.

Now, the second really great change that made deep neural networks possible

is advances in computing resources.

Because we have to deal with bigger neural networks and more data,

we need fast computers and GPUs which were originally designed for

accelerating graphics for computer games.

Turned out to be exactly the right tool to build and

use neural networks with lots of data.

So because of GPUs and

because of these deep neural networks, everything's changed.

And now we've had a lot of impact in the real world.

[MUSIC]

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.