0:00

>> In this module, we're going to review matrices.

We'll define matrices, we're going to define operations such as transposes,

inverses. We're going to talk about linear

functions, and how they are related to matrices.

We're going to define concepts such as rank, which will play a role later on in

the course in defining complete markets, and hedging instruments.

To start at the very basic, what is a matrix?

A matrix is simply a rectangular array of real numbers, and we represent the matrix

by the number of rows, and the number of columns.

So I'll walk you through some examples. So this matrix a, has 2 rows, row 1, row

2. It has 3 columns, 1, 2, and 3, and

therefore, we're going to call it a 2 by 3 matrix.

2 rows, 3 columns, 2 by 3 matrix. Its elements, I'm going to index by the

row index and the column index. So if you have, if I'm talking about an

in, an element called a 1 2. So the first index is going to be the row

index. The second index is going to be the column

index. So, it's going to be row 1 column 2.

So that's the number that I am talking about.

So, a 1 2 is equal to 3. Similarly, if I talked about another

element a 2 3, so remember the first index refer to the row, so we're talking about

the second row. The second index is going refer to the

column. The third column, so that's this element

down here, so a 2 3, is equal to 5. So every matrix is a rectangular array, we

say a matrix is an m by n matrix, if it has m rows and n columns, so that's down

here. So this is the general a that I'm talking

about. This particular matrix has m rows.

So if you look at a partic, elements over here, it goes from a 1, 1.

And remember, the row index always comes first.

So it's a, 1, 1, a, 2, 1, and so on, these dot, dot, dot means, and so on.

A m 1, it has n columns so the column index comes second.

It's a 1, 1, a 1 2, a 1 3, and so on, and a 1 n, and the last one is a m n.

Here's another example; b, it has 1 row and 3 columns, that's a row vector.

In the col, in the module and vectors, we noticed that there were two kinds of

vectors, so column vectors and row vectors.

Row vectors have row equal to 1, and a many number of columns.

Column vectors have column equal to 1, and several rows.

So here's an example of a column vector; w equals, 3, 4, 1.

3 rows, 1 column, so either, I can think of it either as a vector in r 3, or as a

matrix in r 3 times 1. The second 1 says it's got 1 column and 3

rows. So the first thing that I want to do, is

introduce an operation called transpose. Transpose really takes rows to columns and

columns to rows. So, the easiest way to start thinking

about it is to think in terms of column vectors and row vectors.

I've got a column vector here, v. It's got 1 column and 3 rows, 1, 2, 3

rows. If I take it's transpose, if I put the

operation transpose on this vector v, I'll go from column to row.

So v transpose is simply this row vector. Now, we want to do the same thing, to

matrices. I want to pretend that the matrix is

nothing but a collection of columns. I'll take every column and flip it and

make it into a row, and that's what transpose does.

I have a column 2, 1 over here, I'm going to flip it, and make it a row, 2 1.

Same thing that I did with the vector. I have 2 6 4, 2 6 4, 2 1, 2 1.

I take the second column, and I flip it, I get the second row.

I take the third column and I flip it, I get the third row.

So columns go to rows and vice versa, rows go to columns.

It's, it's symmetric in that sense. More generally, here's a matrix a which is

r m times d. To remind, to remind you once more, m is

the row index, d is the column index. So it has m rows and d columns.

If I put the transpose operator, that's what this is doing, I'm going to take this

column and transform it into a row, and that's what I'm going to do to everyone of

them. So now, after the transposition is done,

you end up getting that the number of columns becomes the number of rows, and

number of rows becomes the number of columns.

And so a matrix which is in r n times d, ends up being a matrix in r d times m.

In the module on vectors, we had learned, that the inner product between 2 vectors,

involves a transpose and a multiplication, so we figured out the transpose part.

It's going to take a column vector into a row vector.

Now we're going to try to see what happens, how do I multiply matrices?

So the next concept is that of a matrix. So I've a matrix which is rm times d.

So it's the, the row index is m, the column index, I'm going to put them in red

just to make, emphasize the fact. The column index here is d, the row index

for the matrix b that I'm going to multiply to it, must be the same as the

column index of a. The inner multiplication dimension should

be the same, otherwise it cannot multiply these matrices.

And the column index of b could be anything, say p in this particular case.

So when you multiply these matrices, you get a new matrix.

So the inner index that was there this d index, it disappears, and the new matrix c

that you end up getting, is going to be in m times p.

So the m in the row index, becomes the row index here, and p which is the column

index, becomes a column index there. So you end up getting a matrix which is in

r m times p. The inner dimensions have to be the same

in order the multiplication to happen. And when the multiplication happen, that

disappears. There's a general formula for how to get

the elements of c, but before going there, let me give you some examples, so that

this idea becomes clearer. So I've got 2 matrices down here.

I've got this matrix, which is in r, 2 times 3, 2 rows, 3 columns.

This is a vector, but I'm going to treat it as a matrix, which is an r 3 times 1, 3

rows and 1 column. Now, if I apply the general rule that I

just talked about, I should end up getting a matrix, which should be r 2 times 1,

because I end up getting that the inner dimension which is the same for the multi,

for the vec, matrix a and the matrix b. That will disappear and the outer

dimensions are what are going to define the product.

So this is the matrix c that I'm going to get, that's in r, 2 times 1, as I

expected. So, how do I get the elements of this

matrix c? What I do, is I take the rows, and

multiply them to the corresponding columns.

So in order to get c 1, 1, so I have, this is the row index, this is the column

index. In order to get this element c 1, 1, I'm

going to take the first row of a, and multiply it to the first column of b.

I've got the first row, I've got the first column, and what does it mean to multiply

a row and a column? I multiply them component by component add

them up, sum product. So I take the 2, multiply to that 2.

I take the 3, multiply to 6, take the 7, multiply to 4.

And that's what I have written up here. 2 times 2, plus 3 times 6, plus 7 times 4.

In order to make it clear what I'm trying to do here, the elements in the brackets

correspond to the column, the elements outside the bracket corresponds to the

row, and the first component ends up being 50.

So let's look at c 2 1. Second row, first column, same story.

So I've got the second row here, and that's going to multiply the same column,

and I'll get elements 1 times 2, 1 times 2, 6 times 6, and 5 times 4.

And, if you multiply, multiply all of that together you get 58, and that's how matrix

multiplication works. So let's now go back and look at the more

general case about matrix a, which is r m times d.

I've got a matrix b which is r b times p, their inner dimension disappears.

I get a matrix c, which is r m times p. If i'm looking at a particular element c i

j, this is the row index, that's the column index.

How do I get this element? I take the ith row of a and multiply it to

the jth column of b. So the ith row of a, is a i 1, a i 2, the

row index remains the same all the way through.

Over here, I have b one j, b two j, up through b d j.

So these last indices actually should be d not n.

If you multiply them together using the rule that we just generated, I'm going to

multiply this element with that element and component wise, and then add it up.

We'll end up getting, this is the same as the sum over l going from 1 through d, a,

i, l, b, l, d, which is exactly what we have done down here.

So once we know how to multiply matrixes, I can start simplifying a lot of things.

L 2 norm, remember, in the modular and vectors we talked about L 2 norm?

We had said that L 2 norm is the sum of the components squared, square root.

We'd also shown that this is nothing but a dot product square root.

Now I'm going to show you a different interpretation for it.

So I've got a vector 1 and minus 2. It's, L 2 norm is 1 squared plus minus 2

squared square root. I'm going to write that as 1 minus 2,

that's a row vector times 1 minus 2, which is a column vector, and why do I do that?

Because if I write out the expression for what this multiplication means, its the

first component times the first component, second component times the second

component, square root. So that's 1 squared, plus minus 2 squared,

the same thing as down here. Now this row vector, I can also write it

as, this vector, 1 minus 2, which is a column vector transpose, a transpose takes

it to a row. Now we have the same vector, a vector 1

minus 2 transpose times itself square root, and that's what is written down

here. The inner product between 2 vectors is

nothing but take the first vector, take it's transpose, and multiply it to the

second vector. This is a reminder, we had said in the, in

the module for vectors, that if I don't specify it, every vector is a column

vector. So, w is a column vector, v by itself is a

column vector, I take it's transpose, I get a row vector, row vectors times a

column vector always gives me a real number.

And that's why the inner product turns out to be the real number.

Alright. We know what our matrices now, the

rectangular arrays of numbers. We know how to take its transpose, we know

how to take its multiplication. Now we want to take the next step, and try

to figure out what can matrices do, how are matrices and vectors connected to each

other? And their connection turns out to be,

coming from linear functions. So that's our next component of this

module, linear functions. I'm going to call a function linear, if it

has the following property. I take a vector x, and I take a vector y,

I multiply this vector by a number alpha, and I multiply the vector y with the

number beta, beta and alpha are real numbers.

Back in the modular on vectors, we had talked about that alpha x plus beta y is

another vector. All we do is making, multiply every

component of x by alpha, every component of y by beta, and add them both component

by component. So, now alpha x plus beta y is a new

vector. I'm going to take the function at this new

vector. If it so turns out that for any choice of

x, any choice of y, any choice of alpha and beta, the function evaluated at this

combination vector, is nothing but the same combination of evaluations of x and

y. So, what's, what's important in terms of

linearity, is that in one case I'm taking the liner combination inside the bracket,

in the other case I'm taking the linear combination outside the bracket, and the 2

answers are the same, alpha x plus beta y. The function evaluated at this vector does

nothing but the function evaluated at the vector x multiplied by alpha, plus the

function evaluated at the vector y multiplies by beta.

If this is true for all x, y, alpha, beta the function is linear.

There is a simple theorem, we won't get into it, that a function is linear, if and

only if, I can write that function as the multiplication of the vector by a matrix.

So f of x is a linear function, if and only if, I can find some matrix a, such

that f of x is nothing but a times x. It's just a multiplication by a matrix a,

and that's why this is the next mod, next component to understanding what matrices

can do. So, if I want to have a linear function

from r 3 to r, I have to take vectors in r 3, and I have to get a number in r.

So what should be a? So this multiplication, if a is going to

be an r tie, m times d, and x is going to be an r d, a times x is going to be a

vector in r to the m. Now, I want to map r 3, to r and

therefore, the row index of a should be 1, and the column index should be exactly

equal to d in effect, this should be a row vector.

So here's a particular row vector 2 3 4, just as an example, if you look at the

combination, if you look at the multiplication of 2 3 4 to the vector x1

X2, x3, you end up getting 2x 1, plus 3x 2, plus 4x 3.

It takes vectors and maps it to real numbers.

Take it one step further, here's another matrix a.

Now this matrix has 2 rows and 3 columns, it's going to multiply a vector with 3

rows. And you'll end up getting another vector

which has 2 components. Why 2?

Because the row index is 2. So again, component by component

multiplication. 2 times x1, plus 3 times x2, plus 4 times

x3, that gives you the first component. 1 times x1, 0 times x2, 2 times x3, that

gives you the second component, and that's what linear functions are.

Linear functions take vectors, multiply them by a matrix, and give another vector.

So, in most of this course, we won't really be interested in just functions,

we'll be interested in constraints, we'll be interested in sets of vectors that are

defined by functions. These might be portfolios, these might be

values of options, these might be other kinds of things that are, random variables

and so on. So we're going to talk about 2 different

kinds of constraints, a linear equality. It would mean all those vectors x, such

that they satisfy some linear equality. This is a linear function, it's equal to

some given vector b. We'll all talk about linear inequalities,

which means that all vectors x, such that ax is less than, equal to b.

When I mean less than here, I mean component by component.

So I'm going to say a vector 2, 3, is less than equal to a vector 4, 5, because

component by component, 2 is less than 4, 3 is less than 5.

But the same vector 2, 3, is not less than equal to the vector 4 1.

Why? Because the first component is less than

4, but the second component is not. So therefore, this vector is not less than

4 1, but this vector 2 3, is less than 4 5.

So if I say a vector a x, meaning the vector obtained by multiplying, a to x, is

less than or equal to b. I mean that by component by component that

vector should be less than b. Why did I only show you 1 inequality?

Because, if you had an inequality, which is going the other way, ax greater than

equal to b, that's nothing but minus a x, less than or equal to minus b.

So, without loss of generality, I can just look at 1 side of the inequality, it turns

out that it becomes easier to keep track of various things, if I just look at 1

side of the inequalities. Alright.

I've got linear functions, I've got the notion of linear constraints.

Now the next concept that I want to know about matrices, is what can linear

functions do? How complicated can a set can linear

function generate? And, that's going to be important when we

start talking about spans of matrices, and how we can think in terms of what these

spans do. So the next concept that we're going to

learn is that of a rank of a matrix. There are 2 notions, I call them rank of a

matrix, and a row rank of a matrix. Let's look through the examples, and we'll

come back and look at more general ideas. And another related concept to rank, is

the range of a matrix, and we'll, we'll try to make all of this clearer by looking

at an example. So, down here is an example, I've got a

matrix a, which is a 2 by 3 matrix, 2 rows, 3 columns.

We know that this matrix induces a linear function, and what that linear function

does is it takes a vector x which is in r3, so x is in r3, meaning it has three

components. If you multiply this vector by the matrix

a, you end up getting a vector ax, which is in r2.

So, it maps 3 dimensional vectors into 2 dimensional vectors.

This concept is important when we talk about ranges.

But before we get there, let's start talking about a column rank.

Column rank, I want to look at the rank, columns of this matrix, and ask myself,

how many linear independent columns are there.

How many columns can I take and write, and still leave them linearly independent?

So I know for a fact, that because the columns are 2 dimensional, meaning that

each column is an r 2, I can at most get 2 vectors.

Back in the module, for vectors, we had talked about linear independent, and we

said that in r 2, 2 vectors can be linearly independent, and the third vector

will become linearly independent. So at most, I can get 2 columns that are

linearly independent, but it turns out, that for this particular matrix, only 1

column is linear independent. Why?

Because if I take the first column, 1 2, I can get the second column simply by

multiplying the first column by 2. I can get the third column by simply

multiplying the first column by 3. So the second and the third column are not

linearly independent of the first column. So the column rank, which is the number of

linearly independent columns, is 1, because I can only get 1 column.

Now let's do the same thing for the rows, and we end up getting a concept of a row

rank. So, here's my row 1, now it turns out,

that if I take that row and multiply it by 2, I get the second row.

So the row rank is also equal to 1. That's not a coincidence, there is a

theorem which says that their row rank and the column rank are equal always for any

matrix. So it turns out for this matrix the row

rank is equal to 1, the column rank is equal to 1 and the rank itself is just

equal to 1. So, next let's look at this notion of

range. So, what we want to do, is we want to

understand what does this matrix a do to the vectors in r3.

So, now I want to think of this matrix not really as a matrix, but as a function

since taking vectors in r3 and mapping them into vectors in r2.

What kind of vectors I can get? What is the largest set of vectors that I

can generate by this transformation? So, it, I'm going to multiply them by

different components, so the vector ax is going to be equal to x1 plus 2 times x2

plus 3 times x3. The second component is going to be equal

to 2 times x1, plus 4 times x 2, plus 6 times x3.

That is what this vector a x 3 is going to be.

So, x1 plus 2, times, x2 plus 3, times, x3 times 1, gives me the first component, x1

plus 2, times x2 plus 3, times, x3 times 2, gives me the second component.

So every vector that I can generate by multiplying by the vector x, is of the

form some real number, let's just call it lambda, times the vector 1 2, and that's

exactly what is written down here. The range of a, the set of all vectors

that I can generate by multiplying to the right hand side of a, by sum vector x, is

all the vectors of the form lambda times 1 2.

What does that mean? Now, going back to the notion r2 is this

plane, one chooses the second component is equal to 2, the first component is equal

to 1, that's this vector 1 2. In, in the r2 plane, all the, all the

vectors that are possible, are represented by this r2 plane.

But this matrix a, can only generate vectors on the straight line, nothing else

can be generated by multiplying it by the vector x, and this is exactly what it

means to have a rank 1. Rank 1, means that although the vector is

sitting in r2, meaning that it has 2 components, really it's a one dimensional

thing, that's what this 1 dimensional line tells me.

On the other hand if this matrix a had rank 2, then that would tell me, that I

should be able to generate everything in r2.

Because it would, it would mean that there are two independent vectors that I could

generate, and I know that in r2, using 2 independent vectors, I can generate

anything, and so on. In the in this course, we will never get

down to the details of how do I compute the rank, and so on.

We'll work mostly with the, sort of the idea of what a rank is.

And the idea that I want you to keep in mind is, the rank tells you the rich.

Higher rank means that you can get a lot more things out of this linear function,

lower rank means that you get less out of this linear function.

In the next module, we're going to start talking about hedging, and bringing an

optimization problem. And, there we'll notice that the rank, of

the matrix will tell me how many different payoffs can I, can hedge.

Before we finish this module on matrices, there's 1 last concept, and that's of

inverse. If I've got a matrix a, which is a square

matrix n by n, both the row column, row index, and the column index is the same.

And the rank of the matrix is n, meaning that all the columns are linearly

independent, all the rows are linearly independent, then, this matrix is

invertible. What does that mean?

It means that there exists a matrix a inverse, such that a inverse times a, is

the same as, a times a inverse, and which is the same as identity.

So this i, remember, was identity.