Welcome to the next video in the Introduction to Data Exploration unit.

In this video, I will focus on Vector Spaces.

We will learn what are vector spaces,

we will learn what's the distance measured in a vector spaces.

And we'll also discuss what we call similarity measures,

If you remember from the last videos,

we have seen that one way to represent data,

one way to represent multidimensional data,

is to map it into a vector space.

For example, in this slide we see that we have a set of images and what we do is

we take all of the images and now we somehow map them into what we call a vector space,

and within this vector space we can then go and

measure similarities or distances between these images.

And this can provide the way to

design data retrieval systems as well as data exploration systems.

So, this vector space becomes a representation

in which we can actually explore data, explore complex data.

An important questions, of course when we are

designing a vector space for exploding data,

is to define the vector space itself.

And as we see this slide,

to define a vector space,

we need to identify what we call basis vectors.

So, these are the vectors in the vector space that you use to represent

other objects and you also defined in

the same space that we call distance and similarity functions.

So, in this lecture,

we will first focus on the definition of basis vectors,

we will define what a basis vector is.

We will also discuss some of the good features of basis vectors.

So, how do we select these basis vectors?

How do we decide the vectors that we use to represent our complex data?

And in the upcoming units of the data exploration course,

will also focus on the second question,

how many features we need to represent our vector space?

How many basis vectors we need?

How do we select them?

And so on.

This lecture, we are primarily focusing on the definition of

the basis vectors and also some of the critical properties.

So, what's a vector space?

A vector space is actually nothing but a set of objects,

a set of objects.

A set of objects make up a vector space.

So, this set of objects could be anything,

could be images, could be audio,

could be video, could be social media data,

could be records in a database, could be anything.

However, for us to call this set of objects a vector space,

they need to satisfy certain conditions.

And these conditions, the vector space essentially needs to have

a set of properties that enables us to operate on it.

One of these is what we call addition operation.

That is if I have two objects that I represent as vectors,

I should be able to add them and the result should also be an object in the same set.

So, if for example,

if I have two objects,

and if two images then I combine them,

the result should be another image object.

A second thing that is critical to define

a vector space is what we call scaling operation.

That is, I should be able to take a vector,

multiply it with a real number,

positive or negative and the results should also be a vector in the same space.

That is, I can't take any match and I can scale the image.

For example, I can scale the number of

pixels in the image and the result is still that image.

So, scaling is another required operation in the vector space.

And next requirement is that you should have a specific vector,

specific object in the vector space that we call zero object.

This object essentially is

a special object where if we add the zero object to another object,

we get the same object itself.

So, the zero object essentially is,

doesn't have an effect for the addition operation.

The same way, if we scale an object with number one,

real value one, we get the same object itself.

So, those that are essentially primarily it.

So, if we define any set of objects and if you can define addition

and scaling on those objects and we can identify at least one of the objects,

actually exactly one of the object is the zero object,

then we have a vector space, that's it.

That this is essentially the definition.

So, the question is how do we use these definitions,

addition, scalars, multiplication and zero objects to define a vector space?

Because this still doesn't say,

how do we use this vector space for representing

our vectors or to define distance or similar to measure things in vector space.

So, we'll discuss that next.