Hi. All the time spent with me we've been talking about various things like,

ways to represent graph memory, counting shuffles,

and features of GraphFrames library.

So I think it's time to relax and speak about music.

In this video, I'm going to present to you with a set of algorithms used for

music recommendations and personalization in a general purpose social network, "ok.ru."

The second largest social network in the commonwealth of independent states,

visited by more than 40 million users today.

The success of nearly all internet product nowadays,

highly depends on how much value they can

provide for the user requiring endless supports as possible.

Recommender systems are one of the approaches used to increase value for fort rate.

By analyzing user's activity in the past,

a good recommender selects the items which are most relevant to the user's current needs,

increasing user's satisfaction, loyalty,

and the revenue of the product.

For example, in this slide,

right now you can see my radio tab where I can choose from four radio stations;

pop rock, Indie rock, Russian pop and metal.

Specially created for me,

basing on the songs I've listened to before.

Guess which station is my favorite?

Continue. The core component of the recommender system is a taste graph,

containing information about different entities, users,

tracks, singers, and music genres,

and relations between them.

For example, user Natasha likes song,

Take Me Out the certainty one-quarter.

Track, Love me Tender created by Elvis Presley.

Group Placebo is similar to group Green Day with certainty one-eighth, and so on.

Using the graph, it's possible to select tracks a user would most probably like,

to arrange them in a way that they match each other well,

to estimate which items from a fixed list are most relevant for the user and more.

Thus, soon, you will know one more useful application of the graphs.

Taste graph is used in order to combine all the information mined from the system,

including the history of user's activity and content information.

Then, this graph is analysed by different algorithms

in order to construct recommendations and personalized output.

The taste graph is an oriented graph with waves associated to it's edges,

capturing all the information mined from the system.

Formally, taste graph G can be defined

as a foreign tuple.

Where, V is a field known to a set of vertices.

Tv is a field known entry set or vertex types.

And, Tv is a mapping function measuring each lyrics to it's type.

For example, the type for vertex Natasha is a user,

vertex Green Day is a singer,

and for vertex rock,

the type is genre.

Having finished with this,

we are moving to data.

Data belonging to V is a zero balancing vertex used to

compensate impact of the vertices with a small amount of outgoing edges.

We will in detail talk about it later.

Ready for a new step?

Here we go.

E is a field non-empty set of edges.

T sub E is a field non-empty set of edge types.

And T sub E is a median function measuring each edge to its type.

For example, from vertex Natasha to vertex Love me Tender,

there is only one step, really.

For edge from vertex Natasha to vertex Love me Tender the type is LIKE.

For the edge for vertex Every You and Every Me to vertex Placebo,

the type is CREATED BY.

For the edge from vertex Green Day to vertex rock, type is BELONGS TO.

Shall we go further and think about R?

R is a function matching each edge to its start vertex and end vertex.

Edges leading to data are zero balancing edges.

One of the most important parts of taste graph is omega.

Omega is an great function matching each edge to it's weight.

A set of outgoing edges of a type T from a vertex V is defined as out of V in two.

Taste graph G must satisfy the following conditin.

For all belongs to a set of vertices and all belongs to a set of edge types.

Sum of mega of E as [inaudible] of V and T is equal to one.

Assuming that a graph has all edges of the same type,

T, you can define it as stochastic. Why is it important?

Because, the graph being stochastic has necessary condition for every vertex

to have a fixed probablity to get to it after quite a big amount of steps.

And this probability doesn't depend on the vertex from which you start.

These probabilities of getting finally to every vertex

of the graph are called stationary distribution.

We already talked about it before in the lesson devoted to paid rank.

Taste graph G is called partly stochastic.

In order to get a full stochastic graph,

balancing function beta is used.

The balancing function must satisfy the following condition.

Giving the balancing and function beta,

it's possible to define a balanced wave function omega sub beta

S. It's clear that under the weight function,

omega sub beta graph G is a stochastic graph.

First of all, the role of introducing non-standard construct should be explained.

Split of the vertices and edges to different types and partial stochastic,

allows different parts of the taste graph to be

constructed independently and then combined.

For example, belonging of tracks to different genres edges of type,

track A is a rock and roll one is constructed of later data received from partners.

While edges of track type B as weight peaks are

constructed by an aggregate function from the overall track ratings.

Furthermore, different parts of the graph can be updated at different frequency

depending on the complexity of the update and the natural dynamics of the part.

Balancing function beta is used to balance

the impact of different factors on the overall result.

For example, by increasing the weight of user-track links,

you increase the impact of collaborative recommendations

while decreasing the impact of content based recommendations.

Zero balancing vertex data plays an interesting role.

In the taste graph the number of outgoing edges might vary from vertex to vertex.

For example, Track 21 guns might have

five similar tracks identified while tracks Clouds in Camarillo only one.

Smaller number of outgoing edges typically indicates

the lack of statistics and lower certainty of the list.

But, obsolete weights on the edges from Clouds in

Camarillo would be greater than on the edges from 21 Guns.

Increasing the impact of Clouds in Camarillo on the recommendations.

In order to compensate this impact,

the zero balancing vertex is used.

This vertex does not match any of the object and has only a self outgoing edge.

But each time when the amount of outgoing edges of type T vertex V is

below limit and edge for V to theta is edited to drain weight from the existing edges.

To sum up, after watching this video you can answer the question,

which parts does the taste graph consist of.