0:01

Welcome to the reference material section on Transforming Data.

So, many times when we're trying to built up a constraint

which represents part of the problem.

We'll have the information that we need to built the constraint packaged in

a different way than, than is natural for building constraints.

So, we may need to create a new data representation in order to

build constraints.

So, let's look at an example problem.

We've got n agents and we need to pair them up and

each agent has a list of preferred agents.

And then that can only be paired with one of their preferred agents.

And we need the pairing to be stable.

And what this means is,

if agent i is paired with another agent j, but prefers agent k,

then agent k must be paired with an agent l that they prefer to i.

So if that's not the case, then basically it's advantageous for i and k to join

each other and leave their current roommates because they'll both be happy.

So in the stable roommates problem, we do appear in such that there is no

pair that would be happier if they lived together to be in a new room.

So, here's an example of data.

So, we've got six people and here's their rankings.

So rank, person one prefers person two, and four, and five, in that order.

Person two prefers six, one, three metal.

Three prefers two and four, etc, etc.

So, I welcome you to try and generate a stable pairing.

2:17

Now, we should look at this some unstable pair.

So if we look at the pair one and four,

then one is currently its third preference and this would be a second preference.

And four is currently there with its second preference and

then this could its first preference.

So we can pair those up, right, and if I break these two pairs here,

pair those up and indeed once we do that,

we can also extend the pairing between six and five.

Right, so this is a better matching because one and four,

we've got one less unstable pair.

So, we can look at this further and we'll also find another unstable pair.

So we look at two and six, and both each other's first preference and

two currently has its third preference and six has its second preference.

So that is another unstable pair, so we can change to that matching.

And we end up with three and five with no one left to pair with, but in fact,

there's no better matching.

So, this is a stable matching for this example.

3:52

And given that input, we can calculate for each agent the number of preferences

they have by just summing up the number of times that this agent of a,

i is greater than zero.

And then we should check that our data that's in the format that we assume it is.

And that's saying, okay, for every i, j in the number of preferences of a,

we make sure that they're not the same version.

So we kind of have two preferences for this same a other agent, and

we also want that for every of the first elements in the list.

This agent from 1 to npref[a],

the agent that we're talking about here is not zero, it's a real agent.

If that's not the case, then the preference data for that agent is wrong.

So, here's our example data written this way.

You can see the preference array.

We have 2, this is the first preference for number 1, and 4 and then 5.

And then we're panning out with zeros until we've run out

of the preferences.

And for 2, 6 was their first preference, then 1, then 3, and

then no more preferences.

We can calculate the number of preference for each thing and obviously we get 3,

3, 2, 3, 2, 3.

And we also check that all the zeros appear at the end because

basically we're going to check that from one to three in this array,

they're all greater than three.

So, what do we actually have to do for decisions?

Basically, we're going to have to pair each agents with another.

Now, some agents may not be paired, so we're going to basically assign the pair

of the agent as another agent or this extra agent zero.

5:32

Now obviously, the pairing has to be possible.

That is, we can only pair an agent with someone who appears in their preferences.

So, we can work out the possible set of agents that an agent can be paired with by

basically looking on the preference list from 1 to npref.

So, these are the actual real preferences of the dummy zero preferences and

boolean at set.

And then we can say, well, the pair, whoever we pair a with has to be

in this possible set or it could be zero if a ends up not being paired.

The other critical constraint about the pairing is of course,

that pairing must agree.

That if we pair a with b, then we must pair b with a.

Now of course, if we pair a with zero, then this constraint is false, right?

Because pair of zero will be a, this is an out of bounds error and

the other half of the relation will make that false as well.

So, if you think about what happens when one of these pairings are zero.

7:29

We'll look at three different approaches to the we can add it to the data farm.

So just add it as a new data, that's one way of doing it.

We can use constraints to build,

we can write constraints to force the rank [a, b].

They used to take the rank arrays or we can use more complex comprehensions.

So, adding it to the data file is fairly straightforward.

So here, is our two-dimensional array of agents and it's basically saying,

okay, agent one ranks number two as the first preference.

Number four as the second preference and number five, it's the third preference.

Agent two ranks number one as the second preference,

number three is the third preference and number six is the first preference.

So, we can see a different view of the data.

Now, this is advantageous and the reason is we can use any program we want

to create this other view of the data.

But when we do this we have to be very careful.

We have to ensure we don't have to add to our model sign.

It's going to check that the preference information and

the rank information actually agree, right?

So we know that if we have two agents and

i is in the preference set with a, and we know that the i's

preference of a is b means it must be that the rank a gets to b is i.

If that's not the case, then something has gone wrong and

your ranking doesn't agree for those pair of agents.

We also have to do something else, we have to see if a ranks b is zero, so

that means it has no ranking for it, then it shouldn't be the case.

But there's something in its number of preferences,

some i, the i's preference of a is b.

So, we have to check these things to make sure that the preference and

the rank array agree on the data.

And if we're using two views of the data that are both coming from the outside.

It's essential that we check the date, give us the same year.

So, another way of doing the rank array is to use constraints.

So now we have to make this a variable array,

because we're going to use constraints to evaluate it.

And then we can basically wipe down the constraints which define it.

So we know for two agents, if the preference and i and

the number of actual preferences are made,

then if i's preference of a is b and the rank of ab should be equal to i.

So basically that's exactly the definition,

we have to also ensure that the other ones, so this one is applied.

So if a and b, if there's no preference between a and b,

then this one forced at the rank of a, b, zero.

So, we have to do that as well.

So we're looking two agents and for none of the preferences of a there's

an i which gives us agent b then we're going to set the rank of a b equals zero.

So this can be very easy, because basically, often the relationship

between these two views are very simple to write down with constraints.

But there's strong disadvantages that, when we use this model,

the rank is not fixed in translation time.

So remember, when we gave them the data these were not variables,

these were just integers from zero to n-1 those ranks.

So, all the ranks were known at translation time.

So in this model, the ranks are not known.

So wherever we use them, we're going to be using variables.

And it gives it more resolve to do.

So another thing we can do, is we can just build the rankings and comprehension.

And now we have to be a bit clever about how we do that.

So, we know that this is a two-dimensional array for pairs of agents.

Now, how can we work out to put the right value i in this a,

b position for a pair of agents?

So this is the expression that's fixing a in i, and

we're going to sum up for i in the number of preferences of a.

We find that preference at the i-th position of a is is b,

then this will evaluate to true, and so we'll add up i.

12:17

So now we've got our rank array, we can build our stability definition.

So remember, if agent i is paired with agent j but prefers agent k,

then agent k must be paired with an agent l, they prefer i.

So basically if i ranks k better,

right, than the person they're paired

to then either that k ranks i not at all.

So, k will not allow itself to be paired with i.

Or k ranks i worse than the person they currently paired with,

So that's what that constraint effectively says.

Okay, so you have to be a bit careful.

What if, so we have to make sure that if i ranks k then k doesn't rank i,

like in this case, that's allowed.

All right, so

we have to add that extra position, otherwise the stability might fall.

13:21

So, another thing we can do here, to simplify this thing is really

if i ranks k, and k doesn't rank i, then they can never be paired.

So, we could actually build a better version of the ranking way by getting rid

of rankings which would never be allowed.

So here, what we have done is whenever we are about

to add an i because the I expectance of a to b.

So, we are about to say that the rank of a of b is i, we add this check.

That there is some ranking that b is to a.

Some position j, where b is a preference of a.

If that's not the case, then this whole bool2int will evaluate to false, and

we'll get zero.

So basically, it replaces rankings which don't have a paired ranking by zero.

14:36

So in overview, real combinatorial problems require data.

And MiniZinc only has limited structures for representing data, basically,

those which are common.

So, you need to learn how to use MiniZinc to encode data of your problem and

to create new dependent data.

Because often you would want the form in a different data than you were given to

build a constraint the right way.

So, this could lead to some challenging declarative programming,

we cannot do all in.

But in the worst case, you can generate the model with the data from outsite.

That's something else to re-write, the data to a different form but to do that,

you need to add the assertions to the model to check that

the two views of the data you're given actually agree.

That's a pretty standard set.