discrete optimization, before last lecture on constraint programming, okay?

So, one of the things that I told you in the beginning is that there are two

aspects which are really critical in a constraint programming system.

One of them is constraint propagation, and we spend a lot of looking at that in

the, in the past lectures. And what I'm going to, the other, the

other one is search. And that's what we're going to talk about

in the next two lectures. It's also a very active area of research.

And if for what I'm going to talk about today is basically an introduction to

some of the concepts. Basic introduction once again, to make

you curious, right? So, the key idea in constraint

programming is that search is based on feasibility information, okay?

So what the constraint programming system is doing very well is basically pruning

the search base, based on constraints and feasibility, making sure that, you know,

as many of these constraints are feasible.

And the search, of course, is going to explore that information, because that's

the big information that we have in a constraint programming system.

Okay? So what does it mean to explore this,

this feasibility information? So we, we have this, very high level

principle that we want to apply, which is called the first-fail principle.

And it's very counterintuitive. But actually, it makes a lot of sense.

So, so what you are trying to do. So, the, the, the principle tells you.

Try first where you are the most likely to fail, okay?

So, and the way to think about this is that you have a bunch of tasks that you

have to accomplish. And you get a reward if you accomplish

them all. And one of them is really hard.

Don't postpone it, focus on it. Because the easy stuff, you know, you can

do the easy stuff but you are wasting your time if you can't do the hard stuff.

That's what this principle is telling you.

Try first, you know, where it's the most, when you are the most likely to fail,

because you want to fail early, okay? And so, essentially what it means is

that, you don't want to spend time doing the early stuff, and then go to the tough

stuff and see that, no I can't do the tough stuff, go back and try to do the

easy stuff in another way, such that, in the hope that the hard stuff is going to

be a little bit easier. That's not what we want to do, okay?

We want to start with the hard stuff as quickly as possible.

And of course by starting with the hard stuff what we hope is that we'll have a

very small search tree. And that easy stuff we can take care of

at the end. So, that's the first-fail principle.

You want to start first with where you are the most likely to fail.

Okay? Now, what does it mean?

Well, look at this beautiful queens example that we have.

Okay? So, what does it mean to, to look at

something which is though, where you are most likely to fail?

Okay, now all these queens, they have different set of values, okay?

And what you want to do is to start with one, which is as few values as possible.

This one is only one value. You know that you have to place the

queens there, okay? Now, if there are some values which have,

some queens have four values, and some which have two, you want to start with

the ones which have only two values. They're harder to place and that's where

you want to start from. Now we're also constructing smaller

searchbase because you only have a branching, you know, factor of two

instead of four, okay? Now, sometimes all the variables have the

same set of values at the beginning. Okay?

So this is what we see in this particular graph coloring example.

At the beginning, I can use all the colors for actually, all these countries.

Right? And so which one do I start first?

Okay? Now I can say, oh I'm going to start with

Denmark. Okay?

But Denmark, in this particular case is easy.

Right? So it only connected to Germany.

So there is only one color that it won't be able to take and therefore, I don't

want to start there because I'm going to give it a color that's going to remove a

color from Germany. But that doesn't mean anything, right?

So, what I have to start from is someplace which is essentially connected

to many, many, many different countries, and that's where it's going to be the

most difficult to color. In this particular case, guess which

country it is, right? So, we'll start with Belgium, okay?

For instance, okay, so I want to illustrate or sort of first-fail

principle on a very interesting example. It's a puzzle.

It's called the Euler Knight problem, okay?

It's basically having a knight in chess, and making sure that you visit all the

places in the chess board, and visit them exactly once, okay?

Following the rules of chess. Okay?

Now, its pretty tough to do but, that manually.

I know only one girl who actually did it, did that.

She was in a computer science class, computer architecture, you know, she was

completely bored and so she was playing this thing during class and some point

she said, yeah! And she found it, okay?

So if you can't sleep at night, try this. It's not easy.

Okay? Now, it's an interesting problem because

it's connected to a lot of problems in routing.

It's kind of an abstraction for these problems, okay?

So, but essentially, what I want to show you is the first-fail principle on this

particular example because it's very intriguing, okay?

So this is the model. You see a very simple model for this.

The decision variables are the jump variable, okay basically for every

position in the chess board. Okay, you have to, you have to know where

the knight is going to jump next. Okay, and so you are basically 64

variable, one for every, you know, position on the chest board and that

tells you where you jump next. Once you've those you can follow the path

of the knight. Okay, and the only constraints that you

have is that all these variables have to make up a circuit, okay?

So, if you start from one place and follow this, this path, you're going to

go back to exactly the same place, okay? And you will visit every position in the

chess board exactly once. Okay?

Now this is a tough constraint to implement.

We can just think about it and we'll come back to that later on, okay?

But this is essentially how the model is stated here.

Now, one more thing that we have to do is we have to now basically specify for

every position where the knight can jump, that follows basically the rules of chess

and this is this boring set of positions that you see over there.

you know don't try to spend your time understanding it, but you know you just

have to know that this can be done, okay? Now what I want to show you is

essentially how the first-fail principle is working on this particular example,

okay? So let's look at the chess board.

This is the chess board at the beginning, okay?

So, where do we start? Okay?

Are we starting at the middle? Are we starting here?

Are we starting over there? Okay?

Well, the first-fail principle is telling you.

Start first where you are the most likely to fail.

You know? Where are the position where it's

difficult to actually, you know, jump, okay?

Well, these are the corners, right? So when you look at this guy here, okay?

There are only two position where we can jump.

And as soon as it jumps to one. Okay, you know that the, you know, you

know where the previous, you know where, essentially, the previous position was.

Okay? So essentially, there are only two

positions. And as soon as you choose one, you also

fix the value of another variable's. Okay?

That's where you're going to stop. These corners are tough, because there

are only two places where they can jump. If you're in the middle, there are these

eight places where you can jump. Many more positions.

Okay? So, we know where to start.

Okay? So, the key, the key question here is

where do we go next? Okay?

And every time you know, I, I've, I, I've been teaching a class on combinatorial

optimization and ask this question. People always tell me, okay well you have

to take this guy probably because it has very few positions.

Okay? But this is not what the system is

going to do. Once again, the system is going to look

at all the position, and find out the one which is the toughest to assign.

And where is that? Well, that's in the other corner, okay?

Because, once again, there are only two position.

Where for this guy, there are at least three or four positions, four for that

particular one, okay? So, essentially, what's the system is

going to do? Is built some small parts all over the,

the places. And the constraint is essentially trained

to make sure that you can connect them into a big circuit.

Okay? At every, every stage and after every

assignment. So this is a partial state of the system.

You see we built a couple of interesting paths all over the place.

Okay? And this is a solution at the end.

Okay. Beautiful, right?

And so this is what the first-fail principle is actually doing on the Euler

Knight problem, finding the variables which are the toughest to assign.

They have a small domain. That means that they are tough to assign,

okay? So let's put this principle in

perspective now and let's look how you program this search and how you can

express them, okay? So I am going to start with the eight

queens problems again so that I can illustrate a little bit how you write

this search procedure. So what you see here at the bottom.

Is a search procedure for the queens problem, okay?

And there are various things that you can see there.

So the first instruction is a for instruction, and basically it's telling

the system iterate for every one of the queens.

I want to assign a value to every queen, so I have to iterate over all the queens.

The second instruction is actually pretty interesting, it's the try all

instruction, and essentially that instruction is a non-deterministic

instruction, it's not a loop. What it's trying to do, is to assign a

value to the particular variable. It's to choose a value in a

non-deterministic fashion in this particular case.

Okay? Now it's going to try to do that, find

one popular value. But he's not going to try them all at

this point, okay? It's just going to pick a point, okay, in

a non-deterministic fashion. And then execute whatever the body is

below. Okay?

But in, but if, if later on, okay, the execution fail, okay, it's going to come

back to this instruction and try another one.

Okay, so the for loop is kind of iterating over all the values, a try all

knows that it has many possibilities to actually do something, and it's going to

pick up one. And if later on you are, you know,

basically in the failure mode, you're going to come back and try another one,

okay? That's non-deterministic instruction.

The last string that you see in this string is essentially, as soon as you, as

you have a variable, you have selected a value for the variable, you are going to

basically add a constraint to the constraint stock.

Stating that that variable is basically assigned to that value, okay?

So that's three component that we'll see in a lot of the search procedures that

we're going to see in this lecture. You start by iterating over the variable,

you non-deterministically choose a value for them, and you add a constraints

inside a constraint store, okay? Remember the computational paradigm,

okay? So what you have is basically a

constraint store that's pruning the search base.

And then a search, okay? What a search is doing is always adding

constraint to the constraint store. Okay?

The search is adding constraint to the constraint store and the constraint store

is basically sending a razor back to the search saying okay, yeah, good.

Success. Okay?

Or sometimes you have the constraints and they say failure, okay?

You, you can't satisfy your constraints, when something like that happens you go

back to non-deterministic instruction, and try another value for the variable.

Okay? So in a sense, what you need to do, is

that every time a constraint fails, okay? So, you go back to a non-deterministic

instruction and try a value which has not been tried again.

Okay? If there is no possibility for that

non-deterministic instruction, what you will do is go back to another one.

That was, you know, the previous one, and try that and, and try another value for

that non-deterministic instruction, okay? I'm going to go into, into, into the

details of this a little bit more such that you see how basically the syntax

corresponds to the semantics that I'm talking eh, talking about, okay?

So when you look at the full instruction essentially what you are doing, you can

view that, you can unfold this loop. And what you're really doing is do a

bunch of try instruction for the first variable, for the second variable and so

on. Okay?

Now if at any point you fail for one of these, you're going to go back to the

previous one and try another value. If that, if there are no values left,

you're going to go back to the previous one, try another value and so on.

That's what the system is doing, is basically, you know,

non-deterministically selecting some of these values.

If at some point you fail, you go back to the previous one and try another value

for that one. If there are no one left, you go back one

more level up, okay? So essentially that's what the system is

doing there. Okay, for every queen, there will be this

non-deterministic instruction and you are trying to see which values you can assign

to that queen. Okay?

Now, let's look a little bit into a, into more detail on how the trial instruction

is working. It's a non-deterministic construction,

which can potentially explore all the values.

But it only does so when basically activated because of a failure, right?

So in a sense, what you can do is this try all instruction is a big try

instruction, okay, and which is basically telling you, oh try first the value one,

the value two, the value three, and so on and so forth, okay?

And in this particular case, if you fill for the value 1, okay, so if for instance

the value 1, when you put it inside a constrained system, the constrained

system is telling you oh, I have a failure.

You will go and move to the next instruction in the try, okay?

And try the second value, and so on and so forth, until you arrive at the bottom.

If none of these values are actually successful, you will go back to a

previous try all instruction. And select the next try in the list where

you had left out. At the you know, at, at the previous

execution of that try all. Okay?

So, that's essentially what these try and follow you know, the way they are

interacting, and the way non-deterministic computations are

basically computed. Okay?

So, what I want to talk about now, are the various ways you can use these

primitive to do various kinds of search. And I'm going to talk about variable

value ordering. Value variable labeling.

Okay? I'm going to talk about domain splitting.

I'm going to say how you can focus some of the search on the objective function.

I'm going to talk about symmetry breaking during searching and randomization and

restart. Okay?

In this lecture, we'll do only the first one, which is basically variable and

value labeling. We're also looking at the value of the

objective function at the very end of the lecture.

Okay? So variable value labeling are probably

the most simple search procedure that you can design in a constraint programming

system. It basically has two stamps, the first is

you choose a variable that you want to assign and then you choose a value.

The first step you have to reiterate for the valuable, the second step is a

non-deterministic step. Now, once again, you want to apply the

first-fail principal, which means that you want to find the ordering of these

variables in a very, in a very, in a very specific way.

You want to start with the one which have the toughest.

Toughest to, to, to give a value to. And typically, you know, the size of the

domain of the variable is going to be a good indicator of what is tough and what

is not tough. Okay?

The variable which has the smallest domain means that you have remove a lot

of value from that variable. It means that it is probably interacting

with a lot of other constraints. It's probably a variable which is very

difficult to instantiate, okay, to give a value to.

Okay? So the ordering that we will want in

general is going to be dynamic. You just don't want to fix the ordering

of this variable. You want to choose a variable which has

the smallest domain, then you propagate all the constraints, that reduces the

search space again and then now you want to look for the next variable with the

smallest domain. Okay?

And that variable may not be, it may not have been the second most attractive

variable the first time around because now more values have been removed.

Okay? So what you want to do is iterate this,

you know? Choosing the variable with the smallest

domain, propagating. Choosing the variable with the smallest

domain, propagating, okay? And so this is essentially all you can do

that in a, in a constraint programming system.

The forward instruction, though, is capturing an order.

And in this particular case, what I do is that I look at the variable, row r.

And I take the size of its domain. And that basically tells me what, at this

particular point in time, the domain size of that variable is.

Okay? So, in a sense, you have to view this

iteration now as look at all the variables, okay, which have not been

given a value so far. Take the one which is the smallest

domain, and execute the body. Okay?

So take the variable which has the smallest domain, no?

You execute the body. So which basically will assign a

non-deterministic, will assign non-deterministically a value to that

variable. Then you're going to go back to the loop

and look at all the variables that you have not selected, you know, at this

point, and take the one which has the smallest domain now.

Okay? Why am I saying now?

Because essentially the constraint that we added, it started constraint

propagation in the constraint engine, okay?

In the constraint propagation part of the system and has reduced the size of the

domains, okay? So basically what you have to see

compared to what we have seen before, what we are doing though, is choosing the

order of variables that we're going to give value to in a dynamic fashion.

Taking the one which has the smallest domain with every time, okay?

So essentially what we are seeing is a two steps.

First step is choosing the variable. And typically the variable which has the

smallest domain. That's the first-fail principle.

Know at any point in time, you may have several variables which have the smallest

domain. Which one do you choose?

Okay? One of the good principle that you have

to try, that, that you can try to apply, is choose the one which is the most

constrained. Remember the graph coloring at the

beginning, all the variables, all the countries, have exactly the same color,

possible colors. Okay?

Which one did we choose? We chose a country which was connected to

many other ones. It was very constrained.

That's what we can do. Okay?

Now, in the particular case of the queens problem, for instance.

We can choose the queens which has the smallest domain.

And if there are ties, we can choose the one which is as close as the middle of

the board as possible. Why?

Because when you place a queens in middle, it has a tendency to remove more

values than when you place a queen on, on the sides of the chessboard.

Okay, so how to do that well essentially this is easy you take the for all

construction that you've seen before and know you're using a lexicographic you

know criteria. You take the viable which is the smallest

domain and in case of ties the one which is closer to the middle of the chest

wall, okay? So, essentially, same thing as before

except that now instead of choosing the value with one criterion, I'm basically

using two criterion, okay? One starting with the smallest domain and

then the variable which is the most constrained.

Okay? Now remember also one of the things that

we had on, on the, on the queens problem, was this dual modeling, where you had

variable fold associated with a column, and then variable associated with the

row, okay? Now, that's a good example to show how

you can both choose a variable ordering in a dynamic fashion, and a value

ordering in a dynamic fashion, okay? So what you see here, okay, is that we

look at the raw variables as the one that we want to assign.

Okay, and we choose the one which is the smallest domain and then we have to

assign a value okay, and when we assign a value essentially this is equivalent is

in giving a value to a color. So we can try the value in an authoring

which is also the first-fail principle which says oh but choose the value which

corresponds to the column variable which has the smallest domain because that.

Column is thought to instantiate as well. And that's what you, that's what you are

doing here okay? You choose a dynamic holder for the

variables. Then you choose a dynamic holder for the

values. In both cases, you're actually applying,

you know, the first-fail principal, okay? So that's the kind of things that you can

do in a constraint programming system. Choose a dynamic ordering for the

variable. Choose a dynamic ordering for the values.

A very interesting thing that you can do. Okay, so in general, okay so when you

have a variable value labeling what you're going to do is to choose the most

constrained variable, okay first. That basically means the smallest domain

of the variable that fails very often. Okay that's the kind of things that you

want to do. And the value ordering is going to be

generally a little bit different. What you want to do is try to maximize

the probability of finding a solution. So we're going to try to value which

actually leaves as many options as possible in the future.

Okay? Because that helps you find a solution.

You leave flexibility for the variables. So in a sense here we are trying to find,

trying to, you know, look at the region which is the toughest, but also trying to

find a feasible solutions. of course, sometimes this is, you know,

this is just a heuristic, this is not, you know, a rule that you have to apply

generally, but this is typically something which is useful in practice, in

many, in many applications. Okay?

So essentially, what, what I've shown you so far is that in constraint programming,

most of the time, okay, we try to use feasibility information for branching.

And why? Because that's essentially what a

constraint programming system is doing. It's basically reasoning about

feasibility, pruning the search base, and that gives you a lot of information about

which variables are tough, which variables are central to actually the

problem that we are trying to solve. But sometimes we have an optimization

problem and it's very. So you're trying to find an optimal

solution for an objective function and it's important to look at the value of

the objective function as well. Okay now as I told you before, when we

start the optimization problem now the optimization function becomes the

constraints. Every time you find a solution.

So either we use this search base, sort of, the feasible information is still

very important. But we can do it better than just

exploiting the feasible information. I also, you know, reasoning about the

objective function itself. And what I want to do now is illustrate

that on a very interesting problem. And which is generally very challenging

for optimization system. And constraint programming is really nice

for solving these problems. So it's called Serializable Data

Services, and it's essentially implementing protocol that gives you that

gives you software which is reliable. Okay so this is something that we did

with some colleagues at UConn, and the protocol that we were trying to implement

is called eventually serialized data services.

And the way to think about this is that you have a set of components, okay,

software components that you have to map on a set of hard, on a set of machines.

So a set a set of hardware, okay, and what you do, and, and, and to do that,

you have to satisfy some reliability conditions so there will be different

components that have to be located on the same machine, some components that have

to be located on different machine. But you want this protocol to be as

efficient as possible so you want basically to minimize the traffic the

communication traffic between these various, various component.

If you place them on different machine, its going to cost you more to accurately

exchange data between them. Okay, so to, so when you formalize this

problem, okay in a mathematical programming sense.

It become what is called a generalized quadratic assignment problem.

So, this is the data that you have. You have a communication frequency

matrices which tells you how much two components are interacting with each

other. That's the, the data f.

You have a, a, a matrix which tells you the distance between the various hardware

components this is a distance matrix. H, okay, its essentially the number of

hops that you have to do from you're moving from one hallway to another one.

Then the decision variables are going to be this vector x and x is going to tell

you for every component on which hardware that particular software components is

going to be placed. And c is a set of components and then you

have two types of constraints. You have the colocation constraints and

the separation constraints. The separation constraints are telling

you that two components cannot be on the same machine.

If they are you know, the, the protocol wouldn't be reliable.

And then you have these colocation constraints just saying the opposite,

these two things have to be on the same machine.

And the objective function is going to be minimizing the traffic in the network.

So you have the frequency matrix that you see there, and then you have the number

of hops and Xa and Xb are basically the decision variables.

It's where do you place component a, where do you place component b?

And what you are trying to do is to minimize the communication between all

the components, okay? And that, of course, depends on how much

they communicate, and how far away from each other they are basically placed ins,

on, on the, on the hardware. So this is the model, okay?

This is the model that we have for this partical, these, these generalized,

quadratic assignment problem. Okay, the model is essentially a

straightforward implementation of what I just presented on the next slide.

You see the objective function, which is a direct translation of the objective

function that you saw on the previous slide.

You see the matrix H there, H Matrix, index by two variables, so that's the

element constraint that we saw a couple of lecture back.

Then what you see, the two types of constraints, the colocation constraints,

which are basically telling that, you know, two components in, which have to be

colocated have this, are assigned to the same component.

And then for the separation constraints, what you have is basically all different

constraint. What is really interesting in this

particular example is the search procedure, because it's going to focus on

the objective function. So this is what you do.

At every step, until all of the components have been placed, you will

take a component i, which is interact, which has not been assigned so far, and

which is interacting with as many components as possible.

It is a very high, essentially, it is a very high communication um,it maximized

the communication with another component, okay?

So that's what you see there. Select i which is the maximum frequency

of communication with another component chain.

Why do you want to focus on that particular component?

Because its interacting heavily with somebody else and therefore it will have

a big effect on the objective function once you place it, okay?

And therefore, that's why you want to focus on this one.

You will gain a lot of information as soon as you place that component.

Okay? The way you will place it, is obviously,

by trying to make sure that you don't make this objective function increase too

much. You basically want to find a value, such

that you minimize. The communication, the number of hops

with respect to the other components that it, that it's interacting with, okay, and

that's what you see in this particular instruction.

And then you assign the particular component to that particular location.

So in a sense what is important here is that you are using information on the

objective function. To drive the search, and this is a

critical aspect on this problem. If you don't do that, essentially the,

the execution time is going to increase by an order of magnitude, okay?

So let me try to summarize what we have seen in this lecture.

What we have seen is a, a one type of, of search procedure which is very frequently

used in practice, it's called variable labeling procedure.

We saw the first-fail principal, which tells you to try first.

When its very, where, where it's hot. Okay, where you are most likely to fail.

And that's, you know a first-fail principle typically is using feasibility

information you know, which variables have the smallest domain.

But, it can also use information from the objective function.

By looking at which components are difficult to place, and would increase

the objective function which would make some of the other constraints much more

difficult to, to satisfy, okay? So we'll more sophisticated search

procedure during the next lecture, but this gives you a first taste on what you

can do in a constraint programming system.

Thank you.