Now extractors typically take as input something called a salt, and a salt just

like in a salad, it kind of adds flavor to things, what it does is basically kind of

jumbles things around, so that no matter what the input distribution is, the output

distribution is still going to be indistinguishable from random. So a salt

basically, what is it? It's a non-secret string, so it's publicly known. It doesn't

matter if the adversary knows what the salt is, and it's fixed forever. The only

point is that when you chose it, you chose one at random. And then the hope is that

the funny distribution that you're trying to extract from kinda doesn't inherently

depends on the salt that you chose and hence as a result using your salt, you

will actually get a distribution that looks indistinguishable from random. So

essentially the salt, you know, you can just bang it the keyboard a couple of

times when you generate it but it just needs to be something that's random

initially but then it's fixed forever, and it's fine if the adversary knows what

it is and nevertheless the extractor is able to extract the entropy and output a

uniformly random string K. In some sense the salt is only there to defend against

adversarially bad distributions that might mess up our extractor. Okay, so now that

we have extracted a pseudo random key. Now, we might as well just use it in a KDF

that we just saw using a secure pseudo random function to expand the key

into as many bits as we need to actually secure the session. Okay, so there are

these two steps. The first one is we extract a pseudo-random key, and then once

we have a pseudo-random key we already know how to extend it into as many keys as

we need using a pseudo-random function. So the standardized way of doing this is

called HKDF. This is a KDF, a key derivation function that's built from HMAC.

And here HMAC is used both as the PRF for expanding and an extractor for extracting

the initial pseudo-random key. So let me explain how this works. So in the extract

step, we're gonna use our salt which you remember is a public value just happened to

be generated at random at the beginning of time. And we use this salt as the HMAC

key. And then the source key we're gonna use as the HMAC data. So we're kind of

using a public value as a key. And nevertheless, one can argue that HMAC has

extraction properties, such that, when we apply HMAC, the resulting key is going to

look indistinguishable from random, assuming that the source key actually has

enough entropy to it. And now that we have the pseudo random key we're simply going

to use HMAC as a PRF to generate a session key of you know as many bits as we

need for the session keys. Okay. So that basically concludes our discussion of

HKDF. And I just want you to remember that, once you obtain a source key, either

from hardware or from a key exchange protocol, the way you convert it into

session keys is not by using that sample directly. You would never use a source key

directly as a session key in a protocol. What you would do is you will run the

source key through a KDF. And the KDF would give you all the keys and output

that you need, for, the randomness, for the random keys to be used in your

protocol. And a typical KDF to use is HKDF, which is actually getting quite a

bit of traction out there. Okay. The last topic I wanna talk about in this segment

is, how do you extract keys from passwords. These are called password based

KDFs or PBKDFs. The problem here is that passwords have relatively low

entropy. In fact, we're gonna talk about passwords later on in the course when we

talk about user authentication. And so, I'm not gonna say too much here. I'll just

say passwords generally have very little entropy estimated on the order of twenty

bits of entropy, say. And as a result, there is simply not enough entropy to

generate session keys out of a password. And yet we still need to do it very

frequently. We still need to derive encryption keys and MACs and so on out of

passwords, so the question is how to do it. The first thing is, you know, for this

kind of purpose, don't use HKDF. That's not what it's designed for. What will

happen is that the derived keys will actually be vulnerable to something called

a dictionary attack, which we're gonna talk about much later in the course when

we talk about user authentication. So, the way PBKDFs defend against this low entropy

problem that results in a dictionary attack is by two means. First of all, as

before they use a salt, a public, random value that's fixed forever. But in

addition, they also use what's called a slow hash function. And let me describe

kind of the standard approach to deriving keys from passwords. This is called PKCS#5,

and in particular, the version I'll describe is what's called PBKDF1. And I

should say that this mechanism is implemented in most crypto libraries so

you shouldn't have to implement this yourself. All you would do, you know, you

would call a function, you know, derived key from password. You would give the

password in as input, and you would get a key as output. But you should be aware of

course that this key is not going to have high entropy so in fact it will be

guessable. What these PBKDFs try to do is make the guessing problem as hard as

possible. Okay. So the way they work, first of all, is, as we said, they

basically hash, the concatenation of the password and the salt. And then the hash

itself is designed to be a very slow hash function. And the way we build a slow hash

function is by taking one particular hash function, say, SHA-256, and we

iterate it many, many times, C times. So you can imagine 1000 times, perhaps even a

million times. And what do I mean by iterating it? So, well, we take the

password and the salt. And we put them inside of one input to the hash function.

And then we apply the hash function, oops, let me write it like this. And then we

apply the hash function and we get an output, and then we apply the hash

function again and we get another output. And we do this again and again and again

maybe a thousand times or a million times depending on how fast your processors are

and then finally we get the final output that we actually output as, as the key

output of this key derivation function. Now what is the point here? Iterating a

function 10,000 times or even a million times is going to take very little time on

a modern CPU, and as a result, it doesn't really affect the user's experience. The

user types in his password, it gets hashed a million times and then it gets output.

And maybe that could even take, you know a tenth of a second and the user wouldn't

even notice it. However the attacker, all he can do is he can try all the passwords

in the dictionary, because we know people tend to pick passwords in dictionaries,

and so he could just try them one by one, remember the SALT is public, so he knows

what the SALT is. And so he can just try this hash one by one. However because the hash

function is slow, each attempt is gonna take him a tenth of second. So if he needs

to run through a dictionary, you know, with, with a 200 billion passwords in it,

because the hash function is slow, this is gonna take quite awhile. And by doing

that, we slow down the dictionary attack, and we make it harder for the attacker to

get our session keys. Not impossible, just harder. That's all this is trying to do.

Okay, so this is basically what I wanted to say about password based KDFs. As I

said, this is not something you would build yourself. All crypto libraries have

an implementation of a PKCS#5 mechanism. And you would just call the

appropriate function to convert a password into a key, and then use the resulting

key. Okay, in the next segment, we're gonna see how to use symmetric encryption

in a way that allows us to search on the cipher texts.