In the previous video, you saw how to tokenize the words and sentences, building up a dictionary of all the words to make a corpus. The next step will be to turn your sentences into lists of values based on these tokens. Once you have them, you'll likely also need to manipulate these lists, not least to make every sentence the same length, otherwise, it may be hard to train a neural network with them. Remember when we were doing images, we defined an input layer with the size of the image that we're feeding into the neural network. In the cases where images where differently sized, we would resize them to fit. Well, you're going to face the same thing with text. Fortunately, TensorFlow includes APIs to handle these issues. We'll look at those in this video. Let's start with creating a list of sequences, the sentences encoded with the tokens that we generated and I've updated the code that we've been working on to this. First of all, I've added another sentence to the end of the sentences list. Note that all of the previous sentences had four words in them. So this one's a bit longer. We'll use that to demonstrate padding in a moment. The next piece of code is this one, where I simply call on the tokenizer to get texts to sequences, and it will turn them into a set of sequences for me. So if I run this code, this will be the output. At the top is the new dictionary. With new tokens for my new words like amazing, think, is, and do. At the bottom is my list of sentences that have been encoded into integer lists, with the tokens replacing the words. So for example, I love my dog becomes 4, 2, 1, 3. One really handy thing about this that you'll use later is the fact that the text to sequences called can take any set of sentences, so it can encode them based on the word set that it learned from the one that was passed into fit on texts. This is very significant if you think ahead a little bit. If you train a neural network on a corpus of texts, and the text has a word index generated from it, then when you want to do inference with the train model, you'll have to encode the text that you want to infer on with the same word index, otherwise it would be meaningless. So if you consider this code, what do you expect the outcome to be? There are some familiar words here, like love, my, and dog but also some previously unseen ones. If I run this code, this is what I would get. I've added the dictionary underneath for convenience. So I really love my dog would still be encoded as 4, 2, 1, 3, which is 'I love my dog' with 'really' being lost as the word is not in the Word Index, and 'my dog loves my manatee' would get encoded to 1, 3, 1, which is just 'my dog my'.