Let's recap how far you have come on your journey to learn about feature engineering. Predictive models are constructed using supervised learning algorithms where classification or regression models are trained on historical data to predict future outcomes. Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of a given feature, with the objective of reducing the modeling error for a given target. The underlying representation of the data is crucial for the learning algorithm to work effectively. The training data used in machine learning can often be enhanced by extraction of features from the raw data collected. In most cases, appropriate transformation of data, is an essential prerequisite step, before model construction. Feature engineering can be defined as a process that attempts to create additional relevant features, from the existing raw features in the data, and to increase the predictive power of those learning algorithm. It can also be defined as the process of combining domain knowledge, intuition, and data science skill sets to create features that make the models train faster and provide more accurate predictions. Machine learning models, such as neural networks, accept a feature vector and provide a prediction. These models learn in a supervised fashion where a set of feature vectors with expected output is provided. From a practical perspective, many machine learning models must represent the features as real number vectors, because the feature values must be multiplied by the model weights. In some cases, the data is raw and must be transformed to feature vectors. Features, the columns of your data frame, are key in existing machine learning models to learn. Better features result in faster training and more accurate predictions. As the diagram shows, feature columns are input into the model, not as raw data, but as feature columns. Engineering new features from a provided feature set is a common practice. Such engineered features will either augment or replace portions of the existing feature vector. These engineered features are essentially calculated fields based on the values of the other features. As you will see later in the labs, feature vectors can be numerical, categorical, bucketized, crossed, and hashed. Engineering features is primarily a manual, time-consuming task. The process involves brainstorming features, delving into the problem, looking at a lot of data, studying feature engineering on other problems, leveraging domain-specific engineered features, and it also includes devising features either manually or automatically or both. There's no well-defined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error. The key learning here is that different problems in the same domain may need different features. It depends on you and your subject matter expertise to determine which fields you want to start with for your hypothesis or your problem. Some values in a feature set need to be normalized or re-scaled before they should be used by an ML model. Here, re-scaling means changing a real valued feature like a price, to arrange from zero to one, using the formula shown on the slide. Re-scaling can be done for many reasons, but most of the time it is done to improve the performance of gradient descent. Notice that, to compute the re-scaling formula shown on the screen, you need to know the minimum and maximum values for a feature, which means pre-processing the entire dataset to find these values. Pre-processing can also be useful for categorical values in your datasets, like names of cities, and the code snippet at the bottom of the slide. For example, to use a one-hot encoding technique in TensorFlow which represents different cities as binary value features in your features set, you can use the categorical column with vocabulary list method from the layers API. To use this method, you need to pass it a list of values, which in this example, are different city names. Building this dictionary of values for a key, also requires a pre-processing step over the entire dataset. In this course, you also learned about three technologies that will help you to implement pre-processing, BigQuery, Apache Beam, and TensorFlow, which were used to process the full input dataset, prior to training. You learned that you could exclude some data points from the training set and also compute statistics and vocabularies over the entire input data set. You also saw that you can compute time window statistics for use as input features. You learned that things that are commonly done in pre-processing such as scaling of numeric features, splitting, and lowercasing of textual features, resizing on input images, normalizing volume levels in input audio, can be pre-processed one data point at a time, and can be implemented either in TensorFlow directly or using Beam. You learned that tf.Transform does batch processing, but also emits a TF graph that can be used to repeat these transformations in serving. By combining this graph with the train model graph into a single serving graph, you can guarantee that the same operations that were done to the training data will be done to their request during serving, before passing the transformed result to the trained model graph. This concludes our introduction to feature engineering course. We hope you have found value in our content, labs, readings, discussions, and quizzes.