In this video, we will give an overview of what our facial attributes give a little bit of background on their detection and show how a deep convolutional neural network can be applied to the problem of their recognition. Facial attributes such as gender, race, age, or presence of facial hair such as a mustache arise when examining performance of face recognition algorithms, particularly those that aim at searching for specific people in large databases. A common case in video surveillance applications. These not only might the facial pose differ, but so might for instance, the expression and hairstyle. As a result, in these applications, men might get confused for women, young people for old, or Asians for Caucasians. All these are examples of facial attributes. High level visual traits describing an appearance of a face image. In general, attributes are considered insensitive to pose, illumination expression and other imaging conditions. One more application of visual attributes might be image search. Suppose one is to find images with specified facial appearance such as the smiling Asian man with glasses. When using annotation-based image search, one confronts with sparsely annotated data that might be time consuming to label and prone to incorrect or misleading annotations. Image retrieval then might yield a plethora of irrelevant or incomplete results indicating that annotation-based image search does not scale. Alternatively, attribute-based image retrieval system recognizes attributes and automatically tags images with attribute labels to make them searchable. Facial attribute detection pipeline typically involves localizing face or facial landmarks in a high resolution image. This tap is performed using an object detection or a key point progression convolutional neural network, that we will discuss further in the course. The detected face is normalized and passed to an attribute classification convolutional neural net to predict one or more face attributes. Facial attributes recognition is typically evaluated using a simple prediction accuracy metric corresponding to a fraction of correctly predicted facial attributes. Attribute recognition methods are generally subdivided into two broad groups, global and local methods. Global methods extract features from the entire object, while accurate locations of object parts or landmarks are not required. They are not robust to defamations of objects. Recently, local methods were proposed that first detect object parts and extract features from each part. These local features are then concatenated to train classifiers. Let us describe both approaches in greater detail. Gathering a large labeled image training set for such attribute as age and gender from social image repositories requires either access to personal information on the subjects appearing in the images, which is often private, or the tedious and time consuming manual labeling. Data-sets for age and gender estimation from real world social images are therefore relatively limited in size. And presently, no match in size with a much larger image classification data-sets such as the ImageNet. Overfitting is common problem when machine learning based methods are used on such small image collections. This problem is exacerbated when considering deep convolutional neural networks due to their huge number of trainable model parameters. Care must be therefore taken in order to avoid overfitting under such circumstances. Ways to avoid overfitting include employing aggressive data augmentation strategies, such as random cropping and mirroring input data. And usually, people use multiple normalization and drop-out layers and inject them in the network to stabilize learning. For certain attribute types, age group classification rather than precise age estimation may improve performance because this implies learning the cross-entropy loss and not the usual L2 loss. One may adopt a multistage attribute recognition procedure and train separate convolutional neural network, face detector and attribute classified models. While the detector is trained in a relatively straightforward way, and we'll get to this during the next week, attribute classifier is pre-trained on a large collection of images for face verification. By pre-training on face verification task, attribute classifier learns to extract face sensitive convolutional features. It is later fine tuned to adapt to the final task of attribute recognition. Research shows pre-training on the face verification task significantly improves classifier performance. During inference, accuracy is further improved via computing on multiple crops. To summarize, attributes serve as describable traits of visual appearance, and they are important for face recognition and face retrieval. There are local and global attribute recognition approaches, and they differ in how they perform feature extraction from facial images and interpret these results. As usual, employing convolutional neural network for facial image classification requires careful tuning of network hyperparameters, pre-training and data augmentation.