Hi. Welcome to introduction to Map/Reduce. My name is Paul Rodriguez. I work here at SDSE helping folks with different kinds of data analysis problems. In this module, you will learn the concept for the Map/Reduce framework, and strategies for using Map/Reduce. You will also go through details of some Map/Reduce examples, as well as the execution of Hadoop. In a previous module, you learned about the architecture of Hadoop, and in a previous course, you learned about the challenges of big data. So this module will start putting these things together. The first lecture, I wanna set up the context and motivate the need for Map/Reduce. Let's recall what the problem is. Imagine you have a large amount of data. And let's suppose the data's growing. It's perhaps unstructured, and you somehow wanna process this data. Well, having big data means you're gonna need lots of hard drives. And maybe that data's already spread out among many hard drives. And you can imagine that you're a company or your project collects Internet data. You need to process that data but you don't wanna worry about all the details of parallelizing and communicating between processes and all the potentially messy details that that entails. In fact, this is like the problem that Google faced with Internet searching and they developed an approach to solve their problem which is to bring computation to the data. And moreover, as we will see, they wanted to make it easy to develop a code without worrying about all the messy details. Before we go further, let's be clear on something. If you have a lot of data spread out over many disks and if your data is transactional, meaning that maybe you have a lot of customer records, those records get retrieved, they get updated, processed for billing and so forth and so on, you might wanna go to traditional database scheme where you have a database management system, you build indices, set schemas, you organize the data into tables. But if you need to make a lot of sweep through data and perform some relatively simple processing, Then it would be better to have a system that helps you apply functions to pieces of the data that are spread out and then organize the output. That, in a nutshell, is the Map/Reduce framework. It is a layer of software to help you bring computation to the data and organize the output. The next video will get into the framework in detail.