[MUSIC] Formulating an analytics problem. I'm going back to problem formulation. I'm going back to where we started in session one for a good reason because most of the data that we are going to use here will not be primary in form, it will be digital, it will be secondary. So let's see this in action. Here's an example. The background, the Earth's population is slated to rise from 7.3 billion people now to 9.7 billion people by 2050. Average incomes will also rise, which basically means that people will want to eat more and eat better, which basically means more meat, which basically means more resources needed to raise them. So the question then become how can the world be fed in the future without putting irreparable strain on the Earth's soil and oceans? By 2050, the FAO which is the Food and Agriculture Organization, a part of the UN, it's 2009 report says that by 2050, agri production will have to rise by something like 70% to meet projected demand. There is no way out of this, other than the industrialization of agriculture. And most land suitable for farming have already been farmed, which basically means this growth has to come from higher yields, right? Higher yields have already happened, okay. A lot of agricultural has undergone shift but yields since then have plateaued. So think about the Green Revolution in the 60s and 70s right. So to go beyond that, we are now going to leverage technology and yes, analytics. Let's see this. Consider a farmer's challenges, right, sources of risk or uncertainty. Take a minute, type what do you think are a farmer's challenges. You are a farmer. What are the sources of risk, of uncertainty that you face? Just write it in a minute or two and then we will proceed. Okay, so there are a number of risks and what I have here is an incomplete list. The biggest one is probably the weather, soil's moisture levels, soil's nutrient content, the competition to crops from weeds. These are all risks and uncertainties. There is the threat from pests and from diseases and the cost of taking action should something like this happen, right? All of them will basically in some sense weigh on farmers' minds. So think of the problem in some sense. Think of the farmer's problem as a matrix, okay, with rows and columns. Inputs are the variables. They are the columns. And rows are actually individual plants. We will see this in action. Just hold on. Two questions arise at this point, right? One, how to cost effectively populate this matrix. If I have a matrix where each row is a single plant and each column is things like weather and soil's moisture level and soil's nutrient content and the number of weeds around it. I can actually populate the entire matrix, okay. It's going to be time consuming how to do it cost effectively. And two, once populated, how do you analyze the data top optimize yields and maximize profits. You populated the matrix, the rest, analytics will do. Populating the matrix, how would you cost effectively populate the matrix? With data on let's say inter plant distances, data on weed growth, data on soil conditions, how? What I want you to do is take a minute think about it what is the easiest, best ,fastest, cheapest way to fill this matrix up? Type a one line answer and then we'll proceed. Well, there're two ways this can be done, okay. But first is you take the aerial route. Drones or planes flying over the fields, right? So they overfly the field and what do they do? Well you could take, that's what it looks like in some sense and a drone or plane flies over it. You can photograph the fields in high definition or you could use even better, something called multispectral analysis. So you have these cameras that see beyond the visible spectrum both into infrared and into ultraviolet. What do they do? They figure out crop density, they figure out crop health, they figure out soil conditions. I can tell you moisture content in the surface by using multispectral analysis. Why? Because wet soil tends to reflect a different wavelength than dry soil. Can I identify weeds? Yes, because they would reflect, their leaves reflect a different wavelength than that of the crop. I can do all of that by one overpass, one flyby. You can also do accurate contour mapping, we'll not go there, but all of that is possible, right? Ultimately, you could take the land route. So basically you have these machines that do soil planting and so on, they can be GPS enabled, so sowing, harvesting, seeding, all of that and in fact, the John Deere, all their machines are GPS enabled. They basically say, we are information company too. You can control inter-seed distances while sowing, you can do water, fertilizer supply, all of that, based on local soil conditions, based on contours, based on all of that. And you can in some sense, record harvested quantities. I'll come to that. That is important. That becomes the y variable, the yield map for the whole field. Okay, I was talking about machine vision. Might be spectral images and so on. Yes advances in machine vision and image analysis may have made this possible. Here's the link with digital media. Okay the digital media link is that all this data is digital in form, right? The applications are there from geo mapping and imaging using satellite data. Enormous possibilities there. Once the datasets are ready from image analysis, you could just open the analytics tool box. Here is an example of just how good machine vision has gotten of late. That is the error rate, and you can see it basically coming down, down, down until in 2015, it crossed a great milestone, it basically came below the human error rate in image analysis. All right, and from here, it just gets better for the machines. Consider an example of what the data sets might look like now, right? I have soil conductivity. I have crop yield. It's basically what the images look like. It's hard to make out from here, but you put a machine to work on this and it will be able to give you different matrices, individual columns populated based on this kind of pictures right, a set of input and a set of outcomes. All right, so if you go back to what we had and how analytics works, I have a set of outcomes, I have a set of in puts, I train my model, I train the machine, right. We calibrate them, we test them and after that, and hold out samples in some sense on unknown fresh samples, we are able to get them to predict, which basically in some sense, sums up everything that we've done thus far. So we've actually come to the end of this introductory course. The terms business analytics and digital media have wide usage. They mean different things to different people. We've defined them in a simple way for our own purposes. We started with problem formulation in session one. We then moved on to where we basically do basic approaches of how to solve the problems we formulated and this came about in session two. Session three, if you remember what we did, we basically applied those approaches onto customer analytic problems. And finally, in session four, we basically put everything into a digital social and non-social context. With that, this course comes to an end. [MUSIC]