[MUSIC] So going back to the grid we can highlight those systems that were based on MapReduce itself. The MapReduce paper itself in 2004 and these language layers on top, Pig and Hive in 2008 where Hive is SQL and Pig is a relational algebra looking language that we'll talk about in some detail in the next few segments. And Tenzing, which is also SQL, and Impala, which is also SQL, where Tenzing is from Google and Impala is from a company called Cloudera, that's a pretty eager evangelist of MapReduce and Hadoop based technologies in general, okay? So one trend I think you see is that these declarative languages on top of the parallel processing primitive of MapReduce are really here to stay, right? So people that were relatively against these kinds of languages from [INAUDIBLE] doing it. Now it's also fair to say that the enterprises in general have made a pretty significant investment in SQL expertise. So even if they are attracted to the advantages that [INAUDIBLE] my brain, they are pretty much demanding. SQL, so this may be a response to this inertia from having invested in SQL in the past. I think that's certainly true, however it's also true that the desire for declarative languages reasonably well founded for reasons that we've already talked about. So you put these systems on this timeline and the only point I wanna make about this is that there's a bit of a gap between the paper in 2004 and these systems in 2008. But then as soon as you had Hadoop, the system itself, developed in Yahoo and released as an Apache open source project, you sort of immediately see an ecosystem start to emerge of extensions to it. Particularly adding these languages on top, and so I think that the need for a high level interface is motivated by how quickly they came around as soon as Hadoop was out. And, again, it didn't stop with these later systems a few years later. Okay. And actually, you know, not on this page, there's potentially hundreds of, if you include research projects based on. Extensions to MapReduce. There are really, really a lot, okay? And, so this is some of the most popular ones. All right. So another subset of this grid that you can look at is just the NoSQL systems. Now the whole last few segments have ostensibly been about NoSQL. But I've also included these kind of analytics systems in here, MapReduce based systems and a few others, for example, Dremel and Spark and Shark. Dremel is a system from Google that is the back end of a query as a service system called Google BigQuery which is pretty nice. I recommend taking a look at it. You can sort of upload data and put it in there, and, it doesn't matter how big it is and you can kind of query it at very low speed. Spark and Shark come from the amp lab at Berkeley and they're part of the Berkeley data analytic stack, BDAS or Bad Ass. And Spark is a language label on top of, it's not Map Reduce, but on top of a parallel processing system, and Shark is an SQL layer even on top of that. Okay. And so a couple of the distinctive features of Spark is that it loads everything into memory, process everything there when possible, writing things out to disk only for full tolerance reasons, so much less often than they produce. And it also supports integrative processing, which is pretty important, and we're going to come back to that later in the course. And then Shark, again, is just SQL on top of this. So within these NoSQL systems, this diagram, to point out that there's been sort of a Cambrian explosion. First we had memcached, which is, again, just a caching layer for really the real system, and the real system was a bunch of MySQL databases that really weren't working all that well for all the requirements they're having used for. But you could bring things into memory and keep it there, looking it up by name, and it was just sort of a performance enhancement. A free performance enhancement, if you invested in the system. The real approach of throwing out everything you had and replacing it with a NoSQL System came a little later. And there's only a couple points I wanna make is that there's been kinda like a Cambrian explosion of different systems around this time, and that this space in here is nowhere near as empty as it looks. I picked out a few systems here, but really, the emergence of new systems in this space hasn't really slowed down much at all. So, really filling out the design space here since around 2006 And the only other point I make is that these quite popular document-oriented data models, systems, CouchDB and MongoDB, have actually been around for quite a while. So they were originally releases were 2005 and 2007. So some of those, I think they seem like new systems, but they have some maturity. Okay. [MUSIC]