Welcome to Cubes, Rollups, and Materialized Views and Tables. After watching this video, you will be able to: Relate what a data cube is in terms of star schema. Discuss the terms slice, dice, drill up or down, roll up, and pivot in terms of data cubes. Describe what a materialized view is. And, recall two use cases for materialized views. Let’s use an example to illustrate the concept of a data cube. Here is a cube generated from an imaginary star schema for a Sales OLAP (online analytical processing system). The coordinates of the cube are defined by a set of dimensions, which are selected from the star schema. In this illustration, we are only showing three dimensions, but data cubes can have many dimensions. We have the Product categories corresponding to the items sold, the State or Province the items were sold from, and the Year these products were sold in. The cells of the cube are defined by a fact of interest from the schema, which could be something like “total sales in thousands of dollars.” Here the “243” indicates “243 thousand dollars” for some given Product, State, and Year combination. There are many operations you can perform on data cubes, such as slicing, dicing, drilling up and down, pivoting, and rolling up. Let’s go over some examples of these operations, starting with slicing. Slicing a data cube involves selecting a single member from a dimension, which yields a data cube that has one dimension less than the original. For example, you can slice this sales cube by selecting only the year 2018 from the year dimension, allowing you to analyze sales totals for all sales states and all products for the year 2018. Similarly, dicing a cube involves selecting a subset of values from a dimension, effectively shrinking it. For example, you can dice this sales cube by selecting only “Gloves”, “T-shirts”, and “Jeans” from the Product-Type dimension, allowing you to restrict your view to just those product types. In snowflake schema, you will find hierarchies, or subcategories within some of your dimensions that you can drill into. Thus, for example, you can “drill down” into a particular member of the “Product category” dimension, such as “T-shirts,” resulting in this view, which may include more specific “product groups” such as “Classic,” “Slim fit,” and “Regular fit.” Drilling up is just the reverse process, which would take you back to the original data cube. Pivoting data cubes is straightforward. It involves a rotation of the data cube. In this case, the year and product dimensions have been interchanged, while the State dimension has been fixed "as is." Pivoting doesn’t change its information content; it just changes the point of view you may choose to analyze it from. Rolling up means summarizing along a dimension. You can roll up a dimension by applying aggregations, such as COUNT, MIN, MAX, SUM, and AVERAGE. For example, you could calculate the average selling price of Classic, Slim fit, and Regular fit T-shirts by summing horizontally over the three US states and dividing by three. A “materialized view” is essentially a local, read-only copy, or snapshot, of the results of a query. They can be used to replicate data, for example to be used in a staging database as part of an ETL process, or to precompute and cache expensive queries, such as joins or aggregations, for use in data analytics environments. Materialized views also have options for automatically refreshing the data, thus keeping your query up-to-date. Because materialized views can be queried, you can safely work with them without worrying about affecting the source database. Materialized Views can be set up to have different refresh options, such as: Never: they are only populated when created, which is useful if the data seldom changes. Upon request: manually refresh, for example, after changes to the data have been made, or scheduled refresh, for example, after daily data loads. Immediately: automatically refresh after every statement. Let’s look at an example. Here is how you might create a materialized view in Oracle using SQL statements. Start by creating and naming a “materialized view” object called “My underscore Mat underscore View”, Specify the refresh type as fast, which means “incrementally refresh the data”. Specify today as the start date, and Refresh the view every day. The final statement selects all data from my underscore table underscore name. Here is how you might create a materialized view in PostgreSQL to replicate a table. Start by creating a “materialized view” object called “My underscore Mat underscore View”, Specify some parameters, Specify the source tablespace, say “tablespace underscore name”, and Select all rows and columns from “table underscore name.” In PostgreSQL you can only refresh materialized views manually, using the “refresh material view” command. In Db2, materialized views are called MQTs, which stands for "materialized query tables." Here’s an example, from IBM’s online documentation, of creating a system-maintained “immediate refresh” MQT. The table, which is named “emp,” is based on the underlying tables: “Employee” and “Department” from the “Sample” database. The table will be created according to the query formed by these SQL statements, which selects columns from both tables. The “data initially deferred” clause means that data will not be inserted into the table as part of the “create table” statement, while the “refresh immediate” clause specifies that the query should refresh automatically. The “immediate checked” clause specifies that the data is to be checked against the MQT’s defining query and refreshed. Lastly, the “not incremental” clause specifies that integrity checking is to be done on the whole table. A query executed against the “emp” materialized query table shows that it is fully populated with data. In this video, you learned that: A data cube represents a star or snowflake schema’s dimensions as coordinates, plus a fact from the schema to populate its cells with values. Many operations can be applied to data cubes, such as: drilling down into hierarchical dimensions, slicing, dicing, and rolling up. Materialized views can be used to replicate data or to precompute expensive queries. And finally, modern enterprise data warehouse tools, such Oracle and Db2, allow you to automatically keep your material views up-to-date.