The one of the most important topics is safeguarding your data from unintended access. In this next module, we'll cover a range of methods from securing individual columns to encryption to limiting access to specific roles through authorized views. We'll then come to the default access control roles that are available within BigQuery and how those mapped to your overall Google Cloud Platform account level roles. So the first topic that we're going to cover in data access is actually doing a little bit of encryption on some of the columns you might have. So, here, you see the farm fingerprint function, much like other functions that you've seen in the past, is wrapping this literal string called secure. And this is just a one-way encryption function that you can throw on to your field values and have them be encrypted one-way. Thus, it not meant for passwords. This is meant for potentially hiding fields from different access groups, depending upon what level of information you want to see in your data. As an example, if you had an analytics and marketing group, the analytics group, for example, could see the telephone numbers of all the users in your account. But, say, you wanted to set up a view where only the marketers can see every field that's available, but the phone numbers have been hashed away or encrypted so that they can't be read by the marketers. You could set up a view that has this farm fingerprint function thrown on and then authorize the view to access your underlying source data. And we'll look at how to create those authorized views in just a few slides. And just as we introduced, here's the concept of authorized views, which help you limit that role level access and even column level access to your underlying source data. So BigQuery views, again, review are logical, meaning that the data underneath is not stored or materialized. What that means is that your query is going to be rerun every single time against your underlying source data. And one of the really cool things that you can do is invoke some BigQuery reserved keywords like session user, which you can then pull the email address of the person that's logged in, and assuming that you have your data structured in such a way where you have a column called allowed viewer or allowed group that you can then maps that email address on, you can set up some pretty neat role level filtering on your underlying data. We'll cover how to create those authorized views in just a minute. And one of the core principles about datasets in BigQuery is that permissions are granted at the dataset level, not at the individual table level. So two key areas to really focus on for access permission or at the project level, which includes BigQuery and your cloud storage buckets and everything else and then individually at the BigQuery dataset level. And you can access and share datasets by clicking on that triangle beneath each of your different datasets. Pull up a window that looks exactly like this. And then what you can do is you can add people but not just people, you could add authorized views from other different datasets to look into your source data and then get access to that. So a lot of different flexible ways depending upon what your access control use cases to limit and restrict access to your underlying data. So what we're going to look at right now is a quick demo of how to set up an authorized view to pre-filter that data between different user groups. So we talked a lot about authorized views and using those as a really good mechanism to limit the access control to your dataset, but let's look at a real example where you might see those in play. So we have our project ID. In this case, this is the auto-generated Qwiklabs project ID. It's going be different for you. In here, this is the final solution. We will show you how to create exactly what we just did. We have the goal of a table that has all the charities present, but we only want certain users, let's say users from New York, to only be able to see the charities that are from New York. We want the same underlying source dataset, but we want to restrict the amount of roles that are returned or even columns that are returned for users that are in that authorized group. So, to do that, let's start off with a few basics. What we need to do is since BigQuery controls data access at the dataset level, it's not at the data table level, this is where we first need to start. So the first thing that we need to do is make sure that we have source dataset. That is exactly what we want to begin sharing with folks. So, here, I'm just going to preview the data. If you remember the IRS EIN organizational charity names table, this is the exact same thing as that, except that I've just limit it to the EIN number, the charity name and state so you can use that as part of a filter. So it's got a source dataset, and it's in a project that I control. So we're just going to look at the dataset permissions so far. So, as project editors, project owners, this is my auto-generated ID, and then project viewers, and then we have something that's called qwiklabs, ny_charities, ny_charities_view. And what this is, this is an authorized view from a separate dataset. So let's take a look at what that actually means. So we've got all of our data, but we don't want everyone to see all of our roles. So what we need to do first is, again, because BigQuery handles permissions at that dataset level, and you can create as many datasets as you like within the project, create a brand new dataset. In this case, we're just calling it New York charities, and you can do that on the project level just by selecting the drop-down, creating new dataset. And then, within there, you want to create a brand new view. Now, we're going to create a view. You just type in the SQL query, and click save view. So it looks like it's got the same columns. But let's see what we're filtering the dataset on. So in the details for this view, scrolling down, we'll see the actual query. So it's pretty much selecting all the columns that are available, except for the notable distinction, that line 9 here has where state is filtering on New York states only. So if you're actually to run this query, you'll see the results. I'll just throw a limit 10. Now, you can only see the states that belong to New York. Now, this is well and good if you are the admin, and you have access to see both datasets, and you can query data as you like. But keep in mind that you can be having special users, individual users, or email addresses, or groups of users that you want actually to have external access to your datasets. So what do we need to do there? So the fundamental basics of the view means that it queries the underlying data source each time with that query that we specified. In this particular case, that's just the same query with the New York filter. And then what we can do, instead of sharing access to the dataset that is all the charities, this is the really fun and tricky part, you share access to the dataset that has the view inside of it. Now, how you do that is just click on the dataset name, much like you would normally, open the drop-down, click on the shared dataset. And then, here, I've just added an email address. This is a simple Gmail email address. You could add a Google group or whatever you see fit to have restricted access to that data, and then you save that. You just add in people here as you'd like, and then click Add. And then once you've done that, now, the really interesting part is you need to make sure that this view in this dataset, again, because permissions are walled off between those two datasets, needs to have access to the underlying source data itself. So here's the fun part. Clicking on all charities, now, we're not going to give that person's email address, that Gmail email address you saw, read access to this dataset. What we are going to give access to this dataset is what we call that authorized view. And that's exactly what this is. And this is the icon that you see right there. So what you can do, if he knows this little box down here, and this is individual people, right? If you click on that, you scroll all the way down to the bottom. You click on authorized view. Now, you know it's the button actually changes here on the right side. This is saying, okay, you've got your source dataset of all charities, choose another view to have access to your underlying source data. And, here, it's actually not within all charities dataset. It's the one with differing permissions, again, with the one that has a single e-mail address that's been added. And then here, we're going talk in our table ID. That is the view, and you can see it's already been created. It's already been added here. And then if that was new, we would see those changes. So let's review exactly what happened. It means if I was logged in as this user, this Gmail address user, I would be able to come in and then execute queries against this view only. I wouldn't even see this of the dataset because I don't have access to even see all charities dataset. And then because the view, and only the view, not underlying users, not underlying groups, but the authorized view has permission to query against the source dataset, then we're allowed to actually execute and run the query for the view, and then the users will be able to see the results of their queries much like what we showed you earlier with the filter that's attached. Okay. So let's review three of the key concepts. The first thing is permissions are at the dataset level. So you share datasets as such. Second, you can have multiple datasets within the same project ID, and that's how you can do neat things like creating authorized views as bridges between different datasets. And third, you can actually share datasets with groups, users, authorized views, or even entire domains as well. So you can get as granular as you would like. So there you have it. That's how to set up an authorized view, and your queries can get as complex as you like by limiting certain columns or even doing hash functions on those columns to limit the data that's returned. You can just edit these views to your heart's content, and then just make sure you give them underlying access to that all data that you see there. Back in this lines, and just for your reference, these are some of the predefined roles or pre-canned roles that are available as part of the BigQuery. You have your data viewer, your data editor, your data owner, the user, the job user, and then the overall admin. Then here's a list of things that those particular users can do. So you can get as granular as you would like. If you want users just to be able to view your data or users to actually own your data and be able to create tables but not actually create queries, or if you wanted to give full administrative access to all the different members. So you want to pick the most granular role possible or the most limited role possible when it comes to permissions for each user that you're adding to make sure that nobody has more permissions than they should. Now, another one of the key concepts that we want to cover is inherited permissions at the BigQuery dataset level from your overall Google Cloud projects, permissions, and roles. So, as you see all the on the left, what we call that primitive role is the viewer, the editor, and then the owner at the Google Cloud Project level. If you add new users to your overall Google Cloud Project, based on the role that you grant them to the overall Google Cloud Project, they'll automatically inherit a particular role down at BigQuery. So if you had another co-owner to your Google Cloud Project as a whole, it will also be an owner or an editor, as you see in that third row, for your BigQuery datasets, and same goes for viewer and editor. Now, if you didn't want to make an overall project viewers, editors, or owners to have access to your underlying datasets that are in BigQuery, you can then go into BigQuery, and then automatically change the inheritance of those permission levels from what their automatically defaulted out to something a little bit more granular. Now, data access is not something that you set up once and you just forget about. As you see in the last bullet point there, this is something that you need to monitor and audit periodically to make sure that users that join your project initially still need the same level of access that they did at the beginning, or if they're no longer with your project organization, you have an automatic or regular way of phasing out those users to make sure access doesn't fall into the wrong hands. Now, for the second bullet point, datasets in storage are cheap within BigQuery. So a lot of organizations do is you'll add main different datasets mirrored across each other in development, QA, or production environments, and then have a different matrix of permissions that are associated with each of those. So your analytics and development team can have full access to all three resources, but maybe your marketing team or some other teams only have access to certain tables within production or vice versa. As we covered a little bit earlier, your dataset users share the minimum required permissions to do their jobs. It's time for a quick recap. Setting up proper access controls to your data, this probably one of the most important topics in data analysis. As you've seen, Google Cloud platform provides you with a variety of roles at the project level that can then be inherited down to the BigQuery dataset level. It's ultimately up to you to determine which individuals and which groups should have access to which parts of your data. Be sure to set up regular access control audits, and look at those stack driver logs to spot any strange usage patterns. Lastly, as we mentioned before, consider using authorized views in tandem with a where clause filter on current user to limit row-level access to a particular table based on who's logged in.