Talking To Secure Hadoop Clusters

Mubashir Kazia shows how to connect to a secured Hadoop cluster using Active Directory:

The primary form of strong authentication used on a secure cluster is Kerberos. Kerberos supports credentials delegation where a server process to which a user has authenticated, can perform actions on behalf of the user. This involves the server process accessing databases or other web services as the authenticated user. Historically the form of delegation that was supported by Kerberos is now called “full delegation”. In this type of delegation, the Ticket Granting Ticket (TGT) of the user is made available to the server process and server can then authenticate to any service where the user has been granted authorization. Until recently most Kerberos Key Distribution Center(KDC)s other than Active Directory supported only this form of delegation. Also Java until Java 7 supported only this form of delegation. Starting with Java 8, Java now supports Kerberos constrained delegation (S4U2Proxy), where if the KDC supports it, it is possible to specify which particular services the server process can be delegated access to.

Hadoop within its security framework has implemented impersonation or proxy support that is independent of Kerberos delegation. With Hadoop impersonation support you can assign certain accounts proxy privileges where the proxy accounts can access Hadoop resources or run jobs on behalf of other users. We can restrict proxy privileges granted to a proxy account to act on behalf of only certain users who are members of certain groups and/or only for connections originating from certain hosts. However we can’t restrict the proxy privileges to only certain services within the cluster.

What we are discussing in this article is how to setup Kerberos constrained delegation and access a secure cluster. The example here involves Apache Tomcat, however you can easily extend this to other Java Application Servers.

This is a good article showing specific details on using Kerberos in applications connecting to Hadoop.

One CLR Solution

Solomon Rutzky continues his SQL Server 2017 CLR security series:

This new requirement prevents the technique described towards the end of Part 1 from working. That technique uses a SAFE Assembly as an indirect means of creating the Asymmetric Key to create the Login from. That worked perfectly prior to SQL Server 2017, but now even SAFE Assemblies require that the signature-based Login be created first, which now puts us in a whole chicken-egg paradox.

Before proceeding to the solution, it should be noted that yes, Microsoft has, as of RC2 (released on 2017-08-02), provided a kinda/sorta “fix” for this that allows for creating an Assembly without having the signature-based Login. HOWEVER, that “fix” is absolutely horrible, convoluted, and unnecessary. It should not be used by anyone. Ever! In fact, it should be completely removed and forgotten about. In no uncertain terms: it is not an option! To help clarify, I am being intentionally vague about that new feature here (and in Part 1) so as not to distract from these two solutions (this post and Part 3) that do not promote bad practices; it will be covered starting in Part 4.

Solomon outlines one approach to dealing with CLR security changes, though it’s a bit lengthy.

Thoughts On CLR Strict Security

Solomon Rutzky has started a series on CLR in SQL Server 2017 and lays down a gauntlet:

What all of that means is that, assuming clr strict security is “1” (i.e. enabled), and TRUSTWORTHY is “OFF” for the Database in which an Assembly is being created, then in order to create any Assembly you first need to:

  1. Sign the Assembly with a strong-name key or a certificate
  2. Create an Asymmetric Key or Certificate in master from whatever you signed the Assembly with
  3. Create a Login based on that Asymmetric Key or Certificate
  4. Grant that Login the UNSAFE ASSEMBLY permission

Is that really so bad? Aren’t many of us (hopefully!) already doing that?

Solomon’s not very happy with the way that CLR security works in 2017, but he does have solutions of his own in mind.

Row-Level Security In Power BI

Paul Turley has a video showing how to use row-level security with Power BI:

The best method to implement row-level security in a published Power BI model or SSAS Tabular model consumed from the Power BI service will depend on will depend on your data and requirements.  The method I demonstrate here is one of the most flexible approaches and one that I commonly use in my projects.

Click through to watch the video.

Indexed Views And Row-Level Security

Louis Davidson wants to figure out how indexed views work when you have row-level security enabled:

I was giving my Row Level Security session last weekend for the Richmond User Group (@RichmondSQL and a question came up (ok, so I technically asked the question). How does the feature work with indexed views?

Generally speaking, the answer turned out to be obvious. You can apply a RLS Security Policy to a view that has an index on it, but not to a table that is referenced by an indexed view.

Read on for Louis’s demonstration.

Inference Attacks

Phil Factor explains that your technique for pseudonymizing data doesn’t necessarily anonymize the data:

It is possible to mine data for hidden gems of information by looking at significant patterns of data. Unfortunately, this sometimes means that published datasets can reveal sensitive data when the publisher didn’t intend it, or even when they tried to prevent it by suppressing any part of the data that could enable individuals to be identified

Using creative querying, linking tables in ways that weren’t originally envisaged, as well as using well-known and documented analytical techniques, it’s often possible to infer the values of ‘suppressed’ data from the values provided in other, non-suppressed data. One man’s data mining is another man’s data inference attack.

Read the whole thing.  One big problem with trying to anonymize data is that you don’t know how much the attacker knows.  Especially with outliers or smaller samples, you might be able to glean interesting information with a series of queries.  Even if the application only returns aggregated results for some N, you can often put together a set of queries where you slice the population different ways until you get hidden details on individual.  Phil covers these types of inference attacks.

Whitelisting SQL CLR Assemblies

Niels Berglund walks through the process of whitelisting a CLR assembly in SQL Server 2017:

What Microsoft introduces in SQL Server 2017 RC1, is something I refer to as whitelisting. It is somewhat similar to the TRUSTWORTHY setting, where you indicate that a database is to be trusted. But instead of doing it on the database level, you do it per assembly.

To whitelist in SQL Server 2017 RC1, you use the system stored procedure sys.sp_add_trusted_assembly. As the name implies the procedure adds an assembly to a list of “trusted” assemblies. By marking an assembly as trusted, SQL Server will allow it to be loaded when clr strict security is on (on by default), even if:

  • the assembly is not signed, and

  • the database where you want to deploy it to is not TRUSTWORTHY.

With the elimination of the CAS model finally hitting CLR, this is probably going to be one of the easier ways for DBAs to move forward with CLR in the future.

Certificate Copying

Brian Carrig shows how to create certificates from binary:

Sometimes it is necessary to copy a certificate from one database to another database. The most common method I have seen to do this is involves taking a backup of the certificate to disk from one database and then restoring the certificate to the other database.

There is however, a lesser known alternative option available, provided you are working with SQL Server 2012 and above. Sadly despite it being 2017, this is not as foregone a conclusion for SQL Server DBAs as it should be. This alternate option is known as CREATE CERTIFICATE FROM BINARY. There are a few caveats with this option. Chief among them is that you cannot use a variable for the binary value, so you will likely end up needing to use some dynamic SQL.

One of the nice aspects to this feature from an administration and a security perspective is that you do not need to worry about accidentally leaving a copy of your certificate on a disk somewhere or having to remember to delete it after you have imported it into your user database.

Read on to see it in action.  Also, it’s about time that Brian started blogging.

TDE + AG = Higher CPU Utilization

Ginger Keys has an analysis stress testing CPU load when Transparent Data Encryption is on and a database is in an Availability Group:

Microsoft says that turning on TDE (Transparent Data Encryption) for a database will result in a 2-4% performance penalty, which is actually not too bad given the benefits of having your data more secure. There is even more of a performance hit when enabling cell level or column level encryption. When encrypting any of your databases, keep in mind that the tempdb database will also be encrypted. This could have a performance impact on your other non-encrypted databases on the same instance.

In a previous post I demonstrated how to add an encrypted database to an AlwaysOn group in SQL2016. In this article I will demonstrate the performance effects of having an encrypted database in your AlwaysOn Group compared to the same database not-encrypted.

The results aren’t surprising, though the magnitude of the results might be.

Azure SQL Database Security Basics

Arun Sirpal explains some of the security features exposed in Azure SQL Database:

You do not really have to use all of them, this is down to you and your requirements but at least you have decisions to make. I will mention TDE (Transparent Data Encryption), I know a lot of people will opt for TDE in Azure. The big advantage of TDE in Azure over the earthed flavour is that Microsoft does a lot of the work for you, especially around the key management side of things. Also assuming your database is in a geo-replication partnership it will be protected by a different key on each of the servers. Microsoft will also rotate your certificate at least every 90 days, doing this with a local based SQL Server can be quite manual and fiddly (well I think it is).

Read the whole thing if you’re thinking of moving forward with Azure SQL Database, or if you already have a database up in Azure and haven’t checked the latest offerings yet.


August 2017
« Jul