Press "Enter" to skip to content

Category: Security

Securing Spark Shuffle

Cheng Xu uses Apache Commons Crypto to secure data when Spark shuffles off to disk:

The basic steps can be described as follows:

  1. When a Spark job starts, it will generate encryption keys and store them in the current user’s credentials, which are shared with all executors.

  2. When shuffle happens, the shuffle writer will first compress the plaintext if compression is enabled. Spark will use the randomly generated Initial Vector (IV) and keys obtained from the credentials to encrypt the plaintext by using CryptoOutputStream from Crypto.

  3. CryptoOutputStream will encrypt the shuffle data and write it to the disk as it arrives. The first 16 bytes of the encrypted output file are preserved to store the initial vector.

  4. For the read path, the first 16 bytes are used to initialize the IV, which is provided to CryptoInputStreamalong with the user’s credentials. The decrypted data is then provided to Spark’s shuffle mechanism for further processing.

Once you have things optimized, the performance hit is surprisingly small.

Comments closed

Azure SQL Threat Detection

Ron Matchoro discusses use cases for Azure SQL Threat Detection:

Thanks to SQL Threat Detection, we were able to detect and fix code vulnerabilities to SQL injection attacks and prevent potential threats to our database. I was extremely impressed how simple it was to enable threat detection policy using the Azure portal, which required no modifications to our SQL client applications. A while after enabling SQL Threat Detection, we received an email notification about ‘An application error that may indicate a vulnerability to SQL injection attacks’.  The notification provided details of the suspicious activity and recommended concrete actions to further investigate and remediate the threat.  The alert helped me to track down the source my error and pointed me to the Microsoft documentation that thoroughly explained how to fix my code.  As the head of IT for an information technology and services company, I now guide my team to turn on SQL Auditing and Threat Detection on all our projects, because it gives us another layer of protection and is like having a free security expert on our team.”

Anything which helps kill SQL injection for good makes me happy.

Comments closed

Audit Select Statements

Jason Brimhall shows how to build an extended event session which audits all SELECT statements:

I have to be a little honest here. Prior to somebody asking how they could possibly achieve a statement audit via extended events, I had not considered it as a tool for the job. I would have relied on Audit (which is Extended Event related), or some home grown set of triggers. In this particular request, Audit was not fulfilling the want and custom triggers was not an option. Another option might have included the purchase of third party software but there are times when budget does not allow for nice expensive shiny software.

So, with a little prodding, I hopped into the metadata and poked around a bit to see what I could come up with to achieve this low-budget audit solution.

Read the whole thing.

Comments closed

Managing Power BI Group Workspace Members

Melissa Coates shows how to mange Power BI groups with larger numbers of members:

Dozens or hundreds of users in a group is what is prompting me to write this post. Manually managing the members within the Power BI workspace is just fine for groups with a very small number of members – for instance, your team of 8 people can be managed easily. However, there are concerns with managing members of a large group for the following reasons:

  • Manual Maintenance. The additional administrative effort of managing a high number of users is a concern.
  • Risk of Error. Let’s say there is an Active Directory (A/D) group that already exists with all salespersons add to the group. System admins are quite accustomed to centrally managing user permissions via A/D groups. Errors and inconsistencies will undoubtedly result when changes in A/D are coordinated with other applications, but not replicated to the Power BI Group.Depending on how sensitive the data is, your auditors will also be unhappy.

To avoid the above two main concerns, I came up with an idea. It didn’t work unfortunately, but I’m sharing what I learned with you anyway to save you some time.

Even though Melissa’s plan didn’t work, it’s a good concept, so I recommend reading.

Comments closed

Failed Logins

Kevin Hill discusses failed logins:

We’ve all seen them.

Login failed for user ‘MyDomain\Bob’ (password issue)
Login failed for user ‘MyDomain\Nancy’ (default database issue)
Login failed for user ‘blah, blah, blah…’

But what about Login Failed for user ‘Insert Chinese characters here’, Reason, An attempt to logon using SQL Authentication failed.

Wait…nobody in the company has a username with Chinese characters.   And we don’t have SQL Authentication turned on….

I generally agree with Kevin’s assessment, but have one big point of contention:  he recommends turning off successful login logging.  I think that’s not a great thing to do, particularly for a company with a mature security team.  Think about this scenario:  if you see four or five failed login attempts for sa, and you don’t use sa in your environment, you know somebody’s trying something sneaky.  If you see four or five failed login attempts for sa and then a successful login attempt for sa, you know they succeeded.  If you don’t log successful login attempts, you lose that critical piece of information.

Comments closed

Dropping Masking From A Column

Steve Jones shows how to drop Dynamic Data Masking from a column:

This is a quick one. As I experimented with Dynamic Data Masking for the Stairway to Dynamic Data Masking, and writing my Using SQL Compare with Dynamic Data Masking, I needed to remove masking from a column. I didn’t want to rebuild tables, and hoped there was an easy way to ALTER a column.

There is.

The more I’ve seen of DDM, the less I like it.  So I’m more a fan of scripts to remove it than scripts to add it…

Comments closed

Access Control Basics

Robert Sheldon gives an introductory-level overview of the basics of logins, users, roles, and permissions:

You can think of a role as a type of container for holding one or more logins, users, or other roles, similar to how a Windows group can hold multiple individual and group accounts. This can make managing multiple principals easier when those principals require the same type of access to SQL Server. You can configure each role with permissions to specific resources, adding or removing logins and users from these roles as needed.

SQL Server supports three types of roles: server, database, and application. Server roles share the same scope as logins, which means they operate at the server level and pertain to the database engine as a whole. As a result, you can add only server-level principals to the roles, and you can configure the roles with permissions only to server-level securables, not database-level securables.

These help form the foundation of a secure instance, so it’s vital to understand these concepts.

Comments closed

vNet Peering Within An Azure Region

Denny Cherry reports that there is a public preview of a feature to allow vNet peering without setting up a site-to-site VPN connection:

Up until August 1st if you had 2 vNets in the same Azure region (USWest for example) you needed to create a site to site VPN between them in order for the VMs within each vNet to be able to see each other.  I’m happy to report that this is no longer the case (it is still the default configuration).  On August 1st, 2016 Microsoft released a new version of the Azure portal which allows you to enable vNet peering between vNets within an account.

Now this feature is in public preview (aka. Beta) so you have to turn it on, which is done through Azure PowerShell. Thankfully it uses the Register-AzureRmProviderFeature cmdlet so you don’t need to have the newest Azure PowerShell installed, just something fairly recent (I have 1.0.7 installed). To enable the feature just request to be included in the beta like so (don’t forget to login with add-AzureRmAccount and then select-AzureRmSubscription).

Read the whole thing for details on how to enroll in this feature and how to set it up.

Comments closed

Impersonation

Kenneth Fisher shows how to use impersonation to perform tasks without being explicitly granted permissions:

A developer wants to be able to truncate a table.

This isn’t an unreasonable request right? She’s writing a piece of code that loads a bunch of data into a staging table. She want’s to be able to truncate the table once the load is over. Unfortunately the permission required to do this is ALTER on the table. That’s not just going to let her truncate the table, it’s going to let her change the structure of the table. Not acceptable in a production environment. A development environment sure. Not a production one. So what do we do?

We use impersonation.

Check out the post to see how to do this.

Comments closed

Securing The Data Plane

Michael Schiebel gives an overview of security architecture inside a data lake:

Existing platform based Hadoop architectures make several implicit assumptions on how users interact with the platform such as developmental research versus production applications.  While this was perfectly good in a research mode, as we move to a modern data application architecture we need to bring back modern application concepts to the Hadoop ecosystem.  For example, existing Hadoop architectures tightly couple the user interface with the source of data.  This is done for good reasons that apply in a data discovery research context, but cause significant issues in developing and maintaining a production application.  We see this in some of the popular user interfaces such as Kibana, Banana, Grafana, etc.  Each user interface is directly tied to a specific type of data lake and imposes schema choices on that data.

Read the whole thing.  Also, “Securing the data plane” sounds like a terrible ’90s action film.

Comments closed