Press "Enter" to skip to content

Category: Security

Using Azure Key Vault with Azure Databricks

Jason Bonello shows how easy it is to integrate Azure Key Vault into Azure Databricks:

In Azure Key Vault we will be adding secrets that we will be calling through Azure Databricks within the notebooks. First and foremost, this is for security purposes. It will ensure usernames and passwords are not hardcoded within the notebook cells and offer some type of control over access in case it needs to be reverted later on (assuming it is controlled by a different administrator). In addition to this, it will offer a better way of maintaining a solution, since if a password ever needs to be changed, it will only be changed in the Azure Key Vault without the need to go through any notebooks or logic.

If you don’t use Key Vault, Databricks does include its own secrets storage, so there’s really no reason to keep them in plaintext.

1 Comment

Authentication in Hadoop with Apache Ozone

Xiaoyu Yao explains how we can use Apache Ozone to perform service account authentication for a Hadoop cluster:

Like Hadoop delegation tokens, Ozone security token has a token identifier along with a signed signature from the issuer. Ozone manager issues delegation token and block tokens for users or client applications authenticated with Kerberos. The signature of the token can be validated by token validators to verify the identity of the issuer. This way, a valid token holder can use the token to perform operations against the cluster services as if they have Kerberos tickets of the issuer. 

Read on for the high-level overview.

Comments closed

Network Security Changes Around Azure SQL DB

Rohit Nayak announces some changes to Azure SQL Database’s connectivity and network security:

Now in general availability, Private Link enables users to have private connectivity from a Microsoft Azure Virtual Network to Azure SQL Database.

This feature creates a private endpoint which maps a private IP Address from the Virtual Network to your Azure SQL Database.

From security perspective, Private Link provides you with data exfiltration protection on the login path to SQL Database. Additionally, it does not require adding of any IP addresses to the firewall on Azure SQL Database or changing the connection string of your application.

Private Link is built on best of class Software Defined Networking (SDN) functionality from the Azure Networking team. Clients can connect to the Private endpoint from within the same Virtual Network, peered Virtual Networking the same region, or via VNet-to-VNet connection across regions. Additionally, clients can connect from on-premises using ExpressRoute, private peering, or VPN tunneling. More information can be found here

Click through to see what else they’ve been working on.

Comments closed

Power BI Security Features

James Serra takes us through different ways to secure your Power BI dashboards and reports:

Row-Level Security: With Row-level security (RLS) you are given the ability to publish a single report to your users but expose the data differently to each person. So instead of creating multiple copies of the same report in order to limit the data, you can just create one report that will only show the data the logged in user is allowed to see. This is done with filters, which restrict data access at the row level, and you define filters within roles. For example, creating a role called “United States” that filters the data in a table where the Region = “United States”. You then add members (user, security group, or distribution list) who can only see data for the United States to the “United States” role (the assignment of members can only be done within the Power BI Service). If a user should not have access to a report, then just don’t include that person in any of the roles for that report, so they would always see a blank report.

Click through for several more options and links to additional resources.

Comments closed

Sqoop Scheduling and Security

Jon Moirsi continues a series on Sqoop:

In previous articles, I’ve walk through using Sqoop to import data to HDFS.  I’ve also detailed how to perform full and incremental imports to Hive external and Hive managed tables.

In this article I’m going to show you how to automate execution of Sqoop jobs via Cron.

However, before we get to scheduling we need to address security.  In prior examples I’ve used -P to prompt the user for login credentials interactively.  With a scheduled job, this isn’t going to work.  Fortunately Sqoop provides us with the “password-alias” arg which allows us to pass in passwords stored in a protected keystore.

That particular keystore tie-in works quite smoothly in my experience.

Comments closed

Secure Azure Data Source Access from Databricks

Bhavin Kukadia, Abhinav Garg, and Michal Marusan show us the right way to access Azure data sources from Azure Databricks:

Enterprise Security is a core tenet of building software at both Databricks and Microsoft, and thus it’s considered as a first-class citizen in Azure Databricks. In the context of this blog, secure connectivity refers to ensuring that traffic from Azure Databricks to Azure data services remains on the Azure network backbone, with the inherent ability to whitelist Azure Databricks as an allowed source. As a security best practice, we recommend a couple of options which customers could use to establish such a data access mechanism to Azure Data services like Azure Blob StorageAzure Data Lake Store Gen2Azure Synapse Data WarehouseAzure CosmosDB etc. Please read further for a discussion on Azure Private Link and Service Endpoints.

This is more about network configuration rather than things like “store your credentials and other secrets in Azure Key Vault,” which is also a good idea.

Comments closed

Security Changes in ML Services

Dennes Torres goes over some of the security changes with Machine Learning Services in SQL Server 2019:

I have a confession to make. Why, in my last article about shortest_path in SQL Server 2019, have I used Gephi in order to illustrate the relationships, instead of using a script in for the same purpose and demonstrate Machine Learning Services as well?

The initial plan was to use an R script; however, the R script which works perfectly in SQL Server 2017 doesn’t work in SQL Server 2019.

The change is a positive one from the standpoint of security, but it also makes life more difficult. I found this particularly tricky when installing TensorFlow and Keras in R via ML Services.

Comments closed

SQL Server 2019 and sys.syslogins Changes

Taryn Pratt goes into a change in the sys.syslogins system view in SQL Server 2019:

Sigh ok, something is really broken because this was working before we failed over.

The code for the login replication basically does the following via a cursor (yeah, I know, but it works…normally):

1. Select from the primary via OPENQUERY to query the logins and passwords
2. Using sp_hexadecimal convert the varbinary password to a string value
3. Create a string to be executed, i.e. dynamic SQL that runs a CREATE LOGIN

Read on for the whole story and how you can protect yourself as you upgrade to SQL Server 2019.

Comments closed

Tying a Database User Back to a Login

Dave Bland shows how to figure out which database users tie back to which logins:

Over the years I have had to provide information about logins and database users, most of the time per a request of an auditor.  Many times this is very easy to accomplish because the login name matches the name of the database user account. If you look at the “New User” screen you can see that I am able to enter a different User Name.

Because of this, I can have a User Name that doesn’t match the Login Name.  From an audit perspective this can create some confusion.  More importantly, it can make it difficult to provide accurate information to the auditors when asked.

Read on to see how you can tie these together even if the names don’t match.

Comments closed