Press "Enter" to skip to content

Category: Cloud

Granular Permissions for Dynamic Data Masking

John Martin reviews a change:

All the way back with SQL Server 2016 Microsoft released the Dynamic Data Masking feature in the database engine. It seemed like a huge step forward and promised so much, but there were severe limitations around the way that we could control who sees what masked data. It was a case of you either got to see masked data wherever it was configured, or you saw clear data, there was no granularity. I wrote about this and a few other things to do with Dynamic Data Masking all the way back in August of 2016 when I was at SentryOne. You can check that post out here. Also, back then I created several Connect items (blast from the past there), one of which was pulled over to the user voice replacement where I was asking for the UNMASK securable to be made more granular, you can check that out here.

So, why I am I writing this post? Well, it seems that our (my?) request has been granted. At least in Azure SQL Database. On March the 17th this year a little announcement slipped out stating “General availability: Dynamic data masking granular permissions for Azure SQL and Azure Synapse Analytics“. So, has this delivered on what we wanted, to really help this feature live up to its promise?

Read on to see how it works and what John thinks of the whole thing.

Comments closed

Azure ML and Azure SQL DB

I remembered that I had another blog and actually wrote something technical:

Not too long ago, I worked through an interesting issue with Azure Machine Learning. The question was, what’s the best way to read from Azure SQL Database, perform model processing, and then write results out to Azure SQL Database? Oh, by the way, I want to use a service principal rather than SQL authentication. Here’s what I’ve got.

This turned out to be a lot more work than I first expected.

Comments closed

Building a Friendship Lamp

Drew Furgiuele is looking for mood lighting tips:

It did get me thinking, though: what if I could take this idea and change it up a bit to where people could send me messages WITHOUT the need for them to have a lamp (and thereby give them plausible deniability of being, in fact, my friend). How would that work? In absence of a lamp, would a web application work? And what if we could let people pick a color in lieu of an actual message? You could send a whole mood!

And just like that, my motivation was restored. Time to get to work.

Click through for the build process, which includes 3D printing components, wiring and soldering to circuit boards, writing software for the IoT device, building the front-end web app, and more. Also, I sent red but now I’m not sure if I regret that color choice based on re-reading the first paragraph above.

Comments closed

Loading Azure Synapse Analytics using PolyBase

Gauri Mahajan needs to load some data:

Azure Synapse Analytics is Microsoft’s data warehousing offering on Azure Cloud. It supports three types of runtimes – SQL Serverless Pool, SQL Dedicated Pool, and Spark Pools. As there are a variety of data sources on Azure, it’s very obvious that there can be varying types and volumes of data that would have to be loaded into Azure Synapse pools. There are three major types of data ingestion approaches that can be used to load data into Synapse. The COPY command is the most flexible and elaborate mechanism, where someone can execute this command from a SQL pool to load data from supported data repositories. This command is convenient to load ad-hoc and small to medium-sized data loads into Synapse. The second method of loading data is the Bulk Insert, where the method name is self-relevant regarding the approach functionality. To ingest the data from supported repositories into dedicated SQL pools, PolyBase is as efficient and at times it’s even more efficient than the COPY command. This article will help you understand the process to ingest data into Azure Synapse Analytics using PolyBase to load the data.

Click through for the process.

Comments closed

Continuous Backup with Cosmos DB

Hasan Savran reviews a new bit of functionality in Cosmos DB:

     Azure Cosmos DB announced Continuous Backup in Cosmos DB on March 2021. This feature is currently in public preview mode and It is not recommended to use in production. This option gives you more options for your backup requirements. You might be using Azure Data Factory to handle your custom backup needs. Azure Data Factory is the SSIS in cloud. ETL jobs can be problematic. Backing up a database is half of the problem; other half is restoring a database. Until now, we had to call Microsoft to restore Cosmos DB databases/accounts.
    By using Continuous backup, you can easily backup and restore your database. For now, this option is available only for SQL API and Mongo API. There are many limitations in this public preview version. I am sure many of these limitations will go away when it becomes generally available to everybody.

Click through for more details about the offering, as well as how to enable it. We’ll have to wait until it’s out of public preview to see how much it will cost, but it does look interesting.

Comments closed

Tracking Azure Resources with Tags

Jess Pomfret explains the value of tags:

One of the vital parts of this learning and experimenting needs to be cleaning up after myself.  We all know the risks of leaving things running in Azure- it’s likely to drain your training budget pretty quickly.  To be fair, this is also a good lesson for real world scenarios. Getting used to turning off or scaling down resources based on need is a good way to reduce your Azure spend.

This brings me to one morning last week. I logged in to the portal and got a pop up that my credit was down to under $5, which is not what I was expecting. I started looking around and wondering what I’d left running – it isn’t always easy to spot though.

Read on to see how tags can help with this, as well as other forms of cloud governance. If you remember to set them, that is.

Comments closed

Early Thoughts on Dremio

Meagan Longoria gives us a review of Dremio:

I’ve been working on a project for the last few months with a client who has chosen to implement Dremio in Azure. Dremio is a data lake engine that creates a semantic layer and supports interactive queries.

It uses Apache Arrow, Gandiva, and Parquet files under the hood. It runs on either Linux VMs or Kubernetes containers. Like most big data systems, there is at least one coordinator node and one or more executor nodes. These nodes communicate and are managed using Apache Zookeeper. Client applications connect to Dremio via ODBC, JDBC, REST APIs, or Arrow Flight. Dremio can read from storage accounts, external databases, and a few other sources.

Read on for good and bad aspects of the product.

Comments closed

Creating Parquet Files from SQL Server Data

Andy Leonard answers a challenge:

I searched and found some promising Parquet SSIS components available from CData Software and passed that information along. I shared my inexperience in exporting to parquet format and asked a few friends how they’d done it.

I thought: How many times have I demonstrated Azure Data Factory and clicked right past file format selection without giving Parquet a second thought? Too many times. It was time to change that.

Another route is to use PolyBase. If you’re okay with writing the results to Azure Blob Storage, you can insert directly into Parquet files the results of a SQL query. If that sounds interesting, here are posts on connecting to Azure Blob Storage via PolyBase and inserting into Azure Blob Storage. I insert in CSV format to make it easier for people to follow, but swap the file format with Parquet and it works all the same.

Comments closed

Deploying Azure Data Services via Terraform

Chris Adkin has started a series on deploying Azure Arc enabled Data Services. Part 1 serves as an introduction

:One of the most significant things to change the landscape for Azure data professionals will be general release of Azure Arc enabled Data Services. To provide an expedient means of experiencing all that Azure Arc has to offer, Microsoft has come up with Jumpstart – a collection of GitHub repos for deploying Arc in different scenarios. Last Christmas I had a few vacation days and took the opportunity to try out Jumpstart for Azure Arc enabled data services on AWS. AWS was my choice because it made a certain amount of sense to try out Azure Managed SQL Server instances and Postgres Hyperscale on a cloud that they are not natively available on. After all, the whole point of Azure Arc enabled Data Services is to bring Azure to you on your terms if for any reason you cannot use the Azure cloud. 

Part 2 gives us an introduction to Terraform:

Before diving into what the various Terraform modules do that make up the Arc-PX-VMware-Faststart repo, I’m going to provide an introduction to Terraform in this blog post. Terraform comes from Hashicorp, it is a tool that works on the principle of infrastructure-as-code. Resources are specified in what are called configuration files using Hashicorp Control Language in a declarative manner, i.e. you state what you want and to the best of its ability Terraform attempts to create those resources for you. ‘Providers’ are used to create resources for particular types of entity, for example you might use local file, helm (the Kubernetes package manager), Azure, VMware providers etc. etc. . . . Using providers requires plugins, most of which are provided by Hashicorp, but third parties can write their own plugins also.

Check out the first two posts in what promises to be an interesting series.

2 Comments

Bad Request when Debugging an Azure Data Factory Pipeline

Ed Elliott ran into a problem:

Now, whenever I am troublehooting something in Azure and I come to the activity logs I am always hopeful but also always dissapointed that they don’t show more details. The bit that really annoys me is that I know Micrsoft see more detailed error information as I have been screen sharing with a support tech who used log exporer to see more detailed error messages than I see – grrrr, just show us the data! Anyway, I digress – so in the activity log, does it give a clue as to what is wrong?

No, in a word no it doesn’t. 

Read on for the conclusion, which rates as “Should have been an easy fix but the error message was completely unhelpful.”

Comments closed