Press "Enter" to skip to content

Curated SQL Posts

Deploying Custom Docker Images in Azure ML

Tsuyoshi Matsuzaki shows us how to deploy an Azure ML model via custom Docker image:

In my early post, I have showed you how to bring your own custom docker image in training with Azure Machine Learning.
On the contrary, here I’ll show you how to bring custom docker image in model deployment.

In Azure Machine Learning, the base docker image in deployment includes the inferencing assets, such as, Flask server, etc. So you should use AML compliant image for base image, even when you use your own custom docker image.
The list of these maintained AML images is available in https://github.com/Azure/AzureML-Containers .

Read on for an example.

Comments closed

Representing Dates in Power BI: Date or Integer?

Marco Russo and Alberto Ferrari share their take on a classic debate:

A question that is often asked during the design of a Power BI data model is whether it is better to use an Integer or a Datetime column to link a fact table with the Date dimension. Historically, using Integers has always been a better choice in database design. However, Tabular is an in-memory columnar database, and its architecture is quite different from the relational databases we might be used to working with.

Indeed, in Tabular there are no technical differences between using a Datetime or an Integer to create a relationship. The database size, the query speed, and any other technical detail are absolutely identical. Therefore, the choice is not related to technical aspects, but rather on the convenience of the design. Depending on the specific requirements of your model, you might favor one data type against the other. In the most common scenarios, a Datetime proves to be better because it provides more possibilities to compute values on dates without having to rely on relationships. With that said, if your model uses Integers and you do not need to perform calculations on the dates represented in the table, then you can choose the most convenient data type – that is, the one already used in the original data source.

The remaining part of the article aims to prove the previous sentences, and to provide you with the technical details about how we tested the respective performance of the two options.

Click through for Marco and Alberto’s analysis, noting that “date” here does not include time of day, so it would have the same cardinality as the integer date key. This was a more important thing fifteen years ago, before columnstore technologies (like columnstore indexes and VertiPaq) were readily available and that 4-byte integer was considerably smaller than an 8-byte DATETIME.

Comments closed

Finding and Removing Custom Roles, Schemas, and Users from a Database

Thomas Williams wants to go back to square one:

I’m a fan of the built-in database roles like db_datareader to standardise & simplify permissions (sorry Dr. Greg Low!) and recently I needed to do just that in a database created using SQL Server 2000, and remove old defaults and a lot of custom roles, schemas and users.

I wrote the set of queries below to generate scripts to remove non-built-in roles, schemas and users, when compared to the model database on a new SQL Server 2019 server.

After running the script generated by the queries, I added back users and gave appropriate roles (like db_datareader).

Read on for the script.

Comments closed

Deploying Datasets in Azure Analysis Services and Power BI PPU

Gilbert Quevauvilliers continues a series on migrating from Azure Analysis Services to Power BI Premium Per User:

Welcome to part 8, where in this blog post, I am going compare deploying datasets.

For those people who are not exactly sure what deployments are, what this means is when you are using Power BI Desktop and you click on Publish, you are effectively deploying your changes to the Power BI Service (Which could also be a server in the cloud).

In this blog post I will show the differences when completing a deployment from AAS and then PPU.

Read on to see several techniques for deploying for each technology.

Comments closed

Ways to avoid the MERGE Operator

Michael J. Swart has important bullet points:

Aaron Bertrand has a post called Use Caution with SQL Server’s MERGE Statement. It’s a pretty thorough compilation of all the problems and defects associated with the MERGE statement that folks have reported in the past. But it’s been a few years since that post and in the spirit of giving Microsoft the benefit of the doubt, I revisited each of the issues Aaron brought up.

Some of the items can be dismissed based on circumstances. I noticed that:

– Some of the issues are fixed in recent versions (2016+).

– Some of the issues that have been marked as won’t fix have been fixed anyway (the repro script associated with the issue no longer fails).

– Some of the items are complaints about confusing documentation.

– Some are complaints are about issues that are not limited to the MERGE statement (e.g. concurrency and constraint checks).

Spoilers: some + some + some + some is still a lot less than all. Read the whole thing.

Comments closed

An Overview of Amazon Athena

Aveek Das takes us through the basics of Amazon Athena:

Serverless. Since Amazon Athena is offered as a fully managed cloud service, customers do not need to take the pain of installing and maintaining separate infrastructures for this. You can start by logging into the AWS Web console and proceeded to Amazon Athena.

Pay Per Query. You only pay for queries you execute. This is very cost-effective, as you can easily figure out your monthly expenses based on your usage pattern. On average, users pay 5 USD for each terabyte of data scanned. This can be further optimized by creating partitions or compressing your dataset.

Interactive Performance. We do not need to worry about the resources that work behind the scenes. When a query is executed, Athena automatically runs the query in parallel across multiple resources, bringing the results faster.

Read on to see an example of Athena in action.

Comments closed

Azure Functions and (Lack of) F# Support

Jamie Dixon has a shaggy dog tale:

When Azure Functions first came out, F# had pretty good support – templates, the ability to run a .fsx file, cool examples written by Don… Fast forward to 2021 and we have bupkis. I recently wrote a dictionary of amino acid weights that would be perfect as a service: pass in a kmer length and weight, get all hits from amino acids.

I first attempted to create a function app, select a .fsx template, and write the code in my browser. Alas, only .csx scripting is available.

Not to be too cutesy about it, but it would be nice if the product which allows for the execution of functions in a cloud service would support the .NET language which most explicitly embraces the notion of functions. If you feel similarly, there is an open feedback ticket.

Comments closed

Fun with Hash Tables in Powershell

Robert Cain continues a series on Powershell data structures:

Hash tables are powerful, flexible items. In some languages, hash tables are also called dictionaries. In this post we’ll start with a few basics, then get into some fun things you can do with hash tables.

A quick note, some documentation calls them a hash table, others read hashtable, one word. For this post I’ll use the two word hash table, but it’s the same thing no matter what documentation or blog you read.

Click through for plenty of examples of what you can do with them.

Comments closed

Finding Bad (Worse?) NOLOCK Statements across Instances

Aaron Bertrand powers up for about six episodes straight, but the results are amazing:

In Part 1 of this series, I showed how to use a Visitor pattern to walk through one or more T-SQL statements to identify a problematic pattern where NOLOCK hints are applied to the target of an update or delete. The method in my initial examples was very manual, though, and won’t scale if this problem might be widespread. We need to be able to automate collecting a potentially large number of statements across an entire environment, and then try to eliminate false positives without manual intervention.

Read on to see how you can take what Aaron wrote last time and make it scalable.

Comments closed

Executing T-SQL with a Proxy Account

Tom Collins answers a question:

I have some t-sql code added to a job step on a SQL Server Agent job. The problem is I need to run the code as RUNAS . I though of executing the job with a proxy account – so progressed with the Credential & proxy set up.    But I still can’t view the Proxy\Credential in the RunAs list . Is there a way around this problem?

Read on to learn why and for the answer.

Comments closed