Press "Enter" to skip to content

Curated SQL Posts

Installing ML Services on SQL Server 2022

Tomaz Kastrun notices a change to the SQL Server installer:

Machine Learning Services and language extensions is available under Database Engine Services, and if you want to use any of these languages, check this feature. During the installation process, the R, Python or Java will not be installed (nor asked for permissions), but you will install your own runtime after the installation. This will bring you more convenience with the installation of different R/Python/Java runtimes.

Read on to see how you can install and work with languages like R, Python, and Java in SQL Server 2022.

Comments closed

The Purpose of Data Encryption

Matthew McGiffen thinks through the benefits of encryption:

On the face of it, this is a very obvious question with a very obvious answer. We want to prevent data from falling into the wrong hands. In practice, it gets a little more complicated.

Exactly what types of attacks do you wish to be protected against? It’s good if we make sure our data is encrypted where it is stored on the disk, but that doesn’t help us if an attacker gains direct access to write queries against the database. We might encrypt data held in columns, but does that still protect us if the unencrypted data is being passed back across the network to our application and an attacker is intercepting our network traffic?

I did a Ctrl-F for “compliance” and didn’t see anything. Nor for checking boxes to keep regulators off our backs. It seems Matthew is going for the good answers here.

Comments closed

Solving Common CALCULATE Filter Argument Errors

Marco Russo and Alberto Ferrari catalog some errors:

The expression contains columns from multiple tables, but only columns from a single table can be used in a True/False expression that is used as a table filter expression.

This error is seen when the predicate includes column references from more than one table. For example, if we need a measure that returns the sales made to customers living in the same country as the store, we could try to write the following measure:

Read on for several examples and solid guidance on how to resolve these common issues.

Comments closed

Roll Your Own Row-Level Security for the Serverless SQL Pool

Randheer Parmar wants row-level security:

Row Level Security is a very key requirement for most database or data lake applications. Most of the databases are having natively build row-level security but Synapse serverless SQL pool doesn’t support this inbuilt functionality. In this article, we will see how to implement it.

Row-level security has always seemed to me to be a great idea but not one I can implement because its performance cost is always too high.

Comments closed

Distinct Counts in KQL

Robert Cain continues a series on KQL:

In an earlier post in this series, Fun With KQL – Count, you saw how to use the count operator to count the number of rows in a dataset.

Then we learned about another operator, distinct, in the post Fun With KQL – Distinct. This showed how to get a list of distinct values from a table.

While we could combine these, it would be logical to have a single command that returns a distinct count in one operation. As you may have guessed by the title of this post, such an operator exists: dcount.

Read on to see how you can use dcount in queries, including how you can perform speed versus accuracy trade-offs.

Comments closed

InvalidAbfsRestOperationException in Synapse Managed VNet

Kamil Nowinski goes down a rabbit hole:

This happens on the customer’s Synapse workspace where we have a public network disabled, so only private endpoint and managed VNET are available. Additionally, you probably spotted, that it took over 3 minutes to actually get this message. Hence, as a next step, in order to minimize the potential causes I simplified the query to make sure I have access to the Storage, by listing the files:

Click through for a story of pain, followed by glorious resolution.

Comments closed

Semi-Colons in Snowflake

Kevin Wilkie punctuates the statement:

With our last blog post, we started discussing Snowflake and the SELECT statement. Now, if you remember, there is this great thing called a semi-colon.

The main reason you should use the semicolon is to terminate all of your queries. Snowflake does this great thing by default, letting you run one query at a time.

I remember back when Microsoft deprecated T-SQL statements which did not end with semi-colons. It was fun speculating for about 5 minutes regarding the carnage which would happen if they carried out the deprecation notice, not least of which we’d find in Microsoft-developed code.

Comments closed

Tracking Database Errors with Extended Events

Eitan Blumin is watching you:

But interestingly enough – we would be getting an added benefit here. Even if there is no SQL injection attack, it’s still possible that such errors would be raised by the application – simply due to bugs.

Furthermore, these errors in the database may be happening without anyone even noticing! How could that be, you ask? Well, it could be due to bad error handling that “swallows” the error entirely, or because the errors are logged but no one is bothering to look at the logs, or maybe because the errors are caught but an undetailed error message is logged/displayed to the user (I can’t even count how many times I encountered “general database error” messages in applications), or because the developers simply decided to mark this as a “known issue” that they didn’t bother to fix and they didn’t think to ask their DBA about it… The reasons are numerous and varying.

Click through for the scripts. I had built something similar about a decade ago, a simple WPF app which watched for errors. I messaged him with something like “You missed a comma in that IN clause” and I saw him pop up from his cubicle and look around, trying to figure out how I could peek over his shoulder and see the query.

1 Comment