Curated SQL – Page 568 – A Fine Slice Of SQL Server

We have some data we can query using the serverless SQL pools in Azure Synapse Analytics. For this blog post, I’m querying data that is stored in Azure Cosmos DB. Read the blog post How to Store Normalized SQL Server Data into Azure Cosmos DB to learn more about how that data got there.

Suppose I now want to read the data using Azure Data Factory. You can read data from Cosmos DB directly, but let’s pretend I want to do some transformations first using my favorite language: SQL. How can we do this?

Read on to learn how.

Comments closed

Hyperconverged Storage and Trace Flags

Published 2022-11-28 by Kevin Feasel

David Klee has a tip for us:

We all (should) know that running SQL Server in hyperconverged virtual environments, both on-premises and in the cloud, has some interesting trade-offs. The biggest is write latency from the hyperconverged storage platform underneath the database. We find that write latency suffers compared to traditional all-flash storage, even if the hyperconverged layer is all-flash as well, due to how the hyperconverged layer handles the larger block write that the SQL Server engine drops on it.

Read on for a trace flag which could help here.

Comments closed

Just Enough Administration and Granting Access to SQL Server

Published 2022-11-28 by Kevin Feasel

Andrew Pruski tries out a tool:

We’ve all been there as DBAs…people requesting access to the servers that we look after to be able to view certain things.

I’ve always got, well, twitchy with giving access to servers tbh…but what if we could completely restrict what users could do via powershell?

Enter Just Enough Administration. With JEA we can grant remote access via powershell sessions to servers and limit what users can do.

Click through to see how it works.

Comments closed

Partitioning Data in Power BI

Published 2022-11-28 by Kevin Feasel

Paul Turley continues a series on working with large amounts of data in Power BI:

You don’t have to have massive tables to benefit from partitioning. Even tables with a few hundred thousand records can benefit from partitioning, to improve data refresh performance and to detect source data changes. There is little maintenance overhead, so the benefits usually outweigh the cost, in terms of effort and management.

Click through for Paul’s thoughts on the topic.

Comments closed

Performance-Killing Pre-Emptive Waits

Published 2022-11-28 by Kevin Feasel

Sean Gallardy finds the real killer:

If you haven’t already read up on cooperative and preemptive scheduling or aren’t sure what those are, please read the intro to that first, otherwise you’ll be lost.

Much as I’ve discussed before, SQL Server uses a cooperative scheduling model. This doesn’t mean that Windows does, nor does it mean Windows will scheduler whatever SQL Server schedules, in fact much of the time there are many other threads that run before the ones from SQL Server, that’s the job of the operating system to figure out. Due to SQL Server using cooperative scheduling there needs to be a mechanism that exists such that when a resource not under SQL Server’s control needs interaction the scheduler can keep on scheduling and threads can switch in and out (in SQL Server, Windows does what Windows wants). Enter preemptive status and associated waits.

Click through for a deep dive on the topic.

Comments closed

RCSI and Blocking

Published 2022-11-28 by Kevin Feasel

Michael J. Swart says don’t worry, be happy:

What’s the best way to avoid most blocking issues in SQL Server? Turn on Read Committed Snapshot Isolation (RCSI). That’s it.

Do check out Erik Darling’s comment as well for one thing to keep in mind if you turn on RCSI.

The other thing to keep in mind is that, if you have WITH(NOLOCK) hanging around everywhere in your code, you won’t get as much of a benefit with RCSI until you remove them.

Comments closed

Today is Thanksgiving in the United States. To celebrate, Curated SQL will take today and tomorrow off. We’ll be back on Monday with more links to interesting blog posts from across the data platform space.

Comments closed

REST APIs for Synapse Spark Pools

Published 2022-11-23 by Kevin Feasel

Abid Nazir Guroo looks at some endpoints:

Azure Synapse Analytics Representational State Transfer (REST) APIs are secure HTTP service endpoints that support creating and managing Azure Synapse resources using Azure Resource Manager and Azure Synapse web endpoints. This article provides instructions on how to setup and use Synapse REST endpoints and describe the Apache Spark Pool operations supported by REST APIs.

Read on to see some of the Spark pool management options are available to you via the REST API.

Comments closed

Spark RDD Transformations

Published 2022-11-23 by Kevin Feasel

Meenakshi Goyal walks us through the transformation functions available to you when using a Spark RDD:

The role of transformation in Spark is to create a new dataset from an existing one. Lazy transformations are those that are computed only when an action requires a result to be returned to the driver programme.

When we call an action, transformations are executed since they are inherently lazy. Not right away are they carried out. There are two primary types of transformations: map() and filter ().
The outcome RDD is always distinct from the parent RDD after the transformation. It could be smaller (filter, count, distinct, sample, for example), bigger (flatMap(), union(), Cartesian()), or the same size (e.g. map).

Read on to learn more about transformations, including examples of how each works. Even if you’re using the DataFrames API for Spark, it’s still important to understand that transformations are lazy.

Comments closed

Array Expansion and Pivoting in KQL

Published 2022-11-23 by Kevin Feasel

Robert Cain continues a series on learning KQL:

In the previous article, Fun With KQL – Make_Set and Make_List, we saw how to get a list of items and return them in a JSON array. In this article we’ll see how to break that JSON array into individual rows of data using the mv-expand operator.

Read on to learn more about mv-expand.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Reading Serverless SQL Pool Data with Data Factory

Hyperconverged Storage and Trace Flags

Just Enough Administration and Granting Access to SQL Server

Partitioning Data in Power BI

Performance-Killing Pre-Emptive Waits

RCSI and Blocking

Happy Thanksgiving

REST APIs for Synapse Spark Pools

Spark RDD Transformations

Array Expansion and Pivoting in KQL

Curated SQL Posts