Curated SQL – Page 455 – A Fine Slice Of SQL Server

Eager Index Spooling

Published 2022-09-20 by Kevin Feasel

Chad Callihan has a great analogy for eager spools:

Think of a time when you went to someone for help on a problem. Rather than walk you through how to solve the problem, that someone just groans, speeds through solving it for you, and sends you on your way without any explanation.
Did you learn anything? No.
Will you need help again next time that same problem comes up? You bet.

I also like Erik Darling’s explanation that an eager spool is SQL Server’s passive-aggressive way of telling you that you need an index but no, you’re just too busy to create one so I’ll just keep working here all day and do you even call your poor mother anymore?

Comments closed

Azure Data Explorer September 2022 Updates

Published 2022-09-19 by Kevin Feasel

Shaf Mahmood has a few updates for us:

The optimized autoscale feature has been in place and using reactive logic. It has been helping ADX users by adjusting the cluster size when there is an increase in resources due to ingestion or query load. The optimized autoscale feature has been further improved with predictive logic. This logic monitors the same metrics as the reactive logic and over time builds up the cluster usage pattern and uses this to forecast and plan the size of the cluster. The reactive logic is still used to ensure any forecast anomalies or usage pattern changes are still autoscaled appropriately.

There are a few cost-related updates but also updates to visuals, data ingestion, and more.

Comments closed

Setting a Loading Message for a DataTable in RMarkdown

Published 2022-09-19 by Kevin Feasel

Thomas Williams wants you to wait patiently:

Waits are inevitable, whether getting data from an API or database, or manipulating data in an interactive R Markdown document. Showing a “loading” or “updating” message is a beneficial incremental improvement to users’ experience.
The code at https://github.com/thomasswilliams/r-markdown-snippets/blob/main/loading-message-datatable.Rmd, when run from RStudio, demonstrates using CSS and pseudo-elements to display text in a DataTables (DT package).

Read on to see an example of how to use this.

Comments closed

Loading the Synapse Data Explorer Pool

Published 2022-09-19 by Kevin Feasel

Gauri Mahajan loads some event data for analysis:

In my previous article, Getting started with Data Explorer pools in Azure Synapse, we learned how to create Data Explorer pools in Azure Synapse and the unique value that Data Explorer brings to semi-structured and free-text data. The creation of the Data Explorer pool is the first step in the process. After the pool is created, one can create data structures, ingest data and then use it for consumption. Data Explorer pool interface provides different ways of ingesting data into the pool including one-click ingestion.

Some of these techniques are really straightforward. Others—especially if you’re talking about large amounts of data—do require installing and working with local tooling.

Comments closed

Basic Tier Performance in Azure SQL DB

Published 2022-09-19 by Kevin Feasel

Reitse Eskens keeps things basic:

When you look at the documentation, the basic tier has low CPU (at max less than one), 1 to 4 IOPS per DTU (translating to 5-20 IOPS in total), a latency of 5ms read and 10 ms write and a maximum of 7 days backup retention. Even though it’s advertised as a production database, I wouldn’t store essential data there. However, I can use it for metadata storage in a Data Factory or Synapse Analytics environment.
About the IOPS, one IOP is a read of a disk cluster, usually 4 Kb. In this case, this means that the database is capable of reading 20 to 80 Kb of data per second.

When they say Basic, they aren’t kidding.

Comments closed

Importing Delta Tables into a Synapse Dedicated SQL Pool

Published 2022-09-19 by Kevin Feasel

Mark Pryce-Maher does a bit of integration:

In June, Databricks announced that they are open sourcing Delta Lake 2.0. Delta Lake is quickly becoming the format of choice in data science and data engineering.
To import Delta Lake into a Synapse dedicated SQL Pool you would need Azure Data Factory/Synapse Pipelines or Spark to handle the Delta Lake files.

This is not ideal because it adds extra overheads of complexity, time, and costs.
As an intellectual challenge, I wondered if it’s possible to import Delta Lake files directly into the dedicated SQL Pool and support features like time-travel. It turned out to be a great little project and a great way of learning about Delta Lake.

This turned out to be a bit more difficult than I would have imagined. Click through for the script and check the comments as well for a preview of upcoming attractions.

Comments closed

sp_helpExpandView

Published 2022-09-19 by Kevin Feasel

Andy Yun cleans up some nested views:

Yeah, that thing was small. But the processing was a horrific case study in T-SQL worst practices. And the architect that created it LOVED nested views (and scalar functions… and MERGE… on Standard edition).
I spent a good amount of effort trying to unravel those as part of my efforts to improve performance, and as a result, decided to create my own community tool to help with unraveling them – sp_helpExpandView.

I also occasionally deal with a group loving nested views and now I’m going to need to use Andy’s stored procedure because tracking that on my own gets really painful.

Comments closed

Using EvaluateAndLog to Diagnose DAX Performance Problems

Published 2022-09-19 by Kevin Feasel

Chris Webb does a bit of sleuthing:

The Switch function is often a cause of performance problems in DAX. Last year a number of optimisations were made that fixed the most common issues (and as a result you should ignore older blog posts that you may find on this subject) but some scenarios still remain where performance can be bad; Marco and Alberto wrote a great post recently outlining some of them that I encourage you to read. How do you even know whether your measure is slow because of your use of the Switch function though? Trying to read query plans, as Marco and Alberto do in their post, isn’t usually practical and commenting out branches in a Switch to see if there is a performance change can be very time-consuming. Luckily, the new EvaluateAndLog DAX function can help you diagnose Switch-related performance issues.

Click through to see how.

Comments closed

SQL Server Health Checks with SQLMonitor

Published 2022-09-19 by Kevin Feasel

Ajay Dwivedi performs a server health check:

Working as a DBA, I often get pulled into issues where application teams complain about “database server is slow”.
This general statement “server is slow” requires an equally robust approach that can help us figure out if there is an issue with CPU, memory, io, or regressed query. At this point, I prefer to use my SQLMonitor dashboard. This tool is entirely free and open source, and can be deployed on SQL Server 2014+ irrespective of any environment or edition.

Read on to see how it works and check out the GitHub repo as well as a one-off script you can run if you don’t have SQLMonitor set up.

Comments closed

Structuring an API Project in R

Published 2022-09-16 by Kevin Feasel

Jamie Owen begins a series on building APIs as R packages:

At Jumping Rivers we were recently tasked with taking a prototype application built in {shiny} to a public facing production environment for a public sector organisation. During the scoping exercise it was determined that a more appropriate solution to fit the requirements was to build the application with a {plumber} API providing the interface to the Bayesian network model and other application tools written in R.
When building applications in {shiny} we have for some time been using the “app as a package” approach which has been popularised by tools like {golem} and {leprechaun}, in large part due to the convenience that comes with leveraging the testing and dependency structure that our R developers are comfortable with in authoring packages, and the ease with which one can install and run an application in a new environment as a result. For this project we looked to take some of these ideas to a {plumber} application. This blog post discusses some of the thoughts and resultant structure that came as a result of that process.

Read on for the first post in the series, dealing with some of the prep work.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts