Kevin Feasel – Page 771

Fixing Parallel Deadlocks

Published 2020-11-09 by Kevin Feasel

Erik Darling hits on an interesting issue:

You’ll see the exchange event, and you’ll also see the same query deadlocking itself.
This is an admittedly odd situation, but one I’ve had to troubleshoot a bunch of times.

In other words, parallel threads on the same query causing the query to deadlock on itself. Click through to learn what you can do about it.

Comments closed

The Unbearable Slowness of Full Text Queries

Published 2020-11-09 by Kevin Feasel

Brent Ozar explains why full-text search in SQL Server can be so slow:

SQL Server’s full text search is amazing. Well, it amazes me at least – it has so many cool capabilities: looking for prefixes, words near each other, different verb tenses, and even thesaurus searches. However, that’s not how I see most people using it: I’ve seen so many shops using it for matching specific strings, thinking it’s going to be faster than LIKE ‘%mysearch%’. That works at small scale, but as your data grows, you run into a query plan performance problem.
When your query uses CONTAINS, SQL Server has a nasty habit of doing a full text search across all of the rows in the table rather than using the rest of your WHERE clause to reduce the result set first.

Read on for the full impact as well as some alternatives. I agree that those alternatives come with costs (whether that be monetary or conceptual), but I’ve used both n-grams and Elasticsearch with some success.

Comments closed

Auto-Checking Azure Data Factory Setup

Published 2020-11-09 by Kevin Feasel

Paul Andrew is at it again:

Building on the work done and detailed in my previous blog post (Best Practices for Implementing Azure Data Factory) I was tasked by my delightful boss to turn this content into a simple check list of what/why that others could use…. I slightly reluctantly did so. However, I wanted to do something better than simply transcribe the previous blog post into a check list. I therefore decided to breakout the Shell of Power and attempt to automate said check list.
Sure, a check list could be picked up and used by anyone – with answers manually provided by the person doing the inspection of a given ADF resource. But what if there was a way to have the results given to you a plate and inferring things that aren’t always easy to spot via the Data Factory UI.

Paul uses an ARM template rather than hitting your Data Factory directly, so there’s a little bit more work for you the user, but Paul explains why it’s both necessary and proper.

Comments closed

Dynamic M Parameters and Multi-Select

Published 2020-11-09 by Kevin Feasel

Chris Webb shows off a method for handling multi-select using dynamic M parameters:

Even though the documentation for dynamic M parameters does mention how to handle multi-select in the M code for your Power Query queries, I thought it would be useful to provide a detailed example of how to do this and explain what happens behind the scenes when you use multi-select.

Click through for that explanation and example.

Comments closed

Querying Data Lake Files in Power BI through Synapse Analytics

Published 2020-11-09 by Kevin Feasel

Wolfgang Strasser shows us how to integrate Azure Synapse Analytics and Power BI:

Sometimes however, would not it be nice to access the data lake in Direct Query mode – to get the most up to date information for every report view? I would say: yes … but how can you achieve this? The options natively provided by ADLS Gen2 and Power BI are not sufficient to solve this requirement. But: there are options to achieve this and, in this post, I would like to show you the possibilities using Azure Synapse Analytics to build a query layer on top of a ADLS Gen2 storage account.

Click through for a step-by-step walkthrough.

Comments closed

From JSON to SQL Server

Published 2020-11-06 by Kevin Feasel

Phil Factor has some helper functions for us when working with JSON data:

If you know the structure and contents of a JSON document, then it is possible to turn this into one or more relational tables, but even then I dare to you claim that it is easy to tap in a good OpenJSON SELECT statement to do it. If you don’t know what’s in that JSON file, then you’re faced with sweating over a text editor trying to work it all out. You long to just get the contents into a relational table and take it on from there. Even then, You’ve got several struggles before that table appears in the result pane. You must get the path to the tabular data correct, you have to work out the SQL Datatypes, and you need to list the full panoply of keys. Let’s face it: it is a chore. Hopefully, all that is in the past with these helper functions.

Click through for those functions.

Comments closed

Automating Hadoop Workflows with Spark and Oozie

Published 2020-11-06 by Kevin Feasel

Prashanth Jayaram walks us through automating a sample data transfer with tools like Sqoop, Spark, and Oozie:

In the process of building a data product one would end-up applying many resource-intensive analytical operations on a medium to large data-set in an efficient way. Apache Spark is the bet in this scenario to perform faster job execution by caching data in memory and enabling parallelism in a distributed data environments.
Components involved in Spark implementation:
1. Initialize spark session using scala program
2. Ingest data from data lake through hive queries
3. Apply business logic using scala constructs or hive queries
4. Load data into HDFS or Hive targets
5. Execute spark programs through spark submit

Read on for a sample flow.

Comments closed

DBCC CHECKDB on Large Databases

Published 2020-11-06 by Kevin Feasel

Aaron Bertrand shares some thoughts on CHECKDB:

We have a lot of data. Some of that data is stored in large databases (dozens of terabytes each). In some shops, this is an excuse to not run integrity checks. We are not one of those shops.
But we don’t run full CHECKDB operations in production; we have a set of servers dedicated to testing our restores and running checks. We follow a lot of the guidance in these articles:
– CHECKDB From Every Angle: Consistency Checking Options for a VLDB
– Minimizing the impact of DBCC CHECKDB : DOs and DON’Ts
– Minimize performance impact of SQL Server DBCC CHECKDB

Read the whole thing, even if you aren’t dealing with 30+ TB databases.

Comments closed

Creating Users in Azure SQL Database

Published 2020-11-06 by Kevin Feasel

Kenneth Fisher takes us through a nuance in adding users to Azure SQL Database:

Awesome! I did say I preferred code didn’t I? I am noticing a slight problem though. I don’t actually have a login yet. So I look in object explorer and there is no instance level security tab. On top of that when I try to create a login with code I get the following error:
Msg 5001, Level 16, State 2, Line 1
User must be in the master database.

Read on for the whole process.

Comments closed

Deploying SQLWATCH to SQL Server via Azure DevOps

Published 2020-11-06 by Kevin Feasel

Kevin Chant shows off some of the power in Azure DevOps:

I saw that you can install sqlwatch easily using PowerShell. However, I also saw in ‘Alternative Installation’ section that you can also install from the source.
With this in mind I thought I would try deploying using Azure DevOps straight from its GitHub Repository.

There’s a lot to like in Azure DevOps.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel