Curated SQL – Page 1120 – A Fine Slice Of SQL Server

Tuning YARN

Published 2019-09-06 by Kevin Feasel

Dmitry Tolpeko helps us tune YARN settings:

Sometimes it may take a few iterations to find the proper container size, but usually it helps and the query succeeds.
But what if you set the container size 4096 MB or 8192 MB but the query could complete successfully even with 2048 MB?

Read on to learn more.

Comments closed

On-Prem Data, Azure Apps

Published 2019-09-06 by Kevin Feasel

Jamie Wick helps us figure out how to keep our data local while using Azure services:

One of the challenges many organizations face when beginning to work with Azure applications (PowerBI, PowerApps, Flow, etc.) is that their data is on-premise and the applications are hosted in the cloud. Moving the data to the cloud is often cost-prohibitive and there can be operational requirements that prevent the data, or the systems hosting it, from being relocated to the cloud.
So, how can on-prem data be used with Azure apps?

Read on for more.

Comments closed

Processing JSON in Biml

Published 2019-09-06 by Kevin Feasel

Bill Fellows takes us through a library which (seemingly by law) must be in every .NET project:

#sqlhelp #biml I would have the metadata in a Json structure. How would you parse the json in the C# BIML Script? I was thinking use Newtonsoft.Json but I don’t know how to add the reference to it
Adding external assemblies is a snap but here I’ll show how to use the NewtonSoft Json library to parse a Json based metadata structure and then use that in our Biml.

Click through to learn how.

Comments closed

Using Azure Storage Explorer

Published 2019-09-06 by Kevin Feasel

Arun Sirpal takes us through Azure Storage Explorer:

I only ever use the storage explorer when managing my blobs, files, queues within storage accounts. It is your single view access point for all your storage needs and I totally recommend downloading it and using it (https://azure.microsoft.com/en-gb/features/storage-explorer/).

Why do I like using it? I am sure there are more reasons, but these are personal to me.

Click through for Arun’s reasons as well as installation basics.

Comments closed

Dealing with Thousands of Databases

Published 2019-09-06 by Kevin Feasel

Andy Levy has some Q&A about dealing with large numbers of databases on a single server. Part one:

What was the most difficult challenge faced initially with a large environment and how does that challenge relate to now?
For me personally, it was just getting a handle on how to deal with this many databases because I didn’t “grow up” with the system. I walked into an environment with a lot of established tools and procedures for performing tasks and had to learn how those all fit together while also not breaking anything. You don’t want to be the person who walks in the door, says “why are you doing things like this, you should be doing it this other way” and then falls victim to hubris. If something seems unusual, there’s probably a reason for that and you need to understand the “why” before trying to change anything.

Part 2 is also up:

How large is the team that manages the databases? Is the knowledge shared and everyone can work on everything or do these people fill niches?
There are two of us. We each have a few specialties but we aren’t “territorial” and we try to share as much as possible. If we aren’t both directly involved in a given project, we keep each other in the loop as it progresses.

Stay tuned for part 3.

Comments closed

PolyBase and Dockerized Hadoop

Published 2019-09-05 by Kevin Feasel

I have a solution to a problem which vexed me for quite some time:

Quite some time ago, I posted about PolyBase and the Hortonworks Data Platform 2.5 (and later) sandbox.
The summary of the problem is that data nodes in HDP 2.5 and later are on a Docker private network. For most cases, this works fine, but PolyBase expects publicly accessible data nodes by default—one of its performance enhancements with Hadoop was to have PolyBase scale-out group members interact directly with the Hadoop data nodes rather than having everything go through the NameNode and PolyBase control node.

Click through for the solution.

Comments closed

Null Checks in Spark DataFrames

Published 2019-09-05 by Kevin Feasel

Bipin Patwardhan gives us four techniques for validating whether data in Spark exists:

The task at hand was pretty simple — we wanted to create a flexible and reusable library of classes that would make the task of data validation (over Spark DataFrames) a breeze. In this article, I will cover a couple of techniques/idioms used for data validation. In particular, I am using the null check (are the contents of a column ‘null’). In order to keep things simple, I will be assuming that the data to be validated has been loaded into a Spark DataFrame named “df.”

Click through for those techniques.

Comments closed

strace and SQL Server Containers

Published 2019-09-05 by Kevin Feasel

Anthony Nocentino tries using strace to diagnose SQL Server process activity in a container:

We’re attaching to an already running docker container running SQL. But what we get is an idle SQL Server process this is great if we have a running workload we want to analyze but my goal for all of this is to see how SQL Server starts up and this isn’t going to cut it.

My next attempt was to stop the sql19 container and quickly start the strace container but the strace container still missed events at the startup of the sql19 container. So I needed a better way.

Don’t worry—Anthony finds a better way.

Comments closed

Port Forwarding in Kubernetes

Published 2019-09-05 by Kevin Feasel

Andrew Pruski shows off Kubernetes port forwarding and how to connect to a SQL Server instance:

The load balanced service’s IP can be usually be used to connect into the SQL instance running in the pod, but what if we’re unable to connect? Does the issue lie with the service or the pod?
In order to narrow this down, port forwarding can be used to directly connect to the pod: –

Read the whole thing.

Comments closed

dbatools: the Book

Published 2019-09-05 by Kevin Feasel

Chrissy LeMaire has an exciting announcement:

After nearly 10 months of work, early access to Learn dbatools in a Month of Lunches is now available from our favorite publisher, Manning Publications!
For years, people have asked if any dbatools books are available and the answer now can finally be yes, mostly. Learn dbatools in a Month of Lunches, written by me and Rob Sewell (the DBA with the beard), is now available for purchase, even as we’re still writing it. And as of today, you can even use the code bldbatools50 to get a whopping 50% off.

They’re in active book development, so buy a copy now and watch as the book evolves.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Curated SQL Posts