Press "Enter" to skip to content

Curated SQL Posts

Installing ML Services With Python Support In SQL Server 2019

Rich Brenner walks us through installing SQL Server 2019 and enabling Python support:

First things first, you’ll want to choose your version of SQL Server. Python is available on 2017 and greater. For this demo I’ll be using SQL Server 2019 Developer Edition (CTP 2.2).
With 2019 CTP2.2 they’ve increased the requirement of your OS too, in my example I had a spare VM with Windows Server 2012 laying around but I needed to update this to Server 2016. Check the relevant documentation for the version you’re using.

Click through for a step by step guide with plenty of screenshots.

Comments closed

SOS_SCHEDULER_YIELD And Hypervisors

Jonathan Kehayias has an interesting article on generating SOS_SCHEDULER_YIELD waits due to CPU over-subscription on the host:

The test environment that I used for this is a portable lab I’ve used for demos for VM content over the last eight years teaching our Immersion Events. The ESX host has 4 cores and 8GB RAM and hosts three virtual machines, a 4vCPU SQL Server with 4GB RAM, and two 2vCPU Windows Server VM’s with 2GB RAM that are used strictly to run Geekbench to produce load.  Within SQL Server, I a reproducible workload that drives parallelism and is repeatable consistently, that I have also used for years in teaching classes.
For the tests, I first ran a baseline where the SQL Server VM is the only machine executing any tasks/workload at all, but the other VMs are powered on and just sitting there on the hose.  This establishes a base metric in the host and had 1% RDY time average during the workload execution, after which I collected the wait stats for SOS_SCHEDULER_YIELD.

From there, Jonathan starts cranking up the load on the application servers and sees what it does to SQL Server ready time. This is a great reason not to over-subscribe on CPUs on mission-critical hosts.

Comments closed

Data Professional Salary Survey Results Released

Brent Ozar has a new year of data professional salary results:


How much do database administrators, analysts, architects, developers, and data scientists make? We asked, and 882 of you from 46 countries answered this year. Y’all make a total of $84,114,940 USD per year! Hot diggety. (And at first glance, it looks like on average, y’all got raises this year.)

Download the 2019, 2018, & 2017 results in Excel.

Read on for some notes about the data and start playing around.

Comments closed

Emailing SSIS Errors

Peter Schott improves upon Kevin Hill’s script:


Recently, Kevin Hill (b | t ) posted on getting package errors from the SSIS catalog in a single query as opposed to clicking through the SSIS Reports and digging through pages.  I took that and ran with it a little bit. The first pass needed an additional index on the catalog to increase performance.  Kevin’s included that at the bottom of his query on the post above.  (You probably don’t need the included “message” column, though.)
I wanted to take this and run with it a little bit to report on all errors for a given folder within the last day, then e-mail that in an HTML formatted e-mail. To that end, I wrote up a quick stored procedure that should take the Folder or Package or Project name and a “to” e-mail address to send an e-mail through DBMail.

Click through for the script.

Comments closed

ARITHABORT And SET Options

Jason Brimhall explains an error message he received:

INSERT failed because the following SET options have incorrect settings: ‘ARITHABORT’. 
Verify that SET options are correct for use with 
indexed views 
and/or indexes on computed columns 
and/or filtered indexes 
and/or query notifications 
and/or XML data type methods 
and/or spatial index operations.
[SQLSTATE 42000] (Error 1934). The step failed.


Immediately I started looking at my creation scripts for the tables.  Nope, no XML indexes, no spatial indexes, no filtered indexes, no indexes on computed columns (not even any computed columns),  and no query notifications.

In Jason’s case, it was an oddity around SQL Agent jobs, but there are a few reasons this could pop up, and Jason explains some of the most common.

Comments closed

The Basics Of Docker For R Users

Colin Fay explains some of the core principles behind Docker, containerizing some R code along the way:

Docker is designed to enclose environments inside an image / a container. What this allows, for example, is to have a Linux machine on a Macbook, or a machine with R 3.3 when your main computer has R 3.5. Also, this means that you can use older versions of a package for a specific task, while still keeping the package on your machine up-to-date.
This way, you can “solve” dependencies issues: if ever you are afraid dependencies will break your analysis when packages are updated, build a container that will always have the software versions you desire: be it Linux, R, or any package.

Click through for the details. H/T R-bloggers

Comments closed

Generating Test Data In Kafka

Yeva Byzek takes us through the Kafka Connect Datagen connector:

Short of using real data from a real source, you do have a few options on how to generate more interesting test data for your topics. One option is to write your own client. Kafka has many programming language options—you choose: Java, Python, Go, .NET, Erlang, Rust—the list goes on. You can write your own Kafka client applications that produce any kind of records to a Kafka topic, and then you’re set.
But wouldn’t it be great if you could generate data locally to just fill topics with messages? Fortunately, you’re in luck! Because we have those data generators.

Click through for a demonstration.

Comments closed

Containerizing Python And MySQL

Allison Tharp walks us through containerizing a Python-based game she had created:

I’m really amazed at how easy creating the container was.  It took only 11 lines to spin up a Linux environment on my own machine.  The majority of the commands (7 of the 11) are simply adding the files and dependencies.  I’m also pretty shocked that I didn’t have to do anything to my Python script to get this to work.  I had assumed I would need to do something but, I didn’t.  Very cool!  Also, by using the following command while my Python script is running, I see that this is only taking up 1.3 GB!

Click through for scripts and important lessons learned along the way.

Comments closed

Unattended SQL Server Installations

Michal Sadowski walks us through unattended installations:

Imagine a situation when you are tasked to install a dozen SQL Server instances. Using an approach described in previous blog post  installation can take a significant amount of resources and time, also there is a probability of errors due to typos (e.g. in directory paths, etc.)
There are two options that can be used for the installation:
1. Using ConfigurationFile.ini
2. Using command line parameters to setup.exe file
Below there are step-by-step instructions on how to install SQL Server using both options.

If you only have a couple of SQL Server instances in your environment, unattended installation may seem a bit overwhelming. But as you have to deal with larger and larger numbers of instances, it’s the only way to keep your sanity. It’s also the only way to install on Windows Server Core.

Comments closed

Cloudera And Hortonworks Officially Merged

Arun Murthy gives the used-to-be-Hortonworks perspective on the now-official merger of Cloudera and Hortonworks:

Our merger did not arise out of the blue. Our respective missions were well aligned, and together the new Cloudera has the scale it needs to service the constantly changing needs of the world’s most demanding organizations and to grow even more dominant in the market.
New open-source standards such as Kubernetes, container technology and the growing adoption of cloud-native architectures are major parts of Cloudera’s strategy.  Our primary initiative out of the gate is to deliver a 100-percent open-source unified platform, which leverages the best features of Hortonworks Data Platform (HDP) 3.0 and Cloudera’s CDH 6.0. Cloud-native and built for any cloud – with a public cloud experience across all clouds – the unified platform embodies our shared “cloud everywhere” vision.

I’m more a fan of the Hortonworks tooling like Ambari than I am of Cloudera’s alternatives, so it will be interesting to see what happens going forward. The good news for recalcitrant types like me is that HDP will be around for a couple of years yet.

Comments closed