Author: Kevin Feasel

Workload Capture with WorkloadTools

Published 2019-03-14 by Kevin Feasel

Gianluca Sartori continues a series on WorkloadTools:

Last week I showed you how to use WorkloadTools to analyze a workload. As you have seen, using SqlWorkload to extract performance data from your workload is extremely easy and it just takes a few keystrokes in your favorite text editor to craft the perfect .json configuration file.
Today I’m going to show you how to capture a workload and save it to a file. If you’ve ever tried to perform this task with any other traditional benchmarking tool, like RML Utilities or Distributed Replay, your palms are probably sweaty already, but fear not: no complicated traces to set up, no hypertrophic scripts to create extended events captures. WorkloadTools makes it as easy as it can get.
Saving a workload to a file might look superfluous when you think that WorkloadTools has the ability to perform replays in real-time (I’ll discuss this feature in a future post), but there are situations when you want to replay the same exact workload multiple times, maybe changing something in the target database between each benchmark to see precisely what performance looks like under different conditions.

Gianluca’s technique does seem a lot less fussy than the Microsoft techniques.

Comments closed

SQL Server and Ubuntu 18.04

Published 2019-03-14 by Kevin Feasel

Randolph West confirms that SQL Server on Linux will run on Ubuntu 18.04 even though it is not (yet) supported:

Although these screenshots show SQL Server 2019 preview CTP 2.3, this also applies to SQL Server 2017 on 18.04.2, because that’s what I had installed before upgrading the SQL Server version. However, as my friend Jay Falck pointed out on Twitter, Microsoft has stated publicly that it is not yet certified for production use:
Important, this does not change the support state of SQL Server 2017 on Ubuntu 18.04. Work to certify Ubuntu 18.04 with SQL Server 2017 is planned and we will announce when it will be supported for production use on this page. Until such as an announcement occurs, SQL Server 2017 on Ubuntu 18.04 should be considered experimental and for non-production use only.

Read on for Randolph’s thoughts on the issue.

Comments closed

R 3.5.3 Available

Published 2019-03-13 by Kevin Feasel

David Smith shares some info on R 3.5.3, released on Monday:

The R Core Team announced yesterday the release of R 3.5.3, and updated binaries for Windows and Linux are now available (with Mac sure to follow soon). This update fixes three minor bugs (to the functions writeLines, setClassUnion, and stopifnot), but you might want to upgrade just to avoid the “package built under R 3.5.4” warnings you might get for new CRAN packages in the future.

Click through for more info on this release, including where the name from each R release comes from.

Comments closed

Securely Accessing External Resources From Databricks AWS

Published 2019-03-13 by Kevin Feasel

Itai Weiss shows how you can securely hit external data sources when using Databricks for AWS:

For security purposes, Databricks Apache Spark clusters are deployed in an isolated VPC dedicated to Databricks within the customer’s account. In order to run their data workloads, there is a need to have secure connectivity between the Databricks Spark Clusters and the above data sources.
It is straightforward for Databricks clusters located within the Databricks VPC to access data from AWS S3 which is not a VPC specific service. However, we need a different solution to access data from sources deployed in other VPCs such as AWS Redshift, RDS databases, streaming data from Kinesis or Kafka. This blog will walk you through some of the options you have available to access data from these sources securely and their cost considerations for deployments on AWS. In order to establish a secure connection to these data sources, we will have to configure the Databricks VPC with either one of the following two available options :

Read on for those two options.

Comments closed

Getting Started with Kubernetes

Published 2019-03-13 by Kevin Feasel

Praveen Sripati walks us through the Play with Kubernetes lab website:

There are many ways of installing K8S as mentioned here. It can be installed in the Cloud, on-premise and also locally on the laptop using virtualization. But, installing K8S had never been easy. In this blog, we will look at one of the easiest way to get started with K8S using Play with Kubernetes (PWK). With this the whole K8S experience is within the browser and there is nothing to install on the laptop, everything is installed on the remote machine. PWK uses ‘Docker in Docker’ which is detailed here (1, 2).

This looks like a really useful way to get the hang of Kubernetes before trying it out on your own machines.

Comments closed

Finding Singleton Tables

Published 2019-03-13 by Kevin Feasel

Michael J. Swart wants to help you find single tables in your area:

What’s achievable? I want to identify tables to extract from the database that won’t take years. Large monolithic systems can have a lot of dependencies to unravel.
So what tables in the database have the least dependencies? How do I tell without a trustworthy data model? Is it the ones with the fewest foreign keys (in or out)? Maybe, but foreign keys aren’t always defined properly or they can be missing all together.
My thought is that if two tables are joined together in some query, then they’re related or connected in some fashion. So that’s my idea. I can look at the procedure cache of a database in production to see where the connections are. And when I know that, I can figure out what tables are not connected.

Click through for the script to help you do it.

Comments closed

Changing Constraints in Near-Zero Downtime Situations

Published 2019-03-13 by Kevin Feasel

I have part six of my interminable series on near-zero downtime deployments:

The locking story is not the same as with the primary and unique key constraints. First, there’s one extra piece: the transition will block access to dbo.LookupTable as well as the table we create the constraint on. That’s to keep us from deleting rows in our lookup table before the key is in place.
Second, the locks begin as soon as we hit F5. Even SELECT statements get blocked requesting a LCK_M_SCH_S lock. Bad news, people.
So what can we do to get around this problem? Two routes: the ineffectual way and the ugly way.

Despite my being a ray of sunshine here, you should still check this out. It’s shorter than the average Russian novel, at least.

Comments closed

Complexities with Binary Collations

Published 2019-03-13 by Kevin Feasel

Solomon Rutzky takes us through the nuances of binary collations:

Still, there are some complexities related to binary collations that you might not be aware of. To figure out what they are, we need to look at why there are so many binary collations in the first place. I mean, binary collations work on the underlying values of the characters, and comparing numbers doesn’t change between cultures or versions: 12 = 12, 12 > 11, and 12 <13, always. So, then what is the difference between:
– Latin1_General_100_BIN2 and Hebrew_100_BIN2 (only the culture is different), or
– Latin1_General_100_BIN2 and Latin1_General_BIN2 (only the version is different), or
–Latin1_General_100_BIN2 and Latin1_General_100_BIN (only the binary comparison type is different)

Read on to find out.

Comments closed

Finding Missing Index Hints in Query Store

Published 2019-03-13 by Kevin Feasel

Grant Fritchey shows us another place where we can find missing index hints:

A couple of notes on the query. I cast the query_plan as xml so that I can use the XQuery to pull out the information. It is possible that the plan might be so large that you get an error because of the limit on nesting levels within XML. Also, I aggregate the information from the sys.query_store_runttime_stats. You may want to modify this to only look at limited ranges. I’ll leave that to you as an exercise.

Do read Grant’s warning in the conclusion.

Comments closed

Gaps and Islands with Dates

Published 2019-03-13 by Kevin Feasel

Bert Wagner hits one of my favorite topics:

In a traditional gaps and islands problem, the goal is to identify groups of continuous data sequences (islands) and groups of data where the sequence is missing (gaps).
While many people encounter gaps and islands problems when dealing with ranges of dates, and recently I did too but with an interesting twist:
How do you determine gaps and islands of data that has overlapping date ranges?

Check out Bert’s explanation of the solution; it’s a good one.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30