Press "Enter" to skip to content

Curated SQL Posts

Explaining the HA/DR Licensing Changes

Kevin Chant goes into the fairly recent licensing changes for SQL Server:

Which surprised me a bit because these licensing changes have been in-place for a while now. With this in mind, I thought I would discuss them here to raise awareness about the changes.

To clarify, in SQL Server 2019 there have been some big licensing changes about what you can and can’t do on a passive fail-over instance. Especially if you have Software Assurance.

Which I have to admit I am really excited about. Because it opens up some new possibilities which I will explain below. Of course, there are other significant updates in the licensing guide as well.

Read on for the details.

Comments closed

Flink 1.10.0 Released

Marta Paes announces the release of Apache Flink 1.10.0:

The Apache Flink community is excited to hit the double digits and announce the release of Flink 1.10.0! As a result of the biggest community effort to date, with over 1.2k issues implemented and more than 200 contributors, this release introduces significant improvements to the overall performance and stability of Flink jobs, a preview of native Kubernetes integration and great advances in Python support (PyFlink).

Flink 1.10 also marks the completion of the Blink integration, hardening streaming SQL and bringing mature batch processing to Flink with production-ready Hive integration and TPC-DS coverage. This blog post describes all major new features and improvements, important changes to be aware of and what to expect moving forward.

Read on for the improvements and let me once more point out the validation of Feasel’s Law.

Comments closed

Installing Spark on Windows 10

Gopal Tiwari shows how you can install Apache Spark on Windows 10:

By default, Spark SQL projects do not run on Windows OS and require us to perform some basic setup first; that’s all we are going to discuss in this article, as I didn’t find it well documented anywhere over the internet or in books.

This article can also be used for setting up a Spark development environment on Mac or Linux as well. Just make sure you’ll downloading the correct OS-version from Spark’s website. 

You can refer to the Scala project used in this article from GitHub here: https://github.com/gopal-tiwari/LocalSparkSql.

I’ve seen (and written) installation guides for Spark. This is a good one, as it goes beyond installation and into kicking off a project and ensuring that it works.

Comments closed

Finding the Query Used in DirectQuery Mode

Kasper de Jonge shows us how we can find which query ran in DirectQuery mode to populate a Power BI data set:

When you are optimizing your DirectQuery model and you have done all the optimizations on the model already, you might want to run the queries generated by Power BI by your DBA. He then might be able to do some index tuning or even suggest some model changes. But how do you capture them? There are a few simple ways that I will describe here.

Read on for 3 1/2 such methods.

Comments closed

When Transactional Replication Makes Sense

Jonathan Kehayias has some good use cases for transactional replication:

Why in the world would you want to use Transactional Replication?  Isn’t that the thing that is impossible to manage, difficult to configure, and always prone to problems? As a consultant, I see a lot of scenarios where every problem becomes a nail because all you have is a hammer. Sometimes another technology can provide a solution to problems that exist but little is known about the technology, and Transactional Replication tends to fall into this scenario in my experience. In this post we are going to take a look some of the more common Transactional Replication use cases as well as some scenarios where it can be used to solve a business problem in a different way. As with anything, Transactional Replication is just another tool to have in your toolbox.

To be fair, merge replication is much more impossible to manage, difficult to configure, and prone to problems…

Comments closed

Azure Data Studio February Release

Alan Yu announces the February 2020 release of Azure Data Studio:

Azure Data Studio is a multi-database, cross-platform desktop environment for data professionals using the family of on-premises and cloud data platforms on Windows, MacOS, and Linux. To learn more, visit our Github.

The key highlights to cover this month include:

– Improved Azure sign in support.
– Find in Notebook support.
– Visual Studio Code merge to 1.42.
– Bug fixes.

Read on for more details on each topic.

Comments closed

Loading the SQL Server Error Log into a Table

Jeff Mlakar shows how you can load the SQL Server error log into a table:

Why Not Just Use the File System?

When possible I prefer to go into the file system and open the SQL Server error logs in a text editor e.g. Notepad++ or UltraEdit. However, there are sometimes you may have access to a SQL Server instance but not be able to RDP or otherwise scan the file system.

That’s where this comes in – straight T-SQL.

Click through for demos.

Comments closed

Monitoring for Distribution Changes

Nina Zumel explains how we can track if something has changed by monitoring its distribution:

A client recently came to us with a question: what’s a good way to monitor data or model output for changes? That is, how can you tell if new data is distributed differently from previous data, or if the distribution of scores returned by a model have changed? This client, like many others who have faced the same problem, simply checked whether the mean and standard deviation of the data had changed more than some amount, where the threshold value they checked against was selected in a more or less ad-hoc manner. But they were curious whether there was some other, perhaps more principled way, to check for a change in distribution.

The answer is, of course, that there is. Click through to see a few of the techniques.

Comments closed

Publishable Adverse Event Tables in R

Inge Christoffer Olsen shows how to clean up tables in R for publication:

The summary of Adverse Events is a nice table just summing up the adverse events in the trial. Note the “[N] n (%)”-format which is the number of events, number of patients with events and percentage of patients with event.

This particular example is about adverse events, but the key concepts in the code apply to many kinds of tables you want to make look a bit nicer. H/T R-Bloggers

Comments closed

Handling Azure SQL Database Scale Changes

Arun Sirpal shows us how to handle scaling events in Azure SQL Database:

For some reason I have friends / colleagues telling me that when scaling (up and down for this example) that no downtime occurs. Well, not only does Microsoft documentation say differently, I will show it. So let’s test it out. Before the practical test, this is the official stance. “There is a switch over period where connectivity is lost to the database for a short amount of time, which can be mitigated using retry logic”.

Retry logic is an important part of any application. We tend to forget about it with on-prem applications talking to on-prem databases, but that’s a mistake.

Comments closed