Press "Enter" to skip to content

Author: Kevin Feasel

T-SQL Formatters

Kenneth Fisher links to a set of T-SQL formatters:

I have a hard time working with code that’s so badly formatted I can’t read it. Normally I’d just format it myself, but this was a good dozen pages long. So what did I do? I ran to twitter screaming for help.

And I got a lot of responses! Here were the recommendations followed by who recommended them:

My favorite is DevArt’s SQL Complete, which is a paid product, but covers almost perfectly our 30-some page T-SQL formatting guideline.

Comments closed

Alerting On Missed Log Backups

Jamie Wick shows us how to build out an alert if we’ve exceeded a threshold for time since the last log backup:

The primary reason most transaction logs start to (unexpectedly) grow is that the transaction log hasn’t been backed up. This is usually caused by one of two situations: the log backup job failed, or the log backup job didn’t start. Most options for backing up SQL transaction logs (3rd party software, SQL Agent job, or maintenance plan) will include an alert/notification mechanism for when the job fails. Where some backup options are lacking is in notifying that a (log) backup that should have been taken, wasn’t. If the backup (or management) software doesn’t include an alert for missing backups, SQL Server (2008 & newer) has a solution. Policy Based Management (PBM) includes a policy (Last Log Backup Date) that will send an email notification when the last backup date/time exceeds a specified threshold.

A good use for a woefully underappreciated tool.

Comments closed

SSMS: Analyze Actual Execution Plan

Grant Fritchey shows us something that the SSMS tools team snuck in on us:

One of the many new sets of functionality introduced in SQL Server Management Studio 17 is the new option “Analyze Actual Execution Plan.” If Microsoft continues down this path, there will be a lot of useful functionality at some point. If you haven’t yet looked at Analyze Actual Execution Plan, well, read on.

I hope they do expand this out.  I can see it being very beneficial, but it needs to look at a lot more than just cardinality estimations.

Comments closed

Azure SQL Data Warehouse Restore Points

Arun Sirpal explains how backups work with Azure SQL Data Warehouse:

The question is how are backups done with Azure SQL DW?

It is very different from Azure SQL DB (which you would expect). Azure SQL DW has a totally different architecture to its classic database counter-part. Restore points are the key here. Automatic ones are taken throughout the day and are kept for seven days only. Worst case scenario is the time between the restore points will be eight hours hence giving an eight hour RPO (Recovery Point Objective).

You can also create manual restore points, as Arun shows.

Comments closed

Implementation Matters: CTEs In Postgres And SQL Server

Brent Ozar looks at a couple of places where Postgres and SQL Server differ in implementation details:

In SQL Server, if you write this query:

SQL Server builds a query plan for the entire operation at once, and passes the WHERE clause filter into the CTE. The resulting query plan is efficient, doing just a single clustered index seek.

In Postgres, CTEs are processed separately first, and subsequent WHERE clauses aren’t applied until later. That means the above query works just fine – but performs horribly. You’ll get much better results if you include your filters inside each CTE, like this:

That’s less than ideal.

The comments are valuable here as well.

Comments closed

SSL Provider Error: 31 With SQL Server In Docker

Andrew Pruski walks us through fixing a connection error:

I recently bought a Dell XPS 13 running Ubuntu 16.04 and ran into an issue when connecting SQL Operations Studio (version 0.31.4) to SQL 2017 CU9 running in a docker container. Other people seem to encountering this issue as well so am posting it so that it may be of some help to someone in the future.

The error generated was: –

System.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: SSL Provider, error: 31)

The full error can be viewed here

Read on for the solution.

Comments closed

Combining Apache Kafka With TensorFlow

Kai Waehner has an example of an application which uses Apache Kafka to stream car sensor data to TensorFlow on Google ML Engine:

A great benefit of Confluent MQTT Proxy is simplicity for realizing IoT scenarios without the need for a MQTT Broker. You can forward messages directly from the MQTT devices to Kafka via the MQTT Proxy. This reduces efforts and costs significantly. This is a perfect solution if you “just” want to communicate between Kafka and MQTT devices.

If you want to see the other part of the story (integration with sink applications like Elasticsearch / Grafana), please take a look at the Github project “KSQL for streaming IoT data“. This realizes the integration with ElasticSearch and Grafana via Kafka Connect and the Elastic connector.

Check it out and then take a gander at Kai’s GitHub repo.

Comments closed

Writing Better Jupyter Notebook Code

Henk Griffioen shows how to write Python code in your IDE of choice and then synchronize a Jupyter Notebook with the results:

How can you get the interactivity back and get our changes immediately in our Notebook? Add %autoreload at the top of your Notebook:

%loadext autoreload  # Load the extension
%autoreload 2  # Autoreload all modules

%autoreload is a Jupyter extension that reloads modules before executing your code. Functions and classes loaded in notebooks get their functionality updated every time you execute a cell. This means that when new code is saved in the editor, the changes are immediately loaded in your Notebook if you run a cell.

Using %autoreload bridges the gap between Notebook and IDE. You gain all the benefits of an IDE, but you’re still as flexible as before! See the GIF at the top as an example.

That’s a useful trick.  I’ve tended to use notebooks more for post-hoc work, where I’ve already structured my code and want to formalize it for others to use.

Comments closed

Dual Storage Mode In Power BI

Teo Lachev takes us through the Dual storage mode now available in Power BI:

As the name implies, the dual storage mode is a hybrid between Import and DirectQuery. Like importing data, the dual storage mode caches the data in the table. However, it leaves it up to Power BI to determine the best way to query the table depending on the query context. Consider the following schema where all tables come from the same data source. Let’s assume the following configuration:

  • FactInternetSales is imported

  • FactResellerSales is DirectQuery

  • DimDate and DimProduct are Dual

Read on for more.

Comments closed

Azure Data Factor V2 Now In General Availability

Chris Seferlis covers some of the improvements in Azure Data Factory V2:

With ADF V2 you get a browser-based interface using drag and drop technology; V1 was primarily done in the Visual Studio IDE. It also added triggers for scheduling, so you can schedule your jobs when required and in additional ways (which I’ll discuss further in a bit).

Some other features of ADF V2 that came out as it became generally available:

  • Lift and Shift operations for your SSIS packages, so if you have SSIS packages local, you can now Lift and Shift those into compute with the integration runtime service in Data Factory.

  • This also allows for cloud to cloud, cloud to prem, prem to prem and some third-party tools are supported within that as well.

  • Control flow activities like link branching, looping, conditional execution and parameterization.

  • Integration with HD Spark and Databricks for big data workloads and data science.

There’s more where that came from, too.

Comments closed