Press "Enter" to skip to content

Curated SQL Posts

Understanding Data Integration Lifecycle Management

Andy Leonard explains DILM:

Data Integration Lifecycle Management (DILM) is not about data integration development.

DILM is about everything else:

  • Configurations Management
  • Version Management
  • Deployment
  • Execution

Although DILM is not about development, implementing DILM will impact the design of SSIS solutions.

This is the first part in a multi-part series, and covers some of the conceptual basicsbehind DILM.

Comments closed

Reporting Services Mobile Reports

Kathi Kellenberger shows off mobile reports in SQL Server Reporting Services:

Mobile Reports are dashboards that will run on most modern mobile devices as well as within the web portal. They are supported on IOS 9 and later, Android 4.4 or better, and Windows 10. To run Mobile Reports on these devices, the mobile Power BI application must be installed.

At first glance, they are simple to create. There is a new tool to use, the SQL Server Mobile Report Publisher. The tool will look familiar to you if you have worked with Datazen in the past. Microsoft purchased Datazen in 2015.

This is the first major Reporting Services update since 2008 (unless you consider sparkline support in R2 a major update), and could be a good business justification for upgrading to SQL Server 2016.

Comments closed

Columnstore Query Patterns

Niko Neugebauer gives a couple hints on how to make columnstore queries as fast as possible:

On my VM with 4 cores it takes 33 seconds to execute this query on SQL Server 2016 with Service Pack 1, while it burns almost 48 seconds of the CPU Time.
The relevant part of the execution plan can be found below, showing so many performance problems that this query is suffering, such as INNER LOOP JOIN, INDEX SPOOL, besides even worse part that is actually hidden and is identifiable only once you open the properties of any of the lower tree (left side of the LOOP JOIN), seeing that it all runs with the Row Execution Mode actually.

To show you the problem, on the left side you will find the properties of the sort iterator that is to be found in the lower (left) part of the LOOP Join that was executed around 770.000 times in the Row Execution Mode, effectively taking any chances away from this query to be executed in a fast way. One might argue that it might that it might be more effective to do the loop part in Row Mode, but given that we are sorting around 3.1 Million Rows there – for me there is no doubt that it would be faster to do it within a Batch Execution Mode. Consulting the last sort iterator in the execution plan (TOP N SORT), you will find that it is running with the help of the Batch Execution Mode, even though it is processing around 770.000 rows.

There’s some valuable information here.

Comments closed

Changing TDE Certificates

Tom Norman has a script to change the certificate used for TDE:

When you create TDE certificate by default, the certificate is just created for a year. Depending upon your auditor, you may be asked to change this TDE certificate yearly.  In this blog post, you can watch when your TDE certificate expires using policy-based management, https://www.mssqltips.com/sqlservertip/2388/internal-sql-server-certificates-have-expired-is-this-an-issue/.

Click through for the script.

Comments closed

Uses For Binary Data Types

Daniel Hutmacher explains what binary data types are and one use case:

Because binary values are essentially strings, they easily convert to and from character strings, using CAST or CONVERT. To convert the binary value of 0x41 to a plain-text character value, try

SELECT CAST(0x41 AS char(1)); --- 'A'

The binary value 0x41 is equivalent to decimal 65, and CHAR(65) is the letter “A”. Note that I haven’t placed any quotes around 0x41 – that’s because it’s a numeric value (albeit in hex notation) and not a string.

A couple use cases I’ve seen are creating hashes (SHA1 or MD5) for change detection, storing password hashes, and encrypted columns—Always Encrypted uses varbinary data types to store encrypted information, for example.

Comments closed

Getting A Handle On Isolation Levels

Robert Bishop explains some of the basics behind transaction isolation levels:

Lost Updates

It is possible for two or more transactions to modify the same row. While the subsequent transactions are reading the data, the 1st transaction commits its changes. Then the subsequent transaction makes its changes and commits, thereby possibly losing the changes committed by the 1st transaction. In this scenario, the last transaction always wins.

Dirty Read

A dirty read is when a SELECT statement will read against a table that is currently being modified and not yet committed to the database. By default, SQL Server will protect against this on all isolation levels, except for Read Uncommitted or by using NOLOCK in your queries.

This is the first in a series.

Comments closed

Password Expiration

Artemakis Artemiou hits a controversial topic in security today:

As you can see from the above screenshot, the maximum password age is set to 90 days. This means that if I have a login that uses SQL Server Authentication or a contained database user with a password, if the login’s or user’s password was last changed 90 days (or more) ago, then the login/user will be prompted by SQL Server to change her password.

Even though this practice was used for many years, not only in SQL Server but similarly, on Windows-level, as well as in other systems and applications, recent studies argue that it should not be a recommended practice anymore. On the contrary, these studies suggest that user passwords should not be regularly changed but rather change only when there is a specific and justified reason to do so. Furthermore, new security standards are being formulated that contain new recommendations on password change.

Security is all about trade-offs.  If you make people change their passwords every 90 days, they’ll write the password down somewhere.

Comments closed

Reading Extended Events

Dave Mason writes a custom app to read extended events data:

It took me a while to make the transition from SQL Profiler to Extended Events. Eventually I got comfortable enough with it to use it 100% of the time. As I read more about the XEvents architecture (as opposed to just “using” XEvents), I gained a deeper appreciation of just how great the feature is. My only gripe is that there isn’t a way to handle the related events from within SQL Server using T-SQL. DDL triggers can’t be created for XEvents. And they can’t be targeted to Service Broker for Event Notifications (not yet, anyway). For now, the one way I know of to handle an XEvent is by accessing the event_stream target via the .NET framework. I’ll demonstrate with C#.

9/10, would have preferred F# but would read again.

Comments closed

Pipelearner

Simon Jackson introduces pipelearner, a tool to help with creating machine learning pipelines:

This post will demonstrate some examples of what pipeleaner can currently do. For example, the Figure below plots the results of a model fitted to 10% to 100% (in 10% increments) of training data in 50 cross-validation pairs. Fitting all of these models takes about four lines of code in pipelearner.

Click through for some very interesting examples.

Comments closed

Monitoring Car Data With Spark And Kafka

Carol McDonald builds a model to determine where Uber cars are clustered:

Uber trip data is published to a MapR Streams topic using the Kafka API. A Spark streaming application, subscribed to the topic, enriches the data with the cluster Id corresponding to the location using a k-means model, and publishes the results in JSON format to another topic. A Spark streaming application subscribed to the second topic analyzes the JSON messages in real time.

This is a fairly detailed post, well worth the read.

Comments closed