Curated SQL – Page 1416 – A Fine Slice Of SQL Server

Instead of giving a nice & neatly formatted pros & cons table where all the pros have a corresponding cons, let’s just discuss the major aspects: security & complexity.

Basically, in general, OAuth is more secure but more complex for both clients (i.e. consumer) and services.

Why is OAuth more secure? Relying parties never see credentials & secrets in an OAuth authentication scheme. They see a token. Token are revoked after a while ; often minutes, maximum a few hours.

Read on for more. My preference is OAuth, but it’s not always trivial to set up.

Comments closed

SQL Vulnerability Assessment

Published 2017-12-12 by Kevin Feasel

Ronit Reger shows off the new SQL Vulnerability Assessment, available in SQL Server 2012 and later:

Not only does VA expose some of the possible security flaws you have in your database system, it also provides remediation scripts to resolve issues within a couple of mouse clicks. In addition, you can accept specific results as your approved baseline state, and the VA scan report will be customized accordingly to expect these values.

Ronit and Alan Yu also mention it being available via the latest version of Management Studio, 17.4:

The VA service runs a scan directly on your SQL database or server. VA employs a knowledge base of rules that flag security vulnerabilities and deviations from best practices, such as misconfigurations, excessive permissions, and exposed sensitive data. The rule base grows and evolves over time, to reflect the latest security best practices recommended by Microsoft.

Results of the assessment include actionable steps to resolve each issue and provide customized remediation scripts where applicable. An assessment report can be customized for each customer environment and tailored to specific requirements. This process is managed by defining a security Baseline for the assessment results, such that only deviations from the custom Baseline are reported.

VA is supported for SQL Server 2012 and later, and can also be run on Azure SQL Database.

This looks like a good reason to upgrade SSMS.

Comments closed

More Collation Comparison Fun

Published 2017-12-12 by Kevin Feasel

Solomon Rutzky wraps up his collation comparison series:

Just in case you were wondering: No, the test directly above test does not prove the documentation (as quoted at the top of Part A) correct. Yes, the documentation did state that characters would be converted to the Code Page specified by the Collation of the Database or Column, which does appear to be what is going on here. But, the differences are:

The documentation states that the transformation happens if you reference a Unicode datatype, but what we just saw in the most recent test is the exact opposite:
- only the NVARCHAR columns of the “Subscript 2” row match because they are still the “Subscript 2” character, while the NVARCHAR columns of the other two rows do not match due to being either “2” or “?”.
- transformation did occur in the Latin1 and Hebrew VARCHAR columns, which is how it matched both rows with “2” in the Latin1 column and both rows with “?” in the Hebrew column.
In the scenario involving another column where it would help to prefix the string literal with a capital-“N” (i.e. a VARCHAR column using a Collation that specifies a different Code Page than the Database’s Code Page), the Collation of the Database is not used for the transformation; it is only the referenced column’s Collation. The assumption here is that the string literal without the capital-“N” is being used in a Database where the Collation specifies a Code Page that has all of the characters.
In the scenarios where the Database’s Collation, via its specified Code Page, did transform a string literal that was not prefixed with a capital-“N” (the two tests in the previous post), there would have already been unintended behavior no matter how the string literal was used.

I still think it’s a duck.

Comments closed

Using List.Accumulate

Published 2017-12-12 by Kevin Feasel

Reza Rad shows off a Power Query function which is not available in the UI:

List.Accumulate is a function that can easily save a number of steps in your Power Query transformations, instead of applying multiple steps, you can simply use List.Accumulate to overcome what you want. List.Accumulate function loops through the list and accumulate a value as a result. This function needs usually three parameters; the list itself, seed, and accumulator. Here are parameters explained in details;

list; the list that we want to apply the transformation to it.

seed; is an optional parameter. this is the initial value.

accumulator; is a function. this function determines what accumulation calculation happens on items of the list. the way that this function is defined is exactly the way that you write a function in Power Query M script using Lambda expressions.

best way to learn about seed and accumulator is through some examples, let’s apply some transformations with List.Accumulate and see how these two parameters are working.

Read on to see how to use it.

Comments closed

Optimized Bitmaps On Columnstore Indexes

Published 2017-12-12 by Kevin Feasel

Joe Obbish digs into bitmap filters and clustered columnstore indexes:

The position of the bitmap has changed so that it’s evaluated after the key lookup. That makes sense because the key lookup returns the column to be filtered against. However, the bitmap filter still reduces the estimated number of key lookups from 3000000 to 3000. This is impossible. The filter can only be applied after the key lookup, so it does not make sense for the bitmap to reduce the number of estimated executions of the key lookup.

Performance is significantly worse with the query now requiring 12199107 logical reads from the rowstore table and 13406 CPU time overall. We can see that the query did three million key lookups:

This is a fairly deep post, so you’ll probably want to check out the Paul White post on bitmaps first.

Comments closed

Units Of Measure In The ShowPlan Schema

Published 2017-12-12 by Kevin Feasel

Grant Fritchey shows off the ShowPlan Schema:

Because the showplan schema contains notes throughout stating what the units of measure are, what each of the values means. For instance, I can explain why SerialDesiredMemory, DesiredMemory, RequestedMemory are identical:

…Provide memory grant estimate as well as actual runtime memory grant information. Serial required/desired memory attributes are estimated during query compile time for serial execution. The rest of attributes provide estimates and counters for query execution time considering actual degree of parallelism. SerialRequiredMemory: Required memory in KB if the query runs in serial mode. The query will not start without this memory. SerialDesiredMemory: Memory estimated to fit intermediate results in KB if the query runs in serial mode. RequiredMemory: Required memory in KB for the chosen degree of parallelism. If the query runs in serial mode, this is the same as SerialRequiredMemory. …

That’s taken directly from the 2017 schema. The units of measure are KB.

I’d never seen this before, so that’s going on my to-read list.

Comments closed

Kafka Streams And Time-Based Batching

Published 2017-12-11 by Kevin Feasel

Vladimir Vajda provides a warning for people using Kafka Streams:

To completely understand the problem, we will first go into detail how ingestion and processing occur by default in Kafka Streams. For example purposes, the punctuate method is configured to occur every ten seconds, and in the input stream, we have exactly one message per second. The purpose of the job is to parse input messages, collect them, and, in the punctuate method, do a batch insert in the database, then to send metrics.

After running the Kafka Stream application, the Processor will be created, followed by the initmethod. Here is where all the connections are established. Upon successful start, the application will listen to input topic for incoming messages. It will remain idle until the first message arrives. When the first message arrives, the process method is called — this is where transformations occur and where the result is stored for later use. If no messages are in the input topic, the application will go idle again, waiting for the next message. After each successful process, the application checks if punctuate should be called. In our case, we will have ten process calls followed by one punctuate call, with this cycle repeating indefinitely as long as there are messages.

A pretty obvious behavior, isn’t it? Then why is one bolded?

Read on for more, including how to handle this edge case.

Comments closed

Kafka And GDPR

Published 2017-12-11 by Kevin Feasel

Ben Stopford has some ideas for using Kafka in a GDPR world:

The simplest way to remove messages from Kafka is to simply let them expire. By default, Kafka will keep data for two weeks, and you can tune this to arbitrarily large periods of time as required. There is also an Admin API that lets you delete messages explicitly if they are older than some specified time or offset. But what if we are keeping data in the log for a longer period of time, say for Event Sourcing architectures or as a source of truth? For this, you can make use of compacted topics, which allow messages to be explicitly deleted or replaced by key.

Data isn’t removed from compacted topics in the same way as in a relational database. Instead, Kafka uses a mechanism closer to those used by Cassandra and HBase where records are marked for removal then later deleted when the compaction process runs. Deleting a message from a compacted topic is as simple as writing a new message to the topic with the key you want to delete and a null value. When compaction runs the message will be deleted forever.

Click through for more information.

Comments closed

DevOps And Culture

Published 2017-12-11 by Kevin Feasel

Alex Yates points out a problem in how many people think of DevOps:

There is so much that is wrong with that conversation.

We could talk about the bottlenecks and the large amount of work in progress backed up in test – and the ways that could be fixed,
We could talk about the infrequent ‘big bang’ release in three months and the manual, error prone heroics that will probably be required to deliver it – and the ways that could be fixed,
We could talk about the misguided approach regarding branching strategies or the shared development database – and the ways they could be fixed,
We could talk about testing silos and the likelihood of drift and inconsistencies between different environments – and the ways they could be fixed,
We could talk about the word “DevOps-ing” – and why it should be burned along with anyone who uses it un-ironically. (And anyone who uses the word “irony” inappropriately or puts their commas at the end of the line.),

But I’m not going to talk about any of those things. I’m not going to talk about any of the things the customer said. I’m not going to talk about any of the technical issues or the possible solutions to those problems.

I’m going to talk about something much, much more important.

Read on to see what is much, much more important: culture.

Comments closed

Troubleshooting Client Disconnections

Published 2017-12-11 by Kevin Feasel

Mike Hays looks at a trace flag that can help you troubleshoot why client connections drop:

About once a month, I get support ticket regarding SQL Server dropping an application’s or user’s connection. The problem is SQL Server does not just randomly drop a connection and continue to work normally. Some force outside the control of SQL Server breaks the connection. By default, SQL Server does not record when this event occurs.

In my history of working with SQL Server, only in extreme situations have I ever seen SQL Server drop its connections. The most common example is when SQL Server is in the process of shutting down.

Click through for the trace flag and details.

Comments closed

Curated SQL Posts