Author: Kevin Feasel

Bitwise Logic To Make Values Negative

Published 2018-05-25 by Kevin Feasel

David Fowler is working with arcane magic:

The short answer to the question is to simply apply a bitwise NOT to the number that you want to convert and add 1 to the result. I wrote about bitwise logic in The Arcane Science of Bitwise Logic and SQL Server but didn’t mention the bitwise NOT.

If you want to apply a bitwise NOT to a value, simply use the ~

Give it a go, try converting 10 to a negative,
SELECT ~ 10 + 1
The result, -10.

One of my rules of thumb (given how many I have, I need more thumbs) is that if you’re using bitwise logic in SQL Server, something has likely gone wrong somewhere along the way. But read David’s explanation, which is quite clear.

Comments closed

What TDE Does To Query Performance

Published 2018-05-25 by Kevin Feasel

Matthew McGiffen has a few tests on using Transparent Data Encryption:

By the time it had been executed 5 times (with the memory flushed between each execution) each query read about 600,000 pages sized at 8kb each – just under 5GB. If it took 50 seconds on the decryption of those pages, then each page took about 1 twelfth of a milli-second to decrypt – or alternatively, TDE decrypted about 12 pages per millisecond. Or in terms of disk size, 100MB per second. These were tests on a server with magnetic spinning disks (not SSDs) and you can see from the above figures, the straight disk access took about 40 seconds on its own.

When TDE doesn’t read from disk it doesn’t add any overhead, but how do we quantify what the overhead to queries is when it does have to access the disk?

Matthew has some good advice here, and I’d be willing to say that his experience is within the norm for TDE and doesn’t directly contradict general guidelines by enough to shift priors.

Comments closed

Technical Debt, T-SQL Business Rules Edition

Published 2018-05-25 by Kevin Feasel

Paul Turley tells a story of technical debt:

They’ve been writing reports using some pretty complicated SQL queries embedded in SSRS paginated reports. Every time a user wants a new report, a request is sent to the IT group. A developer picks up the request, writes some gnarly T-SQL query with pre-calculated columns and business rules. Complex reports might take days or weeks of development time. I needed to update a dimension table in the data model and needed a calculated column to differentiate case types. Turns out that it wasn’t a simple addition and his response was “I’ll just send you the SQL for that…you can just paste it”. The dilemma here is that all the complicated business rules had already been resolved using layers of T-SQL common table expressions (CTEs), nested subqueries and CASE statements. It was very well-written SQL and it would take considerable effort to re-engineer the logic into a dimensional tabular model to support general-use reporting. After beginning to nod-off while reading through the layers of SQL script, my initial reaction was to just paste the code and be done with it. After all, someone had already solved this problem, right?

It’s the persistent battle between “don’t fix what isn’t broken” and “the process is broken, even if the code isn’t.”

Comments closed

Deplaying A Container On Azure

Published 2018-05-25 by Kevin Feasel

Andrew Pruski shows us how to deploy an Azure Container Instance:

The code should be fairly self explanatory. I’m using the username and password created earlier to access the ACR and am then spin up a container from the sqlserverlinuxagent:latest image. The container has 2 CPUs and 4GB of memory available to it and it will be listening on a public IP address on port 1433 (be very careful with this).

At the time of writing, the only option available for ip-address is public, hopefully further options will be available soon. I will update this blog if/when that happens.

Read on for a demo.

Comments closed

Spark: DataFrame To RDD For Data Cleansing

Published 2018-05-24 by Kevin Feasel

Gilad Moscovitch walks us through a common data cleansing problem with Spark data frames:

A problem can arise when one of the inner fields of the json, has undesired non-json values in some of the records.

For instance, an inner field might contains HTTP errors, that would be interpreted as a string, rather than as a struct.

As a result, our schema would look like:

root

|– headers: struct (nullable = true)

| |– items: array (nullable = true)

| | |– element: struct (containsNull = true)

|– requestBody: string (nullable = true)

Instead of

root

|– headers: struct (nullable = true)

| |– items: array (nullable = true)

| | |– element: struct (containsNull = true)

|– requestBody: struct (nullable = true)

| |– items: array (nullable = true)

| | |– element: struct (containsNull = true)

When trying to explode a “string” type, we will get a miss type error:

org.apache.spark.sql.AnalysisException: Can’t extract value from requestBody#10

Click through to see how to handle this scenario cleanly.

Comments closed

Visualization Over Kafka And KSQL

Published 2018-05-24 by Kevin Feasel

Shant Hovsepian shows off a data visualization tool which can read Kafka Streams data:

KSQL is a game-changer not only for application developers but also for non-technical business users. How? The SQL interface opens up access to Kafka data to analytics platforms based on SQL. Business analysts who are accustomed to non-coding, drag-and-drop interfaces can now apply their analytical skills to Kafka. So instead of continually building new analytics outputs due to evolving business requirements, IT teams can hand a comprehensive analytics interface directly to the business analysts. Analysts get a self-service environment where they can independently build dashboards and applications.

Arcadia Data is a Confluent partner that is leading the charge for integrating visual analytics and BI technology directly with KSQL. We’ve been working to combine our existing analytics stack with KSQL to provide a platform that requires no complicated new skills for your analysts to visualize streaming data. Just as they will create semantic layers, build dashboards, and deploy analytical applications on batch data, they can now do the same on streaming data. Real-time analytics and visualizations for business users have largely been a misnomer until now. For example, some architectures enabled visualizations for end users by staging Kafka data into a separate data store, which added latency. KSQL removes that latency to let business users see the most recent data directly in Kafka and react immediately.

Click through for a couple repos and demos.

Comments closed

Using The Public Role

Published 2018-05-24 by Kevin Feasel

Kenneth Fisher explains the public role in SQL Server:

A common misunderstanding is that the CONNECT permission lets you do more than just connect to a database. It doesn’t. Connection only. So how come there are some things that everyone can do once they are connected to a database? Well, it’s the public role. Everyone is a member and that can’t be changed. In fact, you can’t even disable it. Oh, and I should point out that every database has one.

So what does that mean? If you have a table that you want everyone to have read access to you could grant the permission in public.

I never use the public role for anything, and so it’s a benign role. I strongly dislike database security tools which flag the public role as a risk, mostly because I made the mistake once of believing the tool and had to start granting things like CONNECT to each new login.

Comments closed

Executing Dynamic SQL: EXEC vs sp_executesql

Published 2018-05-24 by Kevin Feasel

Tara Kizer gives a scenario where using sp_executesql can result in lower CPU utilization than EXEC:

For frequently executed queries, “EXEC sp_executesql @sql”is a good choice but at the risk of encountering parameter sniffing issues.

For queries that aren’t executed very often, “EXEC (@sql)” can be used, but you could end up with a bloated plan cache with lots of single-use execution plans.

Let’s compare the performance difference between “EXEC (@sql)” and “EXEC sp_executesql @sql” for a frequently executed query.

For application code, I tend to lean on sp_executesql very heavily to create parameterized queries.

Comments closed

Some Waits Just Need Ignoring

Published 2018-05-24 by Kevin Feasel

Paul Randal explains that not all SQL Server wait types are pernicious:

Wait statistics analysis is one of my favorite things to talk about because it’s so incredibly useful for performance tuning and can dramatically shorten the time it takes to zero in on the root cause of a performance problem. But you have to do it correctly. You can’t just do a SELECT * FROM sys.dm_os_wait_stats. Various people have published scripts online to aggregate and display wait statistics in an actionable way, and my script is one of the most popular (latest version is always in this post).

One question I’m often asked is why does my script have a list of wait types that it specifically filters out? The answer is that those wait types are what I call ‘benign’ – they’re usually not a problem but happen frequently enough from regular SQL Server operations that they would show up as the top waits and so would obscure the waits that you can do something about.

Read on for the rest of the story.

Comments closed

Scoping And Powershell Script Blocks

Published 2018-05-24 by Kevin Feasel

Shane O’Neill looks at variable scope within script blocks in Powershell:

Now what happens if we don’t want to use switch? What happens if we tried to do this with script blocks instead?

I’ve created 3 different script blocks here just for ease of use. If you want to create a single script block to do this then I encourage it! Let me know how you get on.

We’re also passing in the parameter into the script block by passing it in the brackets of the .InvokeReturnAsIs() method.

Read on and learn along with Shane.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31