SET STATISTICS XML

Kevin Feasel

2016-05-23

Syntax

Daniel Hutmacher explains how SET STATISTICS XML will generate execution plans for certain segments of code:

But sometimes you want to run a series of statements or procedures where you only want the execution plan for some of the statements. Here’s how:

The actual execution plan is enabled by turning on SET STATISTICS XML., not unlike enabling STATISTICS IO or TIME. And just like SET NOCOUNT, the SET statements apply to the current context, which could be a stored procedure, a session, etc. When this context ends, the setting reverts to that of the parent context.

I see code snippets with STATISTICS IO and TIME fairly regularly, but almost never see STATISTICS XML; instead, I see people (including myself) hit Ctrl-M or select the “Include Actual Execution Plan” button when generating execution plans is desirable.

What To Do When Corruption Hits You

Kevin Feasel

2016-05-23

DBCC

Brent Ozar has a step-by-step guide explaining what to do when you CHECKDB reports corruption:

This sounds paranoid, but as an example, here’s a corruption case I had recently: shortly after detecting corruption, the team realized they’d have to revert to a backup of the database from a few days ago. Rather than telling users about that possibility, they let the users keep adding data into the already-corrupt database while the DBAs did troubleshooting. Several days later, as the corruption got worse, even Microsoft couldn’t repair the corruption – and the affected tables went completely offline, permanently. If the users would have been alerted earlier, they could have avoided even more data loss.

Good advice.  If you have Pluralsight, I recommend Paul Randal’s course on database corruption.  Watch that ideally before you have corruption…

Updates With Joins

Kevin Feasel

2016-05-23

Syntax

James Anderson shows a case in which an update with a join clause can lead to undesirable results:

That’s right; SQL Server will just pick any child. It will not update the parent row for each child. When the MERGE statement runs into this problem, it raises an error and rolls back the UPDATE but our query will silently pick any value and move on. We don’t want to update data like this, we want our query to dictate the logic explicitly. This will require some changes to our query.

I prefer to write these types of updates with the use of a CTE. This allows us to easily highlight the SELECT in the CTE and execute it to see which rows will be updated (@variables can cause problems here though). Adding a COUNT can help to identify the problem in the previous query.

Highlight and execute the code inside the CTE below to check for duplicates. If this returns 0 rows then you are good to remove the COUNT, GROUP BY and HAVING then add the name columns before executing the whole statement.

When writing T-SQL updates with joins, it’s important to consider whether the grain changes, and if that change can make a difference in your update set.

How WITH ENCRYPTION Works

Paul White digs into the WITH ENCRYPTION clause:

The core RC4 algorithm is well-known, and relatively simple. It would be better implemented in a .Net language for efficiency and performance reasons, but there is a T-SQL implementation below.

These two T-SQL functions implement the RC4 key-scheduling algorithm and pseudorandom number generator, and were originally written by SQL Server MVP Peter Larsson. I have a made some minor modifications to improve performance a little, and allow LOB-length binaries to be encoded and decoded. This part of the process could be replaced by any standard RC4 implementation.

Using WITH ENCRYPTION is a gentleman’s agreement that you won’t look at the underlying code.  In practice, it’s trivial to get around, and Paul shows exactly why.

Changing Computed Columns

Steve Jones shows a how to modify a computed column:

I was working with a computed column the other day, and realized I had the wrong definition. In this case, I was performing some large calculation, and the result was larger than an int. However the first part of the formula was an int, which resulted in an implicit conversion to an int.

I needed to change the formula, and then realized that plenty of people might not work with computed columns much, and not realize how you alter a computed column.

You don’t.

In fact, you need to drop the column and add it back. In my case, this was what I did.

Fortunately, this tends to be a pretty quick operation, especially if the computed column is non-persisted.

“Let” Expressions In M

Chris Webb explains “let” expressions  in M:

In the M language a let expression consists of two sections. After the let comes a list of variables, each of which has a name and an expression associated with it. In the previous example there are three variables: step1, step2 and step3. Variables can refer to other variables; here, step3 refers to both step1 and step2. Variables can be used to store values of any type: numbers, text, dates, or even more complex types like records, lists or tables; here, all three variables return numbers. The Query Editor is usually clever enough to display these variables as steps in your query and so displays then in the Applied Steps pane on the right-hand side of the screen

It’s a look at one of the fundamentals of an interesting language.

Thread-Local Storage

Ewald Cress digs into fundamentals:

As a teaser for where this is heading, I’ll reframe the problem as classic SQL Server examples. Firstly, when a latch wait occurs somewhere in the bowels of a LatchBase subclass instance, how does that latch method know to track the wait against an instance of a Worker, or make it known to the world that it is holding up that Worker? And secondly, at a much higher abstraction level, when a task executes a user query and needs to access a table, how does the access methods code know what security principal to do security checks against? We are taking the first steps towards answering these questions here.

I enjoy Ewald’s explanations because when I’m done, I really feel like I have a clue of what’s going on.  It all fades away as soon as I look away from the screen, but that’s on me, not him.

History Of Apache Storm

Kevin Feasel

2016-05-20

Hadoop

Taylor Goetz gives a history of Storm up to release 1.0:

Storm was originally created by Nathan Marz while he was at Backtype (later acquired by Twitter) working on analytics products based on historical and real-time analysis of the Twitter firehose. Nathan envisioned Storm as a replacement for the real-time component that was based on a cumbersome and brittle system of distributed queues and workers. Storm introduced the concept of the “stream” as a distributed abstraction for data in motion, as well as a fault tolerance and reliability model that was difficult, if not impossible, to achieve with a traditional queues and workers architecture.

Nathan open sourced Storm to GitHub on September 19th, 2011 during his talk at Strange Loop, and it quickly became the most watched JVM project on GitHub. Production deployments soon followed, and the Storm development community rapidly expanded.

Storm is an exciting technology in that it’s a key driver in making Hadoop more than just a batch processing framework.

Machine Learning Skepticism

Julia Evans gives reasons to tamp down expectations with machine learning:

When explaining what machine learning is, I’m giving the example of predicting the country someone lives in from their first name. So John might be American and Johannes might be German.

In this case, it’s really easy to imagine what data you might want to do a good job at this — just get the first names and current countries of every person in the world! Then count up which countries Julias live in (Canada? The US? Germany?), pick the most likely one, and you’re done!

This is a super simple modelling process, but I think it’s a good illustration — if you don’t include any data from China when training your computer to recognize names, it’s not going to get any Chinese names right!

Machine learning projects are like any other development projects, with more complex algorithms.  There’s no magic and there’s a lot of perspiration (hopefully figuratively rather than literally) involved in getting a program which behaves correctly.

Template Replacement

Andy Mallon shows SSMS template replacements:

In the above example, there’s not much value-add by using the template replacement. It’s probably easier to just use @variables and highlight-replace.

The template replacement really shines when you have examples where you’d otherwise need to use dynamic SQL. If you have object names or database names that need replacement, this is a great answer. If you work in a multi-tenant hosting environment, and a client name is part of the DB name, this can make your life a lot easier.

Templates work great with auto-replace (a feature several third-party toolkits include).  My favorite auto-replace that I’ve created is “die” which asks for an schema and procedure name and generates the DROP PROCEDURE script.  Naturally, I also have diet (table), diev (view), and dief (function).

Categories

September 2018
MTWTFSS
« Aug  
 12
3456789
10111213141516
17181920212223
24252627282930