Press "Enter" to skip to content

Month: November 2021

Version 12 of sp_WhoIsActive

Erik Darling answers the long-standing question “Who is active?” with “You is active!”:

– New parameter, @get_memory_info, that exposes memory grant information, both in two top-level scalar columns and a new XML-based memory_info column.

– Better handling of the newer CX* parallelism wait types that have been added post-2016

– A top-level implicit_transaction identifier, available in @get_transaction_info = 1 mode

– Added context_info and original_login_name to additional_info collection

– A number of small bug fixes

– Transition code to use spaces rather than tabs

Spaces rather than tabs? SQL should have tabs! But functional programming languages are great and they use spaces! I’m so conflicted!

Comments closed

SQL Server 2022 and Big Releases

Brent Ozar opines on an interesting topic:

The question, posed by Brent’s Tasty Beverage (nicely done) was:

My friends feel announcement from MS regarding SQL22 were only relatively small changes (since we didn’t see too much of multiple plans technically or demo), nothing groundbreaking or revolutionary. What are your thoughts?

Read on for Brent’s thoughts. I’ll say that I’m still expecting a few smaller surprises to come in as we get closer to CTPs.

Comments closed

Most Business Ideas Fail

Eric Colson, et al, have a humbling thought for us:

The introduction of data science into the business world has contributed far more than recommendation algorithms; it has also taught us a lot about the efficacy with which we manage our businesses. Specifically, data science has introduced rigorous methods for measuring the outcomes of business ideas. These are the strategic ideas that we implement in order to achieve our business goals. For example, “We’ll lower prices to increase demand by 10%” and “we’ll implement a loyalty program to improve retention by 5%.” Many companies simply execute on their business ideas without measuring if they delivered the impact that was expected. But, science-based organizations are rigorously quantifying this impact and have learned some sobering lessons:

1. The vast majority of business ideas fail to generate a positive impact.

2. Most companies are unaware of this.

3. It is unlikely that companies will increase the success rate for their business ideas.

Read the whole thing. It gives a lot of perspective to a difficult problem: there aren’t as many “free wins” in a business as you might expect. To paraphrase Adam Smith, there is a lot of ruin in a company…but that doesn’t mean you know what exactly it is or how exactly to fix it. Coming in with appropriate humility and a flexible mind (by which I mean a willingness to see reality even when it doesn’t comport to the mental model you’ve built over time) can help improve those odds.

Comments closed

Graphing Three or More Dimensions

Mike Cisneros takes on a challenge:

When we have three or more dimensions to show, how do you recommend we do it? I worry that my audience might not be able to make sense of it all.

This is a great question. As analysts we are often asked to consider multiple dimensions at once, and investigate complex relationships among these variables. In doing so, we may use visual analyses to explore and find patterns and outliers. The graph types we use to do this tend to be complicated and less intuitive than a simple bar chart or line chart. They might make sense to a trained observer, but to an unfamiliar audience, they’re at best confusing and at worst impenetrable. 

Click through for a few techniques, none of which directly involves 3D graphs, as those are really difficult for humans to understand in most circumstances.

Comments closed

Table-Valued Parameters and Dapper (.NET Core Edition)

Randolph West hits on a timely question:

A customer I’ve been working with for a while now has a monolithic ASP.NET MVC web application which we are porting to .NET Core 3.1 (and then almost immediately to .NET 6). One of our biggest changes was getting rid of Entity Framework and replacing it with Dapper, because performance is a feature.

To deflect the ire of EF Core aficionados out there, the answer is still no.

Dapper is a micro-ORM in that it does not do as much “magic” as Entity Framework. This necessitates more work at the data access layer, but we have the trade-off of speed.

I say this is timely because my team is working through this exact thing right now. For future reference, anticipating what my team is working on and writing a blog post which answers a question we have is an outstanding way of getting noticed here.

Comments closed

Querying Delta Lake via Azure Synapse Analytics Serverless SQL Pool

Tony Truong uses T -SQL to query Delta Lake files:


How to query Delta Lake with SQL on Azure Synapse  

As mentioned earlier, Azure Synapse has several compute pools for the evolving analytical workload. There is the Apache Spark pool for data engineers and serverless SQL pool for analysts. Let us break down how the two personas work together to query a shared Delta Lake.  

Read on for the setup and the payoff.

Comments closed

Enabling the Single Value Slicer Option in Power BI

Marco Russo and Alberto Ferrari hack the Gibson:

You cannot apply the same behavior to a column of a table you created or imported by using Power BI. However, Tabular Editor is your key to unlocking this feature. The article shows the user interface of the free version of Tabular Editor; the steps required are identical in the commercial version of Tabular Editor.

IMPORTANT DISCLAIMER:  The properties modifications suggested in the following description are not supported by Microsoft. You apply these changes at your own risk. You should always create a backup of the Power BI file before modifying it.

I’m pretty sure a disclaimer like that just makes me want to do it all the more.

Comments closed

Eliminate the DeWitt Clause

Justin Olsson and Reynold Xin throw down the gauntlet:

At Databricks, we often use the phrase “the future is open” to refer to technology; it reflects our belief that open data architecture will win out and subsume proprietary ones (we just set a new official record on TPC-DS). But “open” isn’t just about code. It’s about how we as an industry operate and foster debate. Today, many companies in tech have tried to control the narrative on their products’ performance through a legal maneuver called the DeWitt Clause, which prevents comparative benchmarking. We think this practice is bad for customers and bad for innovation, and it’s time for it to go. That’s why we are removing the DeWitt Clause from our service terms, and calling upon the rest of the industry to follow.

One example of how you can tell if you’re influential is how many legal terms are named after you, which I’m pretty sure makes Dr. DeWitt the Steve Tasker of the database industry. So put David DeWitt in the Data Platform Hall of Fame.

And good of Databricks to eliminate their DeWitt Clause. Vendors put the clause in ostensibly to prevent rigged or invalid comparisons between products, but there’s a much better way to do this: publish the benchmark configuration and allow peer validation. If you put out garbage numbers (including on accident because you didn’t know the right way to do something), people are smart enough to catch that. And if people aren’t willing to publish the process, call for them to do it and if they still don’t, ignore the results. 100 times out of 100, that’s the right way to do it…assuming that you’re looking for the truth and not just trying to hide inferiorities in your product *cough* Oracle *cough*.

1 Comment