Press "Enter" to skip to content

Author: Kevin Feasel

Fill Factor and When It Matters

Raul Gonzalez has a confession to make:

I love SQL Server internals, I do and I just said it.

Why? because thanks to all the tools, documentation and community members that share their knowledge, folks like me can understand how a super complex piece of software like a relational database engine works (or at least a small part of it).

Click through for a discussion of fill factor and one area where Raul thinks it falls short. I’m not sure that I agree but would need to think about it to give a clear explanation as to why.

Comments closed

Auditing Data and Data Access Quiz

Kenneth Fisher has a pop quiz for us:

I was honored to speak at Pass Summit last week (Thanks again Redgate), and if you’ve ever been to one of my sessions you’ll know there is always a “quiz” at the end. I.e. a crossword puzzle. Well .. here is the puzzle itself, and attached (at the bottom) is the answer key.

I answered “c” for all of the questions and it worked out really well. When in doubt, Charlie out!

Comments closed

When MAX becomes TOP

Forrest McDaniel tries out a few aggregations:

I can’t tell you how tempted I was to call this MAXTOP. Thinking about it still causes stray giggles to bubble up, as I gleefully envision sharing this info with eager young DBAs at a SQL Saturday, then ushering them off to a session on parallelism. Thankfully I’m not that evil.

The real conclusion is that SQL Server is programmed to be very clever. Despite the cleverness though, details matter, and the cleverness often falls short. What’s that warm glow I feel inside? Ah yes, job security.

Definitely worth a read, and Forrest also shares the repro scripts.

Comments closed

Viewing Site-to-Site VPN Logs in Azure

Denny Cherry troubleshoots a site-to-site VPN issue:

Recently I needed to view the logs from an Azure Site to Site VPN to see why it wasn’t working as expected. When Azure Site to Site VPNs aren’t working as expected the GUI falls apart quickly for troubleshooting.

Log Analytics is where this problem gets solved. Log Analytics is going to allow you to see basically everything that the Azure Network Gateway is doing. Setting the feed up to Log Analytics isn’t as straightforward as it could be, but it is documented in this post.

Read on for some sample queries.

Comments closed

Fundamentals of Query Store

Deepthi Goguri provides an intro to Query Store:

Introduced in SQL Server 2016, the Query Store feature will store the Query plans, Queries, resource consumption information along with the run time stats information for your queries. This feature is not automatically enabled on your databases. It is a database-level option only. You need to enable the Query Store on the database to capture the queries and query information. You cannot enable the Query store on master or tempdb databases. The information captured by the Query Store is stored in the database you enabled the feature on, in the internal tables created in the PRIMARY filegroup and this cannot be changed. (Microsoft docs reference).

One thing I would mention that Deepthi didn’t cover is, if you tried out Query Store a while ago and ran into problems, try again on the latest CUs of SQL Server. There have been a lot of bugfixes and optimizations which have made it a much more useful product.

3 Comments

Monotonic Constraints on Random Forests

Michael Mayer has some interesting R and Python code for us:

On ML competition platforms like Kaggle, complex and unintuitively behaving models dominate. In this respect, reality is completely different. There, the majority of models do not serve as pure prediction machines but rather as fruitful source of information. Furthermore, even if used as prediction machine, the users of the models might expect a certain degree of consistency when “playing” with input values.

A classic example are statistical house appraisal models. An additional bathroom or an additional square foot of ground area is expected to raise the appraisal, everything else being fixed (ceteris paribus). The user might lose trust in the model if the opposite happens.

One way to enforce such consistency is to monitor the signs of coefficients of a linear regression model. Another useful strategy is to impose monotonicity constraints on selected model effects.

Certain types of regression algorithm make this easy, but random forest? Not so much. That’s where Michael steps in.

Comments closed

Replacing p-values with Bootstrapped Confidence Intervals

Florent Buisson has an interesting post on avoiding p-value calculations:

And indeed, I worked with highly-skilled data scientists who had a very sharp understanding of statistics. But after years of designing and analyzing experiments, I grew dissatisfied with the way we communicated results to decision-makers. I felt that the over-reliance on p-values led to sub-optimal decisions. After talking to colleagues in other companies, I realized that this was a broader problem, and I set up to write a guide to better data analysis. In this article, I’ll present one of the biggest recommendations of the book, which is to ditch p-values and use Bootstrap confidence intervals instead.

I’m a committed Bayesian (or at least a Bayesian who should be committed—depends on who you ask), so I’d consider this a big step forward.

Comments closed

Using ConcatenateX in Power BI

Reza Rad describes a DAX function:

It happens often in Power BI calculations and reports that you need to concatenate a list of values from a column. You can do this concatenation in Power Query or DAX. However, if the concatenation needs to be done dynamically. ConcatenateX is a very helpful DAX function to achieve such results. It is very helpful to understand what happens in the virtual tables in DAX too. In this article and video, I’ll explain what ConcatenateX is and how it works in Power BI and DAX.

Click through for a video, as well as a detailed explanation in blog post format.

Comments closed