Press "Enter" to skip to content

Curated SQL Posts

Creating Power Query Custom Connectors

Eugene Meidinger shows how easy it can be to create a customer Power Query connector:

When I heard about custom data connectors for Power Query, I had assumed there would be a lot of work involved. While there is definitely quite a bit of work in implementing advanced features like query folding,  creating your very first connector is simple.

So, first you need Visual studio installed and the Power Query SDK installed as well. Once you do that, you will see Power Query as an option when creating a new project. Visual studio will also have support for .pq or Power Query files.

Click through for an example of the process in action.

Comments closed

Disable Lightweight Pooling

Randolph West explains why enabling lightweight pooling in SQL Server is almost always a bad idea:

When can I enable lightweight pooling then?
Don’t. But if you must, these are the conditions under which Microsoft suggests it may be useful:
– Large multi-processor servers are in use.
– All servers are running at or near maximum capacity.
– A lot of context switching occurs (greater than 20,000 per second).

We can measure context switching with a performance counter in Performance Monitor on Windows, so the last two items on this list can be monitored. Use the Context Switches/sec counter to track context switches.

I’m sure there were a few customers who benefited from this, but I’ve neither seen nor heard of a case where it did actually help.

Comments closed

Displaying Filters In Power BI Tooltips

Marco Russo shows how you can create a filter dump measure to show all active filters as part of your tooltip:

The Tooltips can display a string with multiple lines. This is useful for the DumpFilters measure that creates a new line for every column with a filter. You might wonder why the DumpFilters measure is required considering that Power BI can already display any filters and slicers affecting a visual. The reason is that the DumpFilters measure isolates the filters of a single cell and can show the effects of filters that are not visible in the standard visualization provided by Power BI.

This is interesting reading and a good way of sharing to users how they got to the current view of data.

Comments closed

Formatting Powershell Outputs

Jeffery Hicks shows us how we can format Powershell files through XML:

My PSScriptTools module (which you can install from the PowerShell Gallery) now includes a command called New-PSFormatXML. The command is designed to analyze an object and by default create a table view of all properties, although you can specify which properties to include. The format.ps1xml file will autosize the table but you can remove the directive and use the widths which are best guesses. Expect some trial and error when defining a new view.

Read on for a couple demonstrations.

Comments closed

Bayesian Modeling Of Hardware Failure Rates

Sean Owen shows how you can use Bayesian statistical approaches with Spark Streaming, using the example of hard drive failure rates:

This data doesn’t arrive all at once, in reality. It arrives in a stream, and so it’s natural to run these kind of queries continuously. This is simple with Apache Spark’s Structured Streaming, and proceeds almost identically.

Of course, on the first day this streaming analysis is rolled out, it starts from nothing. Even after two quarters of data here, there’s still significant uncertainty about failure rates, because failures are rare.

An organization that’s transitioning this kind of offline data science to an online streaming context probably does have plenty of historical data. This is just the kind of prior belief about failure rates that can be injected as a prior distribution on failure rates!

Bayesian approaches work really well with streaming data if you think of the streams as sampling events used to update your priors to a new posterior distribution.

Comments closed

Handling Definitional Changes In Predictive Variables

Vincent Granville explains how you can blend two different definitions of a variable of interest together:

The reasons why scores can become meaningless over time is because data evolves. New features (variables) are added that were not available before, the definition of a metric is suddenly changed (for instance, the way income is measured) resulting in new data not compatible with prior data, and faulty scores. Also, when external data is gathered across multiple sources, each source may compute it differently, resulting in incompatibilities: for instance, when comparing individual credit scores from two people that are costumers at two different banks, each bank computes base metrics (income, recency, net worth, and so on) used to build the score, in a different way. Sometimes the issue is caused by missing data, especially when users with missing data are very different from those with full data attached to them.

Click through for a description of the approach and links showing how it works in practice.

Comments closed

Access Violation Error In SQL Server 2016 SP2 CU4

Lonny Niederstadt tracked down an ugly bug in SQL Server 2016 SP2 CU4:

When I started investigating, the error was known only as an access violation, preventing some operations related to data cleansing or fact table versioning.

It occurred deep within a series of stored procedures.  The execution environment included cross-database DELETE statements, cross-database synonyms, lots of SELECT statements against system views, scalar UDFs, and lots and lots of dynamic SQL.

And… I don’t have access to the four systems where the access violation occurred.

I was able to have some investigation performed on those systems – we learned that by disabling ‘auto stats update’ for the duration of the sensitive operations, the error was avoided.  We also learned that reverting a system from SQL Server 2016 SP2 CU4 to SP2 CU2 avoided the errors.  On those systems, reverting to SP2 CU2 or temporarily disabling ‘auto stats update’ were sufficient temporary mitigations.

Very interesting sleuthing work. It also appears the issue might have been limited to SP2 CU4, as SP2 CU3 and SP2 CU5 return different results in Lonny’s repro.

Comments closed