Press "Enter" to skip to content

Author: Kevin Feasel

Killing Idle Analysis Services Sessions

Shabnam Watson shows us how to kill idle SQL Server Analysis Services sessions:

Think of this method as an emergency procedure only. As always, have database backups and try this on a development server first. Always take a backup of msmdsrv.ini before you modify any server properties. The default location of the file is this: C:\Program Files\Microsoft SQL Server\MSAS15.MSSQLSERVER\OLAP\Config

If you set the timeout values too low on server that is under resource pressure, you may not be able to get to the server properties using SSMS and change them quickly within the time you set for the timeout. For this reason, I prefer the user of XMLA in this case which makes the process faster.

Read on to see how to do this.

Comments closed

Getting Started with Jupyter Notebooks

Aveek Das takes us through the most popular name in notebooks:

In this article, I am going to explain what Jupyter Notebooks are and how to install the same on your machine. Further, I will demonstrate how to use these notebooks using Visual Studio Code and perform data analysis and other development activities. It is an open-source platform using which you can create and share documents that contain live code, equations, and visualizations along with the formatted text. Most importantly, these notebooks can be run on the web browser by just starting a server and using it. This open-source project is maintained by the team at Project Jupyter.

This is a fairly basic introduction to the topic, good if you have heard about notebooks but don’t know where to begin.

Comments closed

Working with Central Management Servers in dbatools

Mikey Bronowski continues a series on dbatools:

The built-in feature of the SSMS allows us to configure a group of SQL instances and run queries against multiple instances at once. With the registered servers you can also build a list of SQL Servers in one place, so everyone with access to the CMS can see them. First, we will start by creating registered servers and server groups.

This is an underrated set of functionality for SQL Server and dbatools does a good job working with it.

Comments closed

Preparing Source Data in Power BI

Paul Turley continues a series on doing Power BI the right way:

Your options for transforming data at the source depend on the capabilities of the data source. In general, a query running closer to a data source will be more efficient than one that must cache a set of data and manipulate it downstream. If you are importing text files or Excel files, then your only choice is to import an entire file and then throw away what you don’t need – which is fine for small-ish data sets. On the right side of the previous diagram, some data morphing can be performed in DAX using calculated columns or calculated tables. In rare cases, there may be good reasons but if you can perform row-level calculations in a view or in Power Query, you will be better off to keep all that query logic in the same layer of the solution.

Read on for some good advice, as well as information on query folding, filters, incremental refresh, and much more.

Comments closed

Geometry and Geography Functions in Power BI

Chris Webb walks us through some new Power Query functionality:

In the August 2020 release of Power BI Desktop a couple of new Power Query functions were added: Geography.FromWellKnownTextGeography.ToWellKnownTextGeographyPoint.FromGeometry.FromWellKnownTextGeometry.ToWellKnownText and GeometryPoint.From. These functions (which are coming soon to Power Query in Excel too), make it easier to work with geographic and geometric data in the Well Known Text format. You can have all kinds of fun with these functions if you have a visual (like the Icon Map custom visual) that can display Well Known Text data, but I’ll leave that kind of thing for future blog posts. In this post I’ll explain how the basics of how the functions actually work.

So far, it looks like it’s converting strings of latitude and longitude data (in the geography case) into individual elements for plotting, but no distance measures at this time.

Comments closed

Covariance and Multicollinearity

Mattan Ben-Shachar gives us an intuitive understanding of multicollinearity and how it can affect an analysis:

The common and almost default approach is to fix age to a constant. This is really what our model does in the first place: the coefficient of height represents the expected change in weight while age is fixed and not allowed to vary. What constant? A natural candidate (and indeed emmeans’ default) is the mean. In our case, the mean age is 14.9 years. So the expected values produced above are for three 14.9 year olds with different heights. But is this data plausible? If I told you I saw a person who was 120cm tall, would you also assume they were 14.9 years old?

No, you would not. And that is exactly what covariance and multicollinearity mean – that some combinations of predictors are more likely than others.

I liked the explanation Mattan provides us. Also be sure to read the warnings near the end of the post around other things to try. H/T R-bloggers

Comments closed

Classification Problems and Classification Rules

John Mount warns against simply returning a class in a classification problem:

This statement is a bit of word-play which I will need to unroll a bit. However, the concrete advice is that you often get better results using models that return a continuous score for classification problems. You should make that numeric score available to downstream business logic instead of making a class choice at model prediction time. Informally the word “classifier” to informally mean “scoring procedure for classes” is not that harmful. Losing a numeric score is harmful.

Read the whole thing, as John lays out a good argument.

Comments closed

Performance Tuning SSIS Data Flows

Mark Broadbent reviews a SQLBits talk:

Yes before you say it, I know SQL Server Integration Services is “old technology” but a lot of people are still using it, and in many cases are either still developing against it, or are looking to integrate/ migrate with other burgeoning technologies such as Azure Data Factory. In other words, if you are not currently using SSIS then this post is probably not for you -otherwise read on.

If you are still one of the lucky ones to still be using SSIS, I thought it would be worth publishing these comprehensive notes taken from a session titled “SSIS Data Flow Performance Tuning” delivered at SQLBits 8 (Brighton) by the then “SSIS guru” Jamie Thomson. Notes have timings (in mins and seconds) against them, which correlate directly with the presentation times. The video is still available and can be downloaded from the SQLBits website so you can watch it (if required) and use the timings to follow along.

It’s an asynchronous watch party with Mark.

Comments closed