Press "Enter" to skip to content

Month: August 2020

Geometry and Geography Functions in Power BI

Chris Webb walks us through some new Power Query functionality:

In the August 2020 release of Power BI Desktop a couple of new Power Query functions were added: Geography.FromWellKnownTextGeography.ToWellKnownTextGeographyPoint.FromGeometry.FromWellKnownTextGeometry.ToWellKnownText and GeometryPoint.From. These functions (which are coming soon to Power Query in Excel too), make it easier to work with geographic and geometric data in the Well Known Text format. You can have all kinds of fun with these functions if you have a visual (like the Icon Map custom visual) that can display Well Known Text data, but I’ll leave that kind of thing for future blog posts. In this post I’ll explain how the basics of how the functions actually work.

So far, it looks like it’s converting strings of latitude and longitude data (in the geography case) into individual elements for plotting, but no distance measures at this time.

Comments closed

Covariance and Multicollinearity

Mattan Ben-Shachar gives us an intuitive understanding of multicollinearity and how it can affect an analysis:

The common and almost default approach is to fix age to a constant. This is really what our model does in the first place: the coefficient of height represents the expected change in weight while age is fixed and not allowed to vary. What constant? A natural candidate (and indeed emmeans’ default) is the mean. In our case, the mean age is 14.9 years. So the expected values produced above are for three 14.9 year olds with different heights. But is this data plausible? If I told you I saw a person who was 120cm tall, would you also assume they were 14.9 years old?

No, you would not. And that is exactly what covariance and multicollinearity mean – that some combinations of predictors are more likely than others.

I liked the explanation Mattan provides us. Also be sure to read the warnings near the end of the post around other things to try. H/T R-bloggers

Comments closed

Classification Problems and Classification Rules

John Mount warns against simply returning a class in a classification problem:

This statement is a bit of word-play which I will need to unroll a bit. However, the concrete advice is that you often get better results using models that return a continuous score for classification problems. You should make that numeric score available to downstream business logic instead of making a class choice at model prediction time. Informally the word “classifier” to informally mean “scoring procedure for classes” is not that harmful. Losing a numeric score is harmful.

Read the whole thing, as John lays out a good argument.

Comments closed

Performance Tuning SSIS Data Flows

Mark Broadbent reviews a SQLBits talk:

Yes before you say it, I know SQL Server Integration Services is “old technology” but a lot of people are still using it, and in many cases are either still developing against it, or are looking to integrate/ migrate with other burgeoning technologies such as Azure Data Factory. In other words, if you are not currently using SSIS then this post is probably not for you -otherwise read on.

If you are still one of the lucky ones to still be using SSIS, I thought it would be worth publishing these comprehensive notes taken from a session titled “SSIS Data Flow Performance Tuning” delivered at SQLBits 8 (Brighton) by the then “SSIS guru” Jamie Thomson. Notes have timings (in mins and seconds) against them, which correlate directly with the presentation times. The video is still available and can be downloaded from the SQLBits website so you can watch it (if required) and use the timings to follow along.

It’s an asynchronous watch party with Mark.

Comments closed

Splatting in Powershell

Mark Wilkinson describes splatting in Powershell and shows how you can use it to handle optional parameters:

I have to start of by saying I hate the name “splatting”. I didn’t come up with it, and I don’t like using it, but it’s the only word we have. Splatting is a way to pass parameter values to a function using a single array or hashtable. In this post we’ll be talking about hashtables because I think it is the more useful of the two.

Splatting is easy to explain in an example. 

And then that’s exactly what Mark gives us. Click through for the example as well as how you can set those optional parameters.

Comments closed

Blocking Classic Workspaces in Power BI

Adam Saxton points out something new in Power BI:

The ability to BLOCK classic workspaces from being created in Power BI is finally here! Adam shows you how to implement and what to consider. Create Microsoft Teams without the worry!

Click through for a video as well as the Power BI blog post describing this. You can also tell that Adam has the heart of a DBA based on the level of excitement around blocking something. DBAs and goalies, I tell you.

Comments closed

Overriding SSRS Authentication

Eitan Blumin doesn’t like the SSRS authentication prompt:

In this post, I hope to summarize the various methods that we have, in order to get rid of that annoying authentication prompt. Each method has its own advantages and disadvantages in terms of complexity of implementation, versatility, and the level of security that it provides. More specifically: the more secure and versatile a method is – the more complicated it is to implement.

Read on for four such techniques, as well as a bonus technique.

Comments closed

Web-Optimized ggplot2 Themes

Petr Baranovskiy shares a few new themes:

This will be a very short post compared to the detailed stuff I usually write. Just what it says on the tin – I made some tweaks to my three favorite {ggplot2} themes – theme_bw(), theme_classic(), and theme_void() – to make the graphics more readable and generally look better when posted online, particularly in blog posts. Please feel free to borrow and use.

Also, I will be frequently using these themes in subsequent posts, and I’d like to be able to point readers here with a hyperlink instead of repeatedly posting the whole theme_web_…() code every time I am writing a post.

Click through for the definition of each theme. H/T R-Bloggers

Comments closed

Methods and Functions in Python

Sairam Uppugundla distinguishes methods from functions in Python:

Function and method both look similar as they perform in an almost similar way, but the key difference is the concept of ‘Class and its Object’. Functions can be called only by its name, as it is defined independently. But methods cannot be called by its name only we need to invoke the class by reference of that class in which it is defined, that is, the method is defined within a class and hence they are dependent on that class.

Read on to see how to create each, as well as more details on types of functions.

Comments closed