2018-10-16 – Curated SQL

Now we can run a single pipeline that combines data processing steps and ggplot plot construction.

data.frame(x = 1:20) %.>%
  mutate(., y = cos(3*x)) %.>%
  ggplot(., aes(x = x,  y = y)) %.>%
  geom_point() %.>%
  geom_line() %.>%
  ggtitle("piped ggplot2")

Check it out.

Comments closed

Capturing Implicit Conversions With Extended Events

Published 2018-10-16 by Kevin Feasel

Grant Fritchey shows how easy it is to build an extended event which captures implicit conversions:

Built right into the Extended Events is an event that captures conversions that would affect execution plans, plan_affecting_convert. This event will show both CONVERT and CONVERT_IMPLICIT warnings that you would normally only see within an execution plan. You can capture this event with others. Capturing events together along with causality tracking makes it very easy to track queries that have the issue. Here’s one example of how you might capture implicit conversions along with the query:

This kind of event can make system-level performance tuning much easier.

Comments closed

Using R To Hit Azure ML From Power BI

Published 2018-10-16 by Kevin Feasel

Leila Etaati shows how you can use R to hit an Azure ML endpoint to populate a data set in Power BI:

You need to create a model in Azure ML Studio and create a web service for it.

The traditional example in Predict a passenger on Titanic ship is going to survived or not?

we have a dataset about passengers like their age, gender, and passenger class, then we are going to predict whether they are going to survive or not

Open Azure ML Studio and follow the steps to create a model for predicting this. Navigate to Azure ML Studio.

Then download the dataset for titanic from here

Click through for the step-by-step instructions.

Comments closed

Transforming To Proper Case In Power BI, With Exceptions

Published 2018-10-16 by Kevin Feasel

Imke Feldmann wants to transform text into proper case, but certain words should have their own casing style:

The overall strategy is to convert everything to proper case first and then use a translation table to convert the keywords from the table back to their desired values. The following steps show how to do it:

It’s an interesting approach.

Comments closed

Power BI Data Profiling

Published 2018-10-16 by Kevin Feasel

Matt Allington takes a look at a new feature in Power BI:

The data profiling tools look at the first 1,000 rows in the preview data loaded an shows you the big picture of what the data “looks” like.

Currently the profiling tool only works on the top 1000 rows of data. It also takes some time to prepare the profile of the columns (as could be expected), however the benefits of getting this stuff right before moving on far outweigh the slower load times (IMO). I would love to see an option to profile the entire set of data for one or more columns. I am sure this will come.

Teo Lachev shares some thoughts on what it would take to make this a killer feature:

That’s all data profiling you get for now. Here is what it will take to make Power BI data profiling a killer feature:

Allow data profiling over all the values (understandably there will be performance impact).
Add more aggregates, such as Min/Max/Std/Median.
The ability to dynamically filter the preview data for the selected bar in the profile.

As it is, there’s enough here to see the potential of where it could go.

Comments closed

Testing TDE Performance

Published 2018-10-16 by Kevin Feasel

Eduardo Pivaral tests the performance of a database with Transparent Data Encryption versus that same database without encryption:

Transparent data encryption (TDE) helps you to secure your data at rest, this means the data files and related backups are encrypted, securing your data in case your media is stolen.
This technology works by implementing real-time I/O encryption and decryption, so this implementation is transparent for your applications and users.

However, this type of implementation could lead to some performance degradation since more resources must be allocated in order to perform the encrypt/decrypt operations.

On this post we will compare how much longer take some of the most common DB operations, so in case you are planning to implement it on your database, you can have an idea on what to expect from different operations.

These results fit in reasonably well with what I’d heard, but it’s nice to have someone run the numbers.

Comments closed

Using Containers To Build A Home Lab

Published 2018-10-16 by Kevin Feasel

Dmitri Korotkevitch walks us through creating a home lab with Docker containers:

Obviously, in the real life, we do not work with vanilla SQL Server installation. We need to customize it by changing SQL Server settings and logins, creating and/or restoring the databases and do other actions. There are a couple of ways how you can do that.

The first approach is customizing existing container manually and creating the image from it using docker container commit command. After that, you can start the new containers from created image the same way as we already discussed. We will cover a couple ways to move data to and from containers later.

There is the better way, however. You can automate this process by utilizing docker build command. The process is very simple. You just need to define DockerFile, which contains the reference to the main image and specifies the build actions. You can copy scripts and database backups into the image, run SQLCMD, BCP and PowerShell scripts there – you, pretty much, have the full control. Internally, Docker runs every command inside deployment containers (creating and destroying them during the process) saving the final one as the target image.

Read the whole thing.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Day: October 16, 2018

Using wrapr For A Consistent Pipe With ggplot2

Capturing Implicit Conversions With Extended Events

Using R To Hit Azure ML From Power BI

Transforming To Proper Case In Power BI, With Exceptions

Power BI Data Profiling

Testing TDE Performance

Using Containers To Build A Home Lab