Press "Enter" to skip to content

Curated SQL Posts

Dealing with Big Ranges in a Graph

Alex Velez shows how we can work with a particular case of problem:

Today’s post is about a common challenge: when one data series is so large relative to the others that a single scale makes it nearly impossible to see any details. Consider the following line graph. It displays state and local revenue by transportation mode, which I created using data from the Bureau of Transportation Statistics 2018 Report.

Alex has one solution. Another idea could be to change the Y axis to log scale, especially because you’re dealing with money. That would tighten up the series and allow for more information to be displayed on the single graph.

Comments closed

Generating Synthetic Data with R

Sidharth Macherla uses the conjurer package in R to generate synthetic data:

If you are building data science applications and need some data to demonstrate the prototype to a potential client, you will most likely need synthetic data. In this article, we discuss the steps to generating synthetic data using the R package ‘conjurer’. 

One of the toughest problems of generating data is making it look realistic enough. It’s one level of difficulty to build “steady-state” data, but if you want data to follow a combination of trend and random walk…that’s when things get dicey. H/T R-Bloggers

Comments closed

Find Installed ODBC Drivers with Powershell

Jack Vamvas answers a question:

Question: I had a question from a developer who was troubleshooting an application to SQL Server connection via a DSN using ODBC. They wanted the SQL Server ODBC Driver names and platform (32 bit|64 bit) used to connect . They will use this information to check application compatibility.

Although i can just RDP onto the server and grab the information through the ODBC gui – how can I use Powershell to get these ODBC details?

Click through for the answer.

Comments closed

Building a Dual-Axis Line Chart in Power BI

Matt Allington shows how you can build a dual-axis line chart in Power BI:

Unfortunately, Power BI does not support a dual axis line chart as a standard visual at this time. The good news however is there is a custom visual called “Multiple Axes chart by xViz” that can do this in Power BI.  This visual has been around for a while, but there have been some formatting issues (in my view) that prevented it being a solution to this problem – that is now fixed).  I will demonstrate how to set up a dual axis charge using the Adventure Works database and this visual.

Honestly, I’m pretty happy that Power BI does not support a dual-axis line chart. It is the cause of so many instances of spurious correlation that I’d err on the side of not including multiple axes.

Comments closed

Finding Events in a Trace

Erin Stellato is on a mission:

Yes…I’m writing a post about SQL Trace…specifically events in a trace. There are still lots of folks that use Profiler and run server-side traces, and when I work with those customers I like to understand how the traces are configured. I want to see what events are in a trace, but I also want to see the filters and if the traces are writing to a file or rowset provider. I wrote this query ages ago and went looking for it today, which means I really should write a post so it’s easier to find

Click through to find out how you can determine which events are included in a particular SQL trace. That way, you can convert them to extended events sessions more easily…

Comments closed

Wait Stats: Necessary but not Sufficient

Greg Gonzalez explains how wait stats are not the only thing you should look at to determine system health:

Waits and Queues has been used as a SQL Server performance tuning methodology since Tom Davidson published the above article as well as the well-known SQL Server 2005 Waits and Queues whitepaper in 2006. When used in combination with resource metrics, waits can be valuable for assessing certain performance characteristics of the workload and aid in steering tuning efforts. Waits data is surfaced by many SQL Server performance monitoring solutions, and I’ve been an advocate of tuning using this methodology since the beginning. The approach was influential in the design of the SQL Sentry performance dashboard, which presents waits flanked by queues (key resource metrics) to deliver a comprehensive view of server performance.

However, some seem to have missed Davidson’s point regarding the importance of resources and rely almost entirely on waits to present a picture of query performance and system health. Waitstats come directly from the SQL Server engine and are easy to consume and categorize. Waiting queries mean waiting applications and users, and no one likes to wait! From a marketing standpoint this is pure gold for a SQL Server monitoring tools vendor – it is easier to evangelize waits analysis as a singular solution for making queries and applications faster than the full story, which is more involved.

Unfortunately, a waits-focused approach to the exclusion of resource analysis can mislead users, and worst-case leave them flying blind. SentryOne team members Kevin Kline and Steve Wright have previously touched on this here and here. In this post I’m going to take a deeper dive into some recent research made possible by Query Store that has shed new light on how deficient waits-focused tuning can truly be.

Interesting research and Greg does a great job of explaining it.

Comments closed

Building Custom Sort Orders in Power BI

Reza Rad shows us how to perform custom sorting in Power BI:

I have previously written about how to sort a column by another column, and I used Month Names as an example. However, still, many are unaware that the same technique with slight modifications can be applied to any other columns. You can have a text column in your slicer (product category for example), and sort it based on a different order than the normal alphabetical order. In this post, I am going to show you how to do a custom sort order for a column in Power BI.

There are a few important considerations, so check out Reza’s post.

Comments closed

Creating a UI in Powershell

Michael Berthold walks us through a useful example of using POSHGUI’s UI editor:

Some time back, a customer and I were working with the SentryOne PowerShell Module. Our PowerShell Module lets you manage the targets you are monitoring with SentryOne using a script or command line rather than the UI. This is a great time saver when you’re administering performance monitoring for hundreds or thousands of database servers.

The customer and I worked together to type up the commands they wanted for their script. They mentioned how it would be great if there were a GUI for this. This seemed odd initially, because the reason we were doing this in the first place was to automate these actions outside of a GUI. We spoke on it for a bit, and their meaning become clear. They envisioned a simple GUI used to guide in defining the commands for the PowerShell Module. I agreed that this would be helpful in getting a head start on scripting automation. I decided to find a way to fill this need.

This post explores one way to create a GUI using PowerShell. I’m using the SentryOne PowerShell Module for this example, but this method can be used for any PowerShell script.

Click through to see the example.

Comments closed

Concepts in Support Vector Machines

Abhijit Telang takes us through the calculations involved in Support Vector Machines and then gives us an example in R:

So, let’s take that out and we are back to old, classical vector algebra. It’s like a person with a bunch of sticks to figure out which one to lay where in a 2-D plane to separate one class of objects from another, provided class definitions are already known. 

The problem is which particular shape and length must be chosen to show maximum contrast between classes.

We need to arrive at a function definition, in such a way that the value a given function takes changes drastically (e.g. from a large positive value to a large negative value).

SVM is often great for two-class classification problems, and different variants also work well for multi-class problems.

Comments closed

Log Aggregation with Apache Flink

Gyula Fora and Matyas Orhidi have started a series on log aggregation with Apache Flink:

There are several off-the-shelf solutions available on the market for log aggregation, which come with their own stack of components and operational difficulties. For example, notable logging frameworks that are widely used in the industry are ELK stack and Graylog. 

Unfortunately, there is no clear cut solution that works for every application, and different logging solutions might be more suitable for certain use cases. The log processing of real-time applications should for instance also happen in real-time, otherwise, we lose timely information that may be required to successfully operate the system.

In this blog post, we dive deep into logging for real-time applications.

This post is mostly understanding and setup, but it leads into processing and visualization.

Comments closed