Press "Enter" to skip to content

Author: Kevin Feasel

SQL Assessment API Public Preview 2

Ebru Ersan announces public preview 2 of the SQL Assessment API:

SQL Assessment API is a new mechanism to evaluate configuration of your SQL Server for best practices. The API methods are used by means of a SQL Server Management Object (SMO) extension and new cmdlets in SqlServer PowerShell module. API is delivered with a ruleset that is highly customizable and extensible. It can be used to assess SQL Server versions 2012 and higher, both on Windows and Linux.

Read on for instructions on how to install and what has changed since public preview 1.

Comments closed

Reviewing Analysis Services DMVs in SSMS

Brett Powell has a project to show Analysis Services DMVs in SQL Server Management Studio:

Just like with registered servers in SSMS, if you’re a BI administrator or developer that regularly needs to access and analyze Analysis Services models, whether that’s Power BI Premium, Azure AS, or SSAS, then quick access to the available DMVs aids your productivity.

For example you may want to quickly retrieve the DAX measures in a model including their expressions, the columns of a table, the largest columns in terms of memory consumption or cardinality, the row-level security roles and role memberships, etc. Obviously it wouldn’t be efficient to open the Power BI Desktop file or the Visual Studio solution to obtain this information and a tool like the Tabular Model Schema Reference v2.0 might be more than you need for short term and light ad hoc scenarios.

Click through for a link to Brett’s project as well as instructions on how to use it and a quick demo.

Comments closed

SQL Server Monitoring with Grafana and Telegraf

Denzil Ribeiro shows how you can use Telegraf and Grafana to monitor Azure SQL Database databases:

SQL DB Storage
This is the dashboard for monitoring file size and space used, IO latency, and IO throughput, for each file in the database. When using Standard or General Purpose databases, which use data files in Azure blob storage, storage performance depends on several blob properties, exposed via the FILEPROPERTYEX() function.

Aside from a couple Azure SQL DB-specific steps, it’s basically the same process that Tracy Boggiano has for monitoring on-prem and IaaS SQL Server instances.

Comments closed

Predicting Application Problems from the Database

Ed Pollack has a pattern for rooting out application problems based on database activity:

We can approach I/O file stats very similarly to how we handled row counts above: Regularly collect data, store it in a reporting table, and then run analytics against it as needed. Since these database metrics are reset when SQL Server services restart, we need to collect a bit more often. We’ll also want to collect often enough to be able to correlate changes to ongoing application activity. Hourly is typically an acceptable collection frequency, but your environment may lend itself to the more frequent or less frequent collection.

What’s nice is that you can get a long way with heuristics and domain knowledge, even before applying data science techniques.

Comments closed

Power BI Performance Tuning

Eugene Meidinger gives us a detailed guide to Power BI performance tuning:

As a report developer, it can be frustrating a report developer, knowing that something is slow, but not being able to put your finger on it. In my mind, there are 4 main areas where there might be a slowdown:

1. Data refresh
2. Model calculations
3. Visualization rendering
4. Everything else

Identifying which one of these is the problem is the first step to improving performance. In most cases, if a report is slow it’s an issue with step 2, your data model.

Eugene has plenty of good advice here.

Comments closed

ADLS Gen2 Navigation in Power Query

Chris Webb shows off hierarchical navigation in Power Query against Azure Data Lake Storage Gen2:

While the documentation on how to import data from Azure Data Lake Gen2 Storage into Power BI is pretty detailed, the connector (which at the time of writing is in beta) that supports this functionality in the Power Query engine has some useful functionality that isn’t so obvious. If you look at the built-in documentation on the AzureStorage.DataLake M function in the Power Query Editor you’ll see there are a lot of options that aren’t in the documentation on the web yet:

Click through for an example.

Comments closed

Fun with Residual Plots

Nina Zumel explains why, when plotting residuals, you always put predictions on the X axis and residuals on the Y axis:

One reason that the proper residual graph (for a well fit model) should smooth out to the line y=0 is known as reversion to mediocrity, or regression to the mean.

Imagine that you have an ideal process that always produces a single value y. You don’t actually observe this “true value”; instead, what you observe is y plus (IID, zero mean) noise. You can build a “model” for this process that predicts the mean of the observations, in this case the value 0.1033149. Then you can calculate the residuals of your “model” in the usual way.

This post went in a direction I wasn’t expecting, and it was all the better for it.

Comments closed

Topic Modeling

Federico Pascual has an article on topic modeling and topic classification:

Topic modeling is an unsupervised machine learning technique that’s capable of scanning a set of documents, detecting word and phrase patterns within them, and automatically clustering word groups and similar expressions that best characterize a set of documents. It’s known as ‘unsupervised’ machine learning because it doesn’t require a predefined list of tags or training data that’s been previously classified by humans.

Since topic modeling doesn’t require training, it’s a quick and easy way to start analyzing your data. However, you can’t guarantee you’ll receive accurate results, which is why many businesses opt to invest time training a topic classification model.

The article is long but worth the read, with examples in Python and additional notes for R.

Comments closed

Using the Power BI API

Jeff Pries takes us through the Power BI API with Powershell:

The Power BI API is a way of essentially bypassing the web interface of powerbi.com and asking questions directly to the back-end of the service (without bypassing security). Using this allows you to issue a command, such as “Give me a list of all of the workspaces” and receive just the result data in a bulk data format. While outside the scope of this exercise, the API also allows reading actual business data from Power BI assets and even manipulating assets.

For my first attempt at interacting with the API, I decided to try the Microsoft developed Powershell Cmdlets. These are a set of free Powershell commands which can be installed on your workstation and then will query the Power BI API and return results in a standard way without the need to write any code (C# or otherwise) or understand how to read a JSON response.

Jeff does more than just hit API endpoints, though, so read the whole thing.

Comments closed

Hammering Azure SQL DB

John Morehouse shows how we can configure HammerDB to run against Azure SQL Database:

Bench marking your environment is an important step when introducing new hardware, which is accomplished by running a test workload against the hardware.    There are multiple ways to accomplish this to get  SQL Server performance data  One of these methods is using HammerDB, which is a free tool that provides TPC standard bench marking metrics for multiple database systems, including Microsoft SQL Server.  These metrics are an industry standard and are defined by the Transaction Processing Performance Council (TPC).  The results from bench marking will help you to ensure that the new infrastructure will be able to support the expected workload.

Azure introduces ways to quickly implement new hardware.  However, if the Azure environment isn’t setup correctly, you can introduce issues that could potentially degrade performance.  Thankfully, HammerDB is Azure aware which allows you to easily benchmark your cloud environment.

HammerDB isn’t as good as having your own perfectly-created set of scripts which exactly replicate your environment, but I’ve never seen an environment with one of those.

Comments closed