Press "Enter" to skip to content

Curated SQL Posts

Connecting To Power BI Report Server Using SSMS

Koen Verbeeck shows how to connect to a Power BI Report Server instance using SQL Server Management Studio:

Sometimes you want to connect to a report server instance using Management Studio, for example to create a new security role or modify an existing one. Recently I tried to log into our newly installed Power BI Report Server (March 2018 edition). I was greeted with the following error:

The Reporting Services instance could not be found.

Read on to see how to solve this problem.

Comments closed

Reading AWS Aurora Error Logs With Powershell

Michael Bourgon has a Powershell script which reads error logs from AWS Aurora:

Been working on monitoring.  For some reason, when you tell Aurora to send errorlogs to Cloudwatch, all it sends are the Audit Logs, which will tell you that code had changed, etc, but doesn’t (!?!?!??!!!) actually put your logs in Cloudwatch.  I don’t understand it, so I built this process to look through logs and return the data.  The next step would be to format it and either upload to Cloudwatch manually, or log it, or send email.

Click through for the script.

Comments closed

Simple Parameterization Isn’t

Erik Darling ran into an interesting case with simple parameterization:

The query plan shows us that we got a Trivial Plan, and that Simple Parameterization was attempted (as shown by the 100000 literal turning into the @1 parameter.)

Simple Parameterization is only attempted in a Trivial Plan.

The key word here is attempted.

And Grant Fritchey has more:

Normally, you spot the change to your query string, you go in to the properties and you see both (and it has to be both) a Parameter Compiled Value and a Parameter Runtime Value, you’ve got simple parameterization going on.

Or do you?

Notice the final property on the sheet, StatementParameterizationType. Honestly, I never really paid attention to that property. I knew what kind of parameterization I was seeing. I’m not running Forced Parameterization. This isn’t a parameterized query. It’s Simple Parameterization. Of course it is. All the keys are there. Change to the code. Parameter List values. Done.

Narrator voice:  it wasn’t done.

Comments closed

Exposing Azure Data Lake Store Data With Power BI

Melissa Coates shows how you can use Power BI to access data in Azure Data Lake Store:

What can you query from ADLS?

You can connect to the data stored in Azure Data Lake Store. What you *cannot* connect to currently is the data stored in the Catalog tables/views/stored procedures within Azure Data Lake Analytics (hopefully connectivity to the ADLA Catalog objects from tools other than U-SQL is available soon).

You’re not sending a U-SQL query here. Rather, we’re sending a web API request to an endpoint.

With an ADLS data source, you have to import the data into Power BI Desktop. There is no option for DirectQuery.

In other words, data that you’ve already prepped using U-SQL and want to display to the outside world.  Click through for a demonstration as well as additional helpful information.

Comments closed

Installing R Tools for Visual Studio 2017

Tech Junkie Blog shows how to install R tooling within Visual Studio 2017:

In this post we are going to go over the steps to install R Tools For Visual Studio 2017.  RStudio has a development environment that is bare bones for the free version.  Visual Studio 2017 offers a more robust development environment if you download the R Tools feature.

Here are the steps to install R Tools for Visual Studio:

R Tools for Visual Studio 2015 is still a separate download.

Comments closed

Single-Node PySpark

Gengliang Weng, et al, explain that even a single Spark node can be useful:

It’s been a few years since Intel was able to push CPU clock rate higher. Rather than making a single core more powerful with higher frequency, the latest chips are scaling in terms of core count. Hence, it is not uncommon for laptops or workstations to have 16 cores, and servers to have 64 or even 128 cores. In this manner, these multi-core single-node machines’ work resemble a distributed system more than a traditional single core machine.

We often hear that distributed systems are slower than single-node systems when data fits in a single machine’s memory. By comparing memory usage and performance between Spark and Pandas using common SQL queries, we observed that is not always the case. We used three common SQL queries to show single-node comparison of Spark and Pandas:

Query 1. SELECT max(ss_list_price) FROM store_sales

Query 2. SELECT count(distinct ss_customer_sk) FROM store_sales

Query 3. SELECT sum(ss_net_profit) FROM store_sales GROUP BY ss_store_sk

To demonstrate the above, we measure the maximum data size (both Parquet and CSV) Pandas can load on a single node with 244 GB of memory, and compare the performance of three queries.

Click through for the results.

Comments closed

Relationships Between Numerical Features

Stacia Varga continues her exploratory data analysis series using hockey data:

Let’s start with something easy and understandable to analyze. If I put age on the horizontal axis and weight on the vertical axis. It’s a common practice to put an explanatory variable on the horizontal axis and a response variable on the vertical axis. In other words, I’m looking to see how an increase in age (explanation) affects – or not – weight (response) for all the hockey players in the current season, regardless of team.

If I put age on the horizontal axis – does this explain weight? Sort of – the combinations of age and weight have some groupings. It almost appears that there is a greater number of younger, heavier players than older, heavier players, but it’s hard to tell here how the age/weight combinations are distributed because I can’t see all the individual points.

Read the whole thing, while keeping in mind that correlation does not imply causation.

Comments closed

Alerting On tempdb Growth

Lori Brown shows how to use a SQL Agent alert to warn you if tempdb grows beyond a certain size:

Lastly, create a SQL Alert to notify you as soon as tempdb grows past the threshold you stipulate. Using the GUI to create the alert, you need to fill out every field on the General page and make sure the Enabled checkbox is marked. Create a Name for the alerts, then specify the Type as SQL Server performance condition alert. The Object should be Databases, the Counter is Data File(s) Size (KB), and the Instance will be tempdb. The alert will trigger if counter rises above the value. The Value will depend upon the cumulative size of your tempdb files. In this case each tempdb file is 12GB (or 12,288,000 KB), so the total size is 98,304,000 KB.

I liked the approach of only firing the SQL Agent job after a trigger was met, rather than running a job which queries and then creates an e-mail afterward.

Comments closed

Default Displayed Properties In Powershell

Claudio Silva explains the default displayed properties in Powershell and how you can find non-default properties:

First, let me say that this person knows that Select-Object can be used to select the properties we want, so he tried to guess the property name using a trial/error approach.

The person tried:

Get-Service WinRM | Select-Object Startup, Status, Name, DisplayName

and also:

Get-Service WinRM | Select-Object StartupType, Status, Name, DisplayName

But all of them were just empty.

There is a better way.

Comments closed