Press "Enter" to skip to content

Curated SQL Posts

Power BI: Parameterized Loading With Web.Contents()

Stacia Varga shows off Power BI automated refresh when hitting dynamically generated endpoint URLs:

There’s just one problem. I’m using the Web.Contents() function in several queries to get the JSON data from the NHL API.

Actually, the function is not the problem. It’s the way I’m using it to dynamically construct URLs so that I can iterate through lists of players or teams or whatever to retrieve the data I want.

Clearly that technique works just fine because I have data. However, when I try to schedule the refresh in the Power BI service, I get the dreaded error: “You can’t schedule refresh for this dataset because one or more sources currently don’t support refresh.”

I explained how to solve this particular problem once upon a time. For the hockey data queries, I had to use variations on that theme to get the dynamic URLs to work. To do this, I had to fix up my queries (which are actually functions) in the Query Editor by opening up the Advanced Editor and fixing the code as noted below.

Read on for her examples.  Given some of the problems she ran into, it seems like it might be a good idea to pull that data into SQL Server (or somewhere) and thereby separate data retrieval from data processing.

Comments closed

Views Don’t Improve Performance

Grant Fritchey lays down the law on views:

One day, it’s going to happen. I’m going to hear some crazy theory about how SQL Server works and I’m going to literally explode. Instead of some long silly rant with oddball literary & pop culture references you’ll get a screed the size of Ulysses (and about as much fun to read). However, for the moment, like Robin Williams describing a dance move, I’m going to keep it all inside. Here’s our query:

No, no where clause because we have to compare this to this, our view:

Grant used up much of his strategic reserve of GIFs in that post, so check it out.

Comments closed

Power BI Custom Visuals In Excel

David Smith notes that Excel is getting a bit of an upgrade:

This week at the BUILD conference, Microsoft announced that Power BI custom visuals will soon be available as charts with Excel. You’ll be able to choose a range of data within an Excel workbook, and pass those data to one of the built-in Power BI custom visuals, or one you’ve created yourself using the API.

David’s point is that you can bring in R charts, but it extends to more than that.

Comments closed

Using The Spark Connector To Speed Up Data Loads

Denzil Riberio explains how you can use the Spark connector for Azure SQL DB and SQL Server to speed up inserting data from Spark into SQL Server 15x over the native JDBC client:

Since the load was taking longer than expected, we examined the sys.dm_exec_requests DMV while load was running, and saw that there was a fair amount of latch contention on various pages, which wouldn’t not be expected if data was being loaded via a bulk API.

Examining the statements being executed, we saw that the JDBC driver uses sp_prepare followed by sp_execute for each inserted row; therefore, the operation is not a bulk insert. One can further example the Spark JDBC connector source code, it builds a batch consisting of singleton insert statements, and then executes the batch via the prep/exec model.

It’s the power of bulk insertion.

Comments closed

Reducing Backup Sizes

Ned Otter gives us some options for reducing the size of full backups:

SPARSE columns

IF a column contains mostly NULLs, then depending on the data type, you can achieve space savings by using the SPARSE property (documentation here). SPARSE columns can be used with filtered indexes to theoretically reduce storage space and increase query performance. But there are a boatload of gotchas, such as issues with query plan caching (filtered indexes), and the fact that if you use SPARSE columns, neither the table or indexes can have any form of compression (the documentation is clear about not supporting table compression, but does not mention index compression being an issue – but it is).

As the documentation clearly states, when converting a column from non-sparse to sparse, the following steps are taken:

  1. Adds a new column to the table in the new storage size and format
  2. For each row in the table, updates and copies the value stored in the old column to the new column
  3. Removes the old column from the table schema
  4. Rebuilds the table (if there is no clustered index) or rebuilds the clustered index to reclaim space used by the old column

For large tables with even a few columns that you wanted to convert to SPARSE, this process would take forever, because you must do this for each column you want to convert.

I don’t like sparse columns at all, but I do like the rest of Ned’s options.

Comments closed

Updated DIML Catalog Browser

Andy Leonard announces a new version of his Catalog Browser utility:

Catalog Browser first displays the reference mapping in the context of the environment named DEV_Person. DEV_Person is a Catalog Environment that contains a collection of Catalog Environment Variables.

Catalog Browser next displays the reference mapping in the context of the SSS Connection Manager named AdventureWorks2014.OLEDB that consumes the Reference between the DEV_Person environment and the Load_Person.dtsx SSIS package. Note that this Reference Mapping is displayed as <Property Name> –> <Environment Variable Name>, or “ConnectionString –> SourceConnectionString”. Why? Catalog Browser is displaying the Reference Mapping from the perspective of the Connection Manager property.

The third instance of Values Everywhere is shown in the Package Connection References node. Remember, a reference “connects” a package or project to an SSIS Environment Variable (learn more at SSIS Catalog Environments– Step 20 of the Stairway to Integration Services).  From the perspective of the reference, the reference mapping is displayed as  <Environment Variable Name> –> <Property Name>, or “SourceConnectionString –> ConnectionString”. Why? Catalog Browser is displaying the Reference Mapping from the perspective of the Reference.

Check it out.

Comments closed

Update Statistics After An Upgrade

Erin Stellato gives us some good life advice:

There are a variety of methods we use for helping customers upgrade to a new SQL Server version, and one question we often get asked is whether or not statistics need to be updated as part of the upgrade process.

tl;dr

Yes.  Update statistics after an upgrade.  Further, if you’re upgrading to 2012 or higher from an earlier version, you should rebuild your indexes (which will update index statistics, so then you just need to update column statistics).

Make this one of your “Not too long; did read” posts of the day.

Comments closed

Choosing Between SQL, M, And DAX

Paul Turley chooses the form of his destructor:

A Date dimension table is an essential component in most any data warehouse or reporting database so techniques to generate these tables have been around for a long time.  The foundation of a Date dimension table is a table containing one row per contiguous date in a range that includes every possible transaction date or fact record.  To make reporting easier, it is common practice to have multiple date dimensions in the semantic model.  For example, if sales transaction facts have an Order Date and a Delivery Date, and both are used independently for reporting; there may be an Order Date dimension and a Delivery Date dimension in the model.

A common practice for building the dimension table is to just populate a single Date type column with the sequential date values.  After these rows are inserted, date part functions may be used to populate additional columns by referencing the Date value in an expression.  Most every language includes, for example, a MONTH() and YEAR() function to convert a date value into these date parts.

I’m hoping that Paul puts together several of these types of post, where he contrasts building something in SQL, M, and DAX so we can see which language helps most where.

Comments closed

Error Processing SSIS Task When TargetServerVersion Is SQL Server 2016

Shabnam Watson diagnoses an error condition when trying to run an Analysis Services processing task inside SQL Server Integration Services:

I ran into this problem a while ago at a client. They upgraded from Visual Studio 2013 to 2015 and the SSAS processing tasks started to error out immediately. The solution turned out to be setting the TargetSeverVersion to anything but SQL Server 2016. In this case, it was set to 2014 and that fixed the error.

Recently I ran into this post https://twitter.com/SQLKohai/status/994335086425399297 by Matt Cushing (@SQLKohai) and decided to dig in more.  Initially when I tested it, all was working fine. After I installed SSDT 2015 to test, I started getting the same error in SSDT 2017.  I played around with a DLL and got SSDT 2017 to work with all TargetVersionServers again. At the end I managed to break it again after I went through an uninstall and reinstall of all versions of SSDT. The reason I did the reinstall of SSDT was that I thought I might have had a broken registry entry that I was hoping the installation would fix. This did not work!

Read on for the solution and a detailed dive into the problem.

Comments closed

Non-Blocking Aggregations

Daniel Hutmacher tilts at windmills:

It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match (aggregate) or create a corrective Sort operation, both of which are blocking.

Is there anything we can do about this? Yes, sometimes, like when those computed expressions are YEAR() and MONTH(), there is. But you should probably get your nerd on for this one.

There are many ways to solve a problem, and sometimes the best method is indirect.

Comments closed