2018-05-14 – Curated SQL

Power BI Custom Visuals In Excel

Published 2018-05-14 by Kevin Feasel

David Smith notes that Excel is getting a bit of an upgrade:

This week at the BUILD conference, Microsoft announced that Power BI custom visuals will soon be available as charts with Excel. You’ll be able to choose a range of data within an Excel workbook, and pass those data to one of the built-in Power BI custom visuals, or one you’ve created yourself using the API.

David’s point is that you can bring in R charts, but it extends to more than that.

Comments closed

Using The Spark Connector To Speed Up Data Loads

Published 2018-05-14 by Kevin Feasel

Denzil Riberio explains how you can use the Spark connector for Azure SQL DB and SQL Server to speed up inserting data from Spark into SQL Server 15x over the native JDBC client:

Since the load was taking longer than expected, we examined the sys.dm_exec_requests DMV while load was running, and saw that there was a fair amount of latch contention on various pages, which wouldn’t not be expected if data was being loaded via a bulk API.

Examining the statements being executed, we saw that the JDBC driver uses sp_prepare followed by sp_execute for each inserted row; therefore, the operation is not a bulk insert. One can further example the Spark JDBC connector source code, it builds a batch consisting of singleton insert statements, and then executes the batch via the prep/exec model.

It’s the power of bulk insertion.

Comments closed

Reducing Backup Sizes

Published 2018-05-14 by Kevin Feasel

Ned Otter gives us some options for reducing the size of full backups:

SPARSE columns

IF a column contains mostly NULLs, then depending on the data type, you can achieve space savings by using the SPARSE property (documentation here). SPARSE columns can be used with filtered indexes to theoretically reduce storage space and increase query performance. But there are a boatload of gotchas, such as issues with query plan caching (filtered indexes), and the fact that if you use SPARSE columns, neither the table or indexes can have any form of compression (the documentation is clear about not supporting table compression, but does not mention index compression being an issue – but it is).

As the documentation clearly states, when converting a column from non-sparse to sparse, the following steps are taken:

Adds a new column to the table in the new storage size and format

For each row in the table, updates and copies the value stored in the old column to the new column

Removes the old column from the table schema

Rebuilds the table (if there is no clustered index) or rebuilds the clustered index to reclaim space used by the old column

For large tables with even a few columns that you wanted to convert to SPARSE, this process would take forever, because you must do this for each column you want to convert.

I don’t like sparse columns at all, but I do like the rest of Ned’s options.

Comments closed

Updated DIML Catalog Browser

Published 2018-05-14 by Kevin Feasel

Andy Leonard announces a new version of his Catalog Browser utility:

Catalog Browser first displays the reference mapping in the context of the environment named DEV_Person. DEV_Person is a Catalog Environment that contains a collection of Catalog Environment Variables.

Catalog Browser next displays the reference mapping in the context of the SSS Connection Manager named AdventureWorks2014.OLEDB that consumes the Reference between the DEV_Person environment and the Load_Person.dtsx SSIS package. Note that this Reference Mapping is displayed as <Property Name> –> <Environment Variable Name>, or “ConnectionString –> SourceConnectionString”. Why? Catalog Browser is displaying the Reference Mapping from the perspective of the Connection Manager property.

The third instance of Values Everywhere is shown in the Package Connection References node. Remember, a reference “connects” a package or project to an SSIS Environment Variable (learn more at SSIS Catalog Environments– Step 20 of the Stairway to Integration Services). From the perspective of the reference, the reference mapping is displayed as <Environment Variable Name> –> <Property Name>, or “SourceConnectionString –> ConnectionString”. Why? Catalog Browser is displaying the Reference Mapping from the perspective of the Reference.

Check it out.

Comments closed

Update Statistics After An Upgrade

Published 2018-05-14 by Kevin Feasel

Erin Stellato gives us some good life advice:

There are a variety of methods we use for helping customers upgrade to a new SQL Server version, and one question we often get asked is whether or not statistics need to be updated as part of the upgrade process.

tl;dr

Yes. Update statistics after an upgrade. Further, if you’re upgrading to 2012 or higher from an earlier version, you should rebuild your indexes (which will update index statistics, so then you just need to update column statistics).

Make this one of your “Not too long; did read” posts of the day.

Comments closed

Choosing Between SQL, M, And DAX

Published 2018-05-14 by Kevin Feasel

Paul Turley chooses the form of his destructor:

A Date dimension table is an essential component in most any data warehouse or reporting database so techniques to generate these tables have been around for a long time. The foundation of a Date dimension table is a table containing one row per contiguous date in a range that includes every possible transaction date or fact record. To make reporting easier, it is common practice to have multiple date dimensions in the semantic model. For example, if sales transaction facts have an Order Date and a Delivery Date, and both are used independently for reporting; there may be an Order Date dimension and a Delivery Date dimension in the model.

A common practice for building the dimension table is to just populate a single Date type column with the sequential date values. After these rows are inserted, date part functions may be used to populate additional columns by referencing the Date value in an expression. Most every language includes, for example, a MONTH() and YEAR() function to convert a date value into these date parts.

I’m hoping that Paul puts together several of these types of post, where he contrasts building something in SQL, M, and DAX so we can see which language helps most where.

Comments closed

Error Processing SSIS Task When TargetServerVersion Is SQL Server 2016

Published 2018-05-14 by Kevin Feasel

Shabnam Watson diagnoses an error condition when trying to run an Analysis Services processing task inside SQL Server Integration Services:

I ran into this problem a while ago at a client. They upgraded from Visual Studio 2013 to 2015 and the SSAS processing tasks started to error out immediately. The solution turned out to be setting the TargetSeverVersion to anything but SQL Server 2016. In this case, it was set to 2014 and that fixed the error.

Recently I ran into this post https://twitter.com/SQLKohai/status/994335086425399297 by Matt Cushing (@SQLKohai) and decided to dig in more. Initially when I tested it, all was working fine. After I installed SSDT 2015 to test, I started getting the same error in SSDT 2017. I played around with a DLL and got SSDT 2017 to work with all TargetVersionServers again. At the end I managed to break it again after I went through an uninstall and reinstall of all versions of SSDT. The reason I did the reinstall of SSDT was that I thought I might have had a broken registry entry that I was hoping the installation would fix. This did not work!

Read on for the solution and a detailed dive into the problem.

Comments closed

Non-Blocking Aggregations

Published 2018-05-14 by Kevin Feasel

Daniel Hutmacher tilts at windmills:

It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match (aggregate) or create a corrective Sort operation, both of which are blocking.

Is there anything we can do about this? Yes, sometimes, like when those computed expressions are YEAR() and MONTH(), there is. But you should probably get your nerd on for this one.

There are many ways to solve a problem, and sometimes the best method is indirect.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Day: May 14, 2018