Press "Enter" to skip to content

Author: Kevin Feasel

Increasing Refresh Parallelism in Power BI Premium

Chris Webb pushes the “go faster” button:

In this case I started the refresh from the Power BI portal so the default parallelism settings were used. The y axis on this graph shows there were six processing slots available, which means that six objects could be refreshed in parallel – and because there are nine partitions in the only table in the dataset, this in turn meant that some slots had to refresh two partitions. Overall the dataset took 33 seconds to refresh.

However, if you connect from SQL Server Management Studio to the dataset via the workspace’s XMLA Endpoint (it’s very similar to how you connect Profiler, something I blogged about here) you can construct a TMSL script to refresh these partitions with more parallelism. 

Read on to see how you can do this, as well as the net improvement.

Comments closed

Methods for Resolving Last Page Insert Contention

Esat Erkec shows us three techniques for resolving last page insert contention:

Primary keys constraints uniquely identify each row in the table and automatically creates a clustered index on the underlining table. This duo is frequently used in table design by database developers. At the same time, if this column is decorated with an identity constraint thus we obtain a sequential incremental index key column. The clustered index creates a sorted data structure of the table for this reason a newly inserted row will be added at the end of the clustered index page until that page is filled. When solely one thread adds data to the above-mentioned table, we will never experience a last page insert contention because this problem will occur with concurrent usage of this table. In the high-volume insert operations, the last page of the index is not accessed by all threads concurrently. All threads start waiting for the last page to be accessible to them because the last page is locked by a thread. This bottleneck affects the SQL Server performance and the PAGELATCH_EX wait type begins to be observed too much.

Read on for three techniques, though I’d swap out “use a heap” for “use a uniqueidentifier and watch Jeff Moden’s video on the topic.”

Comments closed

Optimization Parameters in Oracle 19c

Kellyn Pot’Vin-Gorman enters a time warp:

As I and the dedicated CSA were working to optimize the ETL load on Oracle in Azure IaaS, I noticed that there wasn’t a significant improvement with physical VM and storage changes as expected.  As I dug into the code and database design, I started to document what I’ve summarized above and realized that the database was quite frozen in time. Even though I couldn’t make changes to the code, (per the customer request) I was quickly understanding why we had such limited success and why I was failing miserably as I attempted to put recommended practices in place at the parameter level for the Oracle 19c database from what they had originally.

As I thought this through, I had an epiphany-  This database was doing everything in its power to be a 10g or earlier database so why shouldn’t I optimize it like one?

Read on to see what this entails.

Comments closed

Power BI Cleaner Gen2

Imke Feldmann introduces a new version of the Power BI Cleaner:

Today I’m very excited to share with you my first version of a complete rework of my Power BI Cleaner tool. It is way faster the the initial version, overcomes some bugs and limitations of the old version and doesn’t require creating additional vpax files.

On top of that, I’ve created an Excel-version, that adds some very convenient additional features: The option to analyze thin reports and to generate scripts that delete unused measures and hides unused columns automatically.

Click through for instructions on how it all works.

Comments closed

A Summary of Time Series Algorithms

Gavita Regunath and Dan Lantos give an overview of time series algorithms:

Time series forecasting is a data science task that is critical to a variety of activities within any business organisation. Time series forecasting is a useful tool that can help to understand how historical data influences the future. This is done by looking at past data, defining the patterns, and producing short or long-term predictions.

Click through for an overview, as well as ten examples of algorithms you can use for handling time series data.

Comments closed

LAG() in SQL Server

Chad Callihan shows off one of the best window functions:

The LAG function in SQL Server allows you to work with a row of data as well as the previous row of data in a data set. When would that ever be useful? If you’re a sports fan, you’re familiar with this concept whether you realize it or not. Let’s look at an example.

LAG() is outstanding for business reports, such as if you want three-month trailing data.

Comments closed

SQL Server 2019 on CentOS 7.5 Issues

Aaron Bertrand recaps some recent installation issues:

I’ve created countless Docker containers running SQL Server since I first wrote about it back in 2016, but I recently had my first foray into configuring SQL Server 2019 on a real live Linux machine.

It did not go as smoothly as I expected, so I wanted to share the solution to a particular problem I haven’t seen described elsewhere.

First, let me retrace my steps.

Click through for a summary of the issues.

Comments closed

Moving Artifacts between Folders in Synapse Studio

Wolfgang Strasser looks at a recent update:

Another small but very powerful usability extension in Azure Synapse Studio was added at the beginning of June: Move artifacts across folders in Synapse Studio (without extra clicks but with drag&drop)

Once again, the release notes list contained the short sentence that made me curious… hmm… that sound nice… In one of my previous post, I described the “old” way of moving artifacts around in Synapse Studio.

Click through for a demonstration.

Comments closed

Column-Level Encryption and Hashing

Eric Rouach shows off a pair of things:

Using as an example the AdventureWorks2014 database, the first script describes the process of encrypting the “CardNumber” column from the Sales.CreditCard table while keeping the data decryptable.

Our pre-requisite is the creation of a Master Key, a Certificate and a Symmetric Key.

Once having those created, we may proceed to the addition of a new column called “CardNumberEnc” (where the suffix “Enc” stands for “Encrypted”). This column has a VARBINARY(250) Data Type and is nullable.

Read on for an example of using column-level encryption, followed by how you’d decrypt the data. Then, Eric discusses hashing, though I disagree with the nomenclature of “encryption and make the data non-decryptable.” The reason is that encryption is, by its nature, a two-way process and necessarily requires the ability to decrypt. Hashing, meanwhile, is a one-way process without a direct means of reversal. Nomenclature aside, the examples are good and I appreciate Eric using one of the larger SHA2 hashing algorithms rather than MD5.

Comments closed