Kevin Feasel – Page 664

SQL Server 2019 has been a bit of a roller coaster ride. In particular, UDF inlining started as I think the most interesting addition to the product. Big brain stuff, for sure.
It has been nerfed quite a bit, with seemingly more and more restrictions added to every cumulative update. Hopefully some of these can be lifted at the feature matures, but I understand how difficult all this is.
People program absolute bloodbaths into functions.
Today, I want to look at one restriction that has a fairly simple workaround: Calling GETDATE().

Click through to see how you can replace calls to GETDATE() without too much hassle.

Comments closed

Using ggplot2 to Create a Faceted Histogram plus Curve

Published 2021-06-28 by Kevin Feasel

Sebastian Sauer builds a combo chart:

Overlaying a histogram (possibly facetted) is not something far fetched when analyzing data. Surprisingly, it appears (to the best of my knowledge) that there’s no comfortable out-of-the-box solution in ggplot2, although it can be of course achieved with some lines of code. Here’s my take.

Click through for Sebastian’s version, as well as information on the ggh4x library.

Comments closed

Increasing Refresh Parallelism in Power BI Premium

Published 2021-06-28 by Kevin Feasel

Chris Webb pushes the “go faster” button:

In this case I started the refresh from the Power BI portal so the default parallelism settings were used. The y axis on this graph shows there were six processing slots available, which means that six objects could be refreshed in parallel – and because there are nine partitions in the only table in the dataset, this in turn meant that some slots had to refresh two partitions. Overall the dataset took 33 seconds to refresh.
However, if you connect from SQL Server Management Studio to the dataset via the workspace’s XMLA Endpoint (it’s very similar to how you connect Profiler, something I blogged about here) you can construct a TMSL script to refresh these partitions with more parallelism.

Read on to see how you can do this, as well as the net improvement.

Comments closed

Methods for Resolving Last Page Insert Contention

Published 2021-06-28 by Kevin Feasel

Esat Erkec shows us three techniques for resolving last page insert contention:

Primary keys constraints uniquely identify each row in the table and automatically creates a clustered index on the underlining table. This duo is frequently used in table design by database developers. At the same time, if this column is decorated with an identity constraint thus we obtain a sequential incremental index key column. The clustered index creates a sorted data structure of the table for this reason a newly inserted row will be added at the end of the clustered index page until that page is filled. When solely one thread adds data to the above-mentioned table, we will never experience a last page insert contention because this problem will occur with concurrent usage of this table. In the high-volume insert operations, the last page of the index is not accessed by all threads concurrently. All threads start waiting for the last page to be accessible to them because the last page is locked by a thread. This bottleneck affects the SQL Server performance and the PAGELATCH_EX wait type begins to be observed too much.

Read on for three techniques, though I’d swap out “use a heap” for “use a uniqueidentifier and watch Jeff Moden’s video on the topic.”

Comments closed

Optimization Parameters in Oracle 19c

Published 2021-06-28 by Kevin Feasel

Kellyn Pot’Vin-Gorman enters a time warp:

As I and the dedicated CSA were working to optimize the ETL load on Oracle in Azure IaaS, I noticed that there wasn’t a significant improvement with physical VM and storage changes as expected. As I dug into the code and database design, I started to document what I’ve summarized above and realized that the database was quite frozen in time. Even though I couldn’t make changes to the code, (per the customer request) I was quickly understanding why we had such limited success and why I was failing miserably as I attempted to put recommended practices in place at the parameter level for the Oracle 19c database from what they had originally.
As I thought this through, I had an epiphany- This database was doing everything in its power to be a 10g or earlier database so why shouldn’t I optimize it like one?

Read on to see what this entails.

Comments closed

Power BI Cleaner Gen2

Published 2021-06-28 by Kevin Feasel

Imke Feldmann introduces a new version of the Power BI Cleaner:

Today I’m very excited to share with you my first version of a complete rework of my Power BI Cleaner tool. It is way faster the the initial version, overcomes some bugs and limitations of the old version and doesn’t require creating additional vpax files.
On top of that, I’ve created an Excel-version, that adds some very convenient additional features: The option to analyze thin reports and to generate scripts that delete unused measures and hides unused columns automatically.

Click through for instructions on how it all works.

Comments closed

A Summary of Time Series Algorithms

Published 2021-06-25 by Kevin Feasel

Gavita Regunath and Dan Lantos give an overview of time series algorithms:

Time series forecasting is a data science task that is critical to a variety of activities within any business organisation. Time series forecasting is a useful tool that can help to understand how historical data influences the future. This is done by looking at past data, defining the patterns, and producing short or long-term predictions.

Click through for an overview, as well as ten examples of algorithms you can use for handling time series data.

Comments closed

LAG() in SQL Server

Published 2021-06-25 by Kevin Feasel

Chad Callihan shows off one of the best window functions:

The LAG function in SQL Server allows you to work with a row of data as well as the previous row of data in a data set. When would that ever be useful? If you’re a sports fan, you’re familiar with this concept whether you realize it or not. Let’s look at an example.

LAG() is outstanding for business reports, such as if you want three-month trailing data.

Comments closed

Joining Lists in Python via Nested Loops

Published 2021-06-25 by Kevin Feasel

Jon Fletcher has a quick one for us:

In this blog post I will show you how to join two 2D Python lists together.
The code is in the screenshot below.

Nested loops is an easy algorithm to understand, and even here, the whole process is 10 lines of code.

Comments closed

SQL Server 2019 on CentOS 7.5 Issues

Published 2021-06-25 by Kevin Feasel

Aaron Bertrand recaps some recent installation issues:

I’ve created countless Docker containers running SQL Server since I first wrote about it back in 2016, but I recently had my first foray into configuring SQL Server 2019 on a real live Linux machine.
It did not go as smoothly as I expected, so I wanted to share the solution to a particular problem I haven’t seen described elsewhere.
First, let me retrace my steps.

Click through for a summary of the issues.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel

A GETDATE() Workaround when Rewriting Scalar UDFs

Using ggplot2 to Create a Faceted Histogram plus Curve

Increasing Refresh Parallelism in Power BI Premium

Methods for Resolving Last Page Insert Contention

Optimization Parameters in Oracle 19c

Power BI Cleaner Gen2

A Summary of Time Series Algorithms

LAG() in SQL Server

Joining Lists in Python via Nested Loops

SQL Server 2019 on CentOS 7.5 Issues