Kevin Feasel – Page 938

Weighted Randomization using T-SQL

Published 2020-09-02 by Kevin Feasel

Louis Davidson won’t be stuck with uniform distributions:

The thing is, while I have 77 pictures of my favorite roller coaster, Expedition Everest at Disney’s Animal Kingdom, and 70 of The Tower of Terror at Hollywood Studios, I only have 2 of the Flame Tree Restaurant at Animal Kingdom and many other things that aren’t as exciting to post about. If I randomly choose attractions using a non-weighted random number generator, it would be just as likely to get the lesser items as the same frequency as the greater items. Hence, I want my popular items to come up most frequently, but every once in a while, I want to be surprised by something different.
This is where I needed to build a weighted randomized value.

Read on to see how Louis implements this.

Comments closed

Disabled Indexes Tell No Compression Tales

Published 2020-09-02 by Kevin Feasel

Eric Cobb gives us a warning around disabling indexes:

Here at work we have a very large, very intensive data load that disables and rebuilds indexes as part of the process. We recently added compression to many of the tables and indexes in the database because it was growing quite large (around 28TB at the time). After adding compression, we got the database size down to somewhere around 17TB.
So you can imagine our surprise when the DB size jumped back up to over 30TB after the last data load! In trying to figure out what happened I discovered that most of the data compression was gone.

That’s…not great. Eric shows us a demo as well and notes that it still applies to SQL Server 2019. I’d be apt to call it a bug, myself.

Comments closed

Tips for Creating Azure Data Studio Database Projects

Published 2020-09-02 by Kevin Feasel

Kevin Chant offers some insights for us:

One of the options within the SQL database Projects extension is that you can publish your project to another SQL Server database. Of course, this is only for one database.
So, what do you do if you want to update multiple databases with one project? Well one option is to create a dacpac from your project and use that dacpac to update multiple databases.
You have a couple of options if you wish to do this.

Read on for some helpful tips.

Comments closed

Dealing with Failing SQL Agent Jobs

Published 2020-09-02 by Kevin Feasel

Garry Bargsley has started a four-part series:

SQL Server Agent Jobs are one of the core features of SQL Server. Agent Jobs perform many actions, from maintenance tasks (server and database), data movement flows, replication, log shipping, data cleanup, health checks and many more. Since Agent Jobs are a critical component in a data organization, it is important to know when Agent Jobs do not succeed. There are several ways to accomplish the monitoring of failed Agent Jobs, from easy to more complex. This four part series is going to cover how you, the DBA, can be notified of failing Agent Jobs in your SQL Server environment.

Click through for part one, which is all about finding failed jobs and filtering down to relevant jobs using dbatools.

Comments closed

Principal Component Analysis in Azure ML

Published 2020-09-01 by Kevin Feasel

Dinesh Asanka walks us through Principal Component Analysis as an Azure ML Studio data transformation technique:

We will be discussing one of the most common Data Reduction Technique named Principal Component Analysis in Azure Machine Learning in this article. After discussing the basic cleaning techniques, feature selection techniques in previous articles, now we will be looking at a data reduction technique in this article.
Data Reduction mechanism can be used to reduce the representation of the large dimensional data. By using a data reduction technique, you can reduce the dimensionality that will improve the manageability and visualability of data. Further, you can achieve similar accuracies.

Read on for the demo.

Comments closed

Using oysteR to Track Security Vulnerabilities in R Packages

Published 2020-09-01 by Kevin Feasel

Colin Gillespie walks us through using the oysteR package:

The {oysteR} package is an R interface to the OSS Index that allows users to scan their installed R packages. A few months ago, I stumbled across a fledgeling version of this package and decided to make a few contributions to help move the package from GitHub to CRAN. A few PRs later, I’m now a co-author and the package is on CRAN.

Click through for a demo.

Comments closed

Error Logs with Windows Containers Running SQL Server

Published 2020-09-01 by Kevin Feasel

Jamie Wick walks us through several locations for error logs when you’re running SQL Server on a Windows container:

So far this series has covered: installing Docker for Windows, the basic commands for managing images and containers, and creating a new image. This post will cover troubleshooting containers & the Docker application using the various log files that are available.
Depending on the type of process being run in a container (interactive or non-interactive), event and error information may be collected by the host logs, application logs, or by the Docker logs (or by all 3, in the case of SQL Server and IIS for Windows).

Read on, as it’s not just “The same places as you’d see on SQL Servers outside of containers.”

Comments closed

Optimizing Cross Join Performance in Power BI

Published 2020-09-01 by Kevin Feasel

Chris Webb tries an experiment:

In a late addition to the series of posts that I started here a few months ago, I thought it would be useful to look at how to optimise the performance of a rare (but nonetheless useful) type of merge: the cross join. Cross joins are not available as a join kind in the Table.Join or Table.NestedJoin M functions but there are two ways of creating them in Power Query:
1. Creating a custom column in your query that contains the query that you want to cross join with and then expanding, as Matt Allington describes here
2. Creating a custom column that contains a single value, such as the number 1, on both your queries and then using that column to merge the two queries on (something I mentioned a long time ago here)
Which one gives you the best performance?

Chris has the answer to that question and the tests to prove it.

Comments closed

Testing Columnstore Data Loads on Eight-Socket Servers

Published 2020-09-01 by Kevin Feasel

Joe Obbish puts on the lab coat and safety goggles:

I elected to use a high concurrency CCI insert workload to compare performance between a four socket VM and an eight socket VM. Quite conveniently, I already had a test columnstore workload that I knew pushed the SQL Server scalability limits in terms of memory management. To perform the threading I used the SQL Server Multi Thread open source framework. I wanted all sessions to go to their own schedulers. That could have been tough to manage with tests up to 200 threads but the threading framework handles that automatically.
For those following along at home, testing was done with SQL Server 2019 with LPIM and TF 876 enabled. Guest VMs were built with VMware with Windows Server 2019 installed. The four and eight socket VMs were created on the same physical host with about 5.5 TB of RAM available to the guest OS in both configurations.

Read on to see how an eight-socket server fared in comparison to a four-socket server in this task.

Comments closed

Power BI Migration Documentation

Published 2020-09-01 by Kevin Feasel

Melissa Coates announces some new documentation:

I’m really pleased to announce that new Power BI Migration documentation that I authored is published on the Microsoft Docs site. Although it was written from the perspective of migrating to Power BI from another platform, there’s no doubt that a lot of the content applies to a standard Power BI implementation project.

Click through to learn more and follow Melissa’s link to get the whole document.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Author: Kevin Feasel