Press "Enter" to skip to content

Month: May 2023

Encrypting SQL Server Backups

Matthew McGiffen lays out the requirements:

When we talk about protecting our at-rest data, the item that we are likely to be most concerned about is the security of our backups. Backups are generally – and should be – stored off the server itself, and often we will ship copies offsite to a third party where we don’t have control over who can access the data, even if we trust that that will be well managed.

From SQL Server 2014 the product has included the ability to encrypt data while creating a backup. This feature is available in both the standard and enterprise editions of SQL Server, so it is something you can use even when TDE may not be a feature that is available to you.

Click through for a primer on the topic.

Comments closed

Against Keys in Fact Tables

Marc Lelijveld searches for keys under the lamppost:

Another blog post based on recent client experiences. Last week, I visited a client where we had extensive discussions on data model optimization. As you might know, data modeling in Power BI is one of my favorite topics, so I had an excellent day. It’s also not the first time that I blog about anything data modeling and optimization. If you haven’t read it yet, I recommend reading my previous blog on this topic.

This blog will focus on the need of keys in your tables and primarily your fact tables in your data model. I keep running into data models at customers which are flooded with keys in all tables. For each of them you should ask, do I really need this and could I save it in a different data type for further optimization. In this blog, I will further elaborate on keys in your data model, typical use cases and how these cases can be solved in different manners.

Read the whole thing. The really short version is classic Kimball-style advice: keys for dimensions, not for facts. And in Power BI, removing a unique column from a fact table can speed things up by shrinking the compressed fact table size.

Comments closed

Oracle Connectors for SSIS

Meagan Longoria talks to Oracle:

We needed to make sure we had the most current version of the drivers required to make the package work.

This turned out to be an adventure involving 3 different drivers and plans to refactor, so I’m documenting some of it here in case it helps someone else.

Read on for your options.

Comments closed

Updates to SqlPackage and DacFx

Drew Skwiers-Koballa has an update for us:

As the primary command-line interface to DacFx, SqlPackage often benefits from dependency changes in DacFx, including the Microsoft.Data.SqlClient driver. In SqlPackage 162.0.52, the SqlClient driver has been updated from v5.0.1 to v5.1.0, bringing support for TLS1.3 encrypted connections to the .NET Core builds of SqlPackage and the ServerCertificate connection string setting for validation of SQL Server’s TLS/SSL certificate. 

Many of the fixes in this release were triaged out of issues submitted on the DacFx GitHub repository and are visible in the release milestone.  The full list of SqlPackage fixes from the release, including eight deployment-related fixes, are also listed in the SqlPackage release notes. To improve our ability to service and enhance SqlPackage, SqlPackage now collects usage data, including anonymous feature usage and diagnostic data. For more information, see the documentation section on Usage data collection.

Read on for the full set of changes.

Comments closed

Migrating Multiple SSRS Reports to Power BI Paginated Reports

Olivier Van Steenlandt doesn’t do things one at a time:

A few weeks ago I released a blog post about migrating SSRS Reports to Power BI Paginated Reports. At that point in time, I wasn’t aware of any way to migrate multiple SSRS Reports in one go.

Meanwhile, I have done some research and experimented a bit. In this blog post, I will be going through 2 different ways to migrate multiple SSRS Reports at once to Power BI.

Click through for Olivier’s findings and how you can migrate en masse.

Comments closed

SandDance in Polyglot Notebooks

Matt Eland continues a series on dotnet Polyglot Notebooks:

As I’ve been doing more and more dotnet development in notebooks with Polyglot Notebooks I’ve found myself wanting more options to create rich data visualizations inside of VS Code the same way I could use Plotly for data visualization using Python and Jupyter Notebooks.

Thankfully, Microsoft SandDance exists and fills some of that gap in terms of doing rich data visualization from dotnet code.

In this article I’ll talk more about what SandDance is, show you how you can use it inside of a Polyglot Notebook in VS Code, and show you a simple way you can use it without needing a Polyglot Notebook.

Most of my experience with SandDance was with Azure Data Studio, but it’s nice to see this capability in notebooks as well.

Comments closed

Troubleshooting Issues with Full-Text Indexing

Jose Manuel Jurado Diaz digs into a customer problem:

Today, we got a new service request where our customer asks about the time spent populating the full-text. Following, I would like to share with you some lessons learned during this process, specifically working with Azure SQL Database.

For this example, we are going to use a General Purpose database with 8 vCores. Let’s get started with the creation of the table and fulltext.

Read on for a walkthrough of setting up full-text indexing and figuring out what the indexing engine is doing at any point in time.

Comments closed

Quick Tips for SQL Server Performance Troubleshooting

Matthew McGiffen has a two-parter for us. Part 1 covers useful tools for performance troubleshooters:

When you’ve got the symptoms of a database issue you can run a series of diagnostic queries to try and drill down on the problem and then start figuring out solutions. There are a number of free tools out there though that can speed up that process immensely. In this post we’re going to look at my favourites.

Part 2 takes us through the mechanics of measuring query performance:

Attendees have come up with a range of answers from “With a stopwatch” (which I like a lot) to the slightly more technical “Using Profiler”. I like the first answer just because that’s where we all start. We run something from an application and we literally time how long it takes, or we run something in SSMS and we use the counter near the bottom right of our query window to tell us how long it took. We may even run something 100 times in a loop and capture the overall time so we can take an average.

Incidentally, if you are using SET STATISTICS IO ON and are sick of the way it writes out results, Richie Rump’s Statistics Parser tool is great for converting the blob of text into something humans can easily parse.

Comments closed

Trying NTILE

Chad Callihan looks at the fourth ranking window function:

Have you ever used the NTILE function? Or have you even heard of the NTILE function? It seems to be one of the lesser known, lesser used window functions in SQL Server. I’ve never come across it in the wild but maybe there are those that use it all the time. Either way, let’s have a look at what it does and how it can be used.

Click through for a demo. I definitely use it a lot less than ROW_NUMBER(), RANK(), and DENSE_RANK(), but I have used it to some good effect in the past, mostly in cases where I’ve wanted to focus on the top X% of data for an analysis.

Comments closed

Dates and Times in R

Steven Sanderson talks dates and times. First up is an overview of how built-in date functionality works in R:

In this post, we will cover the basics of handling dates and times in R using the as.Dateas.POSIXct, and as.POSIXlt functions. We will use the example code below to explain each line in simple terms. Let’s get started!

The second post covers the Date type, as well as date plus time:

In the first part of our code, we use the as.Date() function to find the next Mother’s Day. Since we don’t need to consider specific times, we can simply use the Date class, which simplifies the process. We create a vector with two elements: startMothersDay and endMothersDay, both set to “2024-05-14”. This represents the range of Mother’s Day for the year 2024. Finally, we store the result in the variable NextMothersDay and print it to the console. Voilà! We have the next Mother’s Day date.

Note that this isn’t a post about date calculation, such as finding the Nth Sunday in the month of May

Comments closed