Kevin Feasel – Page 282

Building a Retry Mechanism for sqlcmd in Bash

Published 2023-09-28 by Kevin Feasel

Jose Manuel Jurado Diaz won’t let failure get him down:

Introduction:

Efficiently managing temporary failures and timeouts is crucial in production environments when connecting to databases. In this article, we’ll explore how to implement a retry mechanism with sqlcmd in a Bash script, dynamically increasing timeouts with each failed attempt.

Problem Statement:

Operations can fail due to network issues, overloaded servers, or other temporary problems when interacting with databases. Implementing a retry mechanism helps address these temporary issues without manual intervention.

Read on for the solution script. You could also adapt this to Powershell fairly easily, I think, though if you do go down that road, I’d recommend taking a look at Polly and PsPolly.

Comments closed

Finding Partitioned Tables in SQL Server

Published 2023-09-28 by Kevin Feasel

Andrea Allred has a script for us:

I recently needed to know which tables in my database were partitioned. I tried a bunch of queries and some got incredibly complex. I finally found one that I like:

Click through for the script and for the assumption Andrea makes (which is a reasonable one).

Comments closed

Thoughts on Cost Threshold for Parallelism

Published 2023-09-28 by Kevin Feasel

Erik Darling has some thoughts:

First, I’m not suggesting that anyone should be using the default value for Cost Threshold For Parallelism. It’s old and moldy and not a good fit for most workloads functioning on modern hardware.

My apologies to Azure SQLDB users who can’t change this setting and leave it up to Microsoft to maybe manage it for them based on ???

Some people out there really like fiddling with settings in a usually ill-informed reaction to Some Script They Found On The Internet, without reading the fine print.

Erik’s thoughts are reasonable overall. My recommendation is to use Michael J. Swart’s technique for tuning cost threshold for parallelism as a starting point, as it gives you a basis for what the net effect of your changes are.

Comments closed

Using DVC to Store Data Science Artifacts in Azure

Published 2023-09-27 by Kevin Feasel

I have a new video up:

In this video, we introduce DVC, a tool for version control management of data science and machine learning artifacts. We learn why Git isn’t the best place to store those large data files, how DVC integrates with Git, and how you can save your files in Azure Blob Storage.

Click through for the video, as well as a variety of links which helped me put it together.

Comments closed

Visualizing Data in R with ggplot2

Published 2023-09-27 by Kevin Feasel

Adrian Tam continues a series on R:

One of the most popular plotting libraries in R is not the plotting function in R base, but the ggplot2 library. People use that because it is flexible. This library also works using the philosophy of “grammar of graphics”, which is not to generate a visualization upon a function call, but to define what should be in the plot, and you can refine it further before setting it into a picture. In this post, you will learn about ggplot2 and see some examples. In particular, you will learn:

How to make use of ggplot2 to create a plot from a dataset

How to create various charts and graphics with multiple facades using ggplot2

It takes a little while to understand the grammar of graphics approach that ggplot2 takes, but once you do, you realize just how good this library is for generating static images.

Comments closed

Port Scanning for SQL Server

Published 2023-09-27 by Kevin Feasel

David Fowler performs one of the early steps of a penetration test:

Since witnessing a rather nasty cyber attack around a year ago, I’ve been thinking quite a bit about security. Do we really know how secure our SQL Servers are? Penetration testing is a great way to find out where our weaknesses and vulnerabilities are. Ideally you probably want to be getting regular pen tests conducted by external companies (although in my experience, some are better than others. I’ve known some who argue totally pointless issues and miss glaring holes which I know exist, but that’s a whole different story) but wouldn’t it be useful if we could conduct some of these tests ourselves?

In this series of posts, I’m going to try to knock together a little pen testing toolbox so that we can hopefully find some of these vulnerabilities. I’m no pen testing expert and this is never going to replace getting a professional pen tester in to test your setup but it might go some way to helping us understand some of our vulnerabilities and identify them.

Click through to see what David did, as well as an alert which helped pick out this port scanning operation.

Comments closed

SparkSQL CONCAT vs T-SQL CONCAT

Published 2023-09-27 by Kevin Feasel

Bill Fellows has a public service announcement:

The concat function is super handy in the database world but be aware that the SQL Server one is way better because it solves two problems. It combines everything into a string and it does not require NULL checking. In the before times, one had to down cast to a n/var/char type as well as check for NULL before appending strings via the plus sign.

The point of difference is so important that Bill busted out the marquee HTML tag. Which now leads me to wonder, was marquee or blink the bigger evil in the mid-to-late ’90s web?

Comments closed

Unit Testing Dynamic SQL

Published 2023-09-27 by Kevin Feasel

Jay Robinson lays out a pattern:

Dynamic SQL (aka Ad Hoc SQL) is SQL code that is generated at runtime. It’s quite common. Nearly every system I’ve supported in the past 30 years uses it to some degree, some more than others.

It can also be a particularly nasty pain point in a lot of systems. It can be a security vulnerability. It can be difficult to troubleshoot. It can be difficult to document. And it can produce some wickedly bad results.

Click through for Jay’s process as well as recommendations and an example. It’s certainly worth thinking about.

Comments closed

Shortcuts Beat Duplication in Microsoft Fabric

Published 2023-09-27 by Kevin Feasel

Reza Rad makes a recommendation:

Microsoft Fabric uses OneLake as the single logical layer for storage management across the Fabric. OneLake offers Domains and workspaces to work with Fabric Items, and amid all those structures, you may require a table or file from one Domain to be used in another. OneLake’s Shortcut feature will give you this option without duplicating the data. In this article, I will explain what Shortcuts are and their importance in building an analytics solution.

Click through for a video and an article.

Comments closed

Exporting Dynamics 365 Data into Delta Lake via Synapse Link

Published 2023-09-27 by Kevin Feasel

Jose Mendes performs a data migration:

It’s fair to say there have been some considerable changes in the Azure landscape over recent years.

This blog will show you how to configure Synapse Link to export D365 data in the Delta Lake format – an open-source data and transaction storage file format used in Lakehouse implementations.

Before you start considering using this approach, you will need to ensure you meet the following prerequisites (Microsoft documentation).

Read on for those prerequisites as well as a step-by-step guide on how to do it.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel

Introduction:

Problem Statement: