Curated SQL – Page 278 – A Fine Slice Of SQL Server

Cleaning Up Large System Databases

Published 2024-06-06 by Kevin Feasel

Josephine Bush doesn’t need enormous system databases:

Always set this on your SQL Servers so you don’t have this problem in the first place. This is in the SQL Server Agent settings. I remember having some agent jobs that used to serve this function that ran on a schedule, which may have been required in older versions of SQL Server.

Josephine focuses on SQL Agent history and database backup history, both of which are good ones. If you have an older version of SQL Server or are using the package deployment model, there may be an explosion of information in msdb regarding SSIS that you’d want to manage. Also, check if any of the databases are in Full recovery mode; if so, ensure that the backup script you’re using for transaction log backups actually backs up system databases.

Comments closed

The securityadmin Role in SQL Server

Published 2024-06-06 by Kevin Feasel

Jeff Iannucci talks about a role that might as well be sysadmin:

Based on the name, you probably can guess that members of the securityadmin role can make dangerous changes to the permissions of other server principals. What many folks don’t realize is that this role is simultaneously less dangerous and more dangerous than you might think.

Allow me to explain, or better yet show you what that means.

Click through for the explanation.

Comments closed

Auditing a SQL Server: Discovery and Documentation

Published 2024-06-06 by Kevin Feasel

Ben Johnston begins a new series:

Inheriting a server, whether as an inexperienced user or an experienced DBA, has many challenges. It’s very helpful to evaluate the servers, document issues, and record the current configuration. It can also be beneficial to evaluate the current state of servers you have owned since they were built or even in preparation for a formal audit. The discovery and documentation phase of an audit will set you up for later detailed audits, or it may serve as the complete scope of the audit.

This is the first part of a series on evaluating and auditing SQL Server and Azure SQL Database. Auditing SQL is a very broad topic, so I have broken it down into several sections. This section will cover the major categories that should happen in a basic SQL Server discovery audit. An initial examination of your environment is primarily documentation and looking for critical issues. This includes basic server and SQL engine configuration, physical configuration items such as disk and memory, critical items such as backup state, database configuration, basic code smells, application integration, and high-level security configuration.

Read on for some of the things Ben looks at.

Comments closed

UNISTR() and || in Azure SQL Database

Published 2024-06-06 by Kevin Feasel

Abhiman Tiwari announces a new function and a new operator:

We are excited to announce that the UNISTR intrinsic function and ANSI SQL concatenation operator (||) are now available in public preview in Azure SQL Database. The UNISTR function allows you to escape Unicode characters, making it easier to work with international text. The ANSI SQL concatenation operator (||) provides a simple and intuitive way to combine characters or binary strings. These new features will enhance your ability to manipulate and work with text data.

Click through to learn more about both. Honestly, I’d rather stick with CONCAT() versus using || because of how CONCAT() handles NULL without me having to check every operand first.

Comments closed

Getting the Top N Results in a PySpark Notebook

Published 2024-06-06 by Kevin Feasel

Gilbert Quevauvilliers only needs the top 1:

How to get the TopN rows using Python in Fabric Notebooks

When working with data there are sometimes weird and wonderful requirements which must be created in order to get to the desired solution.

In today’s blog post I had a situation where I wanted to get a single row with the highest duration.

Gilbert uses the Spark SQL version, specifically the Python function variant. You could also use Spark SQL and write a query using the LIMIT operator.

Comments closed

Environment Variables in SSIS

Published 2024-06-06 by Kevin Feasel

Andy Brownsword continues a series on SSIS:

Yep it’s more SSIS again this week. Here we’ll be looking at using Environment configuration within the SSIS catalog. This allows sets of parameters to be defined and used across multiple projects and packages which share common values.

This approach can either be used as a central point for configuration, or you could use multiple configurations for the same packages.

Read on for some examples of how you might use them, as well as the process to create one.

Comments closed

An Overview of Logistic Regression

Published 2024-06-05 by Kevin Feasel

I have a new video:

In this video, I provide a primer on logistic regression, including a demystification of the name. Is it regression? Is it classification? Find out!

I have a lot of fun with this “Is logistic regression actually a regression technique, or is it secretly a classification technique?” I think this video is the single clearest explanation I’ve given on that question, which probably says something about my prior explanations.

Comments closed

Dual-Write Issues and Kafka

Published 2024-06-05 by Kevin Feasel

Wade Waldron solves a common but difficult problem:

However, the dual-write problem isn’t unique to event-driven systems or Kafka. It occurs in many situations involving different technologies and architectures.

When I started building event-driven systems, I encountered the dual-write problem almost immediately. I eventually learned effective ways to solve it but tripped over some anti-patterns along the way.

I want to break down the details of the dual-write problem so you can understand how it occurs and avoid making the same mistakes I did. I’ll outline a few anti-patterns that might look promising, but don’t solve the problem. Finally, we’ll look at accepted solutions that eliminate the dual-write problem.

Read on for a few techniques that will not work (assuming you are using Apache Kafka to flow events into some external systems) and some that will.

Comments closed

Working with XML in SQL Server

Published 2024-06-05 by Kevin Feasel

Ed Pollack talks XML:

XML is a common storage format for data, metadata, parameters, or other semi-structured data. Because of this, it often finds its way into SQL Server databases and needs to be managed alongside other data types.

Even though a relational database is not the optimal place to store and manage XML data, it is often needed due to application requirements, convenience, or a need to maintain this information in close proximity to other app data.

This article dives into a variety of common XML challenges and the functionality included in SQL Server to help make managing them as simple as possible.

Ed does a good job of walking through what you can do. My general philosophy on XML and JSON in the database is simple: if you simply want a place to store some JSON or XML outputs and retrieve the results exactly as they are without performing any searches or transformations, write as JSON/XML. If you want to use the database to search through JSON/XML records for particular attributes and values, or if you want to reshape the JSON/XML data within the database, create a proper data model for this input.

Comments closed

Circular Foreign Key Dependencies in Postgres

Published 2024-06-05 by Kevin Feasel

Hans-Juergen Schoenig causes and fixes a problem:

In this case, we want to store departments and employees. Every department will need a leader, and every employee will need a department. We cannot have a department without a department leader – but we cannot have an employee without a department either.

Click through to see how you can resolve this kind of paradox with Postgres.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Curated SQL Posts