Month: June 2024

In a previous post, I showed what you need to do to setup your local PC/laptop to be able to connect to OCI. I also showed how to perform some simple queries on your Object Storage environment. Go check out that post before proceeding with the examples in this blog.

In this post, I’ll build upon my previous post by giving some Python functions to:

Check if Bucket exists

Create a Buckets

Delete a Bucket

Upload an individual file

Upload an entire directory

Read on for those examples.

Comments closed

Test Data Generation Concepts and Challenges

Published 2024-06-11 by Kevin Feasel

Phil Factor talks about data generation:

Data generation is the science and art of providing data for database development work that is as realistic and controllable as possible. The skills of generating realistic data are an essential part of being a database developer. It is important: you need plenty of data of exactly the right type, size and verisimilitude in order to develop and test a database. It has got to look right. If you are launching a corporate, data-driven application, you’ll need it for training as well as testing and development.

Where things get really tough is when you also want to get the distribution of data right. Is a uniform distribution (e.g., outputs from a random number generator) okay? Or do you have a few large accounts that take up the majority of your data? That difference could affect query plans and if you have nice and uniform data, you might never see some of the cardinality estimation problems until they pop up in production.

Comments closed

Cross-Database Ownership Chaining in SQL Server

Published 2024-06-11 by Kevin Feasel

David Seis explains and warns:

In SQL Server, an ownership chain is created when one object (like a stored procedure) accesses another object (like a table) in the same database. If both objects have the same owner, SQL Server only checks permissions for the first object, not the second. This is known as ownership chaining.

Cross-database ownership chaining extends this concept across databases. If a stored procedure in Database A accesses a table in Database B, and both objects have the same owner, SQL Server will not check permissions for the table in Database B.

For what I consider the definitive explanation of why you shouldn’t use cross-database ownership chaining (or other techniques like setting TRUSTWORTHY on), Solomon Rutzky’s article is it. Module signing is the answer and the rest are traps.

Comments closed

Exporting SQL Server Query Results to an HTML Table

Published 2024-06-11 by Kevin Feasel

Vlad Drumea builds a table:

In a previous post, I’ve covered how to import and query HTML table data in SQL Server, so I figured it makes sense to also show ho to export SQL Server query results to an HTML table using PowerShell and dbatools.

This is really similar to how the HTML report from PSBlitz is generated. The differences being that PSBlitz doesn’t rely on dbatools and there’s a bit more formatting and some JavaScript involved.

Click through for a piece-by-piece breakdown of the script and Vlad’s commentary.

Comments closed

Checking for Patterns in Text with R

Published 2024-06-10 by Kevin Feasel

Steven Sanderson looks for a sub-string:

When working with text data in R, one common task is to check if a character or substring is present within a larger string. R offers multiple ways to accomplish this, ranging from base R functions to packages like stringr and stringi. In this post, we’ll explore how to use grepl() from base R, str_detect() from stringr, and stri_detect_fixed() from stringi to achieve this.

Click through for three separate ways of doing this.

Comments closed

Making the Point Clear in a Visual

Published 2024-06-10 by Kevin Feasel

Amy Esselman offers advice:

Have you ever found yourself staring at a graph or slide, wondering what the creator was trying to convey? Perhaps you’ve sat through a presentation, only to be left scratching your head, unsure of what to do next. Don’t put your audience in this same uncomfortable position. Instead, connect the dots for them to make it clear what the point is and what action they should take. When you fail to explicitly state the purpose of your communication, you run the risk of the important insight being lost, or someone arriving at the wrong conclusion.

Read on for an example of comparing resource plans to actual allocations and moving from showing data to telling a story.

Comments closed

Third Party Vendors and Missing tempdb Space

Published 2024-06-10 by Kevin Feasel

Tanayankar Chakraborty troubleshoots a strange issue:

Issue

An issue was brought to our attention recently where an azure SQL DB was throwing TempDB related errors although the customer felt that the TempDB usage never came close to the value published in the official Microsoft document. Here’s the error the customer had complained about:

Error

Here is a more detailed error text :

The database ‘tempdb’ has reached its size quota. Partition or delete data, drop indexes, or consult the documentation for possible resolutions.’. Possible failure reasons: Problems with the query, ‘ResultSet’ property not set correctly, parameters not set correctly, or connection not established correctly.

This was an interesting problem and, admittedly, I didn’t predict the twist.

Comments closed

The Ugly Side of SQL Server Logins

Published 2024-06-10 by Kevin Feasel

Jeff Iannucci continues a series on security in SQL Server:

SQL Server logins are often necessary, like the sa login, but be aware they present a particular set of vulnerability issues. Their lack of multifactor authentication, their susceptibility to brute force attacks, and the dumb passwords folks give SQL Server logins make them a prime target for hackers.

Read on to learn more about these three issues. Jeff also includes a note about a built-in function that lets you compare the hashed passwords in your systems versus a specific password, with the idea that you can at least find if people are using some of the most common passwords.

Comments closed

System Views and Distributed Processing in Microsoft Fabric

Published 2024-06-10 by Kevin Feasel

Koen Verbeeck runs into an annoying error:

I have a metadata-driven ELT framework that heavily relies on dynamic SQL to generate SQL statements that load data from views into a respective fact or dimension. Such a task is well suited for generation, since the pattern to load a type 1 SCD, type 2 SCD or a fact table is always the same.

To read the metadata of the views, I use a couple of systems views, such as sys.views and sys.sql_modules. At some point, I join this metadata (containing info about the various columns and their data types) against metadata of my own (for example, what is the business key of this dimension). This all works fine in Azure SQL DB or SQL Server, but in my Fabric warehouse I was greeted with the following error:

The query references an object that is not supported in distributed processing mode.

Read on to learn more about why you get this error and one workaround for it.

Comments closed

Thoughts on Natural Keys

Published 2024-06-10 by Kevin Feasel

Mark Seemann talks keys:

Although I live in Copenhagen and mostly walk or ride my bicycle in order to get around town, I do own an old car for getting around the rest of the country. In Denmark, cars go through mandatory official inspection every other year, and I’ve been through a few of these in my life. A few years ago, the mechanic doing the inspection informed me that my car’s chassis number was incorrect.

This did make me a bit nervous, because I’d bought the car used, and I was suddenly concerned that things weren’t really as I thought. Had I unwittingly bought a stolen car?

But the mechanic just walked over to his computer in order to correct the error. That’s when a different kind of unease hit me. When you’ve programmed for some decades, you learn to foresee various typical failure modes. Since a chassis number is an obvious candidate for a natural key, I already predicted that changing the number would prove to be either impossible, or have all sorts of cascading effects, ultimately terminating in official records no longer recognizing that the car is mine.

Mark uses this as a jumping-off point on a discussion about whether to use natural keys as primary keys or whether to include surrogate keys instead. I am generally in favor of using surrogate keys in the physical data model and creating unique indexes for natural keys. But you have to use natural keys in the logical data model because surrogate keys don’t exist at the level of the logical data model. Do read the comments, though, because there’s a great debate in there.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30