Press "Enter" to skip to content

Author: Kevin Feasel

Extracting Strings between Specific Characters in R

Steven Sanderson toes a bit of tag replacement:

Hello, R enthusiasts! Today, we’re jumping into a common text processing task: extracting strings between specific characters. This is a great skill for data cleaning and manipulation, especially when working with raw text data. I’m going to show you how to achieve this using base R, the stringr package, and the stringi package. Let’s go!

Read on for examples.

Comments closed

Failure Mode and Effect Analysis on Databases

Mika Sutinen thinks about how things could go wrong:

Failure Mode and Effect Analysis(FMEA) is a process of building more resilient systems, by identifying failure points in them. While it’s highly recommended to perform FMEA during the architecture design phase, it can be done at any time. More importantly, it should be reviewed periodically, and especially when the system architecture changes.

While you can do Failure Mode and Effect Analysis for whole systems, in this post, I will share an example on how to get started with FMEA for a database environment.

Read on for a description of the concept and some tips on how to perform one.

Comments closed

Certificate Expiration Dates and TDE

Mike Lynn talks Transparent Data Encryption:

Transparent Data Encryption uses certificates in its architecture for protecting your data while at rest. One attribute of a certificate is they have an expiration date. Certificates expire for a couple reasons, but the main reason is to enforce security. When a website certificate expires it forces the website owners to get a new certificate by proving they are who they say they are with a trusted third party. 

SQL Server certificates that are used for TDE also have an expiration date, but these dates are only checked when you are creating a self-signed certificate using the “CREATE CERTIFICATE” T-SQL command. If you don’t supply an expiration date when creating your certificate SQL Server will assign one that is 1 year into the future.

Read on to learn more about how it works with TDE. I will say that with encrypting backups, SQL Server does care about the expiration date when it comes to creating a new encrypted backup, but not when it comes to restoring a backup.

Comments closed

Configuring Microsoft Fabric Data Mirroring for Snowflake

Koen Verbeeck copies some data:

We have a couple of Snowflake databases and would like to have that data available in Microsoft Fabric as well. Is there an easy solution to get the data quickly in Fabric? We don’t have many technical people on staff, so writing complex ETL is not an option.

Read on for more information on how it works. Mind you, you’re probably still writing the T and some of the L after using mirroring.

Comments closed

Branch-Out in Microsoft Fabric

Marc Lelijveld covers a new bit of functionality in Microsoft Fabric:

Yesterday, Microsoft released a new option called “branch-out” that allows you to easily setup a new branch from an existing Fabric workspace. Obviously, this was already possible but involved a lot of manual work. With this new option, you can create your own feature branch to work in isolation before you commit your work to the central repository.

In this blog, I will deep dive more in this branch-out feature, how it works, including some things to keep in mind working with this feature.

Read on to learn more about the feature.

Comments closed

Tips for Databricks Asset Bundles

Dustin Vannoy has a new video:

This post and video is covering some specific examples people have brought up when defining their Databricks Asset Bundles. The video includes a bit of review, but for more introduction please see my first post on Databricks Asset Bundles. The github repository I use will probably be first to update with new examples, however I hope to continue to add to the examples in these posts plus additional videos.

Click through to check out those tips.

Comments closed

New Video: Multi-Class Classification

I have a new video:

In this video, I get past two-class classification and explain how things differ in the multi-class world.

What’s really interesting is that, in many cases, when it comes to code, the answer is “not much.” That’s because libraries like scikit-learn do a lot to smooth over differences between single-class and multi-class classification. But there are still differences that can bite you if you don’t understand how the cases differ.

Comments closed

Copying an Azure SQL Database

Josephine Bush makes a copy:

It’s as simple as this for each db you want a copy of. Just run it from the master db. This works if you want to make a copy on the same server. If you want to make a copy from another server, you would have to connect via PowerShell.

Click through for the T-SQL syntax. I’ve used this before on some reasonably large databases and it can take a while for that copy command to finish, but if you’re feeling impatient, you can check the status of the job using sys.dm_operation_status.

Comments closed

SQL as a Language

Brent Ozar shares some thoughts:

I’m not talking just about Microsoft SQL Server specifically here, nor T-SQL. Let’s zoom out a little and think bigger picture for a second: is the SQL language itself a problem?

Sometimes when I talk to client developers, they gripe about the antiquated language.

Brent goes on to list some common complaints with SQL in general and explains why there isn’t a better solution.

I should note that he also summarizes Feasel’s Law near the end of the post:

Remember when NoSQL came out, and everybody was all “databases r doomd”? And remember what business users said when they wanted to run their reports? NoSQL persistence layers pretty quickly changed their tune, saying, “Oh, well, uh, we meant Not Only SQL, that’s what we meant,” as they struggled to quickly slap in SQL compatibility. Even MongoDB, king of NoSQL, implemented SQL support.

Comments closed