Press "Enter" to skip to content

Author: Kevin Feasel

Finding Orphaned Users in SQL Server

Kenneth Fisher needs more factory workers:

When you create a user (a database principal) you have several options on what the user is associated with (usually a login/server principal), or it might not be associated with anything at all (created without a login). And a common problem is when that object you’ve associated your user with is no longer available and you’ve got an orphan.

Click through to see how Kenneth finds them all and has them working in his wallet-making factories. Kenneth is an inspiration to us all.

Comments closed

Conversion from XML to Target Collation Impossible

Chad Callihan has a mission, should you choose to accept it:

I recently ran into an error for a query that was casting XML to VARCHAR to search for a value. I have a bit of familiarity with the XML for these types of queries, so I was surprised when running the SELECT statement caused this error, which I had never seen before:

Msg 6355 “Conversion of one or more characters from XML to target collation impossible”

Click through for the solution. The real solution? NVARCHAR Everywhere.

Comments closed

Finding Duplicate Rows and Values in R

Steven Sanderson de-duplicates, starting with values:

In data analysis and programming, it’s common to encounter situations where you need to identify duplicate values within a dataset. Whether you’re a beginner or an experienced programmer, knowing how to find duplicate values is a fundamental skill. In this blog post, we will explore two different approaches to accomplish this task using base R functions and the dplyr package in R. By the end, you’ll have a clear understanding of how to detect and manage duplicate values in your own datasets.

From there, we get to see various ways to de-duplicate rows in R:

In data analysis and manipulation tasks, it’s common to encounter situations where we need to identify and handle duplicate rows in a dataset. In this blog post, we will explore three different approaches to finding duplicate rows in R: the base R method, the dplyr package, and the data.table package. We’ll compare their performance using the benchmark function and provide insights on when to use each approach. So, grab your coding gear, and let’s dive in!

Duplicate values is a relatively tricky one, with rows being much easier.

Comments closed

Multi-Source Replication in MySQL

Aisha Bukar continues a series on replication in MySQL:

MySQL’s multi-source replication allows a replica server to receive data from multiple source servers. Let’s say you have a replica server at your workplace, and there are multiple source servers in different locations, you need a way to directly receive data from these source servers to your replica server. This is where the multi-source replication technique comes into play. It allows you to efficiently gather data from various sources and consolidate it on your replica server.

Note that this is quite different from merge replication or peer-to-peer replication in SQL Server and there are some limits to its capabilities. That said, I could see this being really useful for performing ELT into a warehouse: use replication to keep the staging tables in sync and then run a job to perform transformations into facts and dimensions periodically.

Comments closed

Data Syncs between Azure SQL DB and Amazon RDS

Joey D’Antoni crosses clouds:

A while back, a client, who host user-facing databases in Azure SQL Database, had a novel problem. One of their customers, had all of their infrastructure in AWS, and wanted to be able to access my client’s data in an RDS instance. There aren’t many options for doing this–replication doesn’t work with Azure SQL Database as a publisher because there’s no SQL Agent. Managed Instance would have been messy from a network perspective, as well as cost prohibitive compared to Azure SQL DB serverless. Even using an ETL tool like Azure Data Factory would have worked, but would have required a rather large amount of dev cycles to check for changed data. Enter Azure Data Sync.

Read on to see what Azure Data Sync is and how it helps solve this problem.

Comments closed

Migrating Column-Level Encryption to Azure SQL MI

Keshav Kiran performs a migration:

One of our customers came up with a requirement where they wanted to Migrate On-prem Database to Azure SQL Managed instance. The databases had traditional column level encryption enabled.

He has restored the database on the SQL Managed instance by Backup/Restore approach. Now when he was trying to read the encrypted column on the destination database, It was showing NULL values after decryption.

Read on for the solution.

Comments closed

Viewing the Power BI Format Pane during On-Object Interaction

Gilbert Quevauvilliers is missing something:

I have enabled the new On-Object Interaction for the formatting pane in Power BI and while it is constantly improving there are times when I would like to have the good old formatting pane available.

I have also found that sometimes when you create a new visual there is no option to format it as shown below.

There’s a workaround to this, so check it out.

Comments closed

Model Diagnostics in Python

Christian Lorentzen has released a new package:

Version 1.0.0 of the new Python package for model-diagnostics was just released on PyPI. If you use (machine learning or statistical or other) models to predict a mean, median, quantile or expectile, this library offers tools to assess the calibration of your models and to compare and decompose predictive model performance scores.

This looks like a really useful package, so check it out.

Comments closed