Press "Enter" to skip to content

Month: July 2023

TDE and Credential Rotation for Arc SQL Managed Instance

Nikita Takru announces a new public preview:

We are thrilled to announce the Public Preview of Transparent Data Encryption (TDE) and Service-Managed Credential Rotation for Arc-enabled SQL Managed Instance. With a strong focus on data security and management, this release introduces cutting-edge features that ensure your sensitive information is protected.  

Click through for more details, particularly on automating credential rotation.

Comments closed

A First-Pass Approach to Migrating Dedicated SQL Pool Schemas to Fabric

Kevin Chant gets a jump on a big problem:

To manage expectations, this post only covers database schema objects. Plus, I need to highlight the fact that this solution has some interesting quirks. Some of which I highlight in this post.

Even though there are some quirks, I still want to show this solution. So that others can see it working and I can highlight a few important facts. Plus, share a template you can use to test this yourself.

The current lack of a good migration strategy is a real challenge for anyone thinking of moving from Azure Synapse Analytics to Microsoft Fabric. Serverless SQL pools and Spark pools are an easy transition, but dedicated SQL pools are a tough nut to crack.

Comments closed

CREATEROLE in Postgres

Muhammad Ali creates a role:

PostgreSQL is a robust open-source relational database management system that provides a wide range of capabilities to guarantee safe and effective data administration. One such privilege is the CREATEROLE privilege, which is vital to PostgreSQL database management of users and roles. We will examine the nuances of the CREATEROLE privilege, its use, and how it affects user administration in this blog post. So let’s investigate this privilege and comprehend PostgreSQL’s use of it.

Click through to see how this privilege works and what has changed in different versions of Postgres.

Comments closed

Creating a Power BI Deployment Pipeline

Richard Swinbank does some deployment:

In part 4 of this series, I introduced a standard pattern for organising report files and pipelines, with a standard process for creating new reports. Repeatable patterns and processes are great candidates for automation, and in this post I’ll build a report creation process that automatically configures a new report and creates its deployment pipeline .

This video (5m31s) shows you the process in action, from creating a new report to seeing it deployed automatically to Power BI:

Click through for the video, as well as plenty of written instruction.

Comments closed

Finding Orphaned Users in SQL Server

Kenneth Fisher needs more factory workers:

When you create a user (a database principal) you have several options on what the user is associated with (usually a login/server principal), or it might not be associated with anything at all (created without a login). And a common problem is when that object you’ve associated your user with is no longer available and you’ve got an orphan.

Click through to see how Kenneth finds them all and has them working in his wallet-making factories. Kenneth is an inspiration to us all.

Comments closed

Conversion from XML to Target Collation Impossible

Chad Callihan has a mission, should you choose to accept it:

I recently ran into an error for a query that was casting XML to VARCHAR to search for a value. I have a bit of familiarity with the XML for these types of queries, so I was surprised when running the SELECT statement caused this error, which I had never seen before:

Msg 6355 “Conversion of one or more characters from XML to target collation impossible”

Click through for the solution. The real solution? NVARCHAR Everywhere.

Comments closed

Finding Duplicate Rows and Values in R

Steven Sanderson de-duplicates, starting with values:

In data analysis and programming, it’s common to encounter situations where you need to identify duplicate values within a dataset. Whether you’re a beginner or an experienced programmer, knowing how to find duplicate values is a fundamental skill. In this blog post, we will explore two different approaches to accomplish this task using base R functions and the dplyr package in R. By the end, you’ll have a clear understanding of how to detect and manage duplicate values in your own datasets.

From there, we get to see various ways to de-duplicate rows in R:

In data analysis and manipulation tasks, it’s common to encounter situations where we need to identify and handle duplicate rows in a dataset. In this blog post, we will explore three different approaches to finding duplicate rows in R: the base R method, the dplyr package, and the data.table package. We’ll compare their performance using the benchmark function and provide insights on when to use each approach. So, grab your coding gear, and let’s dive in!

Duplicate values is a relatively tricky one, with rows being much easier.

Comments closed

Data Syncs between Azure SQL DB and Amazon RDS

Joey D’Antoni crosses clouds:

A while back, a client, who host user-facing databases in Azure SQL Database, had a novel problem. One of their customers, had all of their infrastructure in AWS, and wanted to be able to access my client’s data in an RDS instance. There aren’t many options for doing this–replication doesn’t work with Azure SQL Database as a publisher because there’s no SQL Agent. Managed Instance would have been messy from a network perspective, as well as cost prohibitive compared to Azure SQL DB serverless. Even using an ETL tool like Azure Data Factory would have worked, but would have required a rather large amount of dev cycles to check for changed data. Enter Azure Data Sync.

Read on to see what Azure Data Sync is and how it helps solve this problem.

Comments closed

Multi-Source Replication in MySQL

Aisha Bukar continues a series on replication in MySQL:

MySQL’s multi-source replication allows a replica server to receive data from multiple source servers. Let’s say you have a replica server at your workplace, and there are multiple source servers in different locations, you need a way to directly receive data from these source servers to your replica server. This is where the multi-source replication technique comes into play. It allows you to efficiently gather data from various sources and consolidate it on your replica server.

Note that this is quite different from merge replication or peer-to-peer replication in SQL Server and there are some limits to its capabilities. That said, I could see this being really useful for performing ELT into a warehouse: use replication to keep the staging tables in sync and then run a job to perform transformations into facts and dimensions periodically.

Comments closed