Press "Enter" to skip to content

Curated SQL Posts

Microsoft Fabric GitHub Integration Security Considerations

Kevin Chant covers a bit of security:

I know the option to work with GitHub has got a lot of people excited. Which I why wanted to share my initial thoughts about security with you all. Because a lot of things have come to mind whilst testing this.

I want to highlight immediate implications and options before you all get too involved with testing. To make sure you test working with GitHub safely.

Plus, this post is really useful for those of you looking to test this in a regulated GitHub Enterprise environment. Because it will allow you to explain things to your GitHub administrators better, and/or forward them this post. To explain what you want to achieve.

Read on for Kevin’s thoughts on the matter.

Comments closed

Optimistic Locking in Postgres

Semab Tariq explains how optimistic locking works in PostgreSQL:

Concurrency control in databases ensures that multiple transactions can occur simultaneously without causing data errors. It’s essential because, without it, two people updating the same information at the same time could lead to incorrect or lost data. There are different ways to manage this, including optimistic locking and pessimistic locking. Optimistic locking assumes that conflicts are rare and only checks for them when updating data. In contrast, pessimistic locking assumes conflicts are likely and locks data early to prevent issues. Optimistic locking allows for more concurrent transactions and better performance in systems with fewer conflicts.

Read on to learn more about it, including some patterns for best use and what to avoid.

Comments closed

AutoML in Python with TPOT

Abid Ali Awan gives us a primer on TPOT:

AutoML is a tool designed for both technical and non-technical experts. It simplifies the process of training machine learning models. All you have to do is provide it with the dataset, and in return, it will provide you with the best-performing model for your use case. You don’t have to code for long hours or experiment with various techniques; it will do everything on its own for you.

In this tutorial, we will learn about AutoML and TPOT, a Python AutoML tool for building machine learning pipelines. We will also learn to build a machine learning classifier, save the model, and use it for model inference.

Click through to see an example of how to use the library.

Comments closed

FabricRestClient and Long-Running Operations

Sandeep Pawar has a public service announcement:

I want to thank Michael Kovalsky for pointing out that FabricRestClient in Semantic Link supports (since v 0.7.5) Long Running Operation (LRO).

LRO support allows the client to wait for the request to process without being blocked. Without LRO support, you will get a 202 response code saying the request is being processed. You need to submit another request based on the url returned to get the result. With LRO support, FabricRestClient will wait 20s and give you the result back.

Click through to see what you’d need to do to enable it, as well as the benefit you can receive.

Comments closed

Rounding Options in T-SQL

Rick Dobson talks rounding:

Please compare the SQL Server round function to banker’s rounding in T-SQL for converting decimal values to integer values. I seek a framework for assessing how closely banker’s rounding results versus SQL Server Round function results match the underlying decimal values. Please provide a couple of empirical comparisons with the framework to indicate which set of rounded values are closer to the underlying decimals and by how much.

Rick talks about what banker’s rounding is and shows how its results adhere more closely to the underlying distribution. Rick does show a user-defined function that generates a rounded number, but if you’re doing this with large enough amounts of data, using CLR and the System.Math.Round() function will likely give you better performance. Incidentally, this is also why if you write T-SQL code to round decimal numbers and .NET code to round numbers, your results may be a little different: T-SQL rounds to the nearest integer, whereas .NET uses banker’s rounding by default.

Comments closed

Restoring a MySQL Table from Filesystem Backup

Chad Callihan recovers from a missing database backup:

There may be no worse feeling than needing a database backup and not having one. It ranks right up there with running a DELETE statement and missing the WHERE clause. God help you if you if you suffer both of those together. If you come across that situation with a MySQL database, you might be able to recover what you need.

Read on to see how. Even so, I’d be concerned about what happens if there are foreign key constraints involved.

Comments closed

SQL Server AGs and Kubernetes

Andrew Pruski shakes his head:

Say we have a database that we want to migrate a copy of into Kubernetes for test/dev purposes, and we don’t want to backup/restore.

How can it be done?

Well, with cross platform availability groups! We can deploy a pod to our Kubernetes cluster, create the availability group, and then auto-seed our database!

The caveat is, this probably isn’t a good idea. But then again, when has that ever stopped anyone?

Comments closed

Working with grep in R

Steven Sanderson performs a pattern match:

In R, finding patterns in text is a common task, and one of the most powerful functions to do this is grep(). This function is used to search for patterns in strings, allowing you to locate elements that match a specific pattern. Today, we’ll explore how to use wildcard characters with grep() to enhance your string searching capabilities. Let’s dive in!

Read on to learn more about how to use the grep() function.

Comments closed

Handling Multiple Snapshots on a Database

Andy Brownsword lets things get out of hand:

Last week we looked at using Database Snapshots to help with rolling back upgrades. The snapshot maintained a point in time copy of the database which could be later restored.

We can go further – a database can have multiple snapshots.

Let’s suppose we want to take one before an upgrade, another once the upgrade is complete, and another before the start of business the following day. This would provide us multiple points to restore too.

This however makes restoring more complicated.

My recollection is that it’s not just restoration that gets more complicated, but also any database activity, to the point where too many database snapshots on a single database can have a considerable performance impact.

Comments closed