Press "Enter" to skip to content

Curated SQL Posts

Tips for Using Azure Backup for SQL Server

Anna Hoffman, et al, share some tips and tricks:

We recently worked with a customer that migrated their Windows and SQL Servers to Azure that wanted to use Azure Backup for a consistent enterprise backup experience. The SQL Servers had multiple databases of varying sizes, some that were multi-terabyte.  A single Azure Backup vault was deployed using a policy that was distributed to all the SQL Servers. During the migration process, the customer observed issues with the quality of the backups and poor virtual machine performance while the backups were running. We worked through the issues by reviewing the best practices, modifying the Azure Backup configuration, and changing the virtual machine SKU. For this specific example, the customer needed to change their SKU from Standard_E8bds_v5 to Standard_E16bds_v5 to support the additional IOPS and throughput required for the backups.  They used premium SSD v1 and the configuration met the IOPS and throughput requirements.  

In this post, we share some of the techniques we used to identify and resolve the performance issues that were observed. 

Read on to learn more about how Azure Backup works and troubleshooting mechanisms.

Comments closed

Working with GraphQL in Microsoft Fabric

Stepan Resl takes us through what’s available today:

It is an alternative to REST API and enables users to fetch data from multiple sources using a single query. Compared to REST API, GraphQL is much more flexible and allows users to retrieve only the data they need, reducing the amount of data transferred between the client and server. It also uses a single endpoint, reducing the number of requests made to the server. It is a platform and programming language-independent specification, meaning it can be used with any language and on any platform.

GraphQL is defined by an API schema written in the GraphQL schema definition language. Each schema specifies the types of data that users can request or modify, and the relationships between these types. The term “resolver” is often mentioned in relation to GraphQL. It refers to a function or functions responsible for fetching data for a specific field in the schema and provides instructions for converting the GraphQL operation into data.

As a quick reminder for the data-minded: GraphQL and graph databases are orthogonal to one another.

Comments closed

Test Isolation with Kafka

Anton Belyaev builds some tests:

The experience of running Kafka in test scenarios has reached a high level of convenience thanks to the use of Test containers and enhanced support in Spring Boot 3.1 with the @ServiceConnection annotation. However, writing and maintaining integration tests with Kafka remains a challenge. This article describes an approach that significantly simplifies the testing process by ensuring test isolation and providing a set of tools to achieve this goal. With the successful implementation of isolation, Kafka tests can be organized in such a way that at the stage of result verification, there is full access to all messages that have arisen during the test, thereby avoiding the need for forced waiting methods such as Thread.sleep().

This method is suitable for use with Test containers, Embedded Kafka, or other methods of running the Kafka service (e.g., a local instance).

Click through for that approach.

Comments closed

The Framework Laptop and Right to Repair

Heather Joslyn summarizes an interview:

Chances are, if you’ve lived through a few innovation cycles, you’ve got too many old computers — and their cables — cluttering your house. Do you think that if you had the right to repair your devices, to swap out obsolete components for more performant ones, you wouldn’t keep piling up castoff electronics?

So does Matt Hartley, guest on this On the Road episode of The New Stack Makers, recorded at Open Source Summit North America in April.

This is a bit out of left field for Curated SQL content, but to be fair, when has that ever stopped me? I’ve owned two Framework laptops (one of which is my daily driver and the other I gave away when it stopped being my daily driver) and really like the company because of its repair-friendly ethos, making parts and schematics available—as was the norm for companies until recently. Part of owning a thing is having the ability to maintain and repair it.

Comments closed

Building a Full-Stack App with Kafka and Node.js

Lucia Cerchie builds an application:

A well-known debate: tabs or spaces? Sure, we could set up a Google Form to collect this data, but where’s the fun in that? Let’s settle the debate, Kafka-style. We’ll use the new confluent-kafka-javascript client (not in general availability yet) to build an app that produces the current state of the vote counts to a Kafka topic and consumes from that same topic to surface them to a JavaScript frontend. 

Why are we using this client in particular? It comes from Confluent and is intended for use with Apache Kafka® and Confluent Platform. It’s compatible with Confluent’s cloud offering as well. It builds on concepts from the two most popular Kafka JavaScript client libraries: KafkaJS and node-rdkafka. The functionality is based on node-rdkafka, however, it also provides a way to interface with the library via methods similar to those in KafkaJS due to their developer-friendy nature. There are two APIs: the first implements the functionality based on node-rdkafka; the second is a promisified API with the methods akin to those in KafkaJS. By choosing this client, we can access wide functionality and have a smooth developer experience via the dev-friendly methods.

Click through for the code and explanation. Meanwhile, tabs in my heart, spaces in my job.

Comments closed

A/B Testing with Survival Analysis in R

Iyar Lin combines two great flavors:

Usually when running an A/B test analysts assign users randomly to variants over time and measure conversion rate as the ratio between the number of conversions and the number of users in each variant. Users who just entered the test and those who are in the test for 2 weeks get the same weight.

This can be enough for cases where a conversion either happens or not within a short time frame after assignment to a variant (e.g. Finishing an on-boarding flow).

There are however many instances where conversions are spread over a longer time frame. One example would be first order after visiting a site landing page. Such conversions may happen within minutes, but a large churn could also happen within days after the first visit.

Read on for the scenario, as well as a simulation. I will note that, in the digital marketing industry, there’s usually a hard cap on number of days where you’re able to attribute a conversion to some action for exactly the reason Iyar mentions. H/T R-Bloggers.

Comments closed

Contrasting Data Mesh and Data Fabric

Sahil Babbar makes a comparison:

The concept of a data mesh proposes that each business domain takes charge of hosting, preparing, and delivering its own data to both its internal team and broader stakeholders. This decentralized approach empowers autonomous data teams to take full ownership and accountability for their data products and management processes.

Data fabric is a system designed to help a company manage and use its data from various storage types, like databases, tagged files, or document stores. It supports different tools and applications to easily access this data, working with technologies like Apache Kafka for real-time data streaming, ODBC for database connections, HDFS for big data storage and REST APIs for web services. It focuses on creating a unified data environment that acts as a reliable, centralized source for all organizational data. This setup ensures data is accurate, consistent, and secure, making it easy for different teams to access and manage data efficiently.

Read on to learn a bit more about the two architectures.

Comments closed

Removing Leading Zeroes from a String in T-SQL

Steve Stedman gets rid of leading zeroes:

When working with data in SQL Server, there may be times when you need to remove leading zeros from a string. This task can be particularly common when dealing with numerical data stored as strings, such as ZIP codes, product codes, or other formatted numbers. In this blog post, we’ll explore several methods to remove leading zeros in SQL Server.

I’m not sure I see the reason to use anything other than CAST() (or, better yet, TRY_CAST()), but Steve does show two other methods.

2 Comments

Random Walks in R with TidyDensity

Steven Sanderson goes for a walk:

A random walk is a mathematical object that describes a path consisting of a succession of random steps. It’s a cornerstone concept in fields like physics, economics, and biology. In finance, for example, the random walk hypothesis suggests that stock market prices evolve according to a random walk and thus cannot be predicted.

Read on to see how you can generate a dataset matching a random walk, as well as a comparison of techniques for generating them.

Comments closed

Measure-Object in Powershell

Patrick Gruenauer counts the ways:

The Measure-Object cmdlet counts objects. But it can do even more. We can calculate the sum, the average and much more. In this blog post I show a few examples with Measure-Object. Let’s dive in.

It’s a fairly straightforward cmdlet but it has a lot of use, being a combination of something like wc in Linux as well as collecting basic statistics on objects.

Comments closed