Press "Enter" to skip to content

Month: August 2023

Contrasting Azure Synapse Analytics and Microsoft Fabric

Warner Chaves explains the difference:

In the modern era of data-driven decision-making, businesses rely heavily on robust and efficient data platforms to process, analyze, and derive insights from their vast amounts of data. Since 2019, Azure Synapse Analytics has been Microsoft’s main contender in this space, offering powerful capabilities to handle complex data workloads.

Now, Microsoft has announced a new data platform called Microsoft Fabric, an evolution of the data platform built with a modified philosophy. It is a similar product but with enough differences to make them not interchangeable and so it’s very important to understand how they both compare and contrast if you’re planning a new data platform deployment. Microsoft wanted a product that was even simpler to deploy and operate and could function well outside of an Azure cloud environment as a full standalone Software As a Service offering.

In this blog post, we’ll compare Synapse Analytics and Fabric, highlighting their features, strengths, and considerations to help you make an informed decision for your organization’s data needs.

Warner has seven main areas of comparison, so click through to see how the two products stack up.

Comments closed

Delete Empty Folders with Powershell

Patrick Gruenauer tidies up:

Big Data? Pain? Looking for empty folders and want to delete them? In this post I show you how to proceed to find and delete empty folders.

Open PowerShell, ISE or VS Code.

Caution: If you proceed, all empty folders will be deleted without any warning.

It is kind of funny to warn people that, if they run the script to delete all of these empty folders, they will delete all of these empty folders. But hey, better safe than sorry.

Comments closed

TaskFactory Activation on an Azure-SSIS Integration Runtime

Andy Leonard does some sleuthing:

I regularly help customers migrate SSIS to Azure-SSIS integration runtimes, a nifty component of Azure Data Factory. I was recently stumped by an error activating TaskFactory (Task Factory for the search engines…) on an Azure-SSIS IR node. The error was:

“The system cannot find the file specified.”

Read on to figure out where the file is and how to fix this error.

Comments closed

Persisting Data for SQL Server on Docker Swarm

Andrew Pruski saves the day, or at least the data:

In my last couple of blog posts (here and here) I talked about how to get SQL Server running in Docker Swarm. But there is one big (and show-stopping) issue that I have not covered. How do we persist data for SQL Server in Docker Swarm?

Docker Swarm, like Kubernetes, has no native method to persist data across nodes…so we need another option and one of the options available to us is Portworx.

So how can we use Portworx to persist SQL Server databases in the event of a node failure in Docker Swarm?

Read on to find out how.

Comments closed

SeamlessM4T: Multimodal Speech and Text Translation

Facebook has announced a new library:

Today, we’re introducing SeamlessM4T, the first all-in-one multimodal and multilingual AI translation model that allows people to communicate effortlessly through speech and text across different languages. SeamlessM4T supports:

  • Speech recognition for nearly 100 languages
  • Speech-to-text translation for nearly 100 input and output languages
  • Speech-to-speech translation, supporting nearly 100 input languages and 36 (including English) output languages
  • Text-to-text translation for nearly 100 languages
  • Text-to-speech translation, supporting nearly 100 input languages and 35 (including English) output languages

The open source library is available on GitHub and you can also get the model itself on HuggingFace. The nicest thing about all of this is that, unlike existing translation services, you can run it entirely offline and perform the inference on local compute.

Comments closed

Training a Code-First Model in Azure ML

I have a new video:

In this video, we walk through the code in an Azure Machine Learning project and see how the pieces fit together.

There are a few more videos to go in this Azure ML series and I would recommend going through them in order to understand how we got to this video, but this one is what I’ve been building toward.

Comments closed

Request: Fill out the Redgate State of the Database Landscape Survey

Ryan Booz would like a few minutes of your time:

We’d like to hear what you have to say about the topology of your database landscape, and we want to give you first access to the data after the survey closes.

By taking a few minutes to answer the questions, you can help provide clarity on how our jobs as database professionals are changing and what skills will be needed in the future to successfully manage change.

Click through for the article and fill out the survey at https://rd.gt/survey. This survey is open until September 30, 2023, so there’s still a bit of time to share your thoughts. One annoying thing about the survey is that they ask you about all of the database platforms, even if you didn’t select that you actually use them. Fortunately, you can skip those questions.

Comments closed

Deploying Azure Resources with Terraform and GitHub Actions

Reitse Eskens sets up some new resources:

When you start out with Terraform, you’ll most likely run the code locally with terraform on your own machine. Terraform works with a so-called state-file, it saves the state of the Azure deployment it left behind and compares the (new) code with the state it encounters when it runs again. Changes are resolved by changing, deleting or adding resources that don’t match the state-file.

This works fine when you’re flying solo and don’t have co-workers who can change resources as well. Whenever you need to share code, the industry standard is to use a git solution, whether GitHub, GitLab, Azure DevOps or some other solution, as long as it has version control you should be fine (providing people adhere to the correct usage of branches).

Click through for a step-by-step walkthrough, as well as explanation of the major actors in that play.

Comments closed

Load Balancing in Postgres Clusters with pg_cirrus

Muhammad Ali explains how load balancing works in Postgres:

Load balancing is a critical component of high availability clusters that optimises performance, scalability, and fault tolerance. By evenly distributing database connections across multiple servers, load balancing prevents bottlenecks, efficiently handles increased workloads and improves response time.

In this blog, we will explore how standby nodes contribute to efficient workload distribution and achieving optimal query execution by directing all read/select queries to these standby nodes.

Read on to see how you can use pg_cirrus to perform query load balancing.

Comments closed

SQL Standards through the Years

Brendan Tierney notes a new standard:

As of June 2023 the new standard for SQL had been released, by the International Organization for Standards. Although SQL language has been around since the 1970s, this is the 11th release or update to the SQL standard. The first version of the standard was back in 1986 and the base-level standard for all databases was released in 1992 and referred to as SQL-92. That’s more than 30 years ago, and some databases still don’t contain all the base features in that standard, although they aim to achieve this sometime in the future.

It’s a bit wacky to me that you have to pay for a copy of the SQL standard, but then again, it’s not really intended for regular people: it’s intended for companies developing new database products so they can adhere to the 70% of the standard that they want, outright ignore 20% of the standard, and replace 10% with their own incompatible versions.

Comments closed