Press "Enter" to skip to content

Month: March 2023

Tips on Navigating Postgres Documentation

Laetitia Avrot dishes dirt on Postgres documentation:

I could have created a very easy post with quick tips on psql, like how to disable this horrible pager the “ancient” Postgres contributors insist on keeping on by default (BTW, it’s \pset pager off, you’re welcome, you’ll thank me later), but as I wrote an entire website on that exact topic, I thought I needed to find something else.

So here is my topic: how to use the Postgres documentation! Yes, that documentation content is great, but no, that documentation is not easy to navigate at first.

Click through for tips on the best ways to navigate through this documentation, as well as important pages and topics based on your use case and role.

Comments closed

Unit Testing Spark Notebooks in Synapse

Arun Sethia grabs the oscilloscope:

In this blog post, we will cover how to test and create unit test cases for Spark jobs developed using Synapse Notebook. This is an extension of my previous blog, Synapse – Choosing Between Spark Notebook vs Spark Job Definition, where we discussed selecting between Spark Notebook and Spark Job Definition. Unit testing is an automated approach that developers use to test individual self-contained code units. By verifying code behavior early, it helps to streamline coding practices for larger systems.

Arun covers three major use cases: when your code is in an external library, when it is in a separate notebook, and when it is in the same notebook.

Comments closed

Creating a Disaster Recovery Plan for Synapse

Freddie Santos talks HA/DR with Synapse:

Many of our customers have been asking about creating a disaster recovery plan for their Synapse Workspace. In a new blog series, we will cover the basics of disaster recovery and business continuity, discussing available options and custom solutions.

In this first post, we’ll review important concepts and questions to answer before building a disaster recovery plan, including the differences between High Availability and Disaster Recovery.

The focus in this post is on the dedicated SQL pool and Azure Data Lake Storage Gen2 (because people still think about Gen1?), though that’s the majority of what you’d need to think about—Spark pools and the serverless SQL pool really drive from the data lake. There’s also Data Explorer pools, which have their own storage and HA/DR capabilities.

Comments closed

Content Security Policies and Posit Connect Apps

Theo Roe gets into some web security:

Heads up! We’re about to launch WASP, a Web Application Security Platform. The aim of WASP is to help you manage (well, you guessed it) the security of your Posit Connect application using Content Security Policy and Network Error Logging. More details soon, but if this interests you, please get in touch.


This blog post is aimed at those who are somewhat tech literate but not necessarily a security expert. We’re aiming to introduce the concept of Content Security Policy and teach some of the technical aspects.

This does provide a nice overview to the topic and explains the key “what” and “why” answers.

Comments closed

Sun Modeling and SunBeam

Shannon Bloye takes us through a new analytics systems modeling technique:

 Sun Modelling was a technique initially developed and taught by Mark Whitehorn as a professor of analytics at the University of Dundee. Which is where our own Terry McCann encountered the approach whilst studying for his MSc. He does a great talk on the topic in this video.

A core aim of the method is to offer a simplicity that makes it accessible to end users as well as the usual technical professionals. The approach is a high-level visual means to model data around a business process.

This feels a bit like a Kimball model but where you’re explicitly diagramming hierarchies and common slicers.

Comments closed

Working with Managed Private Endpoints in Synapse

Sergio Fonseca continues a series on Synapse connectivity:

When you create your Azure Synapse workspace, you can choose to associate it to an Azure Virtual Network. The Virtual Network associated with your workspace is managed by Azure Synapse. This Virtual Network is called a Managed Workspace Virtual Network or Synapse Managed VNET

I am 100% in favor of using managed vNETs with Synapse and about 40% in favor of using Data Exfiltration Protection—it’s a lot lower because of the impact it has on your developers, though if you need it, developers will just have to deal with the added pain.

Comments closed

Deciding on When to Automate

Jeffrey Hicks shares some hard-earned wisdom:

I’ve been scripting and automating things since the days of DOS 3.3, beginning with batch files. It always felt like magic. I could cast a charm simply by typing a few characters on a keyboard. Naturally, my magic skills went from batch files to VBScript to PowerShell. Throughout it all, I’ve also had an internal decision tree regarding automation. Over the years, I’ve seen IT pros new to scripting and automation needlessly struggle. Often it is due to a deficiency in their decision tree. Today, I thought I’d help you nurture yours.

There’s a lot of good advice here about where the automation inflection point is, choosing the right tool, and performing research first before trying to jump into a project.

Comments closed

Approximate Percentiles in Azure SQL DB and MI

Balmukund Lakhani announces a feature has gone generally available:

Today, we are announcing General Availability (GA) of native implementation of APPROX_PERCENTILE in Azure SQL Database and Azure SQL Managed Instance. We announced preview of these functions in October 2022. Since then, many customers have adopted these for the applications where response time of percentile calculation was more important than the accuracy of the result.

I have and will continue to extol the virtues of these two functions wherever I go. They’re considerably better than the originals once you start getting into the hundreds of thousands or millions of rows. They’re also available in SQL Server 2022.

Comments closed

Copy-Only Backup and Next Automatic Backup

Jose Manuel Jurado Diaz diagnoses an error:

Today, we worked on a service request that our customer got the following error message: BACKUP WITH COPY_ONLY cannot be performed until after the next automatic BACKUP LOG operation [SQLSTATE 42000] (Error 41937) BACKUP DATABASE is terminating abnormally. [SQLSTATE 42000] (Error 3013), running a manual backup.

Click through to learn when you might see this error and what you can do about it.

Comments closed

Disabling Classic Pipelines in Azure DevOps

Kevin Chant shares some thoughts:

In this post I want to share my thoughts about disabling classic pipelines in Azure DevOps. Which I know there are mixed feelings about.

In addition, I want to raise awareness that this is now possible. Due to the fact that towards the end of January Microsoft announced that you can now disable creation of classic pipelines in Azure DevOps.

In other words, you can now disable the use of the GUI-based Classic Editor and the Releases features in Azure Pipelines.

I agree with Kevin here: it’s generally time to bite the bullet on infrastructure as code if you haven’t already. We talk about it in the data platform context a lot (database schemas in source control, repeatable deployment processes, maintaining config files and applying them) and it matters just as much elsewhere.

Comments closed