Press "Enter" to skip to content

Category: Configuration

Finding Mutable and Immutable Properties in Microsoft Fabric Spark

Sandeep Pawar wants to make a change:

Spark properties are divided into mutable and immutable configurations based on whether they can be safely modified during runtime after the spark session is created.

Mutable properties can be changed dynamically using spark.conf.set() without requiring a restart of the Spark application – these typically include performance tuning parameters like shuffle partitions, broadcast thresholds, AQE etc.

Immutable properties, on the other hand, are global configurations that affect core spark behavior and cluster setup and these must be set before/at session initialization as they require a fresh session to take effect.

Read on to see how you can tell which is which.

Leave a Comment

Configuration Is Code

Steve Jones has a public service announcement:

I posted a note on Twitter/X with this quote: “The content updates had not previously been treated as code because they were strictly configuration information.” This is from testimony given by Crowdstrike to a US Congressional committee in trying to explain how they grounded much of the airline industry a few months ago. That was a mess of a situation, and apparently, the vendor didn’t think their configuration was part of their code.

That’s an amazing viewpoint to me. The fact that any developer or manager thinks that their configuration data isn’t a part of their code is worth testing. Yet, I see this attitude all the time, where developers, QA, managers, and more think that the code is the only thing that changes or doesn’t change, ignoring the fact that there are configuration items that affect the code and need to be managed appropriately. Certainly, if the config data were in enums rather than in a file or database they’d feel differently.

Read on for Steve’s extended thoughts. I can understand the urge to call something “just a configuration file” so that you don’t have to do as much work. But that can lead to disaster.

Comments closed

Building a GitHub Codespace Configuration for Polyglot Notebooks

Matt Eland makes some recommendations:

In order to get Polyglot Notebooks to work with GitHub Codespaces, you’ll need to match the current requirements of the Polyglot Notebooks extension and its underlying .NET Interactive kernels.

This relies on two files in your .devcontainer directory:

  • Dockerfile which describes the Docker container the Codespace will run in
  • devcontainer.json which describes how the dev container is configured in terms of extensions and ports

Read on to learn more. Also, Matt has a brand new book available on the topic of polyglot notebooks, so check that out.

Comments closed

pg_stat_statements and Public Sentiment

Andreas Scherbaum polls the audience:

For anyone who doesn’t know, I’m running a weekly interview series with people from the PostgreSQL community. It’s called “PostgreSQL Person of the Week“. One of the questions in the default set I give everyone is:

What is your favorite PostgreSQL extension?

And guess what the answer is: by far everyone’s favorite is pg_stat_statements!

Read on to learn a bit more about what the extension does, why people like it, and what other extensions interviewees prefer.

Comments closed

Tracking Configuration-Based Performance Differences in Postgres

Ryan Lambert shows off a Postgres extension:

This is my entry for PgSQL Phriday #008. It’s Saturday, so I guess this is a day late! This month’s topic, chosen by Michael from pgMustard, is on the excellent pg_stat_statements extension. When I saw Michael was the host this month I knew he’d pick a topic I would want to contribute on! Michael’s post for his own topic provides helpful queries and good reminders about changes to columns between Postgres version 12 and 13.

In this post I show one way I like using pg_stat_statements: tracking the impact of configuration changes to a specific workload. I used a contrived change to configuration to quickly make an obvious impact.

Read on for the example.

Comments closed

Identify SQL Server Configuration Drift

Garry Bargsley won’t let the cattle become pets:

As you sit and wonder about when the next Star Wars movie is going to come out, do you ever get the thought of “I wonder if all my SQL Servers are configured the same?”

Occasionally, I get a thought like that run through my mind. Or I might see something on Twitter or Blog post about something, and it sparks the question.  Not about Star Wars, but my SQL Server environment.

Today something triggered me to confirm that all of my SQL Servers had the default backup compression setting set to enabled.

Garry mentions dbachecks at the end, and it’s a really good way of performing a fairly large number of such checks easily.

Comments closed

SQL Server Configuration Settings Requiring Restarts

Randolph West enumerates configuration settings which do (or do not) require a restart of the SQL Server database engine:

SQL Server is a complex beast, with many configuration options that can range from recommended to completely avoided.

Since the release of SQL Server 2016, several options that were recommended post-install have been rolled into the default installation options and no longer need to be done, and similar changes were made with SQL Server 2017. Even so, there are configuration changes we data professionals need to make after installation, during maintenance windows, and sometimes even during operating hours, so here’s a handy list of changes that do and don’t require a restart of your operating system or SQL Server instance.

As Randolph mentions, this set is not conclusive. For example, enabling PolyBase requires a restart of the database engine. I believe enabling ML Services technically does not, though I do out of caution because the back of my mind remembers something weird about the service’s behavior if you don’t restart the database engine, but p > 0 my brain made up the whole thing.

Comments closed

Working with SQL Server Configuration Files

Jamie Wick takes us through an underrated part of the SQL Server installer:

The ability to use a parameter file (configurationfile.ini), for automating the installation of SQL Server, has been around for many years. However, each release of SQL Server has had different parameters that could be included in the file. Here are some directions on how to find or create a parameter file, along with the parameter values that are supported by each version of SQL Server.

I appreciate the fact that every installation of SQL Server generates one of these and even points it out to you as you go through the installer wizard. And Jamie has gone a step further by giving us an Excel spreadsheet with all of the available settings and their defaults.

Comments closed

Configuring Kubernetes Pod Eviction Time

Andrew Pruski is a Kubernetes slumlord:

The default time that it takes from a node being reported as not-ready to the pods being moved is 5 minutes.

This really isn’t a problem if you have multiple pods running under a single deployment. The pods on the healthy nodes will handle any requests made whilst the pod(s) on the downed node are waiting to be moved.

But what happens when you only have one pod in a deployment? Say, when you’re running SQL Server in Kubernetes? Five minutes really isn’t an acceptable time for your SQL instance to be offline.

Click through to see how to handle this scenario.

Comments closed