Press "Enter" to skip to content

Author: Kevin Feasel

DBCC SHRINKFILE and tempdb

Tom Collins answers a question:

Question: I’m trying to delete a TempDB ndf file from the TempDB file definitions. It is no longer required ,but getting an error message :

DBCC SHRINKFILE: Page xxxxxxxx could not be moved because it is a work table page.

How can I get around this problem? There is no activity on the server

Read on for the answer.

Comments closed

Notebooks versus Dataflow Gen2 in Microsoft Fabric

Gilbert Quevauvilliers takes us through a comparison:

In this blog post I am going to compare Dataflow Gen2 vs Notebook in terms of how much it costs for the workload. I will also compare usability as currently the dataflow gen2 has got a lot of built in features which makes it easier to use.

The goal of this blog post is to understand which in my opinion is cheaper and easier to use, which will then be the focus for future blog posts with regards to what I’ve learned along the way, which will hopefully assist you too.

To compare between the two workloads, I am going to be using the same source file as well as do the same transformations which will result in the same result.

Read on for a surprising difference in cost.

Comments closed

The Impact of Auto-Close on Performance

Steve Stedman explains why Auto-Close should almost never be on for your database:

When the AutoClose setting is enabled, SQL Server will shut down the database after the last user disconnects. This means that every time a new connection is made, SQL Server must go through the entire process of starting the database again. This includes reading the database file, allocating memory, and performing any necessary recovery processes. This overhead can cause a noticeable delay for users as they connect, especially if the database is large or complex.

Read on for several other factors affecting performance. I will say that the best use case for Auto-Close is when you have a dev instance—especially on a local machine—with a large number of databases and a very limited amount of RAM available. Otherwise, if this is a server, I’m turning Auto-Close off. Even today, I’d rather just buy enough RAM for my developers than flip this switch.

Comments closed

Fixing Eager Spooling

Erik Darling sends it to the moon:

Probably the most fascinating thing about Eager Index Spools to me is how often the optimizer will insert them into execution plans, often to a query’s detriment.

In a sane world, a non-loop join plan would be chosen, a missing index request would be registered that matches whatever would have been spooled into an index, and we’d all have an easier time.

Read on for a few examples of the problem and two separate ways you can fix it. Remember, kids: friends don’t let friends eagerly spool.

Comments closed

Reviewing the Data Activator in Microsoft Fabric

Ginger Grant takes us through the Data Activator:

With the GA release of Fabric in November, 2023, I am dedicating several posts to new features which you will not find in Power BI or Azure Synapse, and the latest one I want to talk about is Data Activator. Data Activator is an interesting tool to include inside of Fabric because it is not reporting or ETL, rather it is a way to manage actions when the data hits defined targets.  It is a management system for data stored in Fabric or streamed in Azure using IOT or Event Hubs. You can use Data Activator to monitor the current state or to define actions to occur when certain conditions occur in the data.  Data Activator is still in preview, but you can evaluate it now.

Read on to see how to enable it and what you can currently do with it.

Comments closed

Running Spark Jobs on Databricks with Spark Connect and .NET

Ed Elliott runs a Databricks job:

This post aims to show how we can create a .NET application, deploy it to Databricks, and then run a Databricks job that calls our .NET code, which uses Spark Connect to run a Spark job on the Databricks job cluster to write some data out to Azure storage.

In the previous post, I showed how to use the Range command to create a Spark DataFrame and then save it locally as a parquet file. In this post, we will use the Sql command, which will return a DataFrame or, in our world, a Relation. We will then pass that relation to a WriteOperation command, which will write the results of the Sql out to Azure storage.

The code is available HERE

Read on for the description of how everything works.

Comments closed

Ensembling Churn Prediction Techniques

Salman Khan gloms together multiple trained models to solve a churn prediction problem:

Historically, this domain has leaned on traditional statistical models, including logistic regression and decision trees. These methodologies sift through historical customer data to identify indicators predictive of future service discontinuation. Although these methods have demonstrated resilience over time, their adequacy is increasingly being questioned. In this regard, ensemble learning emerges as a sophisticated alternative, offering enhanced precision and reliability in identifying potential customer attrition.

Ensemble learning, in turn, distinguishes itself by simultaneously employing multiple predictive models to refine accuracy. This article, thus, aims to elucidate how ensemble learning can revolutionize the approach to churn prediction: we will explore various techniques such as Random Forest, Gradient Boosting, and Stacking, illustrating their efficacy in predicting customer churn through pragmatic examples.

Read on for an introduction to ensemble learning and some high-level tips to keep in mind when ensembling.

Comments closed

Firewalls and TLS in SQL Server on Linux

I have a new video out:

In this video, we harden our SQL Server instance in two ways: by using a firewall to limit inbound traffic, and by using a certificate to force encrypted connections to SQL Server.

This was a video I enjoyed creating. It also shows the progress of SQL Server security: go back to 2005 (pre-SP1) and even SQL authentication over TDS was unencrypted by default. They fixed it so that the authentication would use a self-signed cert but the data you’d get back from query results was unencrypted. Nowadays, encryption is easy (if you’re okay with a self-signed cert) and some future version of SQL Server will make it mandatory.

Comments closed

Redgate State of the Database Landscape Results

Louis Davidson review the results:

Every year, Redgate surveys technologists to ask a big question (through lots of little questions, naturally.) This year’s question was about their current data platform configuration and usage. Just before it was released, I read the results, and I have to say, some of the things I learned amazed me…until I thought a bit more about it.

Read on for what amazed Louis and then check out the survey results yourself.

Comments closed