Press "Enter" to skip to content

Day: November 28, 2022

MLflow 2.0 Now Available

Mike Cornell announces MLflow 2.0:

Today, we are thrilled to announce the availability of MLflow 2.0. Building upon MLflow’s strong platform foundation, MLflow 2.0 incorporates extensive user feedback to simplify data science workflows and deliver innovative, first-class tools for MLOps. Features and improvements include extensions to MLflow Recipes (formerly MLflow Pipelines) such as AutoML, hyperparameter tuning, and classification support, as well modernized integrations with the ML ecosystem, a streamlined MLflow Tracking UI, a refresh of core APIs across MLflow’s platform components, and much more.

I like a lot of what MLflow does; it’ll be interesting to see how quickly different products adopt 2.0.

Comments closed

Fun with Decision Trees

Holger von Jouanne-Diedrich explains the value of decision trees, using predictive maintenance as an example:

Predictive Maintenance is one of the big revolutions happening across all major industries right now. Instead of changing parts regularly or even only after they failed it uses Machine Learning methods to predict when a part is going to fail.

If you want to get an introduction to this fascinating developing area, read on!

Click through for an example of how it works.

Comments closed

Reading Serverless SQL Pool Data with Data Factory

Koen Verbeeck wants to read from the serverless SQL pool in Azure Synapse Analytics:

We have some data we can query using the serverless SQL pools in Azure Synapse Analytics. For this blog post, I’m querying data that is stored in Azure Cosmos DB. Read the blog post How to Store Normalized SQL Server Data into Azure Cosmos DB to learn more about how that data got there.

Suppose I now want to read the data using Azure Data Factory. You can read data from Cosmos DB directly, but let’s pretend I want to do some transformations first using my favorite language: SQL. How can we do this?

Read on to learn how.

Comments closed

Hyperconverged Storage and Trace Flags

David Klee has a tip for us:

We all (should) know that running SQL Server in hyperconverged virtual environments, both on-premises and in the cloud, has some interesting trade-offs. The biggest is write latency from the hyperconverged storage platform underneath the database. We find that write latency suffers compared to traditional all-flash storage, even if the hyperconverged layer is all-flash as well, due to how the hyperconverged layer handles the larger block write that the SQL Server engine drops on it.

Read on for a trace flag which could help here.

Comments closed

Just Enough Administration and Granting Access to SQL Server

Andrew Pruski tries out a tool:

We’ve all been there as DBAs…people requesting access to the servers that we look after to be able to view certain things.

I’ve always got, well, twitchy with giving access to servers tbh…but what if we could completely restrict what users could do via powershell?

Enter Just Enough Administration. With JEA we can grant remote access via powershell sessions to servers and limit what users can do.

Click through to see how it works.

Comments closed

Partitioning Data in Power BI

Paul Turley continues a series on working with large amounts of data in Power BI:

You don’t have to have massive tables to benefit from partitioning. Even tables with a few hundred thousand records can benefit from partitioning, to improve data refresh performance and to detect source data changes. There is little maintenance overhead, so the benefits usually outweigh the cost, in terms of effort and management.

Click through for Paul’s thoughts on the topic.

Comments closed

Performance-Killing Pre-Emptive Waits

Sean Gallardy finds the real killer:

If you haven’t already read up on cooperative and preemptive scheduling or aren’t sure what those are, please read the intro to that first, otherwise you’ll be lost.

Much as I’ve discussed before, SQL Server uses a cooperative scheduling model. This doesn’t mean that Windows does, nor does it mean Windows will scheduler whatever SQL Server schedules, in fact much of the time there are many other threads that run before the ones from SQL Server, that’s the job of the operating system to figure out. Due to SQL Server using cooperative scheduling there needs to be a mechanism that exists such that when a resource not under SQL Server’s control needs interaction the scheduler can keep on scheduling and threads can switch in and out (in SQL Server, Windows does what Windows wants). Enter preemptive status and associated waits.

Click through for a deep dive on the topic.

Comments closed

RCSI and Blocking

Michael J. Swart says don’t worry, be happy:

What’s the best way to avoid most blocking issues in SQL Server? Turn on Read Committed Snapshot Isolation (RCSI). That’s it.

Do check out Erik Darling’s comment as well for one thing to keep in mind if you turn on RCSI.

The other thing to keep in mind is that, if you have WITH(NOLOCK) hanging around everywhere in your code, you won’t get as much of a benefit with RCSI until you remove them.

Comments closed