Press "Enter" to skip to content

Day: April 2, 2025

Fine-Tuning a DistilBERT Model for Question Answering

Muhammad Asad Iqbal Khan builds upon a simple model:

The transformers library provides a clean and well-documented interface for many popular transformer models. Not only it makes the source code easier to read and understand, it also provided a standardize way to interact with the model. You have seen in the previous post how to use a model such as DistilBERT for natural language processing tasks. In this post, you will learn how to fine-tune the model for your own purpose. This expands the use of the model from inference to training. Specifically, you will learn:

  • How to prepare the dataset for training
  • How to train a model using a helper library

DistilBERT is a major simplification of BERT, but it comes with the advantage that it’s very easy to train on modest hardware and performance is in the same realm of acceptability as the full BERT model. Switching from DistilBERT to BERT isn’t as easy as just swapping out model classes, though it’s pretty close.

Leave a Comment

The Power of Virtual Environments in Python

I have a new video:

In this video, I explain why virtual environments are such an important concept in Python and why you should generally be using them. I also talk about virtual environments versus Docker containers and how these are not mutually exclusive.

It took me a while to understand why virtual environments make sense, and I think part of the difficulty in adapting to this mental model was that I was used to the .NET mechanism for package management: per-project library installation. Sure, there was the Global Assembly Cache (GAC) in .NET Framework and that had similar problems to installing packages in base Python installations, but we didn’t use it that often. Or at least, I’ve sublimated however many hours of pain I fought the GAC to the point that I don’t remember them anymore.

Leave a Comment

The New Fabric CLI

Hasan Abo Shally announces a CLI:

  • The Fabric CLI is now in preview
  • It offers a developer-first, file-system-inspired way to explore and manage Microsoft Fabric
  • Use it interactively or script it into your workflows — from your terminal, in seconds
  • Built on Fabric APIs, designed for automation, and constantly evolving
  • Open source is on the horizon — with plans to empower the community to extend and shape the CLI

Give it a try. Break things. Tell us what you want next.

Click through for the full announcement. The idea here is to be the az cli for Fabric. Between this and Semantic Link Labs, it will make automating tasks in Microsoft Fabric easier.

Leave a Comment

A New Dashboard for Distributed Availability Groups

David Fowler has been busy:

This comes off of the back of my last post looking at using a distributed availability group (DAG) to help facilitate a SQL server migration. SQL Server Migration Using a Distributed Availability Group

One thing that I mentioned in that post was that, although SSMS gives us a nice dashboard to check the health of our regular AGs. There’s nothing there to look at the state that the DAGs are in. The only choice that we’ve got is to tap up and compare results from a couple of DMVs on each side.

David has met that demand. Read on to see what the solution includes and how you can get your hands on it.

Leave a Comment

Calling a Microsoft Fabric REST API via Azure Data Factory

Koen Verbeeck makes the call:

Suppose you want to call a certain Microsoft Fabric REST API endpoint from Azure Data Factory (or Synapse Pipelines). This can be done using a Web Activity, and most Fabric APIs now support service principals or managed identities. Let’s illustrate with an example. I’m going to call the REST API endpoint to create a new lakehouse. 

Click through for the instructions.

Leave a Comment

Deploying and Using Custom Python Libraries in Microsoft Fabric

Miles Cole picks up from part one:

This is part 2 of my prior post that continues where I left off. I previously showed how you can use Resource folders in either the Notebook or Environment in Microsoft Fabric to do some pretty agile development of Python modules/libraries.

Now, how exactly can you package up your code to distribute and leverage it across multiple Workspaces or Environment items? How could we acomplish something like the below?

Read on for the answer.

Leave a Comment

Working around Errors Migrating to Azure SQL Managed Instance

Ben Johnston has an after-action report:

I was recently on a project to migrate a very transactional installation of SQL Server to Azure SQL Managed Instance (MI). SQL Managed Instance is a good stepping stone between a full, on-prem SQL instance / Azure VM and an Azure SQL Database. It has most of the functionality of a full, on-prem instance, with management of the SQL engine, backups, OS and underlying hardware done by Microsoft. It allows you to use cross database queries and run SQL Agent jobs, with fewer limitations than Azure SQL Database migrations.

The migration process isn’t completely seamless. During the migration of this system, we encountered several surprises. Hopefully, this will help you avoid, or at least be prepared for these differences from the on-prem version. This also reinforces the importance of testing each aspect of your migration.

This is part one of a two-parter and focuses on issues during the deployment process. Ben promises a follow-up with post-deployment issues you could run into. I expect that’s where the “What is this performance?” issues will come into play.

Leave a Comment