Press "Enter" to skip to content

Category: Python

Bulk Loading Data with mssql-python

Chad Callihan loads some data:

I’ve had some projects in the past that involved using Python to load data in SQL Server. It wasn’t unbearably slow, but it seemed like a process that could be faster. For that reason, a recent SQL Server blog post about bulk loading data with Python caught my eye. I decided to test out the new mssql-python 1.4.0 mentioned in that post and see how much of an impact it would make on loading speed.

Chad saw about a 10x improvement in performance. I’ve had some similar results in production environments. The mssql-python library is a legitimate improvement over the classic ODBC driver and pyodbc.

Leave a Comment

Training, Serving, and Deploying Scikit-Learn Models via FastAPI

Abid Ali Awan serves a model:

In this article, you will learn how to train a Scikit-learn classification model, serve it with FastAPI, and deploy it to FastAPI Cloud.

Topics we will cover include:

  • How to structure a simple project and train a Scikit-learn model for inference.
  • How to build and test a FastAPI inference API locally.
  • How to deploy the API to FastAPI Cloud and prepare it for more production-ready usage.

Click through for the process.

Leave a Comment

Zero-Shot Text Classification in Python

Abid Ali Awan doesn’t have time to train:

In this article, you will learn how zero-shot text classification works and how to apply it using a pretrained transformer model.

Topics we will cover include:

  • The core idea behind zero-shot classification and how it reframes labeling as a reasoning task.
  • How to use a pretrained model to classify text without task-specific training data.
  • Practical techniques such as multi-label classification and hypothesis template tuning.

This typically works best when the set of classes is quite distinct and limited in number. Once you get past several classes, the likelihood of spurious results increases considerably and that’s when you’re back to model training/fine-tuning based off of sufficient quantities of labeled data.

Leave a Comment

Comparing Techniques for Text Featurization in Classification Problems

Ivan Palomaras Carrascosa tries a few things:

In this article, you will learn how Bag-of-Words, TF-IDF, and LLM-generated embeddings compare when used as text features for classification and clustering in scikit-learn.

Topics we will cover include:

  • How to generate Bag-of-Words, TF-IDF, and LLM embeddings for the same dataset.
  • How these representations compare on text classification performance and training speed.
  • How they behave differently for unsupervised document clustering.

Click through for results. Granted, the specific embedding model can alter the quality of results, but even so, I do enjoy the comparison of techniques and the reminder that neural networks aren’t the ultimate solution to everything.

Comments closed

Web Scraping with Python

Jason Yousef has a script:

Below is a production-friendly pattern that:

  • Uses a requests.Session with retries, backoff, and a real User-Agent
  • Sets sane timeouts and handles common HTTP errors
  • Respects robots.txt (and tells you if scraping is disallowed)
  • Parses only mailto: links by default to avoid scraping personal data you shouldn’t
  • Handles pagination with a “Next” link when present
  • Exports to CSV
  • Can be run from the command line with arguments

Click through for the code, some explanation of how it works, and a few tips.

Comments closed

Using the mssql-python Driver

Hristo Hristov tries out a driver:

Programmatic interaction with SQL Server or Azure SQL from a Python script is possible using a driver. A popular driver has been pyodbc that can be used standalone or with a SQLAlchemy wrapper. SQLAlchemy on its own is the Python SQL toolkit and Object Relational Mapper for developers. In the end of 2025 Microsoft released v1 of their own Python SQL driver called mssql-python. How do you get started using mssql-python for programmatic access to your SQL Server?

Click through to see how it works. Hristo points out a couple of benefits to this driver over the classic pyodbc driver, though I’m curious if there are any performance differences between the two.

Comments closed

The Downsides of Python

Andy Brown writes a companion piece:

Four years ago I wrote a blog on this site explaining why Python is better than C# and, arguably, most other programming languages. To redress the balance, here are 10 reasons why you might want to avoid getting caught up in Python’s oh-so-tempting coils – particularly when building large, long-lived systems.

If this sounds like an attempt to have my cake and eat it, my defense is that I follow in my work what I preach here: I use Python for ad-hoc jobs, at which it is unsurpassed. For larger systems – such as our MV website – I use C#, due to its strengths in maintainability, tooling as well as the practical consideration that my personal preference for Visual Basic is not shared by the wider team.

Some of it is opinion, some of it is annoying. I’ve grown to appreciate the spacing, though it can be really painful when copying code from somewhere and the spacing gets all messed up. My short version of Python is that it requires you to have more discipline as a developer to prevent messes from occurring, and I think that’s a negative on net. But that same aspect simultaneously makes it so much easier to prototype and rapidly solve problems, so there’s a natural trade-off here.

Comments closed

Fixtures in Pytest

Jason Yousef shows off a capability in Pytest:

Pytest is one of those tools that feels obvious after you’ve used it for a bit. Tests are just functions. Assertions read like normal Python. And when you need context—database sessions, config, mock data—you reach for fixtures instead of duct tape.

Read on to see how they work. Admittedly, I don’t think I’ve used fixtures before in Pytest, but now seems like a good time to try it.

Comments closed

Hosting an ML Model with FastAPI

Kanwal Mehreen hosts a model:

In this article, you will learn how to package a trained machine learning model behind a clean, well-validated HTTP API using FastAPI, from training to local testing and basic production hardening.

Topics we will cover include:

  • Training, saving, and loading a scikit-learn pipeline for inference
  • Building a FastAPI app with strict input validation via Pydantic
  • Exposing, testing, and hardening a prediction endpoint with health checks

Let’s explore these techniques. 

I definitely enjoy how simple it is to use FastAPI.

Comments closed

A Primer on Data Analysis with Python and SQL Server

Eduardo Pivaral shows off a few examples of analysis techniques:

With the rise of cloud, automation and managed services, the role of the Database Administrator has pivoted towards Data Engineering.  The focus is to maintain, secure, and cleanse data in order for data analysis and decision making by the business.

How can we start using modern data analysis tools with our current SQL Server infrastructure? Further, how can we start providing end users and decision makers with important insights about our data, without spending extra money on enterprise data analysis tools?

Click through for demonstrations of k-means clustering for discerning categorical groups of data, simple demand forecasting, and generating customer segments.

Comments closed