Python – Page 4 – Curated SQL

Data Conversion via Generative AI

Published 2025-04-15 by Kevin Feasel

The DM-32 is a Digital Mobile Radio (DMR) as well as an analog radio. You can follow the link to understand all that DMR represents when talking radios. I want to focus on the fact that you have to program the behaviors into a DMR radio. While the end result is identical for every DMR radio, how you get there, the programming software, is radically different for every single radio (unless you get a radio that supports open source OpenGD77, yeah, playing radio involves open source as well). Which means, if I have more than one DMR radio (I’m currently at 7, and no, I don’t have a problem, shut up) I have more than one Customer Programming Software (CPS) that is completely different from other CPS formats. Now, I like to set up my radios similarly. After all, the local repeaters, my hotspot, and the Talkgroups I want to use are all common. Since every CPS is different, you can’t just export from one and import to the next. However, I had the idea of using AI for data conversion. Let’s see how that works.

Click through for the scenario as well as Grant’s results. Grant’s results were pretty successful for a data mapping operation, though choice of model and simplicity of the input and output examples are important to generate the Python code.

Comments closed

Loading Data from Pandas into Snowflake

Published 2025-04-04 by Kevin Feasel

Anil Kumar Moka loads some data:

Loading data into Snowflake is a common need. Using Python and pandas is a common go-to solution for data professionals. Whether you’re pulling data from a relational database, wrangling a CSV file, or prototyping a new pipeline, this combination leverages pandas’ intuitive data manipulation and Snowflake’s cloud-native scalability. But let’s be real—data loading isn’t always a simple task.

Files go missing, connections drop, and type mismatches pop up when you least expect them. That’s why robust error handling isn’t just nice-to-have; it’s essential for anything you’d trust in production. In this guide, we’ll walk through the fundamentals of getting data into Snowflake, explore practical examples with pandas and SQLAlchemy, and equip you with the tools to build a dependable, real-world-ready pipeline. Let’s dive in and make your data loading process as smooth as possible!

Read on for a quick primer around data loading and some of the sanity checking we should be doing along the way.

Comments closed

The Power of Virtual Environments in Python

Published 2025-04-02 by Kevin Feasel

I have a new video:

In this video, I explain why virtual environments are such an important concept in Python and why you should generally be using them. I also talk about virtual environments versus Docker containers and how these are not mutually exclusive.

It took me a while to understand why virtual environments make sense, and I think part of the difficulty in adapting to this mental model was that I was used to the .NET mechanism for package management: per-project library installation. Sure, there was the Global Assembly Cache (GAC) in .NET Framework and that had similar problems to installing packages in base Python installations, but we didn’t use it that often. Or at least, I’ve sublimated however many hours of pain I fought the GAC to the point that I don’t remember them anymore.

Comments closed

Fine-Tuning a DistilBERT Model for Question Answering

Published 2025-04-02 by Kevin Feasel

Muhammad Asad Iqbal Khan builds upon a simple model:

The transformers library provides a clean and well-documented interface for many popular transformer models. Not only it makes the source code easier to read and understand, it also provided a standardize way to interact with the model. You have seen in the previous post how to use a model such as DistilBERT for natural language processing tasks. In this post, you will learn how to fine-tune the model for your own purpose. This expands the use of the model from inference to training. Specifically, you will learn:

How to prepare the dataset for training

How to train a model using a helper library

DistilBERT is a major simplification of BERT, but it comes with the advantage that it’s very easy to train on modest hardware and performance is in the same realm of acceptability as the full BERT model. Switching from DistilBERT to BERT isn’t as easy as just swapping out model classes, though it’s pretty close.

Comments closed

Deploying and Using Custom Python Libraries in Microsoft Fabric

Published 2025-04-02 by Kevin Feasel

Miles Cole picks up from part one:

This is part 2 of my prior post that continues where I left off. I previously showed how you can use Resource folders in either the Notebook or Environment in Microsoft Fabric to do some pretty agile development of Python modules/libraries.

Now, how exactly can you package up your code to distribute and leverage it across multiple Workspaces or Environment items? How could we acomplish something like the below?

Read on for the answer.

Comments closed

Building a Simple Microservice with Azure Functions

Published 2025-03-31 by Kevin Feasel

Temidayo Omoniyi takes us through an example of creating a microservice:

Today’s architecture is serverless intensive, with multiple microservices performing a particular task. Industries are beginning to move away from traditional monolithic applications, which have a single large codebase infrastructure handling everything, to an easier microservice approach.

Click through for a primer on serverless architecture, microservices, and how to create a simple Python app that acts as a microservice.

Comments closed

Converting CSV Files to Parquet Format

Published 2025-03-24 by Kevin Feasel

Michael Mayer does a bit of file conversion:

Conversion from CSV to Parquet in streaming mode? No problem for the two power houses Polars and DuckDB. We can even throw in some data preprocessing steps in-between, like column selection, data filters, or sorts.

This is certainly not the only way to perform the task, though it’s fast and effective.

Comments closed

Time-Saving Features in Scikit-Learn

Published 2025-03-20 by Kevin Feasel

Cornelius Yudha Wijaya describes a half-dozen functions:

For many people studying data science, Scikit-Learn is often the first machine learning library they encounter. It’s because Scikit-Learn offers various APIs that are useful for model development while still being easy for beginners to use.

As helpful as they may be, many features from Scikit-Learn are rarely explored and have untapped potential. This article will explore six lesser-known features that will save you time.

The calibration curve function, in particular, drew my attention, especially as I had written that by hand in the past.

Comments closed

Writing Data into a Microsoft Fabric Lakehouse via Notebook

Published 2025-03-12 by Kevin Feasel

Stepan Resl writes some code:

Since Lakehouse is one of the key items within Microsoft Fabric, it is important to know how to write data into it in various formats and using different tools. One of the most common tools is notebooks, as they provide great flexibility and speed for development and testing with graphical outputs. In this article, I want to focus primarily on the following types of notebooks:

PySpark

Python

Click through to see how it works in both notebook types.

Comments closed

Retrieving Microsoft Fabric Items using a Python-Only Notebook

Published 2025-03-11 by Kevin Feasel

Gilbert Quevauvilliers doesn’t need Spark for this:

This blog below explains how to use a Python only notebook to get all the Fabric items using the Fabric REST API.

NOTE: At the time of this blog post Feb 2025, Dataflow Gen2 is not included in the Fabric items, I am sure it will be there in the future.

NOTE II: This only gets the Fabric Items, which does not include the Power BI Items.

Despite the notes, Gilbert leads off with the main reason why you might want to use this: it takes up approximately 5% of the capacity units that a Spark-based notebook does to perform the same operation.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Category: Python