Python – Page 5 – Curated SQL

Local Text Summarization via DistilBart

Published 2025-03-10 by Kevin Feasel

Muhammad Asad Iqbal Khan summarizes a document:

Text summarization represents a sophisticated evolution of text generation, requiring a deep understanding of content and context. With encoder-decoder transformer models like DistilBart, you can now create summaries that capture the essence of longer text while maintaining coherence and relevance.

In this tutorial, you’ll discover how to implement text summarization using DistilBart. You’ll learn through practical, executable examples, and by the end of this guide, you’ll understand both the theoretical foundations and hands-on implementation details. After completing this tutorial, you will know:

Click through for the article.

Comments closed

Comparing Pandas to Other Libraries for Data Processing

Published 2025-03-07 by Kevin Feasel

Vidyasagar Machupalli performs a comparison:

As discussed in my previous article about data architectures emphasizing emerging trends, data processing is one of the key components in the modern data architecture. This article discusses various alternatives to Pandas library for better performance in your data architecture.

Data processing and data analysis are crucial tasks in the field of data science and data engineering. As datasets grow larger and more complex, traditional tools like pandas can struggle with performance and scalability. This has led to the development of several alternative libraries, each designed to address specific challenges in data manipulation and analysis.

This is by no means a comprehensive test, but it does show off quite a few libraries that perform similar actions to Pandas.

Comments closed

Microsoft Fabric Shortcuts and Lakehouse Maintenance

Published 2025-03-07 by Kevin Feasel

Dennes Torres has a public service announcement:

I wrote about lakehouse maintenance before, about multiple lakehouse maintenances, published videos about this subject and provided sample code about it.

However, there is one problem: All the maintenance execution should be avoided over shortcuts.

The tables require maintenance in their original place. According to our solution advances, we start using shortcuts, lots of them. Our maintenance code should always skip shortcuts and make the maintenance only on the tables.

Click through to see how you can differentiate shortcuts from actual tables and write code to avoid shortcuts.

Comments closed

Trying out fabric-cicd

Published 2025-02-28 by Kevin Feasel

Kevin Chant tries a Python package:

In this post I want to cover my initial tests of fabric-cicd. In order to provide some tips for those looking to work with this new offering.

Just so that everybody is aware, fabric-cicd is a Python library that allows you to perform CI/CD of various Microsoft Fabric items into Microsoft Fabric workspaces. At this moment in time there is a limited number of supported item types. However, that list is increasing.

Read on for the test. It currently supports a limit amount of functionality, but it looks promising.

Comments closed

MLOps in Python with Vetiver

Published 2025-02-27 by Kevin Feasel

Myles Mitchell deploys a Python model:

Parts 1 to 3 introduced the {vetiver} package for R and outlined its far-reaching applications in MLOps. But did you know that this package is also available in Python? In this post we will provide a brief outline to getting your Python models into production using vetiver for Python.

Read on for the tutorial.

Comments closed

Migrating or Copying a Semantic Model across Microsoft Fabric Workspaces

Published 2025-02-18 by Kevin Feasel

Sandeep Pawar makes a move:

Here is a quick script to copy a semantic model from one workspace to another in the same tenant, assuming you are contributor+ in both the workspaces. I tested this for a Direct Lake model but should work for any more other semantic model. This just copies the metadata (not the data in the model) so be sure to set up other configurations (RLS members, refresh schedule, settings etc.). That can also be changed programmatically, thanks to Semantic Link Labs, but I will cover that in a future post.

Read on for the script, as well as an update from Sandeep on how you can do this even more easily.

Comments closed

Visualizing a SQL Server Kubernetes Statefulset

Published 2025-02-07 by Kevin Feasel

Andrew Pruski builds a diagram:

The other day I came across an interesting repo on github, KubeDiagrams.

What this repo does is generate Kubernetes architecture diagrams from Kubernetes manifest files…nice!

Deploying applications to Kubernetes can get complicated fast…especially with stateful applications such as SQL Server.

So having the ability to easily generate diagrams is really helpful…because we all should be documenting everything, right?

Click through for instructions and a couple of gotchas Andrew ran into along the way.

Comments closed

Automating V-Order Optimization in Microsoft Fabric

Published 2025-02-03 by Kevin Feasel

Miles Cole writes a script:

I’ve previously blogged in detail about V-Order optimization. In this post, I want to revisit the topic and demonstrate how V-Order can be strategically enabled in a programmatic fashion.

Since V-Order provides the most benefit and consistent improvement for Direct Lake Semantic Models, why not leverage platform metadata to enable it automatically—but only for Delta tables used by these models?

This will be a short blog—let’s get straight to the concept, the source code, and then move on to more strategic use of this feature.

Click through for the process and an explanation of what’s happening in the accompanying Gist.

Comments closed

Kernel Methods in Python

Published 2025-01-30 by Kevin Feasel

Matthew Mayo does a bit of kernel work:

Kernel methods are a powerful class of machine learning algorithm that allow us to perform complex, non-linear transformations of data without explicitly computing the transformed feature space. These methods are particularly useful when dealing with high-dimensional data or when the relationship between features is non-linear.

Kernel methods rely on the concept of a kernel function, which computes the dot product of two vectors in a transformed feature space without explicitly performing the transformation. This is known as the kernel trick. The kernel trick allows us to work in high-dimensional spaces efficiently, making it possible to solve complex problems that would be computationally infeasible otherwise.

Read on for the pros and cons of kernel methods and a pair of techniques that use them.

1 Comment

Near Real-Time Data Plotting in Python

Published 2025-01-16 by Kevin Feasel

Hristo Hristov wants to know where the International Space Station is:

Gathering data on events as they occur in real-time is a powerful and popular technique in scientific and industrial computing. If we can query an online REST API representing the position of the International Space Station’s (ISS), how can we visualize these data in real time? How do you plot the data points as soon as they arrive and observe changes in the station’s position immediately? Let’s look at using Python for a real time plot of data.

Click through for the solution and plenty of explanation along the way.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Category: Python