2025-06-13 – Curated SQL

Testing Shiny Applications

Published 2025-06-13 by Kevin Feasel

You’ve created a fantastic mockup and your client is delighted. You’re ready to move to production with your application. But one question haunts you: how can you ensure that your application will remain stable and functional through modifications and evolutions?

The answer comes down to one word: testing.

Read on to learn how you can perform unit testing, integration testing, and end-to-end testing of Shiny applications in R. H/T R-Bloggers.

Comments closed

Handling Imbalanced Data in Python

Published 2025-06-13 by Kevin Feasel

Ivan Palomares Carrascosa gives three ways to deal with imbalanced data:

Here’s the catch: having imbalanced data usually makes analysis processes more difficult, especially for machine learning models that can easily get biased toward the majority class as a result of dealing with data with a remarkably unequal class distribution, thereby ending up becoming an almost “dummy classifier” that assigns the same class to virtually everything — in the most extreme case.

This article shows several strategies to navigate and handle imbalanced datasets using two of Python’s most stellar libraries for “all things data”: Pandas and Scikit-learn.

Click through for those ways, including sample code.

Comments closed

Loading Data into Snowflake via Python

Published 2025-06-13 by Kevin Feasel

Anil Kumar Moka does a bit of data loading:

In our ongoing exploration of Snowflake data loading strategies, we’ve previously examined how to use pandas with SQLAlchemy to efficiently move data into Snowflake tables. That approach leverages pandas’ intuitive DataFrame handling and works well for many common scenarios where you’re already manipulating data in Python before loading it to Snowflake.

In this article, we’re diving deeper into the Snowflake toolbox by exploring the native Snowflake Connector for Python. While pandas offers simplicity and familiarity, the native connector provides a different set of capabilities focused on precision control and Snowflake-specific optimizations. This article explains you when and how to use this more direct approach for everything from small CSV files to massive datasets that would overwhelm pandas.

Click through for the full article.

Comments closed

Fronting Fabric APIs with Azure API Management

Published 2025-06-13 by Kevin Feasel

Ed Lima combines expensive with expensive:

Integrating Azure API Management (APIM) with Microsoft Fabric’s API for GraphQL can significantly enhance your API’s capabilities by providing robust scalability and security features such as identity management, rate limiting, and caching. This post will guide you through the process of setting up and configuring these features.

API Management is a really neat service, though it’s rather costly. That’s my biggest complaint about it, though it is a doozy.

Comments closed

Custom Libraries in Microsoft Fabric Data Engineering

Published 2025-06-13 by Kevin Feasel

Gerhard Brueckl isn’t content with the defaults:

When working with Spark or data engineering in general in Microsoft Fabric, you will sooner or later come to the point where you need to reuse some of the code that you have already written in another notebook. Best practice is to put these code pieces into a central place from where it can be referenced and reused. This way you can make sure all notebooks always use the very same code and it is also easy to develop, update and test the common functions.

As Gerhard mentions, having common notebooks with utilities is fine for when you’re getting started with development, but being able to centralize functions in proper libraries can make that code a lot more useful, not just in the context of the single notebook.

I believe that this does allow for arbitrary code execution, so someone with sufficient permissions to create a notebook and import code from arbitrary locations would be able to execute that code. I think there are ways of limiting this risk (such as not allowing your Fabric hosts to connect to any remote servers other than ones you explicitly allow), but it’s something I’d have to puzzle through.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Day: June 13, 2025

Testing Shiny Applications

Handling Imbalanced Data in Python

Loading Data into Snowflake via Python

Fronting Fabric APIs with Azure API Management

Custom Libraries in Microsoft Fabric Data Engineering