Kevin Feasel – Page 29

Testing Shiny Applications

Published 2025-06-13 by Kevin Feasel

You’ve created a fantastic mockup and your client is delighted. You’re ready to move to production with your application. But one question haunts you: how can you ensure that your application will remain stable and functional through modifications and evolutions?

The answer comes down to one word: testing.

Read on to learn how you can perform unit testing, integration testing, and end-to-end testing of Shiny applications in R. H/T R-Bloggers.

Comments closed

Loading Data into Snowflake via Python

Published 2025-06-13 by Kevin Feasel

Anil Kumar Moka does a bit of data loading:

In our ongoing exploration of Snowflake data loading strategies, we’ve previously examined how to use pandas with SQLAlchemy to efficiently move data into Snowflake tables. That approach leverages pandas’ intuitive DataFrame handling and works well for many common scenarios where you’re already manipulating data in Python before loading it to Snowflake.

In this article, we’re diving deeper into the Snowflake toolbox by exploring the native Snowflake Connector for Python. While pandas offers simplicity and familiarity, the native connector provides a different set of capabilities focused on precision control and Snowflake-specific optimizations. This article explains you when and how to use this more direct approach for everything from small CSV files to massive datasets that would overwhelm pandas.

Click through for the full article.

Comments closed

Handling Imbalanced Data in Python

Published 2025-06-13 by Kevin Feasel

Ivan Palomares Carrascosa gives three ways to deal with imbalanced data:

Here’s the catch: having imbalanced data usually makes analysis processes more difficult, especially for machine learning models that can easily get biased toward the majority class as a result of dealing with data with a remarkably unequal class distribution, thereby ending up becoming an almost “dummy classifier” that assigns the same class to virtually everything — in the most extreme case.

This article shows several strategies to navigate and handle imbalanced datasets using two of Python’s most stellar libraries for “all things data”: Pandas and Scikit-learn.

Click through for those ways, including sample code.

Comments closed

Custom Libraries in Microsoft Fabric Data Engineering

Published 2025-06-13 by Kevin Feasel

Gerhard Brueckl isn’t content with the defaults:

When working with Spark or data engineering in general in Microsoft Fabric, you will sooner or later come to the point where you need to reuse some of the code that you have already written in another notebook. Best practice is to put these code pieces into a central place from where it can be referenced and reused. This way you can make sure all notebooks always use the very same code and it is also easy to develop, update and test the common functions.

As Gerhard mentions, having common notebooks with utilities is fine for when you’re getting started with development, but being able to centralize functions in proper libraries can make that code a lot more useful, not just in the context of the single notebook.

I believe that this does allow for arbitrary code execution, so someone with sufficient permissions to create a notebook and import code from arbitrary locations would be able to execute that code. I think there are ways of limiting this risk (such as not allowing your Fabric hosts to connect to any remote servers other than ones you explicitly allow), but it’s something I’d have to puzzle through.

Comments closed

Fronting Fabric APIs with Azure API Management

Published 2025-06-13 by Kevin Feasel

Ed Lima combines expensive with expensive:

Integrating Azure API Management (APIM) with Microsoft Fabric’s API for GraphQL can significantly enhance your API’s capabilities by providing robust scalability and security features such as identity management, rate limiting, and caching. This post will guide you through the process of setting up and configuring these features.

API Management is a really neat service, though it’s rather costly. That’s my biggest complaint about it, though it is a doozy.

Comments closed

Optimizing a Snowflake Data Warehouse

Published 2025-06-12 by Kevin Feasel

Harshavardhan Yedla gives us some guidance:

Optimizing a Snowflake data warehouse (DWH) is crucial for ensuring high performance, cost-efficiency, and long-term effectiveness in data processing and analytics. The following outlines the key reasons optimization is essential:

Read on for some tips around optimizing Snowflake warehouses. A lot of this stays at a pretty high level and doesn’t provide detailed guidance, but it’s a good checklist for thinking about your own situation.

Comments closed

Power BI Model Analysis via INFO Functions in DAX

Published 2025-06-12 by Kevin Feasel

Reza Rad is leading this interrogation:

There are many DAX functions for covering day-to-day business-related calculations using measures and calculated columns. However, there is also a set of functions that can be helpful to the BI team and developers in gaining insights from the Power BI model itself. The insights can include things such as the number of both-directional relationships, the dependency of the calculations, the list of columns in tables, etc. These functions are in the category of INFO functions in DAX. Let’s see what they are and how they work.

Click through for a list, as well as how you can make use of them.

Comments closed

Refreshing SQL Analytics Endpoint Metadata in Fabric

Published 2025-06-12 by Kevin Feasel

Ancy Philip makes an announcement:

We’re excited to announce that the long-awaited refresh SQL analytics endpoint metadata REST API is now available in preview. You can now programmatically trigger a refresh of your SQL analytics endpoint to keep tables in sync with any changes made in the parent artifact, ensuring that you can keep your data up to date as needed.

Click through to see how it works.

Comments closed

Verifying SQL Server Backups via SMO

Published 2025-06-12 by Kevin Feasel

Stephen Planck does some testing:

Regularly restoring test copies of your databases is the gold-standard proof that your backups work. Between those tests, however, RESTORE VERIFYONLY offers a fast way to confirm that a backup file is readable, that its page checksums are valid, and that the media set is complete. In this post you will see how to run that command from PowerShell by invoking SQL Server Management Objects (SMO), turning a one-off verification into a repeatable step you can schedule across all your servers.

Click through for the script and explanation. I also like dbatools’ Test-DbaLastBackup command, as that can also run RESTORE VERIFYONLY but goes further and allows you to restore the backup and then run DBCC CHECKDB against its contents.

Comments closed

Track those SQL Agent Jobs

Published 2025-06-12 by Kevin Feasel

Kevin Hill satisfies Betteridge’s Law of Headlines:

Too many IT teams treat SQL Server Agent jobs like a coffee timer “Set it and forget it!”

Unfortunately, that mindset only works if everything else is perfect forever. Whether its backup jobs failing silently, index maintenance running on the wrong replica, or nobody getting alerts when things break, unattended SQL Agent jobs are one of the sneakiest ways to rack up technical debt. Let’s dig into what DBAs and non-DBAs alike need to keep an eye on to avoid job-related headaches.

Kevin includes some good tips on monitoring SQL Agent jobs. If you’re feeling paranoid or have a particularly important job to watch, it may also make sense to set up some monitoring alerts around the end results, tracking things like the latest load date (for an ETL job) or some other indicator of doneness, and have a monitoring solution independently verify this.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Author: Kevin Feasel