Press "Enter" to skip to content

Month: August 2025

Getting beyond Pandas

Shittu Olumide recommends a few other packages:

If you’ve worked with data in Python, chances are you’ve used Pandas many times. And for good reason; it’s intuitive, flexible, and great for day-to-day analysis. But as your datasets start to grow, Pandas starts to show its limits. Maybe it’s memory issues, sluggish performance, or the fact that your machine sounds like it’s about to lift off when you try to group by a few million rows.

That’s the point where a lot of data analysts and scientists start asking the same question: what else is out there?

Read on for seven options, including six libraries and one built-in programming technique.

Leave a Comment

Animated Maps in R with gganimate

Osheen MacOscar looks at a new version of an old package:

In this blog post, we are going to use data from the {gapminder} R package, along with global spatial boundaries from ‘opendatasoft’. We are going to plot the life expectancy of each country in the Americas and animate it to see the changes from 1957 to 2007.

The {gapminder} package we are using is from the Gapminder foundation, an independent educational non-profit fighting global misconceptions. The cover issues like global warming, plastic in the oceans and life satisfaction.

There are several common gotchas that Osheen takes us through before building an animated map of the western hemisphere.

Leave a Comment

Oracle Password-Related Profile Settings

David Fitzjarrell takes a look at some settings:

Passwords expire, and, depending upon how various profiles are configured, accounts are either locked or provided a grace period during which the old password can be changed. In any recent enterprise password verification functions are provided to police new passwords to ensure a modicum of security. Let’s dig into what Oracle provides to assist in password security.

Basic elements of password security that Oracle provides start with the profile; listed below are the associated resources:

Read on for the available options you can set on a per-profile basis.

Leave a Comment

Auto-Scale Billing for Spark in Microsoft Fabric now GA

Santhosh Kumar Ravindran announces a feature in general availability:

We’re thrilled to announce the general availability (GA) of Autoscale Billing for Apache Spark in Microsoft Fabric — a serverless billing model designed to offer greater flexibility, transparency, and cost efficiency for running Spark workloads at scale.

With this model now fully supported, Spark Jobs can run independently of your Fabric capacity and are billed on a pay-as-you-go basis — similar to how Spark works in Azure Synapse. This gives teams the freedom to scale compute as needed without impacting other workloads running on your shared Fabric capacity.

I’m of two minds here. On the one hand, there is value to having this as an option. On the other hand, one of the talking points for Microsoft Fabric is that you have one billing model. But because it’s an optional thing you can enable rather than something you must use, I’m fine with it.

Leave a Comment

Stored Procedures and Headers

Andy Brownsword lays out an argument:

Code is an ever moving target. Version control and documentation only go so far, if they even exist. Sometimes all you have is the code in front of you.

This is why I always start stored procedures with a header.

There was a time I strongly resisted this idea, but if you are diligent about keeping this up to date, it can be very useful for record-keeping, especially if your company has a tendency to switch source control systems and not keep the history between moves.

Leave a Comment