Press "Enter" to skip to content

Day: January 4, 2024

An Overview of Polars

Dylan Jones talks about a Rust-based data frame library:

Polars is a high-performance DataFrame library implemented in Rust, and can be used with Rust natively or via its Python wrapper. It is designed to handle large datasets with ease, providing an user-friendly interface for data manipulation and analysis. The library offers two modules: polars-core for the core functionality, and polars-io for input/output operations, allowing you to read and write data in various formats such as CSV, JSON, Parquet, Delta and more.

Read on to see how it works in Python compared to Pandas, as well as some speed comparisons.

Comments closed

A Call for Quality

Kurt Buhler sounds the clarion call:

We have a quality problem, and it’s getting worse. It creates higher costs, hurts our productivity, and threatens our capability to achieve success. The problem: too often, we prioritize quicker results and newer features over lasting quality and consistency in the data and analytics solutions that we deliver. Too often, we don’t collect the right requirements, we don’t test, we don’t automate, and we rely on hope and heroism to save the day. The result: we’re besieged by issues, fighting constant battles against an avoidable enemy that we ourselves created.

This is a long article with a lot of depth to it. I think the topic is well worth thinking about, though it’s quite a challenge.

Comments closed

Useful Operations in dbatools

Rod Edwards shows off some nice functionality in dbatools:

I often build solutions around the dbatools functions, the below is just some of my Operational favourites. With some I’ve included the output pipe that I use most frequently, but obviously, you can view and use the output however you choose to. Clearly, DBATools has many functions to add/remove/update SQL as well, but i’m just folking on the ‘gets’ here.

Naturally, as mentioned…its powershell, you can programmatically use this for any of your automation needs. Marvellous.

The ever growing list of commands can be found here: command index – dbatools . This can prove daunting to new users of the toolset, so here’s a starter for 10.

Click through for those 10.

Comments closed

Math Operations in T-SQL

Daniel Hutmacher builds a few functions:

As part of spending waaaaaay to much time trying to solve the 2023 Advent of Code challenges, I came across multiple instances where I had to dust off some old math that I hadn’t paid attention to since I went to school back in the 90ies.

So for my own convenience, and yours, I’ve built functions for some common math that you might perhaps encounter at some point. I found this whole experience to be a great way to familiarize myself with a lot of the new functionality in SQL Server 2022, including GENERATE_SERIES(), LEAST(), GREATEST() and more. The Github repo contains a SQL Server 2019 version where I’ve built drop-in versions of the 2022 functions, but they probably won’t perform as well as the built-in stuff.

Click through for demonstrations of determining whether something is a prime number, finding the greatest common divisor and least common multiple, factorization, factorials, and even a bit of combinatorics.

Comments closed

Thoughts on Clean Code

Chad Callihan resolves to write better code:

I’ve been involved with more official development work on top of database responsibilities in the last few months, which led to the recommendation to read Clean Code by Robert C. Martin. It’s an older book published in 2011, but plenty of rules and guidelines still apply. Along with the more technical details, one area jumped out at me in Chapter 12 that can apply to anyone writing code, queries, or scripts:

Click through for a pair of salient quotations and some more thoughts from Chad.

Comments closed

Dynamic Search in SQL Server Stored Procedures

Erik Darling isn’t content with simple searches:

Like having a built-in type to make dynamic SQL more easily managed, it would also be nice to have some mechanism to manage dynamic searches.

Of course, what I mean by dynamic searches is when you have a variety of parameters that users can potentially search on, with none or few of them being required.

Erik provides two techniques and contrasts the two, so check it out.

Comments closed

Foreign Key Discovery in SQL Server & Azure SQL DB

Josephine Bush walks around town with a lantern looking for a good foreign key:

There are plenty of times I’m called upon to fix data. To do this, I must know what dependencies are in the database. Foreign keys are a crucial aspect of maintaining data integrity within relational databases. They establish relationships between tables, ensuring data references remain consistent and accurate. In an Azure SQL Database, identifying and managing foreign keys is essential for maintaining a well-structured and reliable database architecture.

Click through for a primer on foreign key constraints, a few ways to find them, and some closing thoughts on working with tables containing foreign key constraints.

Comments closed