Press "Enter" to skip to content

Curated SQL Posts

Configuring a New Powershell Installation

Jeff Hicks starts over:

The other day on X, I was asked about what things I would setup or configure on a new PowerShell installation. This is something I actually have thought about and face all the time when I setup a new demo virtual machine. I had been meaning to build new tooling to meet this challenge, and the question provided the spark I needed to get off my butt and get it done.

Read on for some thoughts on the topic, as well as a lengthy script to get you going.

Comments closed

That the Division of Labor is Limited by the Extent of the Market

Eitan Blumin lets me quote one of my favorite chapter headings from Adam Smith:

You see, in this past decade or so, we’ve all noticed a trend where more and more businesses are looking for DBAs who know a bit about everything: RDBMS, NoSQL, Key-Value databases, Cache Databases, Full-Text Search Databases, Vector Databases, and more. In response, more and more data professionals prefer to become multi-disciplinary, doing a bit of everything… Even if it means becoming a part-time Data Engineer / Data Analyst / Data Scientist / DevOps / whatever. Otherwise, they fear that they won’t be able to find enough job opportunities.

But, there’s a cost to that approach. Sometimes a very high cost.

Eitan argues in favor of an approach with part-time experts and full-time jacks-of-many-trades. For smaller companies, I completely agree: a software development company with 5 total employees usually can’t afford a full-time DBA or networking specialist or virtualization specialist, but they can still run into problems that an expert would easily be able to solve. But I think once you get past a couple dozen IT people at a company, there’s enough scope for some critical specialization. In other words, the market within a company grows (hush, people who are going to throw Ronald Coase at me for that statement—it’s metaphorical!) and allows for additional division of labor and specialization.

Anyhow, Eitan has some tips around determining whether you’d rather be a jack-of-all-trades or a master of something.

Comments closed

Renaming Factor Levels in R

Steven Sanderson renames factor levels of a categorical variable:

Before we jump into renaming factor levels, let’s quickly recap what factors are and why they’re useful. Factors are used to represent categorical data in R. They store both the values of the categorical variables and their corresponding levels. Each level represents a unique category within the variable.

Click through for three methods you can use to pull this off.

Comments closed

A Primer on Pandas Queries

Vinod Chugani works with Pandas:

In the realm of data analysis, SQL stands as a mighty tool, renowned for its robust capabilities in managing and querying databases. However, Python’s pandas library brings SQL-like functionalities to the fingertips of analysts and data scientists, enabling sophisticated data manipulation and analysis without the need for a traditional SQL database. This exploration delves into applying SQL-like functions within Python to dissect and understand data, using the Ames Housing dataset as your canvas. The Ames Housing dataset, a comprehensive compilation of residential property sales in Ames, Iowa, from 2006 to 2010, serves as an ideal dataset for this exploration, offering a rich variety of features to analyze and derive insights from.

Click through for examples of how to use the query() function in conjunction with other Pandas functionality to answer questions of the data.

Comments closed

Recovering Deleted Oracle Connections in Visual Studio Code

Brendan Tierney goes on a search and rescue mission:

With the current early release, there is no way to organise your Database connections like you can in the full Oracle SQL Developer. We are told this will/might be possible in a future release but it might be later this year (or longer) before that feature will be available.

In a previous post, I showed how to import your connections from the full SQL Developer into SQL Dev VS Code. While this is a bit of a fudge, yet relatively straight forward to do, you may or may not want all those connections in your SQL Dev VS Code environment. Typically, you will use different tools, such as SQLcl, SQL*Plus, SQL Developer, etc to perform different tasks, and will only want those connections set up in one of those tools.

Click through to see what it takes to recover these.

Comments closed

Comparing TOP(1) + ORDER BY vs MAX() Performance in SQL Server

Andy Brownsword breaks out the stopwatch:

The TOP clause limits the number of results which are returned from a query, in this instance we’re focussing on a single result. In contrast, when using MAX we’re applying a function to our data to select the largest value from our data.

Let’s dive into some examples with the StackOverflow data, specifically the Votes table.

Read on for several scenarios and how the two perform. Things get a bit more complicated as you introduce other tables in joins and similar additional factors, but this gives you a good foundation for comparison.

Comments closed

Microsoft Fabric Governance & Administration: Tenant Settings

Nicky van Vroenhoven has a pair of posts on Microsoft Fabric administration, specifically around tenant settings. First up is a post on APIs:

Obviously, to use the Get Tenant Settings API you need to have at least Tenant.Read.All permissions, or have the Fabric Administrator role (or higher) in Azure.

There are a few use cases I see for getting these settings exported with this API:

  • Documentation purposes when you have multiple Fabric Administrators
  • Distributing tenant settings to users, and explaining why we (as a team of Administrators/within the Center of Excellence) made certain choices
  • Get notified of the changes in the tenant settings, without having to use Microsoft Defender or M365 Security & Compliance center like mentioned here

Nicky has a follow-up post on visual cues in the Tenant Settings page:

Today I want to talk about a new little addition Microsoft made to the Fabric Admin portal.

This change has actually been here for quite a while now, but I still think it’s worth mentioning because (1) I really like it, and (2) it’s also an important change that the community, and MVP’s in specific, has been requesting for quite some time.

Radhakrishnan Srinivasan and (members of) his team added visual cues to the Admin portal of Fabric.

Check out both posts for good information.

Comments closed

Query Compilation Timeouts and Query Store

Kendra Little diagnoses a problem:

Last November, a puzzle was really bothering me. Some queries from an application were timing out frequently after running for 30 seconds, but they were halfway invisible in the SQL Server.

I say “halfway invisible” because I could see the queries while they were running in SQL Server’s dynamic management views using free tools (sp_WhoIsActive and sp_BlitzWho).

But the queries had some odd characteristics:

Through the power of communication with other humans (eew, that idea sounds icky), Kendra was able to learn what the problem was and how you can track such issues outside of Query Store.

Comments closed

Improving Data Labels with Format Strings

Kurt Buhler has some new digs:

Format strings greatly improve the usefulness of your model, particularly when you apply custom and dynamic format strings creatively. However, there are considerations to keep in mind.

It is important that the values in a report display for users as they expect. Effective formatting of DAX measures and table columns improves your model, as values are easier to read and interpret. While formatting may seem simple at first glance, the flexibility of DAX and format strings in Power BI can create many opportunities for more creative and efficient tables or visuals.

Click through to understand why proper formatting of measures is so important on dashboards, and also congratulate Kurt on becoming an Italian. At least, I assume citizenship conveys when you do work for Marco & Alberto.

Comments closed