Author: Kevin Feasel

The Downside of UNISTR()

Published 2024-11-01 by Kevin Feasel

Since the new UNISTR function doesn’t provide new functionality, only convenience (“syntactic sugar” as some would say; see comment below), I would argue that it should not only use a more standard syntax, but also not waste the opportunity and provide more substantive convenience by handling several commonly used escape sequences. I suspect that the number of times people would use “\n” is several orders of magnitude more than the number of times people would inject emojis or other non-keyboard characters. Even better would be to incorporate common escape sequences into standard string parsing.

Read on for Solomon’s comment explaining why he is not a fan of UNISTR().

Comments closed

Cost Optimization in Azure

Published 2024-11-01 by Kevin Feasel

Albert McQuiston shares some advice:

Organizations using Azure Cloud services often overspend, eventually decreasing their operational efficiency. Leveraging cost-optimization techniques can help these businesses to focus on areas requiring more capital investment.

There are a few tips around specific actions you can take to understand why you’re spending so much and how to cut it down a bit. Albert also mentions but does not share a link to the Azure pricing calculator. This is a great tool if you already know what Azure resources you need and intend to price them out. It’s a real challenge getting the number close enough to right (especially for complex services with a lot of inputs, like Azure Synapse Analytics was), but can be useful in getting in the ballpark. But I also highly recommend going through a Well-Architected Review assessment, based on Azure’s Well-Architected Framework. This framework and its associated reviews cover cost-effectiveness as a key tenet.

Comments closed

Generative AI Answers: Do Not Trust, Do Verify

Published 2024-11-01 by Kevin Feasel

Erik Darling speaks wisdom:

Here’s what I’ve used it for with some success:

Creating images for Beer Gut Magazine

Summarizing long documents

Writing boilerplate stuff that I’m bad at (sales and marketing drivel, abstracts, lists of topics)

But every time I ask it to do that stuff, I really have to pay attention to what it gives me back. It’s often a reasonable starting place, but sometimes it really goes off the rails.

That’s true of technical stuff, too. Here’s where I’ve had a really bad time, and if there’s anything you know deeply and intimately, you’ll find similar problems too.

Click through for Erik’s experience. That’s pretty close to my own, and is a big part of why I refer to generative AI models as being akin to drunken interns: sure, give them assignments, but you’d better double-check every part of it.

Comments closed

Installing PostgreSQL Offline

Published 2024-11-01 by Kevin Feasel

Semab Tariq performs an installation:

Many companies, choose to store their databases in secure, closed environments—machines without internet access or outside the cloud. This is often done to maintain tight control over sensitive data and to meet strict security requirements. However installing PostgreSQL in a restricted, offline environment can be a real challenge, as it limits access to typical installation tools.

Recently, I worked on a client project with a similar setup—a secure, offline environment without internet access—where we needed to install and configure PostgreSQL from scratch. If you’re facing the challenge of setting up PostgreSQL in a closed environment, this blog will guide you through the process step-by-step.

It turns out to be pretty straightforward, so long as you can start from a machine with internet access.

Comments closed

Monitoring R Models in Production with Vetiver

Published 2024-10-31 by Kevin Feasel

Myles Mitchell continues a series on Vetiver:

In those blogs, we introduced the {vetiver} package and its use as a tool for streamlined MLOps. Using the {palmerpenguins} dataset as an example, we outlined the steps of training a model using {tidymodels} then converting this into a {vetiver} model. We then demonstrated the steps of versioning our trained model and deploying it into production.

Getting your first model into production is great! But it’s really only the beginning, as you will now have to carefully monitor it over time to ensure that it continues to perform as expected on the latest data. Thankfully, {vetiver} comes with a suite of functions for this exact purpose!

Click through for the full story.

Comments closed

A Primer on Outlier Detection

Published 2024-10-31 by Kevin Feasel

Jayita Gulati provides an overview:

Anomaly detection means finding patterns in data that are different from normal. These unusual patterns are called anomalies or outliers. In large datasets, finding anomalies is harder. The data is big, and patterns can be complex. Regular methods may not work well because there is so much data to look through. Special techniques are needed to find these rare patterns quickly and easily. These methods help in many areas, like banking, healthcare, and security.

Let’s have a concise look at anomaly detection techniques for use on large scale datasets. This will be no-frills, and be straight to the point in order for you to follow up with additional materials where you see fit.

Outlier detection is a large an interesting space. I suppose I should shill for myself a little bit and note that I wrote a book on the topic. This post provides some quick guidance around outlier detection techniques and applications, and serves as a fine starting point for digging in further.

Comments closed

Aggregate Window Functions in SQL Server

Published 2024-10-31 by Kevin Feasel

Steve Jones does a bit of aggregation:

I looked at row_number() in a previous post. Now I want to build on that and do some counting of rows with COUNT() and the OVER clause. I’ll show how this differs a bit from a normal aggregate.

Read on to see one example of how aggregation window functions (available in SQL Server since 2012) are so powerful.

Comments closed

Updates in .NET 9

Published 2024-10-31 by Kevin Feasel

Ajay Jajoo tells us what’s new:

One of the standout features of .NET 9 is its focus on performance. With numerous optimizations across the runtime and libraries, applications can expect faster execution times and reduced memory usage. This is particularly beneficial for high-load applications, making .NET 9 an ideal choice for cloud-based solutions.

.NET 9 brings various performance optimizations, including improvements in garbage collection and just-in-time (JIT) compilation.

If you work at all with C#, you’ll see some quality of life improvements in .NET 9. But given Microsoft’s policy around short-term and long-term releases, you might wait until .NET 10 in many corporate environments to see them.

Comments closed

Query Processor Ran out of Internal Resources

Published 2024-10-31 by Kevin Feasel

David Fowler explains an error:

Recently I received a cry for help over Teams. The issue was that an application was throwing up the following SQL error,

The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

I’ll be honest, that’s not one that I had seen before but it seemed pretty self explanatory. the query was just too complex for SQL to cope with. I asked what the query was, the answer was something similar to the snippet below,

Read on to learn what the problem was, as well as David’s answer. David had a simple rewrite retaining the IN clause, though you could also rewrite this with an INNER JOIN or even an EXISTS. One of those two alternative approaches might have a better performance profile, though there are no guarantees.

1 Comment

T-SQL Notebooks in Microsoft Fabric

Published 2024-10-31 by Kevin Feasel

Dennes Torres tries out T-SQL notebooks:

T-SQL Notebooks is one of the new features announced during FabCon Europe.

The most distracted could miss the fact this is a new feature at all. Yes, it is. Notebooks were capable to support Spark SQL, but T-SQL is something new.

The main examples being announced are built with data warehouses, but let me confirm and highlight this:

T-SQL Notebooks support lakehouses as well.

There is at least one limitation: DML is not supported with lakehouses.

Saving my rant about lakehouses vs warehouses in Fabric, do read what Dennes has to say about T-SQL notebooks as they exist today.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30