Press "Enter" to skip to content

Day: July 19, 2022

Mounting Data Lake Storage from a Spark Pool

Kamil Nowinski runs into some trouble:

Last weekend, I played a bit with Azure Synapse from a way of mounting Azure Data Lake Storage (ADLS) Gen2 in Synapse notebook within API in the Microsoft Spark Utilities (MSSparkUtils) package. I wanted to just do a simple test, hence I followed the documentation from Microsoft: How to use file mount/unmount API in Synapse.
Having an ADLS Account already created in a subscription – should be easy peasy, right?

Read on to understand when things might be a little more complicated than they seem. And more frustrating, once you see the cause of the problem.

Comments closed

The Importance of a Proper Datamart / Data Warehouse

Teo Lachev explains why you want a datamart (or a data warehouse) for BI solutions:

I sent a proposal for implementing a classic BI solution: Azure SQL-based datamart (not Power BI datamart please), ETL, semantic model, and reports. The client had a sticker shock. Return to sender … as other BI companies that quoted can do it for half! Upon digging, it turned out the other companies would build the semantic model (aka Power BI dataset) directly on top of the data source. On a T&M basis, of course, what else? By contrast, I give fixed-price milestone-driven proposals and I don’t get paid unless I deliver and meet written and agreed upon success criteria, but that’s a different story.

So, let me count the ways as the poet would say. It’s certainly technically possible to slap a dataset on top of the data source(s). That’s what self-service BI is all about right … until it doesn’t serve anymore

Read on for more detail.

Comments closed

Write Powershell Output to a File

Kenneth Fisher needs someplace to put all of his great ideas:

Last week I posted about a request to get a list of services. The first problem I ran into was expanding the column width, now I need to get the output off my screen and into a file to send the requestor. The cmdlet out-file is perfect for this and it’s pretty simple.

Click through to see how the Out-File cmdlet works as well as another cmdlet you may wish to use instead.

Comments closed

Creating Human-Readable Intervals from Lists in T-SQL

Daniel Hutmacher is making a list and checking it twice:

If you’ve worked with reporting, you’ve probably come across the following problem. You have a list of values, say “A, B, C, D, K, L, M, N, R, S, T, U, Z” that you want to display in a more user-friendly, condensed manner, “A-D, K-N, R-U, Z”.

Today, we’re going to look at how you can accomplish this in T-SQL, and what this has to do with window functions and gaps and islands.

Read on for a really good demonstration of how powerful window functions and STRING_AGG() can be.

Comments closed

Using the WINDOW Clause in SQL Server 2022

Hasan Savran peeks through the windows:

WINDOW Operations in SQL Server can be hard to understand.  I believe one of the main reasons for that is the long and repeated code it needs. SELECT…WINDOW Clause will help us to remove repeated code and hopefully it will make the WINDOW Operations more user-friendly.

I am really happy about this syntax change, as often times, we have to re-use the same window frame (PARTITION BY and ORDER BY) or a base frame with a minor change (add one more column to the end). This removes the kind of repetition which makes queries harder to read and introduces subtle bugs.

Comments closed

Don’t Store Files in the Database

Josh Darnell provides timeless advice:

As Deborah’s invite post suggests, this is a “that one time at that client” story. I was working at a consulting firm, and we had written an app for a particular client. Part of this application’s workflow involved users uploading images alongside some other information. These were not particularly large images in the grand scheme of things – they were taken by a microscope, and were a few kilobytes each, maybe.

However, this app had been in use for a long time. And as you might have guessed from the title of this post, each of these images was stored in a single table in the database that backed this application

Yeah, that’ll be a problem… Read on for some recommendations on how to avoid the issue. One thing I would add is FileTable, which came out in SQL Server 2012. In that case, the files are actually stored on disk but are queryable via T-SQL. It introduces its own set of problems but I do have some fond feelings about having used FileTable in the past.

Comments closed

SQL Server Container Images and RHEL/Ubuntu Versions

Amit Khandelwal announces a change:

To make this approach scalable & manageable, we will publish SQL Server container images to MCR based on the distribution’s most recent version, rather than publishing it for all versions of both RHEL and Ubuntu. Here’s an example to help you understand: 

SQL Server 2017 supports Ubuntu 18.04 as the most recent distribution; thus, going forward SQL Server 2017 container images based on the Ubuntu 18.04 image will only be published to MCR and we will not publish the SQL Server 2017 container images for Ubuntu 16.04.

This seems fine to me. They focus on image support for the most recent fully-supported version and hopefully make it a bit better. I do wonder if that will change their overall Linux policy around version support, as that could be trickier to sell.

Comments closed