May 2022 – Page 8 – Curated SQL

Returning an Empty Table in Power Query

Published 2022-05-18 by Kevin Feasel

Imke Feldmann doesn’t want you to leave empty-handed:

When your query returns an error, sometimes it could make sense to return an empty table instead of a simple error message. And ideally, this table has the same columns and types than the table that would have been returned without the error. Check out Chris Webbs article about it here: Chris Webb’s BI Blog: Handling Data Source Errors In Power Query Chris Webb’s BI Blog (crossjoin.co.uk)

Read on for a function which generates an empty replica table based on some other table.

Comments closed

Installing mssql-cli on Ubuntu 22.04

Published 2022-05-18 by Kevin Feasel

Andrew Pruski runs into a problem:

I really like the mssql-cli tool. I use it pretty much everyday however it seems like it’s not being maintained anymore and as such, there are issues when trying to install it on Ubuntu 22.04.
The issue with Ubuntu 22.04 is that it has python 3.10 installed by default which the current mssql-cli is not compatible with. I did try installing previous versions of python (3.8 and 3.9) but had no luck so kept the default version.

Click through to see what Andrew did to resolve the problem. It looks like there’s already a GitHub issue for this that was opened back in October.

Comments closed

Using the HAVING Clause in Spark

Published 2022-05-16 by Kevin Feasel

Lnadon Robinson continues the Spark Starter Guide:

Having is similar to filtering (filter(), where() or where (in a SQL clause)), but the use cases differ slightly. While filtering allows you to apply conditions on your non-aggregated columns to limit the result set, Having allows you to apply conditions on aggregate functions / columns instead.

Read on for examples in Spark SQL, both as a SQL query and Scala/Python function calls.

Comments closed

Databricks Workflows

Published 2022-05-16 by Kevin Feasel

Stacy Kerkela, et al, make an announcement:

Today we are excited to introduce Databricks Workflows, the fully-managed orchestration service that is deeply integrated with the Databricks Lakehouse Platform. Workflows enables data engineers, data scientists and analysts to build reliable data, analytics, and ML workflows on any cloud without needing to manage complex infrastructure. Finally, every user is empowered to deliver timely, accurate, and actionable insights for their business initiatives.

This looks a bit like Synapse pipelines. It’ll be interesting to see how this evovles.

Comments closed

Reviewing the ReportServer Database

Published 2022-05-16 by Kevin Feasel

Ed Pollack opens the lid:

SQL Server Reporting Services is a convenient application for generating reports quickly and efficiently. Its back-end components are a bit more confusing to an unsuspecting administrator.
This article delves into the ReportServer database, revealing the tables and data that are used to power SSRS. In addition, the ability to alter data in these tables is presented as a way to avoid time-consuming migration or data modification processes.

Do be sure to check out Ed’s warning and then dive into the database.

Comments closed

TRY-CATCH in T-SQL

Published 2022-05-16 by Kevin Feasel

Kevin Wilkie breaks out the catcher’s mitt:

In our last post, we started talking about transactions and how they can help your databases have data integrity. Let’s continue that today with our friends TRY and CATCH.
TRY is just telling SQL Server that we want to try out something. CATCH tells SQL Server what to do if it completely messes up.

Read on to see how it works. As a quick note for those of you who come at this from a software developer’s background, T-SQL’s try-catch capabilities are not as robust as what you probably are used to, especially with respect to what gets caught. Most severe error messages are not catchable, so you can’t always expect that database query to go quietly in the night.

Comments closed

Replacing Common Table Expressions in ADF Dataflows

Published 2022-05-16 by Kevin Feasel

Jeet Kainth needs an alternative:

At the time of writing, it is not possible to write a query using a CTE in the source of a dataflow. However, there are a few options to deal with this limitation:
– re-write the query using subqueries instead of CTEs
– use a stored procedure that contains the query and reference the stored proc in the source of the dataflow
– write the query as a view and reference the view in the source of the dataflow (this is my preferred method and the one I will demo here)

Jeet focuses on the third alternative. I’d lean toward the second or the third alternative, myself. Probably the second one (stored procedures) but both allow me to create an interface between ADF and the database. That way, underlying table changes will be less likely to require me to make code changes in ADF.

Comments closed

Using RESTORE DATABASE to Restore Log Files

Published 2022-05-16 by Kevin Feasel

Kevin Hill learns a new trick:

How long has this worked, and why didn’t anyone tell me?
A typical restore statement from a FULL backup and a LOG backup:

Click through to see what Kevin means and how you might be able to make your life a little easier.

Comments closed

Reviewing a SQL Server Backup File

Published 2022-05-16 by Kevin Feasel

Andy Yun digs into a SQL Server backup file:

This first came up during my onboarding with Pure Storage. Anthony Nocentino (b|t) taught me that a SQL Server backup file is a byte-for-byte copy of your data, as materialized in SQL Server MDF files (assuming no backup compression or backup encryption). And that would make sense – how else would SQL Server store a copy of your data in a backup file? It does not make sense for SQL Server to alter your data when it writes it down to a backup file (again, with NO backup compression/encryption) – that’s a waste of compute and effort.
Well, I had a conversation with someone who was unclear about that assertion. I tried some Google-fu to present some supporting materials, but could not actually find any documentation, official or otherwise, to back it up. So here we are.

Click through to dive into a backup file with Andy.

Comments closed

Reviewing the Gartner BI Magic Quadrant

Published 2022-05-16 by Kevin Feasel

Matt Allington takes a look:

I update this blog article regularly to keep track of how Microsoft is progressing on its strategic journey in building a world class BI platform. Gartner released the 2022 magic quadrant for Business Intelligence in March 2022. As expected (by me at least), Microsoft is continuing its trail blazing and has maintained a clear lead over the other contenders in both ability to execute and completeness of vision. I first wrote this blog article in 2019 to show the trend over time. Since then, I have been collecting the new Gartner magic quadrant images and I have combined them all (2015 – 2022) into an animated gif file (see below). The visualisation was built using Power BI (what else) so it is very easy to see the changing nature of the competitive environment. Note that Microsoft started behind back in 2015, and arguably hit the lead in 2018.

Also check out Matt’s video on how to translate spatial values to numbers so Power BI can plot them.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Month: May 2022