April 2023 – Page 9 – Curated SQL

Data-Level Security in Power BI

Published 2023-04-11 by Kevin Feasel

Reza Rad explains different ways to secure data in Power BI:

Power BI supports the security of the data at the dataset level. This security means everyone can see the data they are authorized to see. There are different levels of that in Power BI, including Row-Level Security, Column-Level Security, and Object-Level Security. All these help Power BI Developers create one dataset but give users different views of the data from the same report. In this article, I’ll explain each of those methods and give some guidance on how to use them.

This serves as the opener to a series of articles on Power BI data security.

Comments closed

T-SQL and Fun Puzzles

Published 2023-04-11 by Kevin Feasel

Rob Farley puzzles it out:

Back in my uni days I remember a Prolog assignment to solve “each letter represents a number” puzzles, and my solution being slow. Years later I tried it again and it worked out just fine, but by then the due date was in the past and they weren’t prepared to change my grade.

While these kinds of things can be fun (more so when there aren’t uni grades dependent on the solution), there are also times that it can be fun to rewrite some code in a way that is more intuitive, or that feels clever in a profoundly simple way.

Rob shares links to a few examples along those lines.

Comments closed

Fixing the Parallelism Documentation

Published 2023-04-11 by Kevin Feasel

Erik Darling shreds the docs:

The section with the weirdest errors and omissions is right up at the top. I’m going to post a screenshot of it, because I don’t want the text to appear here in a searchable format.

That might lead people not reading thoroughly to think that I condone any of it, when I don’t.

Erik pulls no punches on this post. Hopefully the end result is that this part of the documentation improves.

Comments closed

Data Recovery with Crash-Consistent Snapshots

Published 2023-04-11 by Kevin Feasel

Andrew Pruski has a video for us:

A while back I posted a blog on how to recover data with crash consistent snapshots.

Snapshots are pretty handy in certain situations so I thought I’d show you them in action!

Want to watch the video? Click through for the video. Just make note that there’s no audio on it, so don’t turn your speakers way up in the hopes that you can hear Andrew.

Comments closed

Changes to the IaaS Agent for SQL Server on Azure VMs

Published 2023-04-11 by Kevin Feasel

Aditya Badramraju has an announcement:

SQL Server on Azure Virtual Machines is powered by the SQL IaaS Agent extension which provides many features that make managing your SQL Server easy. This blog will discuss new features and changes we’ve recently released in this extension.

Click through for those changes. I was prepared, upon seeing the “Retiring Modes” section, to have a cynical response about forcing everyone into what was effectively Full mode, but that proto-take ended up being way off base and the truth is much nicer.

Comments closed

Reading Multi-Sheet Excel Files in R

Published 2023-04-10 by Kevin Feasel

Steven Sanderson does a bit of Excel file reading:

Reading in an Excel file with multiple sheets can be a daunting task, especially for users who are not familiar with the process. In this blog post, we will walk through a sample function that can be used to read in an Excel file with multiple sheets using the R programming language.

Click through for the process, which makes use of the lapply() function and the readxl package.

Comments closed

An Overview of the Kappa Architecture

Published 2023-04-10 by Kevin Feasel

Amian Patnaik provides an overview:

The Kappa Architecture, introduced by Jay Kreps, co-founder of Confluent, is designed to handle real-time data processing in a scalable and efficient manner. Unlike the traditional Lambda Architecture, which separates data processing into batch and stream processing, the Kappa Architecture promotes a single pipeline for both batch and stream processing, eliminating the need for maintaining separate processing pipelines.

What’s interesting to me is that Lambda, an architecture which was an explicit product of its time (in the sense that it was a compromise architecture trying to do two things, the combination of which limited hardware and tooling didn’t allow), is still thriving today. Kappa, meanwhile, isn’t an architectural style that people throw around a lot anymore, at least in the circles I run around in.

Comments closed

Spark ELT in Synapse Notebooks

Published 2023-04-10 by Kevin Feasel

Liliam Leme performs some data movement:

I often receive various requests from customers while working on FastTrack projects, and I have compiled some examples to help you build your solution on top of a data lake using useful tips. Most of the examples in this post use pandas, and I hope they will be helpful for you as they were for me.

Please note that all examples in this post use pyspark.

In my scenario, I exported multiple tables from SQLDB to a folder using a notebook and ran the requests in parallel.

Read on for the examples and some of the things you can do with Spark notebooks in Azure Synapse Analytics.

Comments closed

Against Triggers in PostgreSQL

Published 2023-04-10 by Kevin Feasel

Laetitia Avrot is not a fan of triggers:

My opinion comes from years of practicing as a production DBA, then as a database consultant. As such a professional, my opinion is biased because I am never called when it works! I’ve always been called when there are problems (big problems, usually) so that I see the worst developers can do and never the best. I try to be aware of that bias, but it’s not that easy.

I am sympathetic to Laetitia’s argument but ultimately don’t agree, at least in the general case. Some of these thoughts and alternatives are Postgres-specific, so I don’t have an opinion on those.

Comments closed

Ordered Columnstore Indexes in SQL Server 2022

Published 2023-04-10 by Kevin Feasel

Ed Pollack gives us the scoop on ordered columnstore indexes:

One of the more challenging technical details of columnstore indexes that regularly gets attention is the need for data to be ordered to allow for segment elimination. In a non-clustered columnstore index, data order is automatically applied based on the order of the underlying rowstore data. In a clustered columnstore index, though, data order is not enforced by any SQL Server process. This leaves managing data order to us, which may or may not be an easy task.

To assist with this challenge, SQL Server 2022 has added the ability to specify an ORDER clause when creating or rebuilding an index. This feature allows data to be automatically sorted by SQL Server as part of those insert or rebuild processes. This article dives into this feature, exploring both its usage and its limitations.

I’ve seen a couple places where ordered columnstore indexes make enough sense to use, though not as many as I had first anticipated. That might change over time, as we see additional columnstore development.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Month: April 2023