2021-11-29 – Curated SQL

When it comes to data operations, Spark provides a tremendous advantage as a resource for data operations because it aligns with the things that make data ops valuable. It is optimized for machine learning and AI, which are used for batch processing (in real-time and at scale), and it is adept at operating within different types of environments.
Spark doesn’t completely manage these clusters of machines but instead uses a cluster manager (known as a scheduler). Most companies have traditionally used the Java Virtual Machine (JVM)-based Hadoop YARN to manage their clusters. But with the dramatic rise of Kubernetes and cloud-native computing, many organizations are moving away from YARN to Kubernetes to manage their Spark clusters. Spark on Kubernetes is even now generally available since the Apache Spark 3.1 release in March 2021.

I see some of the benefits there but am not totally sold, especially given the complexity of Kubernetes and its own lack of built-in security measures.

Comments closed

Ordered Set Functions in SQL Server

Published 2021-11-29 by Kevin Feasel

I continue a series on window functions in SQL Server:

As of SQL Server 2019, there is only one ordered set function: STRING_AGG(). I like STRING_AGG() a lot, especially because it means my days of needing to explain the STUFF() + FOR XML PATH trick to concatenate values together in SQL Server are numbered.
STRING_AGG() is interesting in that we categorize it as a window function and yet it violates my first rule of window functions: there isn’t an OVER() clause. Instead, it accepts but does not require a WITHIN GROUP() clause. Let’s see it in action.

Click through for a look at that, as well as a little hint that maybe we’ve seen ordered set functions before in a different guise.

Comments closed

FAST_FORWARD and Cursors

Published 2021-11-29 by Kevin Feasel

Joe Obbish skips past the commercials:

If you’re like me, you started your database journey by defining cursors with the default options. This went on until a senior developer or DBA kindly pointed out that you can get better performance by using the FAST_FORWARD option. Or maybe you were a real go-getter and found Aaron Bertrand’s performance benchmarking blog post on different cursor options. I admit that for many years I didn’t care to know why FAST_FORWARD sometimes made my queries faster. It had “FAST” in the name and that was good enough for me.
Recently I saw a production issue where using the right cursor options led to a 1000X performance improvement. I decided that ten years of ignorance was enough and finally did some research on different cursor options. This post contains a reproduction and discussion of the production issue.

I thought everybody knew how this works: the database streams the data tape from the supply pully to the play shaft by using a sprocket to rotate the gear in the cassette at a fixed speed. The FAST_FORWARD cursor option engages the fast forward idler in the ~~VCR~~ database and causes rotation to occur more rapidly than normal.

Comments closed

New Power BI Tab Order Icons

Published 2021-11-29 by Kevin Feasel

Meagan Longoria looks at some helpful new icons in Power BI’s Selection pane:

If you’ve visited the Tab order area of the Selection Pane in Power BI in the last couple of months, you might have noticed some new buttons.

Click through to see the three functions in action.

Comments closed

Setting CONTEXT_INFO from Power Query

Published 2021-11-29 by Kevin Feasel

Chris Webb is on a mission:

In my quest to check out every last bit of obscure Power Query functionality, this week I looked into the ContextInfo option on the Sql.Database and Sql.Databases M functions. This option allows you to set CONTEXT_INFO in SQL Server (see here for a good article explaining what this is) and here’s an example of how to use it:

Click through for that example.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Day: November 29, 2021

Managing Spark on Kubernetes instead of YARN

Ordered Set Functions in SQL Server

FAST_FORWARD and Cursors

New Power BI Tab Order Icons

Setting CONTEXT_INFO from Power Query