Kevin Feasel – Page 409

Occasionally the unthinkable can occur and the DBA can be left with a standby database that is no longer synchronizing with the primary. A plethora of “advice”will soon follow that discovery, most of it much like this:

“Well, ya gotta rebuild it.”

Of course the question to ask is “how far out of synch is the standby>” That question is key in determining how to attack this situation. Let’s go through the two most common occurrences of this and see how to address them.

Read on to see David’s advice.

Comments closed

Spark Defaults for Core Count and Memory

Published 2023-09-22 by Kevin Feasel

The Big Data in Real World team gives us the defaults:

spark.executor.cores controls the number of cores available for the executors.

[…]

spark.executor.memory controls the amount of memory allocated for each executor.

I did helpfully take out the first answer, so you’ll have to click through to the post in order to see the answers., as well as how cluster mode vs client mode can change things.

Comments closed

Deployment Pipelines for Microsoft Fabric

Published 2023-09-22 by Kevin Feasel

Reitse Eskens crosses a line:

It’s a bit of a challenge to keep up with all the changes, updates and all the new stuff coming out for Fabric. As I’m not really invested in the PowerBI part of the data platform (yay pie charts ;)), some things that are very common for the PowerBI community are very new to me. I have it on good authority that this blog covers a feature that is well know within PowerBI but quite new in the data engineering part. When I say that, I need to add that at the time of writing, only the PowerBI side of things are fully supported but I have very good hopes that pipelines and notebooks will be supported as well.

Supporting pie charts are fightin’ words here. Nonetheless, read on to see how deployment pipelines work in Microsoft Fabric.

2 Comments

Query Execution Concepts and SQL Server

Published 2023-09-22 by Kevin Feasel

Erik Darling answers the question, why is it so hard to figure out why my query sometimes sucks:

Sometimes people will ask me penetrating questions like “why does SQL Server choose a bad execution plan?” or “why is this query sometimes slow?”

Like many things in databases, it’s an endless spiral of multiverses (and turtles) in which many choose your own adventure games are played and, well, sometimes you get eaten by a Grue.

In this post, I’m going to talk at a high level about potential reasons for both.

Read on for a smorgasbord of factors to consider based on the steps SQL Server takes.

Comments closed

Finding SSAS Tabular Dimensions in Excel

Published 2023-09-22 by Kevin Feasel

Olivier Van Steenlandt has lost a few dimensions in the couch cushions:

A colleague reached out last week while connecting to one of our SQL Server Analysis Services models in Excel. He couldn’t find the expected Attribute folders in the model. He was looking for the following dimensions:

Of particular interest was that this colleague could not see them but Olivier could. The answer ends up being a bit surprising.

Comments closed

Grouped Scatter Plots in R

Published 2023-09-21 by Kevin Feasel

Steven Sanderson builds a scatter plot:

Data visualization is a powerful tool for gaining insights from your data. Scatter plots, in particular, are excellent for visualizing relationships between two continuous variables. But what if you want to compare multiple groups within your data? In this blog post, we’ll explore how to create engaging scatter plots by group in R. We’ll walk through the process step by step, providing several examples and explaining the code blocks in simple terms. So, whether you’re a data scientist, analyst, or just curious about R, let’s dive in and discover how to make your data come to life!

Click through for several examples of plot generation.

Comments closed

Azure Data Studio 1.46

Published 2023-09-21 by Kevin Feasel

Erin Stellato has an update for us:

We’re rolling towards conference season and the Azure Data Studio engineers have been working hard on two things:

adding new functionality

improving stability and performance within the application

Read on to see what they’ve pushed out with Azure Data Studio 1.46.

Comments closed

ORMs and Mapping Requirements

Published 2023-09-21 by Kevin Feasel

Mark Seemann is not a big fan of Entity Framework:

When I evaluate whether or not to use an ORM in situations like these, the core application logic is my main design driver. As I describe in Code That Fits in Your Head, I usually develop (vertical) feature slices one at a time, utilising an outside-in TDD process, during which I also figure out how to save or retrieve data from persistent storage.

Thus, in systems like these, storage implementation is an artefact of the software architecture. If a relational database is involved, the schema must adhere to the needs of the code; not the other way around.

To be clear, then, this article doesn’t discuss typical CRUD-heavy applications that are mostly forms over relational data, with little or no application logic. If you’re working with such a code base, an ORM might be useful. I can’t really tell, since I last worked with such systems at a time when ORMs didn’t exist.

Read on for a thoughtful argument. The only critique I have is I’d prefer stored procedures over saving SQL queries in the code.

1 Comment

Fixing Microsoft Fabric V-Order Optimization

Published 2023-09-21 by Kevin Feasel

Dennes Torres asks and answers a question:

I explained in a previous article how the Tables in a lakehouse are V-Order optimized. We noticed this configuration depends on our settings, which can be enabled or not.

One question remains: How could we check if the tables are V-Order optimized or not?

Read on for the answer, as well as a link containing more information on V-Order optimization.

Comments closed

Incremental Sort in Postgres

Published 2023-09-21 by Kevin Feasel

Umair Shahid takes us through the concept of incremental sort in PostgreSQL:

Incremental sort is a database optimization feature, introduced in PostgreSQL 13, that allows sorting to be done incrementally during the query execution process. Sorting is a common operation in database queries, often necessary when retrieving data in a specific order. PostgreSQL’s query planner uses incremental sort to improve query performance, particularly for large datasets. This feature is enabled by default in PostgreSQL 13 and above.

Read on to see how it works and some good practices which help maximize the likelihood that you can take advantage of the feature.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Author: Kevin Feasel

Oracle: RMAN and Non-Synchronizing Standby Database

Spark Defaults for Core Count and Memory

Deployment Pipelines for Microsoft Fabric

Query Execution Concepts and SQL Server

Finding SSAS Tabular Dimensions in Excel

Grouped Scatter Plots in R

Azure Data Studio 1.46

ORMs and Mapping Requirements

Fixing Microsoft Fabric V-Order Optimization

Incremental Sort in Postgres