Curated SQL – Page 525 – A Fine Slice Of SQL Server

Disabling Public Network Access in Synapse

Published 2023-02-10 by Kevin Feasel

If you disable public access to your Azure Synapse Workspace you will get the following error message when attempting to open Synapse Studio.

“Failed to load one or more resources due to forbidden issue, error code 403.”

Click through for more information about routing for Synapse resources and what you’d need to do in order to disable public network access entirely.

Comments closed

PGSQL Phriday 005 Recap

Published 2023-02-10 by Kevin Feasel

Ryan Lambert rounds up the latest PGSQL Phriday:

Thank you everyone who contributed to PgSQL Phriday #005! This month’s topic: “Is your data relational?” If I missed any contributions, or if new ones are published, let me know and I’ll try to update this post. These snippets are in a somewhat random order, loosely threaded together by sub-topic.

Click through for the contributions.

Comments closed

Power BI Incremental Refresh with Non-Standard Dates in Parquet Files

Published 2023-02-10 by Kevin Feasel

Shabnam Watson hits on a specific but interesting use case:

The most common scenario for setting up the out of the box incremental refresh in Power BI is to base it off of a datetime column; however, there are cases when you may want to set up incremental refresh based off of a column with a data type other than datetime. Examples are when you are working with a smart date ID (01012023 for Jan 1,2023) column or when you are working with a source system that has partitioned data using a column such as Year that has a numeric data type.

A use case for the latter scenario is when you are working with Parquet/Delta files via Azure Synapse Analytics Serverless SQL Pool. When working with larger datasets, it is typical to see the Parquet/Delta files partitioned by date ranges. Depending on how much data there is, the partitioning may be at the Year level instead of Day.

With that scenario in mind, read on to learn how you can minimize your Power BI processing time and costs when doing incremental refresh.

Comments closed

Data Protection in Azure

Published 2023-02-10 by Kevin Feasel

Deepthi Goguri enumerates the ways:

Data needs to be protected no matter where it lives, On-prem or in Azure. Data can be protected by using the encryption that Azure provides. What are the types of encryption we have in Azure?

In addition to specific encryption options, Deepthi also provides an overview of Dynamic Data masking and Ledger tables.

Comments closed

Adaptive Protection and More in Microsoft Purview

Published 2023-02-10 by Kevin Feasel

Victoria Holt summarizes some Microsoft Purview updates:

Yesterday a lot of changes were announced for Microsoft Purview at the Microsoft security event securing your data with a multilayered defense. Microsoft Purview is about managing data security risks across hybrid multi-cloud data estates that have a defense in depth strategy to mitigate risk. The recording can be watched at this link.

Read on for a summary of what went down.

Comments closed

How Power BI Chooses Colors for Legends

Published 2023-02-09 by Kevin Feasel

Allison Kennedy reveals a mystery of the universe:

I’ve just had a wonderful discovery about why Power BI sometimes seems to choose random colors in the legend.

Typically, the first item in a series will match the first color of your Power BI theme, the second item in the series will match the second color of your Power BI theme, and so on.

However, this isn’t always the case. I have noticed that sometimes when I have text category values for my legend that Power BI can assign random colors, seemingly not even part of my theme. Until recently, I just accepted this as a quirk of Power BI and carried on with my report development.

Read on for the answer.

Comments closed

Migrating SQL Server Databases to Aurora

Published 2023-02-09 by Kevin Feasel

Jamie Wick wounds me:

Our decision to move some (not all) SQL databases to AWS Aurora was partially a cost decision, but also an effort to simplify operational management by diversifying our database platforms. Now that second reason might seem counter-intuitive, but here are some details about the situation and how we came to our decision:

Read on for Jamie’s thoughts and tips to make the migration process easier.

Comments closed

Thoughts on the Data Lakehouse Concept

Published 2023-02-09 by Kevin Feasel

Teo Lachev shares some thoughts:

There has been a lot of noise surrounding a data lakehouse nowadays, so I felt the urge to chime in. In fact, the famous guy in cube, Patrick LeBlanc, gave a great presentation on this subject to our Atlanta Power BI Group and you can find the recording here (I have to admit we could have done better job with the recording quality, but we are still learning in the post-COVID era).

A lot of Teo’s thoughts relate to the data lake portion of the “lakehouse” metaphor and they’re worth reading. One small comment I’ll make is regarding the following statement:

And should you stage the data 1:1 from the source? In some cases, like the Get Data First aforementioned scenario, it might make sense. But in most cases, it would be much more efficient to stage the data in the shape you need it, which may necessitate joining multiple tables at the source (by the way, a relational server is the best place to handle joins).

The biggest benefit to getting data as “raw” as possible comes from the expectation that upstream data is ephemeral. What I mean is that, if you pull data on 2023-02-08, you’ll get some dataset. On 2023-04-08, you won’t be able to re-create that original dataset because your source will have changed, with inserts, updates, and deletes modifying the source.

Therefore, you want your copy of data into the Raw/Bronze layer to be as “opinion-free” as possible. This is because, if your data access code has a bug in it, you’ll never be able to re-create the old data. By contrast, if you have that raw data, you can start from there, re-apply your transformations, and replace your downstream results with corrected details. This is similar to the way a write-once service like Apache Kafka can benefit: you get history where there was none before.

So that’s the trade-off on the other side. Yeah, it’s typically more efficient to create queries and operate on the data before it goes into Raw, and you might need to do so for practical reasons, but there’s a risk of that code causing an irrevocable data problem and the risk goes up as you get more complicated operations.

Comments closed

Last Refresh Date Measure in Power BI

Published 2023-02-09 by Kevin Feasel

Paul Turley keeps it fresh:

Business users often need to know how fresh the data is that they see on a Power BI report. This requirement can be interpreted to mean either “when was the last date & time that the source data was updated?” or “when was the last date & time that the data model was refreshed?”

Read on for Paul’s current solution to the problem of knowing the last time this data was refreshed.

Comments closed

Azure Synapse Analytics January 2023 Updates

Published 2023-02-09 by Kevin Feasel

Ryan Majidimehr shares an overview of what the Synapse team has been working on:

ADX contains native support for detecting anomalies over multiple time series by using the function series_decompose_anomalies(). This function can analyze thousands of time series in seconds, enabling near real time monitoring solutions and workflows based on ADX. Univariate analysis is simpler, faster, and easily scalable and is applicable to most real-life scenarios. However, there are some cases where it might miss anomalies that can only be detected by analyzing multiple metrics at the same time.

For some scenarios, there might be a need for a true multivariate model, jointly analyzing multiple metrics. This can be achieved now in ADX using the new Python-based multivariate anomaly detection functions.

The themes for this month are Spark, Data Exploration (via Kusto), and data integration.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Curated SQL Posts