Press "Enter" to skip to content

Author: Kevin Feasel

Customizing Spark Settings in Microsoft Fabric Workspaces

Nikola Ilic doesn’t accept the default:

In this article, I’ll walk you through how to go from out-of-the-box default Spark configurations to a fine-tuned setup that suits your specific workloads and requirements, as well as getting you ready for the DP-700 exam.

Spark is an extremely powerful engine, but like any powerful tool, it runs best when you tune it. So, don’t always settle for default. Get dynamic—and get Spark working the way you need it to.

Click through for the explanation of functionality.

Comments closed

Backups Aren’t Enough

Kevin Hill lays out a common but very important argument:

Many IT leaders and system admins think, “We have full backups every night. We’re covered.” But when the time comes to restore, they discover:

· The backup file is corrupt.
· The storage location is inaccessible.
· The restore process takes way longer than expected.
· The recovery model wasn’t configured properly.
· The point-in-time restore doesn’t actually bring back the data they need.

At that point, it’s not a “backup strategy.” It’s a data loss incident.

The solution is to test those backups, and Kevin provides some guidance on how, as well as additional important parts of the story.

Comments closed

Using DATETRUNC() in SQL Server

Rajendra Gupta shows off a nice feature in SQL Server 2022:

Suppose you are a data strategist or analyst for an organization. You have been tasked with getting actionable insights from customers who want to track customer patterns at different intervals, such as hourly, daily, or weekly. To do this, you need to use several date functions such as DATEADDDATEDIFFDATEPART, and DATEFROMPARTS to get the required date format.

In SQL Server 2022, this got a lot easier to do using the DATETRUNC function.

Solutions using DATETRUNC() are significantly easier to read and understand than the alternative of combiningDATEADD() and DATEDIFF()

Comments closed

Microsoft Fabric Extensions for VS Code

Sunitha Muthukrishna announces a new trio of VS Code extensions:

Microsoft Fabric is changing how we handle data engineering and data science. To make things easier, Microsoft added some cool extensions for Visual Studio Code (VS Code) that help you manage Fabric artifacts and build analytical applications.

By adding these Microsoft Fabric extensions to VS Code, developers can quickly create Fabric solutions and manage their data setups right from their coding environments. Here, we’ll look at these extensions and show why they’re useful.

Click through for notes on the three extensions that are available. Note that two of them are still in preview.

Comments closed

Proactive Monitoring in Microsoft Fabric via Activator

Someleze Diko shows off a powerful feature in Microsoft Fabric:

Driving actions from real-time organizational data is important for making informed data-driven decisions and improving overall efficiency. By leveraging data effectively, organizations can gain insights into customer behaviour, operational performance, and market trends, enabling them to respond promptly to emerging issues and opportunities.

Setting alerts on KQL queries can significantly enhance this proactive approach, especially in scenarios such as customer support. For instance, by monitoring key metrics like response times, ticket volumes, and satisfaction scores, support teams can identify patterns and anomalies that may indicate underlying problems.

This helps drive home an important mental shift around “real-time intelligence.” Ignoring my standard disdain for misuse of the term “real-time,” most people will ignore the feature because of a perfectly reasonable belief: my data doesn’t come in that frequently, so I don’t really need to process it in near-real-time. But the real-time intelligence functionality isn’t necessarily just about loading in your data and making it available to users faster. Instead, think of it as acting immediately when your data does change, especially if you have multiple sources of data loading at different times during the day.

Comments closed

Analyzing Microsoft Fabric Lakehouse Query Performance

Dennes Torres takes a peek at some views:

You may have already discovered the 4 special views the lakehouse has in the queryinsights schema to track query performance. I made a video about the lakehouse special tables, but since then, they evolved a lot:

  • queryinsights.exec_requests_history
  • queryinsights.exec_sessions_history
  • queryinsights.frequently_run_queries
  • queryinsights.long_running_queries

Let’s discover what these tables have to offer for us to analyze the lakehouse performance.

Click through to see what each one of these holds.

Comments closed

Indexing for PostgreSQL in pgNow

Ryan Booz continues a series on pgNow:

In that first article, I shared how pgNow can be a lifesaver when you need immediate performance insights, highlighting features like query tuning and current activity monitoring. The tool’s ability to take periodic snapshots of query activity and spotlight active sessions has already been a significant help for early users.

Today, I wanted to look at another area of information that pgNow can help you explore during times of performance degradation or even as part of a regular database maintenance and hygiene: the Indexing tab.

Click through to see what’s in the feature and to get a free copy of the preview for pgNow.

Comments closed

SQL Server Book Recommendations

Erik Darling has a list:

They are organized by author, and in no particular order of importance, quality, or anything other than how they appear on my bookshelf.

I am saddened that the $20 I sent Erik did not make it in time for my glorious book on PolyBase in SQL Server 2019 to make it on his recommendations list.

Jokes aside, you could do a lot worse than starting off with the list Erik has. There are quite a few books I’d add to that list, but the idea is not to scare people away by recommending a stack of books as tall as they are.

Comments closed

Bulk Replacement in Power BI via TMDL

Gilbert Quevauvilliers finds and replaces:

It is great to see the advancements in Power BI with regards to TMDL.

Recently I was working on a customer’s semantic model where I was doing some optimizations in the semantic model.

One of the changes I wanted to make was to replace the Dynamic Format String for the measures.

My challenge was that there were roughly 40 measures where the Dynamic Format String needed to be updated.

I could have done this using Power BI Desktop, but that would mean making the changes 40 times.

Read on to see how Gilbert was able to make this change en masse.

Comments closed