Press "Enter" to skip to content

Month: December 2024

GiST Indexes and Range Queries in PostgreSQL

Lee Asher can’t be limited to a single point:

Our Part I query used the following WHERE clause:

WHERE tsrange(o.start_time, o.end_time) && tsrange(p.enter, p.leave)

The “tsrange()” functions return timestamp ranges. But overlap queries aren’t limited to timestamps; they can be constructed from integers and floating-point values too. Imagine an arbitrage database that tracks the minimum and maximum price paid for a commodity.

Read on for examples of other types of ranges, preventing range intersection, and more.

Leave a Comment

External Data Sharing in OneLake

Jens Vestergaard shares some info about sharing some info:

At #MSIgnite Microsoft announced a new feature in Fabric that allows people from one organization to share data with people from another organization. You might ask yourself why is this even news, and rightly so. Up until last week, professionals have had to use tools like (S)FTP clients like FileZillaAzure Storage ExplorerWeTransfer or similar products in order to share data. Some of these tools are in fact hard to use and/or understand for a great number of business users – they are familiar with Windows and the Office suite and not much more. This is all to be expected, as business users in general should focus on business stuff rather than IT stuff.

Read on to see how this has changed, and an update to what I consider one of the coolest products to come out of Microsoft Fabric.

Leave a Comment

A Review of the Azure AI Foundry

Tomaz Kastrun starts a new series:

Microsoft Azure offers multiple services that enable developers to build amazing AI-powered solutions. Azure AI Foundry brings these services together in a single unified experience for AI development on the Azure cloud platform.

Until now, developers needed to work with multiple tools and web portals in a single project. With Azure AI Foundry, these tasks are now simplified and offers same environment for better collaboration.

Read on to see more about the Azure AI Foundry.

Leave a Comment

An Explanation of Boosting, Bagging, and Stacking Ensembles

Ivan Palomares Carrascosa disambiguates three terms:

Unity makes strength. This well-known motto perfectly captures the essence of ensemble methods: one of the most powerful machine learning (ML) approaches -with permission from deep neural networks- to effectively address complex problems predicated on complex data, by combining multiple models for addressing one predictive task. This article describes three common ways to build ensemble models: boosting, bagging, and stacking. Let’s get started!

My explanation, which makes sense for people who grew up during the 1980s: bagging is Voltron, boosting is Rocky, and stacking is three racoons in a trench coat.

Leave a Comment

Benchmarking Power BI Local Data Import Speed

Eugene Meidinger has all the data he needs on his desktop:

The chart above shows the number of seconds it took to load X million rows of data from a given data source, according to a profiler trace and Phil Seamark’s Refresh visualizer. Parquet is a clear winner by far, with MS Access surprisingly coming in second. Sadly the 2 GB file limit stops Access from becoming the big data format of the future.

Part of the reason I wanted to do these tests is often people on Reddit will complain that their refresh is slow and their CPU is maxed out. This is almost always a sign that they are importing oodles and oodles of CSV files. I recommended trying Parquet instead of CSV, but it’s nice to have concrete proof that it’s a better file source.

Read on for the chart. Also, don’t tell his accountants about the gaming laptop. It’s 100% for work purposes, just like my desktop PC. Only work, nothing else, IRS. The high-end GPU is for AI work. And the big screen is for doing big business.

Leave a Comment

Determining Power BI Report Fields in Use

Meagan Longoria performs a search:

Have you ever wondered where a certain field is used in a report? Or maybe you need an easy way to find broken field references in a report? Certain 3rd-party tools such as Measure Killer and Power BI Helper (not updated recently) have helped us with this task in the past. But now we can perform this task with a notebook in Fabric!

This is made possible by the Semantic Link Labs Python library. Please note that PBIR format is still in preview at the time of publishing this blog post, so use it at your own risk. Also, this works only on reports published to the Power BI service. Since this notebook is not making any changes to the report, I feel it’s pretty safe to run, but do remember that it uses CUs on your Fabric capacity while you run it.

Read on to see how it works.

Leave a Comment

Sending Data from Power Automate to Microsoft Fabric

Chris Webb uses Eventstreams:

Fabric’s Real-Time Intelligence features are, for me, the most interesting things to learn about in the platform. I’m not going to pretend to be an expert in them – far from it – but they are quite easy to use and they open up some interesting possibilities for low-code/no-code people like me. The other day I was wondering if it was possible to send events and data from Power Automate to Fabric using Eventstreams and it turns out it is quite easy to do.

Read on to see just how easy it is. And there’s a good question from a reader about using other languages, such as Powershell. Turns out the answer is yes.

Leave a Comment

First Impressions of SSMS 21

Chad Callihan has some thoughts:

Some exciting news recently as SQL Server Management Studio 21 is available in preview. Head over here to download it and experiment with the latest. I’ve worked with it a little bit so far and hope to play around more over the Thanksgiving break.

Here are my early thoughts on what I’ve seen so far.

I’ve linked to a few articles on what’s new in SSMS 21 but Chad points out a couple of things I haven’t seen from people yet.

Leave a Comment

Temp Tables in SSIS Data Sources

Andy Brownsword disappears in a flash:

When handing data we can make use of temporary tables to aid with separation or performance. However, they don’t always play nice with Integration Services packages.

If we set a source to call a procedure returning the contents of a temporary table we’ll see an error like below:

Read on for three options. It’s been a while, but I vaguely recall that you can use global temp tables (such as ##Results) and it will work, as those persist and are available to all sessions so long as there is some open session using them.

Leave a Comment