Press "Enter" to skip to content

Author: Kevin Feasel

Power BI Exporter

Soheil Bakhshi announces a new community tool:

Apart from my website statistics showing many people want to export data from Power BI Desktop, some of my customers asked the same question. They wanted to export the curated data from their data model within Power BI Desktop to CSV format and make the curated data available for their other platforms. While all the methods mentioned in my previous blog posts are working, some users still find them complex. So I thought, we can make it better. We can make a straightforward tool that exports the data with only two clicks. So we started building the Power BI Exporter as a micro-project. We added some more ideas to the original idea of only exporting the data. We thought it is good to export the data along with the table names, column names and relationships. Having that information handy, we can quickly build the same data model as the one we exported its data but using the CSV files as the data sources. The other idea was to pack everything in a ZIP file on the fly, so we have a single ZIP file, including the tables, columns, and relationships. As a result, the first version of the Power BI Exporter is born. In this post I explain how it works.

Read on to see how to use it. Download is free, though you do have to provide them with an e-mail address.

Comments closed

Granular Deployment of Power BI Changes with ALM Toolkit

Gilbert Quevauvilliers shows off an interesting scenario:

In this blog post I am going to demonstrate how to make a granular deployment where I will create a new column in my City table, and only deploy those changes.

What this means is that by deploying only the column change to my PPU dataset, I am only updating the column in the table.

This now saves me from doing the following tasks previously:

– Time taken to refresh the PBIX file so that the data is up to date.
– Re-uploading my PBIX.
– If configured re-creating the incremental refreshing
– Time and effort to upload and wait for dataset refresh.
– Quick updates to my dataset.

I will not have to worry about saving my PBIX file, file and if configured re-creating the incremental refreshing. This saves me a lot of time and effort.

Click through to see those steps in action.

Comments closed

Temporal Table Data Retrieval Execution Plans

Hugo Kornelis digs into execution plans when retrieving data from temporal tables:

The query above will simply grab the requested data from the Products table as if it were a normal table. The corresponding history table is not used at all. To understand why this happens it is important to recall that system-versioned temporal tables always store the currently valid version of the row in the table itself; all older, no longer valid versions are stored in the accompanying history table. So the query above, which does not use any specific temporal logic, means “I don’t care about the history, I want the data as it is now”. Due to how temporal tables are designed, SQL Server only has to query the actual table for this.

Since there is nothing special or interesting about this, let’s move on to queries that do use special logic in the query to retrieve older versions of the data.

Click through for quite a bit more detail on what it looks like for less-regular queries.

Comments closed

Sending Large Messages in Kafka

The Hadoop in Real World team shows how you can send large messages in Apache Kafka:

By default the messages you can send and manage in Kafka should be less than 1 MB. To increase this limit there are few properties you need to change in both brokers and consumers.

Let’s say your messages can be up to 10 MB. So in this case your Kafka producers are producing messages up to 10 MB. So your Kafka Brokers and consumers should be able to store and receive messages up to 10 MB respectively.

Kafka Producer sends messages up to 10 MB ==> Kafka Broker allows, stores and manages messages up to 10 MB ==> Kafka Consumer receives messages up to 10 MB

Click through to see how, but also recognize that it’s usually a really bad idea to push large messages in a broker system. Even 1MB is probably going too far—I’d try to stay under 1KB if possible.

Comments closed

Synapse Studio in 5 Minutes

Kevin Chant wants 4 minutes and 58 seconds of your time:

In this post I want to do a five minute crash course about Synapse Studio. Because I have recently been asked to do this by colleagues.

In addition, I want to clear up some confusion about what you need to do before you can access Synapse Studio.

Aim of this post is for you will have a better overview of Synapse Studio within five minutes. Which happens to be the estimated reading time of this post.

Click through and be sure to start the stopwatch.

Comments closed

spkarlyr 1.6 Released

Carly Driggers announces a new release of sparklyr:

Sparklyr, an LF AI & Data Foundation Incubation Project, has released version 1.6! Sparklyr is an R Language package that lets you analyze data in Apache Spark, the well-known engine for big data processing, while using familiar tools in R. The R Language is widely used by data scientists and statisticians around the world and is known for its advanced features in statistical computing and graphics. 

Click through to see the changes.

Comments closed

Deploying an Azure Arc Enabled Data Services Controller

Chris Adkin continues a series:

If you have been following this series, you should have:

– a basic understanding of Terraform
– a Kubernetes cluster that you can connect to using kubectl
– a basic understanding of Kubernetes services
– a working metalLB load balancer
– a basic understanding of how storage works in the world of Kubernetes
– a Kubernetes storage solution in the form of PX Store, alternatively you can use any solution (for the purposes of this series) which supports persistent volumes, however to use the backup solution in part 9 of the series you will need to use something that supports CSI

From here, Chris explains the importance of the data controller and then deploys one.

Comments closed

Embracing the XML

Grant Fritchey has some advice:

While XML is, without a doubt, a giant pain in the bottom, sometimes, the best way to deal with Extended Events is to simply embrace the XML.

Now, I know, just last week, I suggested ways to avoid the XML. I will freely admit, that is my default position. If I can avoid the XML, I will certainly do it. However, there are times where just embracing the XML works out nicely. Let’s talk about it a little.

Just need to do a little victory dance here. I didn’t explicitly say “embrace the XML” but close enough…

I think the biggest problem DBAs have with XML is that they end up treating it like a dreadful task: I need to shred XML for an extended event. But to do that, I have to learn how to query it using this quasi-language, and so they get stuck trying to fuss with something somebody else did, moving symbols around in the hopes that they get the right incantation. By contrast, a day or two really focusing in on how XQuery and XPath work would clarify a lot and make the process much simpler.

There is a fair counter-point in asking how often you’ll use this, and if the answer is “probably never,” then poke through and just try to get it working. But I’ve got a bit of bad news: “probably never” is probably wrong.

Comments closed

Indexing for Physical Join Operators

Deepthi Goguri continues a series on physical join operators:

In the Part1 of decoding the physical join operators, we learned about the different types of physical operators: Nested loops, Merge joins and Hash joins. We have seen when they are useful and how to take advantage of each for the performance of our queries. We have also seen when they are useful and when they needs to be avoided.

In this part, we will know more about these operators and how the indexes really help these operator to perform better so the queries can execute faster.

Read on to see how to define indexes for each of the three physical operators.

Comments closed