Press "Enter" to skip to content

Month: April 2020

Dataflows vs Datasets in Power BI

Reza Rad disambiguates two Power BI concepts:

I have presented about Power BI dataflow and datasets a lot, and always one of the questions I get is: What is the difference between dataflow and dataset. So I thought better to explain it in a post and help everyone in that understanding. In this post, you will learn what the differences between these two components are, when and where you use each of them, and how they work together besides other components of Power BI.

Read on to learn where each is useful.

Comments closed

Decade Two of Hadoop

Arun Murthy takes us through decade two of Hadoop:

By the end of the first decade, we needed a fundamental rethink — not just for the public cloud, but also for on-premises. It’s also helpful to cast an eye on the various technological forces driving Hadoop’s evolution over the next decade:

– Cloud experiences fundamentally changed expectations for easy to use, self-service, on-demand, elastic consumption of software and apps as services.
– Separation of compute and storage is now practical in both public and private clouds, significantly increasing workload performance.
– Containers and kubernetes are ubiquitous as a standard operating environment that is more flexible and agile.
– The integration of streaming, analytics and machine learning — the data lifecycle — is recognized as a prerequisite for nearly every data-driven business use case.

“Core” Hadoop (not including products in the broader Hadoop ecosystem like Spark, Kafka, etc.) hit a major stress point with migration out of data centers running direct attached storage. This is how Cloudera is working to pick up some of that lost momentum.

Comments closed

SQL Server Management Studio 18.5 GA

Dinakar Nethi announces that SSMS 18.5 is now generally available:

Today, we’re sharing the release of SQL Server Management Studio (SSMS) 18.5. We have some feature updates as well as important behind the scenes updates.

You can download SQL Server Management Studio 18.5  today and review SSMS Release Notes for full details.

Hugo Kornelis recommends that you update as soon as possible:

And I need all of you to update your version. Now. Yes, right now. Here’s a link to download it. I’ll wait.

Why the rush, you ask? Because hidden in between all the little (and some big) improvements and fixes, there is one true gem. One I wish Microsoft had done … oh, let’s say two decades ago?

Click through to see what has Hugo so excited.

Comments closed

Sharing a Dataset in Power BI

Marc Lelijveld shows how you can share a dataset in Power BI:

There are many different use cases to consider where shared datasets can be an advantage. Below I have quickly listed a few advantages, but probably you can think of many more.

– Centrally managed definitions and calculations to avoid different calculations for the same metrics and different versions of the truth.
– One central load from source to Power BI dataset, which lowers the performance impact on the source system.
– Easier to kickstart the data driven analytics experience for the business users and any other self-service analytics purposes.

Sharing here doesn’t mean giving to the broader world; it’s sharing within an organization.

Comments closed

Powershell Interactive Debugging in Visual Studio Code

Jess Pomfret shows how to use the interactive debugger in Visual Studio Code to troubleshoot an issue in Powershell code:

So I figured I’d take a look and see what was happening and how we could fix it. Now I’m going to be honest with you, my usual method of debugging involves adding Write-Host 'Hi‘, or piping objects to Out-GridView. I did start down this route, but the Get-DbaRegServer function calls an internal function, and things quickly got complicated.

Luckily, the PowerShell extension for VSCode includes a debugger so we can level up our game and use that to track down our issues.

Click through to see how it works.

Comments closed

Configuring Kubernetes Pod Eviction Time

Andrew Pruski is a Kubernetes slumlord:

The default time that it takes from a node being reported as not-ready to the pods being moved is 5 minutes.

This really isn’t a problem if you have multiple pods running under a single deployment. The pods on the healthy nodes will handle any requests made whilst the pod(s) on the downed node are waiting to be moved.

But what happens when you only have one pod in a deployment? Say, when you’re running SQL Server in Kubernetes? Five minutes really isn’t an acceptable time for your SQL instance to be offline.

Click through to see how to handle this scenario.

Comments closed

Using Apache Flink to Read from Apache Kafka

Preetdeep Kumar crosses the streams:

Apache Flink provides various connectors to integrate with other systems. In this article, I will share an example of consuming records from Kafka through FlinkKafkaConsumer and producing records to Kafka using FlinkKafkaProducer.

Read on for an example. I’m glad to see that integration between these two competitors (more exactly, Flink and Kafka Streams are competitors) is so easy.

Comments closed

Logging in R

Himanshu Gupta walks us through the log4r package:

One of the most important aspect of an application is Logging. Since logs provide visibility into the behavior of a running app. Hence logs play a vital role in maintenance and enhancement of an application.

However, most of us are already aware with the importance of logging. That’s why we add them in our applications. But one thing that we are not aware of is that, the application should never be concerned with routing or storage of logs, i.e., it should not attempt to write to or manage logs or log files. Instead, each running process, within the application, writes logs to a stdout. In local environment, we can view the logs in the console whereas in staging/production environment, logs can be collated together in .log file(s).

Hence, in this blog post we will learn – how to collect, customize, and standardize R logs using log4r? But first let’s know what log4r is.

Read on for a demonstration of log4r and some of the settings you can choose.

Comments closed