Kafka 2.3 and Kafka Connect Improvements

Robin Moffatt goes over improvements in Kafka Connect with the release of Apache Kafka 2.3:

A Kafka Connect cluster is made up of one or more worker processes, and the cluster distributes the work of connectors as tasks. When a connector or worker is added or removed, Kafka Connect will attempt to rebalance these tasks. Before version 2.3 of Kafka, the cluster stopped all tasks, recomputed where to run all tasks, and then started everything again. Each rebalance halted all ingest and egress work for usually short periods of time, but also sometimes for a not insignificant duration of time.

Now with KIP-415, Apache Kafka 2.3 instead uses incremental cooperative rebalancing, which rebalances only those tasks that need to be started, stopped, or moved. For more details, there are available resources that you can readlisten, and watch, or you can hear the lead engineer on the work, Konstantine Karantasis, talk about it in person at the upcoming Kafka Summit.

Looks like some nice improvements here.

Python versus R (Again)

Alex Woodie looks at whether Python is dominating R in the data science space:

There is some evidence that Python’s popularity is hurting R usage. According to the TIOBE Index, Python is currently the third most popular language in the world, behind perennial heavyweights Java and C. From August 2018 to August 2019, Python usage surged by more than 3% to achieve a 10% rating (TIOBE’s proprietary metric that primarily measures search activity), easily the biggest gain among the 20 most popular languages.

R, by contrast, has not fared well lately on the TIOBE Index, where it dropped from 8th place in January 2018 to become the 20th most popular language today, behind Perl, Swift, and Go. At its peak in January 2018, R had a popularity rating of about 2.6%. But today it’s down to 0.8%, according to the TIOBE index.

I’ll say that rumors of R’s demise are premature.

Installing Microsoft Master Data Services

Garry Bargsley shows how you can install Master Data Services on SQL Server:

MDS Installation pre-requisites:
The first step is to add the IIS feature to the server where MDS is going to be installed

Follow these steps for more information
https://docs.microsoft.com/en-us/sql/master-data-services/master-data-services-installation-and-configuration?view=sql-server-2017#InstallIIS

Read on for full instructions.

The Importance of Interaction in Power BI

Marc Lelijveld continues a series on storytelling with Power BI:

Many times, I see reports with loads of visuals on the pages. This results in both a really poor performance, as well as the end user has no clue what the key message is of this report. You can always ask yourself, is this visual necessary to show on this page? What does it add to this page? Is this really needed? If not, remove it! If the visual does add some value, is it needed on this page? Maybe it is only distracting the user of where the report is about.
A good approach can be to put certain visuals on a different page or hide them by default until the user interacts with the report. Within the interaction, you will have multiple options in Power BI to interact with your user.

There’s a lot more to it, so read on.

Automate VM Shutdown

Meagan Longoria has a script to shut off an Azure VM when a SQL Agent job finishes:

The runbook sets the Azure context to the appropriate subscription (especially important when you are a guest user in someone else’s tenant). Then it checks if the VM is started. If it is, it goes into a do-while loop. This task isn’t super time sensitive (it’s just to save money when the VM isn’t in use), so it’s waiting 60 seconds and then calling the child runbook to find out if my SQL Agent job is running. This makes sure that the child runbook is called at least once. If the result is that the job is not running, it stops the VM. If the job is running, the loop starts over, waiting 60 seconds before checking again. This loop is essentially polling the job status until it sees that the job is completed.

Click through for the script.

Database Health Monitor Update

Steve Stedman has an update to the Database Health Monitor:

Today I released the August 2019 release of Database Health Monitor. This is version 2.9.

My favorite new report is the Blocking By Hour of Day report which uses the existing data that is collected by the historic monitoring feature.

Click through for the change list.

The Uniqueness of Cosmos DB Unique Keys

Hasan Savran explains the scope of unique keys in Cosmos DB:

I wrote about Unique Keys and tried to explain how they work in one of my earlier post. It’s common to use SQL Server’s Primary Key or Unique Indexes to explain Unique Keys of Azure Cosmos DB. If you have a Primary Key in a table in SQL Server, the key you defined cannot be in that table more than once. By adding a unique index or unique constraint in a table, you guarantee that no duplicate values can be in your table. The key word in both of those statement is the TABLE.

Azure CosmosDB Unique Keys do not work like Primary Key or Unique Indexes/Constraints

Read on to learn how Cosmos DB differs.

Z-Tests vs T-Tests

Stephanie Glen has a picture which explains the difference between a Z-test and a T-test:

The following picture shows the differences between the Z Test and T Test. Not sure which one to use? Find out more here: T-Score vs. Z-Score.

Click through for the picture.

Management Studio’s Staying Power

Kendra Little explains why SQL Server Management Studio isn’t going away anytime soon:

After all, SSMS is no longer the cool new kid on the block: Microsoft has shown consistent effort to develop their new tool, Azure Data Studio (the artist formerly known as SQL Operations Studio), since November 2017. Azure Data Studio is built on the modern foundation of Microsoft’s VS Code, whereas SQL Server Managed Studio is related to the legacy Visual Studio Shell.

Based on this overview, it might seem like a new SQL Server DBA or developer should primarily learn Azure Data Studio, not SSMS. And it might similarly seem like vendors should focus on developing new tooling only for Azure Data Studio.

But when you look into the details of how Azure Data Studio is being developed, it becomes clear that SSMS is still just as relevant than ever:

User base inertia is another reason, one that Kendra doesn’t mention directly. I like where Azure Data Studio is going and try to use it at least half-time. But there are a lot of people with a specific workflow they’ve developed and don’t want to change. As long as that’s a large percentage of the SQL Server population, SSMS isn’t going anywhere.

Running Power BI Desktop as a Different Account

Gilbert Quevauvilliers shows how you can run Power BI Desktop as a different domain account, even when your logged-in account is not on the domain:

This is not the greatest error message.

Fortunately, I knew that I could connect to the server so the issue was not with connectivity but more around how could I authenticate against Analysis Services.

NOTE: Analysis Services can only Authenticate against domain accounts, and that is why I got the error above.

Click through for the solution.

Categories

August 2019
MTWTFSS
« Jul  
 1234
567891011
12131415161718
19202122232425
262728293031