Press "Enter" to skip to content

Day: September 20, 2023

Building a Flink Application in Java

Wade Waldron talks about a (free) new course:

Recently, I got my hands dirty working with Apache Flink®. The experience was a little overwhelming. I have spent years working with streaming technologies but Flink was new to me and the resources online were rarely what I needed. Thankfully, I had access to some of the best Flink experts in the business to provide me with first-class advice, but not everyone has access to an expert when they need one. 

To share what I learned, I created the Building Flink Applications in Java course on Confluent Developer. It provides you with hands-on experience in building a Flink application from the ground up. I also wrote this blog post to walk through an example of how to do dataflow programming with Flink. I hope these two resources will make the experience less overwhelming for others.

Click through for the blog post and check out the full course if you’re so inclined.

Comments closed

Working with Histogram Breaks in R

Steven Sanderson divvies out buckets for a histogram:

Histograms divide data into bins, or intervals, and then count how many data points fall into each bin. The breaks parameter in R allows you to control how these bins are defined. By specifying breaks thoughtfully, you can highlight specific patterns and nuances in your data.

Click through to see how you can use the breaks parameter in a few different ways to customize your histogram. The default breaks in R are often reasonable, but trying a few different breaks can help you get a better understanding of the actual distribution of the data.

Comments closed

An Overview of Sankey Diagrams

Simon Rowe explains what a Sankey diagram is:

In the image, you can see how Sankey used arrows to show the flow of energy with the widths of the shaded areas proportional to the amount of heat loss as it progresses through the engine’s cycle. This series of complex relationships would be difficult for a reader to understand at a glance were they simply presented in text and data tables. Making just such a sophisticated system easier to understand is the purpose of a Sankey diagram, which visually summarises the volume and direction of flows through the stages of a process or system.

Click through for several good examples, some advice on when a Sankey diagram could make sense, and the times in which you should not use this visual.

Comments closed

Azure Blob Storage Operating System Error 86

Jose Manuel Jurado Diaz 86’d that option:

Today, I worked on a service request that our customer got the following error message: Cannot open backup device ‘https://XXX.blob.core.windows.net/NNN/YYY.bak‘. Operating system error 86(The specified network password is not correct.). RESTORE HEADERONLY is terminating abnormally. (Microsoft SQL Server, Error: 3201). Following I would like to share with you some details why this issue and the activities done to resolve it. 

Read on to get a better understanding of what this error actually means and how you can fix it.

Comments closed

Azure AD (or Entra ID) Authentication for SQL Server VMs

Deepthi Goguri enables Azure Entra ID security on a SQL Server VM in Azure:

To enable the SQL Server 2022 on a virtual machine to use Azure AD, we need below things:

Deepthi then includes the list of requirements, starting with a managed identity and moving on to permissions and other configuration. It looks like a fair number of steps, but it’s not that onerous a task once you get to it.

And this also gives me a chance to rant about Microsoft marketing a bit, as they are in the process of switching the name Azure Active Directory to Azure Entra ID. Granted, Azure Active Directory isn’t Active Directory (although it does perform a very similar task in a fairly similar way). So to remove confusion that I don’t think really existed, they changed the name to Entra ID. Except that most of the Microsoft documentation still says Azure Active Directory, and we have about a decade’s worth of blog content talking about Azure Active Directory, so when you go searching for the resolution to a problem, you’ll have to search for Azure Entra ID as well as its former name, which means people will still link the product to Azure Active Directory—at least, until the point when there’s enough blog content and documentation in place to replace the large majority of those existing blog posts—and so you renamed a product for no reason. Plus, they picked an ambiguous name that people will pronounce multiple ways: is the “ent” in Entra like “enter the dungeon” or Entra like “a delicious entrée”?

But then again, considering how many pronunciations of Azure there are, maybe this is the plan…

Comments closed

A Review of DataVeil for SQL Server Users

Brian Kelley tries out a product:

My organization typically moves production data to other environments. There are a variety of use cases:

  • Testing with the amount and frequency of production data.
  • Performing analytics on said data.
  • Delivering production-like data to a third party for their use.

We do not want to move production data around. Instead, we want to deliver “production-like” data for these use cases. Sometimes, we work with multiple systems integrated with each other, and in those cases, we need the data to match up. In other instances, we need the sensitive data, such as personal identifiable information (PII), to be altered so it’s no longer sensitive, but there’s no requirement for it to be consistent across systems.

Read on for Brian’s full review. I should also note that this is most definitely a paid product.

Comments closed

Verifying a Backup in SQL Server

Chad Callihan knows your last backup is only as good as your last restore:

Is the process of testing your backups something you know you should do but never get around to? Do you find yourself assuming all is well with backups while putting out other fires? Test-DbaLastBackup, part of the beloved dbatools, can solve your dilemma.

There are many options available when using Test-DbaLastBackup. Let’s explore a few of these options and see some examples of how to use them.

Click through to learn more about this. And you could easily put together Powershell scripts to stagger your restorations over a time frame (such as, 15% of your databases each day, so that you get to 100% by the end of the week).

Comments closed