Press "Enter" to skip to content

Day: October 4, 2021

Apache Flink 1.14.0 Released

Stephan Ewen and Johannes Moser have aa round-up of the latest Apache Flink updates:

The Apache Software Foundation recently released its annual report and Apache Flink once again made it on the list of the top 5 most active projects! This remarkable activity also shows in the new 1.14.0 release. Once again, more than 200 contributors worked on over 1,000 issues. We are proud of how this community is consistently moving the project forward.

This release brings many new features and improvements in areas such as the SQL API, more connector support, checkpointing, and PyFlink. A major area of changes in this release is the integrated streaming & batch experience. We believe that, in practice, unbounded stream processing goes hand-in-hand with bounded- and batch processing tasks, because many use cases require processing historic data from various sources alongside streaming data. Examples are data exploration when developing new applications, bootstrapping state for new applications, training models to be applied in a streaming application, or re-processing data after fixes/upgrades.

Read on for the list of changes.

Comments closed

Emulating Visual Deficiencies with Edge DevTools

Chris Webb shows off a feature in Microsoft Edge:

I’m not an expert on Power BI report accessibility like Meagan Longoria but I do know how important an issue accessibility is. I also know how difficult it can be to remember to check for accessibility issues when building reports which is why, when I was watching this video on new features in Edge DevTools, I was pleased to see that Edge now makes it easy to see how your report looks like when viewed by someone with vision deficiencies.

Click through to see this extension in action.

Comments closed

Working with DACPACs

Chad Callihan has a two-parter. First up is the process of creating a DACPAC:

DAC is the abbreviation for data-tier application and is an item containing the objects of a database. When put together into a versioned file to be used for deploying in SQL Server, that package is called a DACPAC. You can point a DACPAC at an existing database to deploy changes or use a DACPAC to create a brand new database with tables, procedures, and the rest of the database objects built in.

And once you have a DACPAC, you of course need something to do with it:

What if we want to use our DACPAC to create a new database using SSMS? Our first step will be to right click on Databases and select Deploy Data-tier Application.

As a concept, DACPACs are really great: I can package together tables, database code, and reference data in one bundle and import it into a variety of products (e.g., database projects, other SQL Server instances, Azure SQL Database). The problem is that once they get beyond a certain size, DACPACs are rather unwieldy and failure-prone to create and to extract.

Comments closed

Implementing LAG() and LEAD() in Power Query

Nick Edwards wants to perform a lag on data in Power Query:

I’m sure you’ve all heard of windowing functions in SQL Server? But what if you need to replicate these windowing functions in Power Query?

Here I have a sample CarSales.csv dataset of car sales per brand, per year, per month and the number of units sold. I will use this dataset to demonstrate how we can replicate the windowing LAG function in Power Query.

Click through to see the process.

Comments closed

Automating Semantic Versioning with Azure DevOps

Dave Ruijter shows how you can use Azure DevOps to perform automatic semantic versioning:

I am a fan of using semantic versioning (a.k.a. SemVer) for data solutions, following the v1.0.0 pattern. It helps in the communication between team members and stakeholders, by limiting ambiguity and misunderstandings related to the version of your solution’s releases. With semantic versioning, the trick is to increment the version according to the changes you have made since the latest release. Manually keeping track of that is not an easy task, especially for small teams, without the capacity to have somebody dedicated to this administration task. I found a way to make this a lot easier, leaning on the Pull Request description! And as a bonus, we will create some nice release notes automatically

Click through to see what you need to have set up on your Azure DevOps subscription and a detailed walkthrough of how to set it up.

Comments closed

Using the FIRST_VALUE() Window Function

Dave Mason explains what FIRST_VALUE() does:

Last week, I found myself with a T-SQL problem. A customer has a database where multiple “widget” records might exist in a table. End users might enter the duplicate rows, or they might be inserted by other automated processes. My task was to keep the most recent widget record. But, if there were any NULL values, look in the previous “duplicate” rows and use the most recent missing value for each column.

Click through to see it in action. And following up from Dave’s last point, IGNORE NULLS did originate in Azure SQL Edge, though hopefully we see it in the next version of on-premises SQL Server.

Comments closed