Press "Enter" to skip to content

Month: April 2020

Metadata Integrity Checks in ADF.ProcFwk

Paul Andrew has another update to the ADF metadata-driven processing framework:

With this release of the framework I wanted to take the opportunity to harden the database and add some more integrity (intelligence) to the metadata, things that go beyond the existing database PK/FK constraints. After all, this metadata drives everything that Azure Data Factory does/is about to do – so it needs to be correct. These new integrity checks take two main forms:

1. Establishing a minimum set of criteria within the metadata before the core Data Factory processing starts and creates an execution run.
2. Establishing a logical chain of pipeline dependencies across processing stages. Then providing a set of advisory checks for area’s of conflict and/or improvement.

More details on both are included against the actual stored procedure in the database changes section below.

In addition to database hardening, I’ve added a few other bits to the solution, including a PowerShell script for ADF deployments and a Data Studio Notebook to make the developer experience of implementing this code project a little nicer.

Read on to see what’s in version 1.3. Check it out on GitHub as well.

Comments closed

Power BI: The Key Didn’t Match Any Rows in the Table

Chris Webb troubleshoots an issue:

One of the most common errors you’ll see when working with Power Query in Power BI or Excel is this:

Expression.Error: The key didn’t match any rows in the table

It can occur with almost any data source and in a wide variety of different circumstances, and for new users of Power Query it can be very confusing. In this post I’ll explain what the error message means and when you’re likely to encounter it using a simple example.

TL;DR You’re probably getting this error because your Power Query query is trying to connect to a table or worksheet or something in your data source that has been deleted or been renamed.

Read on to understand exactly what it means and how you can fix your code if you get this error.

Comments closed

MERGE in Many Languages

Lukas Eder takes a look at the MERGE statement in SQL:

A few dialects support MERGE. Among the ones that jOOQ 3.13 supports, there are at least:

– Db2
– Derby
– Firebird
– H2
– HSQLDB
– Oracle
– SQL Server
– Sybase SQL Anywhere
– Teradata
– Vertica

For once, regrettably, this list does not include PostgreSQL. But even the dialects in this list do not all agree on what MERGE really is. The SQL standard specifies 3 features, each one optional:

– F312 MERGE statement
– F313 Enhanced MERGE statement
– F314 MERGE statement with DELETE branch

But instead of looking at the standards and what they require, let’s look at what the dialects offer, and how it can be emulated if something is not available.

This is a really cool overview of an area where several vendors can claim support, but that support can mean quite different things. The one caveat is, I don’t know if any of the other platforms’ MERGE operators are as busted as SQL Server’s in terms of bugs.

Comments closed

Checking Login Usage

Kenneth Fisher checks a box I really like checking:

I get asked this every now and again, along with the companion When was the last time this login was used? It’s a pretty easy question to answer but there are some caveats. First of all you need to have your system set to log both successful and failed logins. You can probably get away with successful only but personally I want to know a failed attempt just like I’d want to know a successful one.

This is a thing that we tend to avoid because of how many events it adds to the Security event log, but is critical in understanding whether that person trying to log in as sa gave up or stopped due to a successful login.

1 Comment

Stateful Functions in Apache Flink

Stephan Ewen announces Stateful Functions 2.0:

Today, we are announcing the release of Stateful Functions (StateFun) 2.0 — the first release of Stateful Functions as part of the Apache Flink project. This release marks a big milestone: Stateful Functions 2.0 is not only an API update, but the first version of an event-driven database that is built on Apache Flink.

Stateful Functions 2.0 makes it possible to combine StateFun’s powerful approach to state and composition with the elasticity, rapid scaling/scale-to-zero and rolling upgrade capabilities of FaaS implementations like AWS Lambda and modern resource orchestration frameworks like Kubernetes.

With these features, Stateful Functions 2.0 addresses two of the most cited shortcomings of many FaaS setups today: consistent state and efficient messaging between functions.

Read on to see how it works.

Comments closed

Reading JSON in .NET from a DataTable

Hasan Savran ran into an issue parsing JSON data from SQL Server via .NET:

FOR JSON lets you return the data in JSON format. As you might know, SQL Server can return and query JSON documents, but It doesn’t have a special data type for JSON documents. You must store data as string in SQL Server. You can make SQL Server work like a NoSQL Database. Your web application can retrieve data as JSON document and you can use dynamic objects to make things flexible.

     Let’s see an example first, In the following example, I retrieve data as JSON document and send it to directly to my front-end as string. JavaScript parses it and generates a grid from it. It’s very flexible because there is no schema. Front-End will display whatever SQL Server returns. You can change query and without changing any code in the middle, your grid will display the data.

There are limitations in how much JSON gets generated on the buffer at a time, so click through to see how you can rebuild the entire JSON output for a large file.

Comments closed

Querying an AS400 with PolyBase

Lee Markum proves you can read data from an AS400 via PolyBase:

Before I dive into that, why was I interested in this feature? What did I hope to gain? Well, first of all, there was definitely the motivation of wondering, “Can we get this to work?” Secondly, and more practically, the promise of SQL Server Data Virtualization is to make other data sources available without using a Linked Server and without the time it takes to develop an ETL process to move the data. On a related note, you can cut out the time it takes for an ETL job to actually move the data somewhere like a data warehouse or flattened tables for reporting. Third, the Polybase feature has a built in engine that can provide query performance not available via Linked Server. Fourth, I wanted to provide a way for developers to query data in the AS400 without having to learn the different syntax required by the AS400 iSeries. Fifth, query writers can also join the external able to local SQL Server data.

Lee used the ODBC driver functionality; click through to see how that worked and what needed to change.

Comments closed

String Parsing with SQLCLR

Josh Darnell would like you to give CLR a try:

The basic problem is this: given a string of arbitrary characters, return a new string containing only numbers. If there are no numbers, return NULL.

One use case for a function like this would be removing special characters from a phone number, although that’s not the specific goal of this function.

Doing string manipulation directly in SQL Server using T-SQL is somewhat infamously slow. This is stated very nicely in Aaron Bertrand’s (b | t) blog post Performance Surprises and Assumptions : STRING_SPLIT() (note he’s talking about splitting strings, the emphasis is mine to illustrate the broader point):

Throughout, my conclusion has been: STOP DOING THIS IN T-SQL. Use CLR or, better yet, pass structured parameters like DataTables from your application to table-valued parameters (TVPs) in your procedures, avoiding all the string construction and deconstruction altogether – which is really the part of the solution that causes performance problems.

For the life of me, I really don’t get why CLR is supposed to be so scary for DBAs. The best answer I have is that they matched the .NET term “unsafe” (which means unmanaged code) and DBAs, without a .NET background, interpreted that the wrong way.

Comments closed

Generating Scripts as a Notebook in SSMS 18.5

Emanuele Meazzo is excited about some new functionality in SQL Server Management Studio 18.5:

Microsoft just dropped SSMS 18.5 after almost 5 long months without any updates; this new release fixes a lot of bugs and introduces a few new features, above them all I’m now showing you the following.

I’m sure that you used the the “Generate Scripts” feature in SSMS quite a few times, you could generate the code for schema and/or data for any (or all) the objects in your DB, especially if you haven’t embraced that sweet, elusive, devops workflow.

Well, good news! Other than file, clipboard and the good ‘o new query window, you can now export directly to a new Notebook.

Read on to see what you have to do and what the output looks like.

Comments closed

All About Table Expressions

Itzik Ben-Gan has started a series on table expressions:

Perhaps this will come as a surprise to some, but I actually do find the use of the term table in common table expression as very appropriate. In fact, I find the use of the term table expression as appropriate. To me, the best way to describe what a CTE is in T-SQL, it’s a named table expression. The same applies to what T-SQL calls derived tables (the specific language construct as opposed to the general idea), views and inline TVFs. They are all named table expressions.

If you can bear with me a bit, I’ll provide the reasoning for my view of things in this article. It occurred to me that both the naming confusion, and the confusion around whether there’s a persistency aspect to table expressions, can be cleared with a better understanding of the fundamentals of our field of relational database management systems. Those fundamentals being, relational theory, how SQL (the standard language) relates to it, and how the T-SQL dialect used in the SQL Server and Azure SQL Database implementations relates to both.

There’s a lot of depth in this post, so I recommend a careful reading.

Comments closed