2020-04-09 – Curated SQL

Stateful Functions in Apache Flink

Published 2020-04-09 by Kevin Feasel

Stephan Ewen announces Stateful Functions 2.0:

Today, we are announcing the release of Stateful Functions (StateFun) 2.0 — the first release of Stateful Functions as part of the Apache Flink project. This release marks a big milestone: Stateful Functions 2.0 is not only an API update, but the first version of an event-driven database that is built on Apache Flink.
Stateful Functions 2.0 makes it possible to combine StateFun’s powerful approach to state and composition with the elasticity, rapid scaling/scale-to-zero and rolling upgrade capabilities of FaaS implementations like AWS Lambda and modern resource orchestration frameworks like Kubernetes.
With these features, Stateful Functions 2.0 addresses two of the most cited shortcomings of many FaaS setups today: consistent state and efficient messaging between functions.

Read on to see how it works.

Comments closed

Reading JSON in .NET from a DataTable

Published 2020-04-09 by Kevin Feasel

Hasan Savran ran into an issue parsing JSON data from SQL Server via .NET:

FOR JSON lets you return the data in JSON format. As you might know, SQL Server can return and query JSON documents, but It doesn’t have a special data type for JSON documents. You must store data as string in SQL Server. You can make SQL Server work like a NoSQL Database. Your web application can retrieve data as JSON document and you can use dynamic objects to make things flexible.
Let’s see an example first, In the following example, I retrieve data as JSON document and send it to directly to my front-end as string. JavaScript parses it and generates a grid from it. It’s very flexible because there is no schema. Front-End will display whatever SQL Server returns. You can change query and without changing any code in the middle, your grid will display the data.

There are limitations in how much JSON gets generated on the buffer at a time, so click through to see how you can rebuild the entire JSON output for a large file.

Comments closed

Querying an AS400 with PolyBase

Published 2020-04-09 by Kevin Feasel

Lee Markum proves you can read data from an AS400 via PolyBase:

Before I dive into that, why was I interested in this feature? What did I hope to gain? Well, first of all, there was definitely the motivation of wondering, “Can we get this to work?” Secondly, and more practically, the promise of SQL Server Data Virtualization is to make other data sources available without using a Linked Server and without the time it takes to develop an ETL process to move the data. On a related note, you can cut out the time it takes for an ETL job to actually move the data somewhere like a data warehouse or flattened tables for reporting. Third, the Polybase feature has a built in engine that can provide query performance not available via Linked Server. Fourth, I wanted to provide a way for developers to query data in the AS400 without having to learn the different syntax required by the AS400 iSeries. Fifth, query writers can also join the external able to local SQL Server data.

Lee used the ODBC driver functionality; click through to see how that worked and what needed to change.

Comments closed

String Parsing with SQLCLR

Published 2020-04-09 by Kevin Feasel

Josh Darnell would like you to give CLR a try:

The basic problem is this: given a string of arbitrary characters, return a new string containing only numbers. If there are no numbers, return NULL.
One use case for a function like this would be removing special characters from a phone number, although that’s not the specific goal of this function.
Doing string manipulation directly in SQL Server using T-SQL is somewhat infamously slow. This is stated very nicely in Aaron Bertrand’s (b | t) blog post Performance Surprises and Assumptions : STRING_SPLIT() (note he’s talking about splitting strings, the emphasis is mine to illustrate the broader point):
Throughout, my conclusion has been: STOP DOING THIS IN T-SQL. Use CLR or, better yet, pass structured parameters like DataTables from your application to table-valued parameters (TVPs) in your procedures, avoiding all the string construction and deconstruction altogether – which is really the part of the solution that causes performance problems.

For the life of me, I really don’t get why CLR is supposed to be so scary for DBAs. The best answer I have is that they matched the .NET term “unsafe” (which means unmanaged code) and DBAs, without a .NET background, interpreted that the wrong way.

Comments closed

Generating Scripts as a Notebook in SSMS 18.5

Published 2020-04-09 by Kevin Feasel

Emanuele Meazzo is excited about some new functionality in SQL Server Management Studio 18.5:

Microsoft just dropped SSMS 18.5 after almost 5 long months without any updates; this new release fixes a lot of bugs and introduces a few new features, above them all I’m now showing you the following.
I’m sure that you used the the “Generate Scripts” feature in SSMS quite a few times, you could generate the code for schema and/or data for any (or all) the objects in your DB, especially if you haven’t embraced that sweet, elusive, devops workflow.
Well, good news! Other than file, clipboard and the good ‘o new query window, you can now export directly to a new Notebook.

Read on to see what you have to do and what the output looks like.

Comments closed

All About Table Expressions

Published 2020-04-09 by Kevin Feasel

Itzik Ben-Gan has started a series on table expressions:

Perhaps this will come as a surprise to some, but I actually do find the use of the term table in common table expression as very appropriate. In fact, I find the use of the term table expression as appropriate. To me, the best way to describe what a CTE is in T-SQL, it’s a named table expression. The same applies to what T-SQL calls derived tables (the specific language construct as opposed to the general idea), views and inline TVFs. They are all named table expressions.
If you can bear with me a bit, I’ll provide the reasoning for my view of things in this article. It occurred to me that both the naming confusion, and the confusion around whether there’s a persistency aspect to table expressions, can be cleared with a better understanding of the fundamentals of our field of relational database management systems. Those fundamentals being, relational theory, how SQL (the standard language) relates to it, and how the T-SQL dialect used in the SQL Server and Azure SQL Database implementations relates to both.

There’s a lot of depth in this post, so I recommend a careful reading.

Comments closed

Dataflows vs Datasets in Power BI

Published 2020-04-09 by Kevin Feasel

Reza Rad disambiguates two Power BI concepts:

I have presented about Power BI dataflow and datasets a lot, and always one of the questions I get is: What is the difference between dataflow and dataset. So I thought better to explain it in a post and help everyone in that understanding. In this post, you will learn what the differences between these two components are, when and where you use each of them, and how they work together besides other components of Power BI.

Read on to learn where each is useful.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Day: April 9, 2020

Stateful Functions in Apache Flink

Reading JSON in .NET from a DataTable

Querying an AS400 with PolyBase

String Parsing with SQLCLR

Generating Scripts as a Notebook in SSMS 18.5

All About Table Expressions

Dataflows vs Datasets in Power BI