Press "Enter" to skip to content

Day: November 20, 2020

Q&A about the Lakehouse

Terry McCann posts Q&A from Simon Whiteley’s session on Lakehouse models in Spark 3.0:

“WHILE ALL THE HADOOP PROVIDERS PROMOTED THE DATALAKE PARADIGM BACK THEN, HOW THE INDUSTRY AND THE OTHER DATA LAKE PROVIDERS ARE SHIFTING TO/CONSIDERING THE LAKE HOUSE PARADIGM?“

It’s a direction that most providers are heading in, albeit under the “unified analytics” or “modern warehouse” name rather than the “lakehouse”. But most big relational engines are moving to bring in spark/big data capabilities, other lake providers are looking to expand their SQL coverage. It’s a bit of a race to who gets to the “can do both sides as well as a specialist tool” point first. Will we see other tools championing it as a “lakehouse”, or is that term now tied too closely as a “vendor-specific” term coming from Databricks? We’ll see…

Click through for some good questions and thoughtful answers.

Comments closed

Console Coloration with Powershell

Jeffrey Hicks takes us through color changes with Powershell:

I readily admit that I spend a great deal of my day at a PowerShell prompt. My day is very much run from the command-line, and has been for quite some time. This used to be a drab, gray existence. But I’ve been finding ways to liven things up. Here’s one way.

The PSScriptTools module includes a number of custom format files with alternate views. You need to make sure the module is imported before you can use any of them.

It’s interesting just how much of a quality of life improvement file type coloration is. I’ll go out of my way to use colorized shells with bash, as well as pretty much any IDE and even Notepad++.

Comments closed

Monitoring Availability Groups via DMV

Rajendra Gupta continues a series on Availability Groups:

In the previous article, Explore dynamic management views for monitoring SQL Server Always On Availability Groups, we explored the DMV’s for the Availability group for Windows Failover Clusters.

This article takes a further step and covers useful DMV related to availability replica and databases. Let’s start our journey with this article.

Click through for the article.

Comments closed

The Filter Operator

Hugo Cornelis continues a dive into execution plan operators:

The Filter operator is used to remove rows from a result set that do not satisfy a specified condition.

Its most common usage is to implement a WHERE clause, although many elements of a WHERE clause get implemented in other ways (e.g. by pushing them down into an Index Seek or Index Scan operator), and Filter operators can also be used when the optimizer transforms other query elements into any type of filtering.

The Filter operator can only verify conditions based on the data in the current row; if data from different rows needs to be compared for the filter, the optimizer generates an execution plan that first adds the data to be compared to the row, or it uses different operators.

Click through for Hugo’s normal in-depth look.

Comments closed

Msg 7390: The Requested Operation Could Not Be Performed

Jack Vamvas fixes a problem:

 I have a SQL Server Linked Server configured , pointing to an ODBC – accessing a MongoDB driver . The test connections all work OK – and no problems running an OPENQUERY select statement using the Linked Server. 

But when I attempt to run an EXECUTE AT , and attempt to INSERT the data into a #temp table – I get an error message:

Msg 7390, Level 16, State 2, Line 6
The requested operation could not be performed because OLE DB provider “MSDASQL” for linked server “my_linked_server” does
not support the required transaction interface.

Read on to see what the problem is and how you can solve it.

Comments closed

Filter Operators in a Query

Erik Darling shares some introductory thoughts on the filter operator:

When we write queries that need to filter data, we tend to want to have that filtering happen as far over to the right in a query plan as possible. Ideally, data is filtered when we access the index.

Whether it’s a seek or  a scan, or if it has a residual predicate, and if that’s all appropriate isn’t really the question.

In general, those outcomes are preferable to what happens when SQL Server is unable to do any of them for various reasons. The further over to the right in a query plan we can reduce the number of rows we need to contend with, the better.

Read on to see where Erik takes this and stay tuned for part 2.

Comments closed