Press "Enter" to skip to content

Author: Kevin Feasel

Skipping SSIS Package Validation

Andy Leonard takes us through a new feature in Visual Studio:

A new feature in version 3.9 of the Visual Studio 2019 SSIS extension is “skip package validation.” I’ve tested the functionality and it works well. Here’s how you set it up:

Click through to see how easy it is to enable. I’d imagine that this would work really well when dealing with dynamic packages where connection details aren’t available until runtime. Waiting for validation to fail to connect to a remote data source (on the UI thread!) was always annoying.

Comments closed

Estimated I/O Costs and Spills

Hugo Kornelis goes deep into the rabbit hole:

Typically, when I look at an operator that does I/O, I expected to see values in both the Estimated CPU Cost and the Estimated I/O Cost. An Index Seek locates specific data in an index. The CPU has to do some work, but most of the cost of this operator is the actual I/O. So when I look at an execution plan and I see an Index Seek that has an Estimated CPU Cost of  0.0001787 and an Estimated I/O Cost of 0.003125 (almost 20 times as much), I am not surprise. That’s an expected ratio.

Conversely, operators that don’t do any I/O should of course be estimated to have a zero I/O cost. And that is indeed the case. Operators such as Nested LoopsAssert, or Row Count Spool will always have an Estimated I/O Cost of zero.

And then there are some operators where the Estimated I/O Cost may or may not be zero.

Hugo then walks us through a case where the optimizer is promising you a bad time if you run the query.

Comments closed

T-SQL Tuesday 129 Round-Up

Tamera Clark wraps up T-SQL Tuesday #129:

Well, I don’t even know what to say, such great ideas and creativeness. The last couple of days I have internally struggled with, as Glenn wrote this ” human malware situation”. Reading all of these posts has made me smile.

Read on for twelve people’s thoughts on this month’s topic, a time capsule for SQL Server professionals.

Comments closed

Querying Power BI from Visual Studio Code

Phil Seamark shows us how to write queries against Power BI using Visual Studio Code:

It’s helpful to understand there are two main client libraries for Analysis Services. A client library is what you can add to any new Visual Studio Code Project to provide objects, methods and functions relevant for the tool you are building.

Make sure you download the NetCore (.Net Core) versions of these libraries when working with Visual Studio Code. There are .Net Framework versions of these libraries that are more suited to use with the full Visual Studio product.

Read on for links to those libraries and a thorough demonstration.

Comments closed

Using Flink Stateful Functions

Igal Shilman gives us an example of where Apache Flink’s Stateful Functions come in handy:

In this blog post, we’ll take a look at a class of use cases that is a natural fit for Flink Stateful Functions: monitoring and controlling networks of connected devices (often called the “Internet of Things” (IoT)).

IoT networks are composed of many individual, but interconnected components, which makes getting some kind of high-level insight into the status, problems, or optimization opportunities in these networks not trivial. Each individual device “sees” only its own state, which means that the status of groups of devices, or even the network as a whole, is often a complex aggregation of the individual devices’ state. Diagnosing, controlling, or optimizing these groups of devices thus requires distributed logic that analyzes the “bigger picture” and then acts upon it.

A powerful approach to implement this is using digital twins: each device has a corresponding virtual entity (i.e. the digital twin), which also captures their relationships and interactions. The digital twins track the status of their corresponding devices and send updates to other twins, representing groups (such as geographical regions) of devices. Those, in turn, handle the logic to obtain the network’s aggregated view, or this “bigger picture” we mentioned before.

Read on to see where stateful functions come into play.

Comments closed

Late-Arriving Data with Spark Streaming

Sarfaraz Hussain continues a series on Spark streaming:

The size of the State (discussed in the previous blog) will continue to increase indefinitely as we really don’t know when a bucket can be closed.

But practically a query is not going to receive data 1 week late or in that matter such late-arriving data is of no use to us.

So, to specify the information when to stop considering older buckets for the streaming query we use Watermark.

Read on to see how you can design and implement a watermark.

Comments closed

A Review of Azure Synapse Analytics

Teo Lachev looks at Azure Synapse Analytics:

There is plenty to like in Azure Synapse which is the evaluation of Azure SQL DW. If you’re tasked to implement a cloud-based data warehouse, you have a choice among three Azure SQL Server-based PaaS offerings, including Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse. In a nutshell, Azure SQL Database and Azure SQL MI are optimized for OLTP workloads. For example, they have full logging enabled and replicate each transaction across replicas. Full logging is usually a no-no for decent size DW workloads because of the massive ETL changes involved.

In addition, to achieve good performance, you’ll find yourself moving up the performance tiers and toward the price point of the lower Azure Synapse SKUs. Not to mention that unlike Azure SQL Database, Azure Synapse can be paused, such as when reports hit a semantic layer instead of DW, and this may offer additional cost cutting options.

Teo focuses primarily on SQL Pools and the more SQL-friendly side of things (ELT and Power BI) rather than Spark pools.

Comments closed

Finding Heaps in a Database

David Fowler has a quick one for us:

This is just going to be a quick post but I was asked the question the other day, how can I find all the tables in the database that don’t have a clustered index?

Tables without clustered indexes, known as heaps can give you quite a few different performance problems for various reasons. I don’t really want to go into those here, that could be a subject for another post but finding these heaps is actually very simple.

And David has a short script to show just how simple it is.

Comments closed

Generating Stored Procedure Mappings for Entity Framework Core

Erik Ejlskov Jensen takes us through stored procedure mapping with Entity Framework Core Power Tools:

In my previous post I showed how you can map and use stored procedures manually from EF Core, a process which involved quite a bit of code, and some changes to your derived DbContext.

With the latest release of EF Core Power Tools, you can opt-in to have any SQL Server stored procedures in your database made available to you.

Click through to learn how to do this.

Comments closed