Press "Enter" to skip to content

Month: May 2019

Using Power Query to Expand Out Missing Dates

Matt Allington solves a problem in Power Query:

Suppose you have data in the form of dates (not consecutive) with a value for each of the dates (see the table below left side). You need to expand the rows of the table (create the missing rows) so that you will have all the consecutive dates in the given range and each of the dates has the previous updated value (see the table below right side).

The solution has a pretty large number of steps but is straightforward.

Comments closed

Rewriting Expensive Updates

Erik Darling takes us through an experiment:

Let’s also say that bad query is taking part in a modification.

UPDATE u2
SET u2.Reputation *= 2
FROM Users AS u
JOIN dbo.Users AS u2
ON CHARINDEX(u.DisplayName, u2.DisplayName) > 0
WHERE u2.Reputation >= 100000;
AND u.Id <> u2.Id;

This query will run for so long that we’ll get sick of waiting for it. It’s really holding up writing this blog post.

Erik rewrites this query a couple of times. Click through to learn what he does and why he does it.

Comments closed

Triggers and Multi-Record Changes

Brent Ozar points out a common problem with trigger design:

When you declare variables and set them using one row from the INSERTED or DELETED virtual table, you have no idea which row you’re going to get. Even worse, sometimes this trigger will update one row, and sometimes it won’t – because it might happen to grab a row with a reputation under 1,000!

It’s an easy mistake to make and one which can have a major impact.

Comments closed

Enabling Large Memory Pages in SQL Server

David Klee talks us through large memory pages:

SQL Server Enterprise Edition can leverage large memory pages to reduce the amount of memory pointers required for larger SQL Server deployments. Reducing the number of pointers makes the database engine more efficient, especially for SQL Servers with greater than 32GB of RAM. A normal memory block is 4KB, and many thousands of pointers are required to manage the memory underneath a larger SQL Server. Large memory pages can change the block size to 2MB, greatly reducing the number of pointers required for memory management.

Read on to see what effect this has, as well as when to use them and—more importantly—when not to use them.

Comments closed

Using Powershell Core in Containers

Anthony Nocentino shows us how we can run Powershell Core in containers:

Now, with that last technique, we’ve encapsulated the entire lifecycle of the execution of that script into one line of code. It’s like this script execution never happened…or did it 😉 All kidding aside, we effectively have a serverless computing platform now. Using this technique in our data centers, we can spin up a container, on any version of PowerShell on any platform, run some workload/script and when the workload finishes, the container just goes away. For this to work well, we will need something to drive that process. In an upcoming blog post, we’ll talk more about how we can automate the running of PowerShell containers in Kubernetes.
 
In this post, we covered a lot, we looked at how you can interactively run PowerShell Core in a container, how you can pass cmdlets into a container at runtime, running different versions of PowerShell Core and also how you can persistently store scripts outside of containers in volumes and run those scripts in your containers. We also looked at how you can encapsulate the whole execution of a script and the containers life cycle into one line of code. Really giving you the ability to run PowerShell Core anywhere on any platform.

Check it out for sure. Containers today are where VMs were about a decade ago: becoming more common but still a bit “out there” for administrators. It’s not a stretch to say that within a few years, containers will be as ubiquitous as VMs were by 2012, if not more so.

Comments closed

Optimizing Kafka Streams Apps

Bill Bejeck and Guozhang Wang give us an idea of some Kafka Streams internals:

At a high level, when you use the Streams DSL, it auto-creates the processor nodes as well as state stores if needed, and connects them to construct the processor topology. To dig a little deeper, let’s take an example and focus on stateful operators in this section.

An important observation regarding the Streams DSL is that most stateful operations are keyed operations (e.g., joins are based on record keys, and aggregations are based on grouped-by keys), and the computation for each key is independent of all the other keys. These computational patterns fall under the term data parallelism in the distributed computing world. The straightforward way to execute data parallelism at scale is to just partition the incoming data streams by key, and work on each partition independently and in parallel. Kafka Streams leans heavily on this technique in order to achieve scalability in a distributed computing environment.

They then use that info to show you how you can make your Streams apps faster.

Comments closed

Azure SQL Database Edge

Randolph West gives us some quick info on Azure SQL Database Edge:

I was first made aware of this edition at the MVP Summit earlier this year, and I need to clear some things up for folks who might be confused about the name, and who it’s for.

Firstly, recall that Azure means “hybrid” now, so while we might expect that it refers to cloud computing, it also takes on-premises infrastructure into account.

Secondly, this is the full SQL Server database engine running on a 64-bit ARM CPU. It could run on a Raspberry Pi, or — provided there was support for the other hardware — Android or iOS devices, however it is geared towards edge devices that gather data from IoT sensors and other data points. Think of this as one step up from the IoT devices capturing data in the field, whether it be wine-making, oil and gas, manufacturing, you name it.

Read the whole thing. I’m definitely interested in how they handle time series. With luck, it’s done well and brought over to the main product.

Comments closed

Parsing JSON with T-SQL

Dave Mason has a primer on JSON parsing using T-SQL:

Microsoft added support for JSON data beginning with SQL Server 2016. JSON is an open-standard file format consisting of attribute–value pairs and array data types. It is commonly used to transmit data objects for asynchronous browser–server communication. But it is also used for storing unstructured data in files or NoSQL databases such as Microsoft Azure Cosmos DB. For most of us, SQL Server’s support for JSON probably means two things: we can convert relational data to JSON and vice versa. In this post, I’ll focus on converting JSON to relational data and share what I’ve learned from a recent experience.

I’ve been pleasantly surprised with the way JSON support works in SQL Server. It’s supported every complicated scenario I’ve had to deal with so far, including nesting, deciding with or without arrays for the outer element, quotes or no quotes around numbers, etc.

Comments closed

Trouble Installing CTP 2.5: msoledbsql.msi and msodbcsql.msi

Solomon Rutzky spent a lot of time troubleshooting a pernicious issue with SQL Server CTP 2.5 installation:

The other day, I was <sarcasm>blessed / honored / delighted</sarcasm> to waste several hours attempting to install SQL Server 2019 CTP 2.5 over and over again. Each time it would get through the first several steps of the installation process, but then encounter some condition causing it to rollback and finally end with the <sarcasm>super helpful</sarcasm> error message of:

An error occurred for a dependency of the feature causing the setup process for the feature to fail.
 
Use the following information to resolve the error, and then try the setup process again.

That might have been ok had there actually been any information that followed. But no, there was none, not even a small piece of unhelpful information.

Solomon takes us through the blow-by-blow accounting as well as a quick rundown of the solution.

Comments closed

SSIS 2019 Preview Released

Koen Verbeeck notes something very nice:

It is not a joke: SSIS is available for Visual Studio 2019 as a preview. Whoa, hold on. SQL Server 2019 hasn’t been released yet? But there’s already an SSIS 2019? Didn’t we have to wait months after the release of SQL Server 2017 before we had an SSIS version for Visual Studio 2017?

Yes, we did, you can read all about there here.

But times have changed apparently. The SSIS team caught up with the rest of the BI tools: SSIS projects are now available from the Visual Studio market place.

Read on to see what this means for SQL Server Data Tools.

Comments closed