Press "Enter" to skip to content

Month: April 2019

Using dbatools for Inventory Analysis

Andreas Schubert gives us a way to learn more about our SQL Server inventories with dbatools:

With the multitude of environments that I am operating, it’s impossible to remember every server, every database or the multiple different ways they are interacting with each other. Therefore, one of the first things I do when taking over a consulting engagement is mapping out all those different bits of information.

Since the environments usually change pretty fast, my goal is to automate this process as much as possible.

In this series of posts, I will try to show you how I am implementing this. Of course, your requirements or implementations may differ, but hopefully this blog post can give you some ideas about your tasks too.

Click through for a script. There are also some good comments.

Comments closed

READPAST In Action

Erik Darling shows how READPAST is no panacea:

Locking hints can be really handy in these situations, especially the READPAST hint. The documentation for it says that it allows you to skip over row level locks (that means you can’t skip over page or object level locks).

What it leaves out is that your READPAST query may also need to try to take row level shared locks.

Read on for an example as well as an alternative which ends up being better in this case.

Comments closed

SQL Server 2019 CTP 2.5

The SQL Server team has a new CTP out:

We’re excited to announce the monthly release of SQL Server 2019 community technology preview (CTP) 2.5. SQL Server 2019 is the first release of SQL Server to closely integrate Apache Spark™ and the Hadoop Distributed File System (HDFS) with SQL Server in a unified data platform.

This is a big one for me: lots of changes in Big Data Clusters, PolyBase on Linux, and a Java SDK. Looks like I am going to be pretty busy.

Comments closed

Why Optimize for Ad Hoc Workloads

Randolph West explains why optimize for ad hoc workloads should be enabled by default:

Enabling the optimize for ad hoc workloads configuration setting will reduce the amount of memory used by all query plans the first time they are executed. Instead of storing the full plan, a stub is stored in the plan cache. Once that plan executes again, only then is the full plan stored in memory. What this means is that there is a small overhead for all plans that are run more than once, on the second execution.

Read the whole argument. I don’t know that I’ve seen an instance yet where this setting was a really bad choice.

Comments closed

Testing an Event-Driven System

Andy Chambers takes us through how to test an event-driven system:

Each distinct service has a nice, pure data model with extensive unit tests, but now with new clients (and consequently new requirements) coming thick and fast, the number of these services is rapidly increasing. The testing guardian angel who sometimes visits your thoughts during your morning commute has noticed an increase in the release of bugs that could have been prevented with better integration tests.

Finally after a few incidents in production, and with velocity slowing down due to the deployment pipeline frequently being clogged up by flaky integration tests, you start to think about what you want from your test suite. You set off looking for ideas to make really solid end-to-end tests. You wonder if it’s possible to make them fast. You think about all the things you could do with the time freed up by not having to apply manual data fixes that correct for deploying bad code.

At the end of it all, hopefully you’ll arrive here and learn about the Test Machine.

Check it out. Testing these types of system is certainly possible, but can be a bit difficult because of the additional layers of complexity.

Comments closed

Monads and Monoids and Functors

Anmol Sarna explains the concept of a monad:

In functional programming, a monad is a design pattern that allows structuring programs generically while automating away boilerplate code needed by the program logic.

To simplify the above definition a bit more, We can think of monads as wrappers. You just take an object and wrap it with a monad.

Let’s just be clear on one thing: A Monad is not a class or a trait; Neither is it only dedicated to the Scala language. It is a concept related to functional programming.

This also includes a few examples in Scala.

Comments closed

CosmosDB Continuation Tokens

Hasan Savran walks us through the idea of a continuation token in CosmosDB:

In CosmosDB, TOP option is required and its default value is 100. You can change the default value by sending a different value using the request header “x-ms-max-item-count“. If you have 40000 rows in your Orders table, and run the same query in CosmosDB, you will get 100 rows(documents) rather than 40000 rows(documents). CosmosDB returns all kind of metadata with the data. You can find this metadata in the response headers. One of those responses is, “x-ms-continuation” and it is responsible to display the rest of the rows of your query. If you like to get the next set of results, you can take “x-ms-continuation” value from the response headers and attach it to your next request to get the next set of rows. CosmosDB SDK does this automatically for you. SDK checks for the x-ms-continuation value when you check HasMoreResults property. If this property is true, that means CosmosDB returned a continuation token.

I have fanciful notions of SQL Server offering something similar—think of a grid built from a query. Get the first 50 rows from the result set and store that off in tempdb somewhere, using the “continuation token” (which might just be the full name in tempdb) and auto-trashing after a certain amount of time.

Comments closed

Window Functions with IGNORE NULLs

Lukas Eder walks us through a bit of functionality I wish we had in SQL Server:

On each row, the VALUE column should either contain the actual value, or the “last_value” preceding the current row, ignoring all the nulls. Note that I specifically wrote this requirement using specific English language. We can now translate that sentence directly to SQL:

last_value (t.value) ignore nulls over (order by d.value_date)

Since we have added an ORDER BY clause to the window function, the default frame RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW applies, which colloquially means “all the preceding rows”. (Technically, that’s not accurate. It means all rows with values less than or equal to the value of the current row – see Kim Berg Hansen’s comment)

Only a few database products have this and SQL Server is not one of them.

Comments closed

Aggregate Pushdown with GROUP BY

Paul White takes us through several performance improvements around aggregate pushdown:

SQL Server 2016 introduced serial batch mode processing and aggregate pushdown. When pushdown is successful, aggregation is performed within the Columnstore Scan operator itself, possibly operating directly on compressed data, and taking advantage of SIMD CPU instructions.

The performance improvements possible with aggregate pushdown can be very substantial. The documentation lists some of the conditions required to achieve pushdown, but there are cases where the lack of ‘locally aggregated rows’ cannot be fully explained from those details alone.

This article covers additional factors that affect aggregate pushdown for GROUP BY queries onlyScalar aggregate pushdown (aggregation without a GROUP BY clause), filter pushdown, and expression pushdown may be covered in a future post.

Read the whole thing.

Comments closed