Keep Predicates Meaningful

Gail Shaw shows that adding a valueless predicate to change a scan operation into a seek operation does not guarantee a performance improvement:

I remember a forum thread from a while back. The question was on how to get rid of the index scan that was in the query plan. Now that’s a poor question in the first place, as the scan might not be a problem, but it’s the first answer that really caught my attention.

Since the primary key is on an identity column, you can add a clause like ID > 0 to the query, then SQL will use an index seek.

Technically that’s correct. If the table has an identity column with the default properties (We’ll call it ID) and the clustered index is on that identity column, then a WHERE clause of the form WHERE ID > 0 AND <any other predicates on that table> can indeed execute with a clustered index seek (although it’s in no way guaranteed to do so). But is it a useful thing to do?

Time for a made up table and a test query.

Anything Gail writes is a must-read; this is no exception.

Visualizing Power BI DMV Queries

Chris Webb shows us how to use Power BI Desktop to visualize DMV queries:

However, running DMV queries against a Power BI Desktop model (which of course runs a local version of the same engine that powers Analysis Services Tabular and Power Pivot) and more importantly doing something useful with the information they return, isn’t straightforward. You can run DMV queries from DAX Studio but that will only give you the table of data returned; you need to copy and paste that data out to another tool to be able to analyse this data. Instead it’s possible to use Power BI Desktop’s own functionality for connecting to Analysis Services to connect to its own local data model and run DMV queries.

It looks like there are some limitations to this technique, but for quick and dirty work, it works.

Max Server Memory Matrix

Randolph West has a new tool:

After many months of confusing myself with mental arithmetic, I decided to create a straightforward list based on Jonathan’s algorithm, in 4 GB increments, for my own reference. I would like to share this with you today.

You can bask in the glory that is my Max Server Memory Matrix athttps://bornsql.ca/memory/.

It’s a nice lookup table, so thanks to Randolph for putting that together.

Partitioning Thoughts

Kendra Little has a few questions to ask before you set up sliding-window partitioning:

Map this out before you write the code. When will the jobs run, and what should happen if they fail? Should someone be engaged? What tools will they need, and when is the Service Level Agreement for when the process has to be complete? You’ll need lots of details on this to make sure your automation and documentation meet the bar.

Partitioning is an extra layer of complexity.  It can be a very useful extra layer of complexity, but this is a case where it’s best to spend an hour before you begin and walk through potential issues.  Those potential issues will come—automation jobs will fail, external configuration changes will affect your partition strategy, bad data will sneak in and fill up your supposedly-empty edge partitions.

Check Endpoint Security

Erik Darling ran into an issue with endpoint security while setting up mirroring:

This is the error text:

The ALTER DATABASE command could not be sent to the remote server instance ‘TCP://ORACLEDB.darling.com:5022’. The database mirroring configuration was not changed. Verify that the server is connected, and try again.

SUPER SLEUTH

Alright, that’s silly. I used the GUI. Instead of going to bed I’ll spend some time checking all my VM network settings. BRB.

I’m back. They were all correct. I could ping and telnet and set up linked servers and RDP. What in the name of Shub-Niggurath is going on with this thing?

These things always happen right before bed, right before the big meeting, right before lunch.  They never happen on a slow Tuesday afternoon, it seems…

SQL Server TLS 1.2 Support

Aaron Bertrand has a great explanation of how to plan for TLS 1.2 support in SQL Server:

It seems straightforward, but as of today, not all builds will enable you to rush out and convert to TLS 1.2 exclusively. Here is what I suggest for each set of builds (in addition to patching .NET Framework, SQL Server Native Client, ODBC, and JDBC on all machines)

A protocol change seems like a small thing, but it suddenly gets to be a big thing when services stop working.

H/T Matt Slocum.

Multi-Site Availability Groups

Derik Hammer discusses spanning an Availability Group across multiple sites:

In the architecture above, replica A and B are in the primary data center while replica C and D are in the disaster recovery (DR) site. Like the previous architecture, the disks are displayed as local but the most important part is that they are physically separate. SANs are wonderful systems with a lot of redundancy but they can also be a single point of failure. Keep your Availability Group disks separate.

This is a nice architectural overview.  Once the series is done, it looks like it’ll be a good resource to discuss high availability and disaster recovery with management and show the options and trade-offs.

Parameter Validation Using Pester

Rob Sewell is using Pester to write Powershell unit tests and wants to test the existence of a parameter:

Here is the start of the function. I validate the VMName parameter to ensure that there a VM with that  name does not already exist

Before Rob’s posts, I hadn’t seen Pester before, but if you write a lot of Powershell code, it looks like a nice testing framework.

Aggregates Using OVER

Kevin Feasel

2016-02-02

T-SQL

Slava Murygin shows aggregation and windowing using SUM:

As a conclusion: You CAN use “OVER” clause to do the aggregation in three following cases:
1. When data set is extremely small and fits in just one 8 Kb page;
2. When you want to hide your logic from any future developer or even yourself to make debugging and troubleshooting a nightmare;
3. When you really want to kill your SQL Server and its underlying disk system;

That conclusion’s rather pessimistic for my tastes, mostly because Slava’s trying to do the same thing with a window function that he’s doing with a GROUP BY clause and has multiple functions across different windows (including calculations).  Using SUM() OVER() is powerful when you still need the disaggregated values—for example, running totals.

HDInsight + Power BI + Spark

Reza Rad has a nice walkthrough on integrating several powerful technologies:

Power BI can connect to many data sources as you know, and Spark on Azure HDInsight is one of them. In area of working with Big Data applications you would probably hear names such as Hadoop, HDInsight, Spark, Storm, Data Lake and many other names. Spark and Hadoop are both frameworks to work with big data, they have some differences though. In this post I’ll show you how you can use Power BI (either Power BI Desktop or Power BI website) to connect to a sample of Spark that we built on an Azure HDInsight service. by completing this section you will be able to create simple spark on Azure HDInsight, and run few Python scripts from Jupyter on it to load a sample table into Spark, and finally use Power BI to connect to Spark server, load, and visualize the data.

If you’re totally unfamiliar with Spark but interested in data processing, now’s a good time to start digging into the topic.

Categories

October 2018
MTWTFSS
« Sep  
1234567
891011121314
15161718192021
22232425262728
293031