Indexes are probably the number one tool we have to improve performance. That said, there are times when we want to put that index on hold. While indexes dramatically improve read performance they do cause a slight dip in write performance. This isn’t significant most of the time but when doing a large load it can frequently be faster to get rid of the existing indexes and then put them back when you are done.
I don’t think that I’ve ever regularly disabled indexes, even during bulk loading. It’s good to know that the option exists, however.
A while back I promised I’d write about what allows SQL Server to perform two seeks rather than a seek with a residual predicate.
More recently, a post touched a bit on predicate selectivity in index design, and how missing index requests don’t factor that in when requesting indexes.
This post should tie the two together a bit. Maybe. Hopefully. We’ll see where it goes, eh?
Also apropos: missing index hints return results in alphabetical order, not in selectivity order or what would be best for queries. In other words, just because the green text in SSMS says it’s the index you want doesn’t mean it’s the index you need.
If you have more than a handful of instances, you really owe it to yourself to take a few minutes and set up your own CMS. The hard part is getting a list of all the instances, but you need to do that anyways, right?
Once you have that list pick ONE SQL Server instance to create your CMS. You want this to be something with solid up-time that is accessible to other team members, but not an instance that’s already spending CPU cycles on something critical.
The lack of CMS compatibility is one of my larger pain points with Azure Data Studio and an issue which hopefully gets fixed this year.
Azure SQL Managed Instances do not utilise windows authentication – so your two methods of authenticating applications and users are:
– SQL Authentication:This authentication method uses a username and password.
– Azure Active Directory Authentication:This authentication method uses identities managed by Azure Active Directory and is supported for managed and integrated domains. Use Active Directory authentication (integrated security) whenever possible.
Hamish also elaborates on some of the trickier bits about Azure Active Directory for someone used to on-prem AD solutions.
2 — You shouldn’t send large messages or payloads through Kafka
According to Apache Kafka, for better throughput, the max message size should be 10KB. If the messages are larger than this, it is better to check the alternatives or find a way to chop the message into smaller parts before writing to Kafka. Best practice to do so is using a message key to make sure all chopped messages will be written to the same partition.
Read the whole thing.
Recently I was delivering a Spark course. One of the exercises asked the students to split a Spark DataFrame in two, non-overlapping, parts.
One of the students came up with a creative way to do so.
He started by adding a monotonically increasing ID column to the DataFrame. Spark has a built-in function for this,
monotonically_increasing_id— you can find how to use it in the docs.
Read on to see how this didn’t quite work right, why it didn’t work as expected, and one alternative.
Write-Asciifunction from my PowerShell MS-Module module can draw custom text in the PowerShell console using ASCII fonts. This article’s cover was drawn by the
Write-Asciifunction too. The below code does the same, it writes
end of line (EOL)character or emulates the
Read on for examples and enjoy.
The idea is to loop through the rows in the Excel table and use the data on each row to populate the content controls in the template and then create a new Word document. Here’s a Flow that does this:
The steps are a bit convoluted, but they work. Chris mentions at the end why people might want to do this, and I’ll reiterate that: I’ve been in several discussions over the years where people want to embed data inside a document without manual intervention, and using tools like Reporting Services, that has not been pretty.
While learning about Docker Container, I notice that is much easier to installed on a Linux system. In Windows, Hyper-V is a requirement to install Docker, and specially if you want to use the “Windows Subsystem in Linux” WSL feature, there’s more setup to complete. So, I’m not using Hyper-V, I’m using VMware Workstation. To keeping simple, I created an Ubuntu 18.04 VM using VMWare Workstation.
You can find the Docker CE installation instructions in the following link.
If you’re using Ubuntu 18.04. make sure to install Curl, as it isn’t included in the OS.
Click through for instructions on how to set this up and join the three layers club (which is not quite the three commas club but close).
Oracle has the most sophisticated implementation, which supports both the ordered set aggregate function, and the window function version:
– Aggregate function:
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x)
– Window function:
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x) OVER (PARTITION BY y)
But there are ways to calculate
PERCENTILE_DISC() using a couple of window functions, so read the whole thing.