Press "Enter" to skip to content

Curated SQL Posts

SSMS 20 and Default Security

Brent Ozar notes a change:

SQL Server Management Studio 20 Preview 1 is out, and the new connection dialog has a big change:

When you click Connect, you’re likely going to get an error:

Read on for the quick-and-easy solution, which brings behavior back to the pre-SSMS 20 default, as well as the long-term solution to prevent it from being an issue at all.

This brings SSMS in line with Azure Data Studio, which has defaulted to requiring certificates for quite some time. Note that you will need to select “Trust server certificate” if you are using a self-signed cert, though self-signed certs remove one of the two benefits of using certificates in the first place. The first is that certificates allow for encrypting the Tabular Data Stream (TDS) packets SQL Server sends over the network. Self-signed certs do just as good a job of that task as certificates you get from a trusted authority.

The second use case of certificates, however, is ensuring that this is definitely the machine and service you intend to connect to. If an attacker takes over the machine and swaps out the certificate with their own, your client should panic a bit because that’s your early-warning indicator that something is wrong.

Comments closed

Measuring Query Times in Power BI DirectQuery Mode

Chris Webb breaks out the stopwatch:

If you’re tuning a DirectQuery semantic model in Power BI one of the most important things you need to measure is the total amount of time spent querying your data source(s). Now that the queries Power BI generates to get data from your source can be run in parallel it means you can’t just sum up the durations of the individual queries sent to get the end-to-end duration. The good news is that there are new traces event available in Log Analytics (though not in Profiler at the time of writing) which solves this problem.

Read on to learn more about this event.

Comments closed

Measuring Write Speeds in SQL Server

Vlad Drumea performs a test:

In this post I cover a script I’ve put together for measuring storage write speeds in SQL Server, namely against database data files.

This is meant to help get an idea of how the underlying storage performs when SQL Server is writing 1GB of data to a database.

At this point, you might be asking yourself: “Why not use CrystalDiskMark instead?”.
The answer is simple: you might not always be able to install/run additional software in an environment. Even more so if you work with external customers or you’re a consultant. It’s a lot simpler to ask a customer to run a script and send you the output, than it is to ask them to install and run some 3rd party software.

Click through for the script, what it does, and how to run it, as well as a note on limitations and example based on three drives.

Comments closed

Setting Data Frame Columns as Indexes in R

Steven Sanderson explains and does:

Before we dive into the how, let’s briefly discuss why you might want to set a column as the index in your data frame. By doing so, you essentially designate that column as the unique identifier for each row in your data. This can be particularly useful when dealing with time-series data, categorical variables, or any other column that serves as a natural identifier.

Setting a column as the index offers several advantages:

Read on to see those advantages.

Comments closed

Reducing Power BI Dataset Sizes with Semantic Link

Sandeep Pawar builds some really cool diagnostics:

Semantic Link v0.6 is out and it has many new exciting additions to its growing list of list_* methods. Highlighted are some of the new methods. Install the latest version and check it out.

Some of the existing methods such as list_columns() have an additional parameter extended which returns more column information such as column cardinality, size, encoding and many more column properties. This allows users to get detailed information about the dataset and the columns.

Click through to see how you can get this information not just for a single semantic model, but for all semantic models in a tenant.

Comments closed

Extended Events Tracing on Read Scale-Out Azure SQL MI

Kendra Little goes on a journey:

It took me more than half hour to figure out how to start an XEvents trace on a read-scale out instance of Azure SQL Managed Instance. It’s hard to monitor read scale-out instances, so tracing is desirable! I started with a simple trace of sql_statement_completed. Hopefully this saves other folks some time.

Click through for that process. The process seems a bit painful, to put it kindly.

Comments closed

Strict Encryption in SSMS 20

Erin Stellato shares an update:

SSMS 20 is the first major version of SSMS that supports Strict encryption and TLS 1.3, thanks to the migration to Microsoft.Data.SqlClient (MDS) 5.1.4.  MDS is the data access library used by SSMS 19 and higher, as well as other SQL Server tools.

Read on for a quick primer on terminology, as well as what it means to force strict encryption. I’m not sure how quickly companies will jump on this, especially given the features that don’t support strict encryption yet, such as availability groups, replication, SQL Server Agent, database mail, linked servers, and PolyBase’s connector to SQL Server.

Comments closed

Symbolic Links and Powershell Modules

Jeff Hicks makes a connection:

I have a short tip today that you may find useful, especially if you write modules for your private use. I have a number of such modules that I have written to fill my needs. These are private modules that I don’t publish to the PowerShell Gallery. I develop and maintain these modules in C:\Scripts. This means that when I need to import the module, I have to type the full path.

Read on to see how you can use symbolic links to make this a bit smoother.

Comments closed

Reducing the Cost of Delete Operations in SQL Server

Ben Johnston eats the elephant:

One of the first things you learn when working with SQL Server, and other transactional based SQL systems, is that set based operations perform best. If you are querying data, a cursor pulling individual rows doesn’t perform as well as a single query. It doesn’t matter if that cursor is on the client side or the server side. A set-based operation is more efficient, runs faster, locks less, and is generally better than submitting multiple queries.

This is also generally true with delete statements. This post covers the exceptions to that rule. Large delete statements impacting many rows and large amounts of data (millions of rows and many gigs of data) can actually have decreased performance. With transactional systems, such as SQL Server, each transaction follows the ACID standard. Part of that standard ensures that transactional statements either complete or roll back fully – partial transactions are not allowed. For a delete statement, that means that all of the rows specified by the delete are removed from the table, or none are removed and the data rolls back to the original state. The delete and rollback behavior must be predictable and consistent or the data could be left in a contaminated, unreliable state. Performing very large deletes can present some challenges and needs to be treated differently in production systems.

Read on for the reasoning behind this, as well as several techniques you can use and how they compare.

Comments closed