Press "Enter" to skip to content

Curated SQL Posts

Reducing Power BI Dataset Sizes with Semantic Link

Sandeep Pawar builds some really cool diagnostics:

Semantic Link v0.6 is out and it has many new exciting additions to its growing list of list_* methods. Highlighted are some of the new methods. Install the latest version and check it out.

Some of the existing methods such as list_columns() have an additional parameter extended which returns more column information such as column cardinality, size, encoding and many more column properties. This allows users to get detailed information about the dataset and the columns.

Click through to see how you can get this information not just for a single semantic model, but for all semantic models in a tenant.

Comments closed

Extended Events Tracing on Read Scale-Out Azure SQL MI

Kendra Little goes on a journey:

It took me more than half hour to figure out how to start an XEvents trace on a read-scale out instance of Azure SQL Managed Instance. It’s hard to monitor read scale-out instances, so tracing is desirable! I started with a simple trace of sql_statement_completed. Hopefully this saves other folks some time.

Click through for that process. The process seems a bit painful, to put it kindly.

Comments closed

Strict Encryption in SSMS 20

Erin Stellato shares an update:

SSMS 20 is the first major version of SSMS that supports Strict encryption and TLS 1.3, thanks to the migration to Microsoft.Data.SqlClient (MDS) 5.1.4.  MDS is the data access library used by SSMS 19 and higher, as well as other SQL Server tools.

Read on for a quick primer on terminology, as well as what it means to force strict encryption. I’m not sure how quickly companies will jump on this, especially given the features that don’t support strict encryption yet, such as availability groups, replication, SQL Server Agent, database mail, linked servers, and PolyBase’s connector to SQL Server.

Comments closed

Symbolic Links and Powershell Modules

Jeff Hicks makes a connection:

I have a short tip today that you may find useful, especially if you write modules for your private use. I have a number of such modules that I have written to fill my needs. These are private modules that I don’t publish to the PowerShell Gallery. I develop and maintain these modules in C:\Scripts. This means that when I need to import the module, I have to type the full path.

Read on to see how you can use symbolic links to make this a bit smoother.

Comments closed

Reducing the Cost of Delete Operations in SQL Server

Ben Johnston eats the elephant:

One of the first things you learn when working with SQL Server, and other transactional based SQL systems, is that set based operations perform best. If you are querying data, a cursor pulling individual rows doesn’t perform as well as a single query. It doesn’t matter if that cursor is on the client side or the server side. A set-based operation is more efficient, runs faster, locks less, and is generally better than submitting multiple queries.

This is also generally true with delete statements. This post covers the exceptions to that rule. Large delete statements impacting many rows and large amounts of data (millions of rows and many gigs of data) can actually have decreased performance. With transactional systems, such as SQL Server, each transaction follows the ACID standard. Part of that standard ensures that transactional statements either complete or roll back fully – partial transactions are not allowed. For a delete statement, that means that all of the rows specified by the delete are removed from the table, or none are removed and the data rolls back to the original state. The delete and rollback behavior must be predictable and consistent or the data could be left in a contaminated, unreliable state. Performing very large deletes can present some challenges and needs to be treated differently in production systems.

Read on for the reasoning behind this, as well as several techniques you can use and how they compare.

Comments closed

Using Apache Spark in Microsoft Fabric

Ginger Grant gives us an overview of where we can use Apache Spark in Microsoft Fabric:

If you have used Spark in Azure Synapse, prepare to be pleasantly surprised with the compute experience in Microsoft Fabric as Spark compute starts a lot faster because the underlying technology has changed. The Data Engineering and Data Science Fabric experiences include a managed Spark compute, which like previous Spark compute charges you when it is in use. The difference is the nodes are reserved for you, rather than allocated when you start the compute which results in compute starting in 30 seconds or less versus the 4 minutes of waiting it takes for Azure Synapse compute to start.  If you have different capacity needs that a default managed Spark compute will not provide, you can always create a custom pool.  Custom pools are created in a specific workspace, so you will need Administrator permissions on the workspace to create them. You can choose to make the new pool your default pool as well, so it will be what starts in the workspace.

Read on for more of Ginger’s thoughts on the matter, including how you can use Copilot in Microsoft Fabric (if you pay for it) to help generate Spark code.

Comments closed

Piecemeal Database Restoration

Chad Callihan restores an elephant one bite at a time…or something:

The larger a database grows, the more difficult it becomes to restore it in a timely manner. When a database is young, you might be able to manage full restores in seconds. But as it matures and backup sizes go from megabytes to gigabytes to terabytes, those restore times will expand as well.

If you plan ahead, it’s not always a requirement to restore the entire database if only part of the database is necessary. This is where the idea of piecemeal restores can save you time and wasted effort.

I’ve always found piecemeal database restoration more of an interesting idea than something quite practical. The problem is, if your data is so easily separable that you can restore one set and not need the other for some reasonable length of time, why are they in the same database? I understand that there are reasonable answers to this question, but I also rarely see those scenarios pop up.

Comments closed

EXISTS and NOT EXISTS in T-SQL

Erik Darling plays Descartes in his spare time:

Let’s talk about two of the most overlooked and undervalued facilities in the SQL language: EXISTS and NOT EXISTS. Perhaps they’d get more traction is they were called THERE and NOT THERE, but but it would be perilously easy to start confusing your WHERE and THERE clause.

Often besmirched by someone who read a blog post about MySQL in 1998 as “subqueries, which are slower than joins”, they’re two of the most useful things you can grasp and use regularly.

This post does a great job of explaining when you’d want to use EXISTS and NOT EXISTS.

Comments closed

Get- cmdlets in dbatools Replication

Jess Pomfret continues a series on replication in dbatools:

This post will show off all the Get- commands that are available within dbatools for replication. When you’re using PowerShell, and especially if you’re new to PowerShell, exploring the Get- commands for a certain module, or area is a great way to get started. As it says in the name, these commands get information about something, they aren’t going to change anything, which means they are pretty safe to run in your environment. Of course, I’m always going to say, you should still run these in your test environment first to make sure you understand what they are doing, and how they behave in your specific environment.

Read on to see some of what you can do with it.

Comments closed

Dynamic Subscriptions in Power BI

Reza Rad sends an e-mail:

If you ever want to have a general report for sales for all countries, and then you want every morning the report to be sent (as PDF or PowerPoint alongside the link to the report) to representatives of each country, the report with the data filtered for their country only, then Dynamic Subscription in Power BI is the feature you need to use. Previously, this was called a Data-Driven Subscription in SSRS (SQL Server Reporting Services). In this article and video, I’ll explain how to use this feature using an example and what you need to know about it.

Click through for the video and blog post.

Comments closed