Press "Enter" to skip to content

Category: Tools

sp_helpExpandView

Andy Yun cleans up some nested views:

Yeah, that thing was small. But the processing was a horrific case study in T-SQL worst practices. And the architect that created it LOVED nested views (and scalar functions… and MERGE… on Standard edition).

I spent a good amount of effort trying to unravel those as part of my efforts to improve performance, and as a result, decided to create my own community tool to help with unraveling them – sp_helpExpandView.

I also occasionally deal with a group loving nested views and now I’m going to need to use Andy’s stored procedure because tracking that on my own gets really painful.

Comments closed

Backup Jobs and Dropped Databases

Chad Callihan reasons through a use case:

I’m a big fan of Ola Hallengren’s SQL Server maintenance scripts and would recommend that anyone working with SQL Server check them out. They have served me well over the years. As it relates to today’s blog post, maybe too well…

I recently ran into a strange situation with the DatabaseBackup stored procedure that had me scratching my head: a backup job completing successfully for a database that didn’t exist.

Confused? So was I. Let’s take a look at how it happened.

Click through for the scenario.

Comments closed

A Wake for Distributed Replay

Grant Fritchey thinks about what could have been:

Honestly, sincerely, no kidding, I love Distributed Replay. Yes, I get it. Proof positive I’m an idiot. As we needed proof. To be a little fair to me, I love what Distributed Replay could have been, with a little more love. However, fact is, it’s on the deprecation list for 2022. Which means, what minimal amount of love, if any, that Microsoft was giving to it, it’s all gone, forever. Unlike the Little Engine That Could, turns out that Distributed Replay was the Little Engine That Almost Could, But Didn’t. Really Didn’t. Let’s discuss it a bit.

I liked the concept a lot but the tooling was so finicky and there were just so many built-in assumptions that tended to fall apart in real life. Grant actually got it to work outside of toy environments, which was one step further than I could ever get.

Comments closed

Databricks Extension for VSCode at 1.0

Gerhard Brueckl shares the good news:

As you probably know from my previous posts, my colleagues at paiqo.com and I are constantly working to improve our VSCode extension for Databricks. Almost every month we silently release a new version to the VSCode gallery so you get the latest features. However, as this is a special release, I am also writing a dedicated blog post for it

There’s a lot of cool stuff in here, so check it out.

Comments closed

PSWorkItems: Powershell TODOs

Jeffrey Hicks has a new module:

I spend my working days living in a PowerShell console. Over the years, I’ve developed many PowerShell modules to help me manage the chaos that is my work life. One area that always demands attention is managing my tasks and To-Dos. For several years I have been using the MyTasks module. This module stored tasks and supporting information in a set of XML files. The code in the module treated the XML files as databases. I was trying to avoid a dependency on a SQL Server Express installation with the idea that would be overkill.

ln the meantime, I finally got around to finishing and publishing the MySQLite PowerShell module. This module has a set of PowerShell functions designed to simplify working with SQLite database files. This type of database has a much smaller footprint than SQL Server Express and would streamline my task management.

Click through to see how it works.

Comments closed

The Basics of Network Tracing

Will Aftring shows how to put together a network trace:

I know it can be tempting to spin up WireShark and jump right into looking at traces, but asking questions is just as important, if not more important than the traces themselves. 

I usually like to group these questions into two groups: technical and general. 

Note: I will be using the terms client and server to refer to the sender and receiver. The client is always the sender, the server is always the receiver. 

Read on to get an idea of why we might create a network trace, what we intend to learn from it, and then how to do it.

Comments closed

Tools for a Jump Box

Tracy Boggiano looks in the tool bag:

Having recently taken a new job and introducing a number of new tools to my new coworker I thought I’d share how I setup my jump box to and keep it updated so others can benefit, and I can find it later (I did put this in our internal Confluence pages, but I do have a box in Azure for presentations). In alphabetical order because that’s the only way to make sense of things.  I’d be curious if anybody has anything they use that I should add, so please leave comments.

Read on to see the list, including all of the Azure Data Studio and Visual Studio Code extensions Tracy likes to use.

Comments closed

SQL Server Client Tool Updates

Erin Stellato has a burndown list for us:

We introduced .NET Interactive Notebooks in Azure Data Studio through the .NET Interactive Notebooks extension. With multi-language support for Jupyter Notebooks you can now code in T-SQL, PowerShell, C#, JavaScript, and more. 

Never fear, folks: the .NET Interactive Notebooks extension does include the most important .NET language of all, F#. In all seriousness, F# is the type of language which was made for notebooks, especially with generative type providers.

Comments closed

A Change Log for SQL Server ScriptDom

Arvind Shyamsundar keeps track of changes:

Till such time that we have a detailed, fully updated change log, this blog post is being written as an unofficial change log for ScriptDom at least. Hopefully it will help readers understand when certain T-SQL grammar was added to ScriptDom, etc. I hope to keep it updated as we have later releases of DacFx and thereby, ScriptDom. If you have questions / feedback for me, please do leave a comment in this blog post and I will try to address it in due course of time.

Note that if there are no functionality changes for a given DacFx release, it will not feature in this table.

And again, this is an unofficial change log, so it is provided as-is and should not be in any way construed as an official Microsoft statement.

Click through for this unofficial log. As you can see, ScriptDom is still under active development with SQL Server 2022.

Comments closed

The Move to General Database Platforms

Steve Jones muses on specialization in data platforms:

It’s been a decade-plus of the Not-Only-SQL (NoSQL) movement where a large variety of specialized database platforms have been developed and sold. It seems that there are so many different platforms for data stores that you can find one for whatever specialized type of data you are working with. However, is that what people are doing to store data in their applications?

I saw this piece on the return to the general-purpose database, postulating that a lot of the NoSQL database platforms have added additional capabilities that make them less specialized and more generalized. I’ve seen some of this, just as many relational platforms have added features that compete with one of the NoSQL classes of databases. The NoSQL datastores might be adding SQL-like features because some of these platforms are too specialized, and the vendors have decided they need to cover a slightly wider set of use cases.

I see three overlapping forces here in play. First, you have vendors looking at Total Addressable Market for their specific technology (document, key-value, graph, whatever) versus the size of the relational database market and they salivate about getting those general-purpose fat stacks of cash. That’s what Steve is getting at in the graf above.

I think the second force is that specialization is ultimately a sucker’s game when it comes to databases. By specializing in one area, you ultimately sacrifice others. A tool like Elasticsearch is outstanding as a document search engine and it is miserable as an aggregation engine (ask me how I know—it’s like every 6 months, another product team decides that this time, they’ll get stats aggregation with Elasticsearch to work well…and six months after people actually start to use the thing, they move all the data to someplace else that is adequately queryable). Similarly, document databases are excellent for populating details in an application but is not at all excellent at aggregation or arbitrary queries connecting data together. Specialization seems like a great idea until new requirements come in which require advanced reporting.

The third force is that these systems are independent and getting them to talk to each other typically involves writing a lot of ETL/ELT code or using additional third-party tools. To the extent that there are data virtualization platforms, they’re either excruciatingly slow (e.g., PolyBase) or expensive and out of date because they cache the data periodically. A corollary of the third force is that different platforms tend to use different languages and trying to remember which of the three or four different languages you need to use to access data in this case can be a bit painful. This is part of the reason Feasel’s Law exists.

The net result of all of this is that it seems you end up with the same piece of information in several separate places and build complicated systems to keep these separate systems aligned. Each system is (theoretically) optimized for a given use case but you end up with more and more people spending their time gluing together data from disparate systems, ensuring that data in disparate systems matches up, or moving data between disparate systems. If you need to do all of this, then sure, do it. But if there’s a single general-purpose platform which does all of this stuff 90% as well, a large number of companies and use cases will do just fine with the single tool. And that’s why general-purpose database platforms are still so popular and why I believe they will remain popular indefinitely.

The biggest exception I see is caching but that’s because it’s more a “fire-and-forget” data storage system. If you do it right, you don’t have any ETL/ELT to or from the cache and if cache dies, your system continues to work (albeit slower than with the cache). It’s also tied to a specific application and only exists temporarily, so data mismatches are (hopefully) transitory enough not to matter.

Comments closed