Press "Enter" to skip to content

Category: Tools

The Move to General Database Platforms

Steve Jones muses on specialization in data platforms:

It’s been a decade-plus of the Not-Only-SQL (NoSQL) movement where a large variety of specialized database platforms have been developed and sold. It seems that there are so many different platforms for data stores that you can find one for whatever specialized type of data you are working with. However, is that what people are doing to store data in their applications?

I saw this piece on the return to the general-purpose database, postulating that a lot of the NoSQL database platforms have added additional capabilities that make them less specialized and more generalized. I’ve seen some of this, just as many relational platforms have added features that compete with one of the NoSQL classes of databases. The NoSQL datastores might be adding SQL-like features because some of these platforms are too specialized, and the vendors have decided they need to cover a slightly wider set of use cases.

I see three overlapping forces here in play. First, you have vendors looking at Total Addressable Market for their specific technology (document, key-value, graph, whatever) versus the size of the relational database market and they salivate about getting those general-purpose fat stacks of cash. That’s what Steve is getting at in the graf above.

I think the second force is that specialization is ultimately a sucker’s game when it comes to databases. By specializing in one area, you ultimately sacrifice others. A tool like Elasticsearch is outstanding as a document search engine and it is miserable as an aggregation engine (ask me how I know—it’s like every 6 months, another product team decides that this time, they’ll get stats aggregation with Elasticsearch to work well…and six months after people actually start to use the thing, they move all the data to someplace else that is adequately queryable). Similarly, document databases are excellent for populating details in an application but is not at all excellent at aggregation or arbitrary queries connecting data together. Specialization seems like a great idea until new requirements come in which require advanced reporting.

The third force is that these systems are independent and getting them to talk to each other typically involves writing a lot of ETL/ELT code or using additional third-party tools. To the extent that there are data virtualization platforms, they’re either excruciatingly slow (e.g., PolyBase) or expensive and out of date because they cache the data periodically. A corollary of the third force is that different platforms tend to use different languages and trying to remember which of the three or four different languages you need to use to access data in this case can be a bit painful. This is part of the reason Feasel’s Law exists.

The net result of all of this is that it seems you end up with the same piece of information in several separate places and build complicated systems to keep these separate systems aligned. Each system is (theoretically) optimized for a given use case but you end up with more and more people spending their time gluing together data from disparate systems, ensuring that data in disparate systems matches up, or moving data between disparate systems. If you need to do all of this, then sure, do it. But if there’s a single general-purpose platform which does all of this stuff 90% as well, a large number of companies and use cases will do just fine with the single tool. And that’s why general-purpose database platforms are still so popular and why I believe they will remain popular indefinitely.

The biggest exception I see is caching but that’s because it’s more a “fire-and-forget” data storage system. If you do it right, you don’t have any ETL/ELT to or from the cache and if cache dies, your system continues to work (albeit slower than with the cache). It’s also tied to a specific application and only exists temporarily, so data mismatches are (hopefully) transitory enough not to matter.

Leave a Comment

Installing mssql-cli on Ubuntu 22.04

Andrew Pruski runs into a problem:

I really like the mssql-cli tool. I use it pretty much everyday however it seems like it’s not being maintained anymore and as such, there are issues when trying to install it on Ubuntu 22.04.

The issue with Ubuntu 22.04 is that it has python 3.10 installed by default which the current mssql-cli is not compatible with. I did try installing previous versions of python (3.8 and 3.9) but had no luck so kept the default version.

Click through to see what Andrew did to resolve the problem. It looks like there’s already a GitHub issue for this that was opened back in October.

Leave a Comment

Modifying Multiple Lines of Text at Once in SSMS

Chad Callihan has some advice:

A quick T-SQL/SSMS trick that I know and love is the ability to add text to multiple lines at a time. I tend use macros in Notepad++ to accomplish these types of repetitive actions but this type of work can be completed in SSMS.

Annoyingly, this doesn’t work quite the same way in VS Code or Azure Data Studio. Alt + Down moves a line down one, Alt + Shift + Down copies the current line. To replicate this behavior in VS Code / Azure Data Studio, hold down Alt + Shift and use the mouse to select the lines you want.

Comments closed

Walking through the Azure Log Analytics User Interface

Robert Cain starts a new series on KQL:

The area in the upper half is where you enter the query you want to run. The lower half is where the results are displayed. We’ll see an example of this in action later in this post.

Just above the query area is a toolbar. The Run button will execute the query you’ve entered. Note too, you can use the keyboard command SHIFT+ENTER to run a query. I’m a keyboard guy, so this is what I use most often to run queries, which you’ll see if you take either of my KQL courses on Pluralsight (I’ve linked to them in the Conclusion of this post).

Read on for a walkthrough of the product. Robert also mentions his Pluralsight course, which I thoroughly enjoyed and used as research materials for a talk I put together.

Comments closed

The Basics of Azure Storage Explorer

Manvendra Singh takes us through Azure Storage Explorer:

This article will explain Azure storage explorer, its installations, and details of how to start working with this application to access Azure storage services. Azure storage provides a flexible solution to store various types of data at a massive scale in the cloud environment. If you have many storage accounts in Azure storage, then it will be difficult to manage them. Microsoft has recognized this problem and developed a desktop application Azure storage explorer to manage Azure storage accounts easily. It can be installed on Windows, Linux, and macOS operating systems.

This is a rather useful tool.

Comments closed

Azure ML and the Python SDK in VS Code

I continue a series on getting beyond the basics with Azure ML. First up, we get up close and personal in development:

Notebooks are great for ad hoc work or simple data analysis but we will want more robust tools if we wish to perform proper code development, testing, and deployment. This is where Visual Studio Code comes into play, particularly the Azure Machine Learning extension.

Then, I get into the Python SDK:

Over the past two posts, we have started using the Azure Machine Learning SDK for Python but I’ve only touched on the topic. In this post, we are going to dive into the topic.

Read on for more info on each.

Comments closed

Markdown Tools for VS Code

I highlight a pair of useful extensions for Visual Studio Code:

The first tool of choice is a big one, Yu Zhang’s Markdown All in One. This extension provides several great features. One of my favorites is its support for creating a table of contents. After opening the Command Palette (Ctrl + Shift + P), select Markdown All In One: Create Table of Contents and it creates a ToC for you based on the heading markers you already have.

Read on for several more things I like about this tool, as well as a discussion of a second useful extension.

Comments closed

Vim as an IDE

Andrew Pruski shares some settings:

Disclaimer – I like VS Code and I won’t be uninstalling it anytime soon and I’m not recommending people do.

However, I feel it can be overkill for 90% of the work that I do. So I’ve been playing around with Vim to see if it will give me what I want.

What I really want is a light weight text editor that allow me to run commands in a terminal…that’s it!

I’ve found that vim-markdown is one of those extensions Andrew mentions not having installed but being good.

Comments closed

Macros in Tabular Editor 3

Matt Allington notes a key feature in Tabulor Editor 3:

Today I am talking about Macros in Tabular Editor 3. This is a new name for an old feature. In Tabular Editor 2, this feature is called Advanced Scripting (a term I actually prefer, but oh well).  I think one reason for the name change is there are now multiple types of scripting, including the new DAX scripting feature (I covered that as a key feature I love in the article linked above).

Click through to see how it works. Tabular Editor 3 is a paid product, though the free Tabular Editor 2 is still around if your employer won’t front the cash for 3.

Comments closed

Troubleshooting with sp_HumanEvents

Erik Darling shows off sp_HumanEvents:

With shorter procs you can probably just collect actual execution plans and slam F5 like a tall glass of gin at 6am.

But you don’t wanna do that with the larger procs, for a few practical reasons:

– Lots of little queries run quickly, and we don’t care about those

– Navigating through lots of plans in SSMS is tedious

– There’s no differentiation when other procedures, etc. are invoked

– You introduce a lot of overhead retrieving and rendering all those plans

– The full query text might not be captured, which is a limitation in many places

Let’s save the day with sp_HumanEvents, my stored procedure that makes using Extended Events really easy.

Read on to see how this all works.

Comments closed