Press "Enter" to skip to content

Curated SQL Posts

An Overview of Google BigQuery Pricing Changes

Jeremy Pries gets out the checkbook:

Google has announced that a new tiered pricing model for Google BigQuery will come into effect in July. If your organization is already a BigQuery customer or you’re considering it as a data warehouse option, here’s what you need to know about the new pricing structure.

As of July 5, 2023, BigQuery will come with three pricing tiers: Standard, Enterprise and Enterprise Plus. Customers will have the ability to mix and match these editions to suit various workloads.

Read on for what this means and ways you can hope to save some money as a result of the licensing changes. I’m slightly pessimistic about it—only slightly because I’m quite ignorant of BigQuery pricing to begin with, but more than zero pessimism because companies often don’t change licensing terms with the intent of users paying them less.

Comments closed

Updates to Always Encrypted since 2016

Aaron Bertrand notes some changes:

In 2015, during the SQL Server 2016 beta, I explored a new feature is this article, Always Encrypted. This feature finally allowed us to encrypt data at rest and on the wire, and I showed how beneficial this was and how much more secure your data could be. I also explained that, as a new feature, some limitations made it difficult to use and, sometimes, impossible to adopt.

Several major versions of SQL Server later, how has this feature evolved, and is it easier to use today?

Read on for the answer. Aaron also covers secure enclaves, a big topic for Always Encrypted users.

Comments closed

NTILE and QUALIFY in Snowflake

Kevin Wilkie continues to build a sample:

In our last post, we went over one way to get a sample of data. In the end was it right? Heck, no – at least not if we wanted a percentage of rows returned. Now, SAMPLE does work fine if you want a specific number of random rows returned each time.

But, let’s face it – sometimes we will be asked for 10% of a table – especially in this world of Machine Learning. So let’s attempt to find a way to make this happen, shall we?

QUALIFY would be a nice clause to have. Instead, if you’re in the T-SQL world, you’re probably used to creating a common table expression or subquery and then applying a WHERE clause to the outputs of that CTE/subquery.

Comments closed

Publish to Power BI Environments via ADO

Richard Swinbank deploys a report:

In the first post in this series, I built an Azure DevOps pipeline to automate steps in a Power BI development workflow. The pipeline implemented a very basic workflow – as soon as a developer committed a new report version to Git, the pipeline deployed it immediately into a Power BI workspace.

In this post I’ll be building a pipeline to support a more sophisticated workflow that enables peer review and stakeholder testing.

Click through for the step-by-step process.

Comments closed

A Summary of Full-Text Search in SQL Server

Paul Hernandez gives us a primer on full-text search in SQL Server:

Sometimes you want to perform a search using one or more keywords over one or multiple character columns in a table. Clustered, nonclustered or column stored indexes (organized in a B-Tree structure) will help you with such a task. You can of course use the LIKE operator and do wildcard text searches, but this is still inefficient. Full-text search in SQL Server and Azure SQL lets you perform full-text queries against character based-data in your tables.

Read on to learn more about the topic. I’ve used full-text search with some success once, and my failed attempts count (in that, I tried to use FTS but it wasn’t a good use case and it didn’t work) is a little bit higher. The biggest thing I found was that it struggled with very large numbers of rows–I had tried examples with 50-100 million rows and the index never finished building.

Comments closed

Performance Overhead of TDE

Matthew McGiffen answers an age-old question:

Microsoft states that enabling TDE usually has a performance overhead of 2–4%. That doesn’t sound like very much, and personally I wouldn’t let it bother me if I want to make sure my data is encrypted at rest. However, you may have heard other sources saying that it’s actually a lot more than that – and the performance impact is a high price to pay for the level of protection offered. So, what’s the truth?

It turns out the answer is a bit more complex than simply saying “x%,” though as a first approximation, I’d still say that the 2-4% is a good starting point. For what would move you off of that 2-4%, read the whole thing.

Comments closed

Choosing from a List of Values

Greg Dodd doesn’t need no steenkin’ tables:

Sometimes you have a list of values or parameters that you’d like to run a select statement over the top of to get a result. There’s a few ways of doing this, usually I see people create a Temporary Table and insert the data. But is there an easier way?

Yes, we can select from VALUES:

The result of this is called a virtual table, and as far as other parts of the query are concerned, it’s just another table.

Comments closed

Working with .NET Polyglot Notebooks

Matt Eland installs Polyglot notebooks in VSCode:

Polyglot Notebooks is a powerful new interactive notebook technology that lets you run experiments in your editor, mix together code and rich documentation, and write code in a variety of languages to accomplish your tasks.

This article will guide you through the setup process to get Polyglot Notebooks running on your machine so you can do local data science and analytics notebooks using dotnet.

In case you’re curious, my installation has the ability to create notebooks in F#, C#, HTML, JavaScript, KQL, Markdown, Mermaid, Powershell, and SQL. I’m not positive how many of those come with the extension itself and how many are additional kernels I got from installing other extensions.

Comments closed

Connecting Power BI to ADX via Private Endpoint

Dany Hoter keeps it all on the Azure backbone:

The PBI developer creating datasets and reports need to connect to the ADX cluster using Power BI desktop.

To establish such a connection, the user’s IP address should be allowed access to the private end point.

The access should be tested using Kusto Web explorer (KWE) to make sure that the cluster can be reached.

If KWE can connect , Power BI desktop should also connect successfully and a report using the cluster in Direct Query or import can be created.

That’s the goal, and Dany shows us the way to do it.

Comments closed