Press "Enter" to skip to content

Day: May 10, 2023

An Introduction to Polyglot Notebooks

Matt Eland walks us through a sample:

Polyglot Notebooks allows you to create notebooks composed of multiple cells. These cells can be either markdown cells for documentation or code cells containing code in either C#, F#, PowerShell, SQL, KQL, HTML, JavaScript, or mermaid markdown for diagramming.

This allows you to mix together rich documentation supported by little pieces of code that progressively expand upon an idea, tell a story, or otherwise provide insight or information to you as a developer.

Click through for the example. The thing I hadn’t realized—because I don’t really do this in Jupyter—is that you can share variables between languages. That’s a fairly useful feature when you want to do most of your work in one language but just happen to need a library using a different language.

Comments closed

An Overview of Google BigQuery Pricing Changes

Jeremy Pries gets out the checkbook:

Google has announced that a new tiered pricing model for Google BigQuery will come into effect in July. If your organization is already a BigQuery customer or you’re considering it as a data warehouse option, here’s what you need to know about the new pricing structure.

As of July 5, 2023, BigQuery will come with three pricing tiers: Standard, Enterprise and Enterprise Plus. Customers will have the ability to mix and match these editions to suit various workloads.

Read on for what this means and ways you can hope to save some money as a result of the licensing changes. I’m slightly pessimistic about it—only slightly because I’m quite ignorant of BigQuery pricing to begin with, but more than zero pessimism because companies often don’t change licensing terms with the intent of users paying them less.

Comments closed

Updates to Always Encrypted since 2016

Aaron Bertrand notes some changes:

In 2015, during the SQL Server 2016 beta, I explored a new feature is this article, Always Encrypted. This feature finally allowed us to encrypt data at rest and on the wire, and I showed how beneficial this was and how much more secure your data could be. I also explained that, as a new feature, some limitations made it difficult to use and, sometimes, impossible to adopt.

Several major versions of SQL Server later, how has this feature evolved, and is it easier to use today?

Read on for the answer. Aaron also covers secure enclaves, a big topic for Always Encrypted users.

Comments closed

NTILE and QUALIFY in Snowflake

Kevin Wilkie continues to build a sample:

In our last post, we went over one way to get a sample of data. In the end was it right? Heck, no – at least not if we wanted a percentage of rows returned. Now, SAMPLE does work fine if you want a specific number of random rows returned each time.

But, let’s face it – sometimes we will be asked for 10% of a table – especially in this world of Machine Learning. So let’s attempt to find a way to make this happen, shall we?

QUALIFY would be a nice clause to have. Instead, if you’re in the T-SQL world, you’re probably used to creating a common table expression or subquery and then applying a WHERE clause to the outputs of that CTE/subquery.

Comments closed

Publish to Power BI Environments via ADO

Richard Swinbank deploys a report:

In the first post in this series, I built an Azure DevOps pipeline to automate steps in a Power BI development workflow. The pipeline implemented a very basic workflow – as soon as a developer committed a new report version to Git, the pipeline deployed it immediately into a Power BI workspace.

In this post I’ll be building a pipeline to support a more sophisticated workflow that enables peer review and stakeholder testing.

Click through for the step-by-step process.

Comments closed

A Summary of Full-Text Search in SQL Server

Paul Hernandez gives us a primer on full-text search in SQL Server:

Sometimes you want to perform a search using one or more keywords over one or multiple character columns in a table. Clustered, nonclustered or column stored indexes (organized in a B-Tree structure) will help you with such a task. You can of course use the LIKE operator and do wildcard text searches, but this is still inefficient. Full-text search in SQL Server and Azure SQL lets you perform full-text queries against character based-data in your tables.

Read on to learn more about the topic. I’ve used full-text search with some success once, and my failed attempts count (in that, I tried to use FTS but it wasn’t a good use case and it didn’t work) is a little bit higher. The biggest thing I found was that it struggled with very large numbers of rows–I had tried examples with 50-100 million rows and the index never finished building.

Comments closed

Performance Overhead of TDE

Matthew McGiffen answers an age-old question:

Microsoft states that enabling TDE usually has a performance overhead of 2–4%. That doesn’t sound like very much, and personally I wouldn’t let it bother me if I want to make sure my data is encrypted at rest. However, you may have heard other sources saying that it’s actually a lot more than that – and the performance impact is a high price to pay for the level of protection offered. So, what’s the truth?

It turns out the answer is a bit more complex than simply saying “x%,” though as a first approximation, I’d still say that the 2-4% is a good starting point. For what would move you off of that 2-4%, read the whole thing.

Comments closed

Choosing from a List of Values

Greg Dodd doesn’t need no steenkin’ tables:

Sometimes you have a list of values or parameters that you’d like to run a select statement over the top of to get a result. There’s a few ways of doing this, usually I see people create a Temporary Table and insert the data. But is there an easier way?

Yes, we can select from VALUES:

The result of this is called a virtual table, and as far as other parts of the query are concerned, it’s just another table.

Comments closed