Press "Enter" to skip to content

Month: May 2023

An Overview of Azure Synapse Analytics

Kevin Chant offers a primer on Azure Synapse Analytics:

In reality, there are a lot of features within Azure Synapse Analytics where your SQL Server background can prove to be useful.

By the end of this post, you will have a good overview of Azure Synapse Analytics. In addition, where your SQL Server background can prove to be useful. Plus, there are plenty of links included in this post.

This is not the slimmest of primers, which makes sense given how broad Synapse is.

Comments closed

Which Backups are in a File?

Steve Jones didn’t keep ’em separated:

I had a question on multiple backups in a file and had to check my syntax. This post shows how to see which backups are in a file.

Note: Don’t do this. Put backups in separate files.

Still, if you didn’t follow Steve’s good advice here (or you have an edge case situation where you, for some reason, need to store multiple backups in the same file), there’s a way to check what’s in a file.

Comments closed

Formatting Visuals in Power BI Desktop

Reza Rad shows off the new visual formatter in Power BI Desktop:

Where is the Format visual option in the new Power BI Desktop? There is a simple answer to this question, which I explained in this blog and video. This change applied from March 2023 version of Power BI Desktop, and by the time you read this article, it might be at general availability and the only way to format a visual in the Power BI Desktop.

I do hope they also keep the old way of formatting visuals, as there appear to be fewer clicks involved.

Comments closed

An Introduction to Polyglot Notebooks

Matt Eland walks us through a sample:

Polyglot Notebooks allows you to create notebooks composed of multiple cells. These cells can be either markdown cells for documentation or code cells containing code in either C#, F#, PowerShell, SQL, KQL, HTML, JavaScript, or mermaid markdown for diagramming.

This allows you to mix together rich documentation supported by little pieces of code that progressively expand upon an idea, tell a story, or otherwise provide insight or information to you as a developer.

Click through for the example. The thing I hadn’t realized—because I don’t really do this in Jupyter—is that you can share variables between languages. That’s a fairly useful feature when you want to do most of your work in one language but just happen to need a library using a different language.

Comments closed

An Overview of Google BigQuery Pricing Changes

Jeremy Pries gets out the checkbook:

Google has announced that a new tiered pricing model for Google BigQuery will come into effect in July. If your organization is already a BigQuery customer or you’re considering it as a data warehouse option, here’s what you need to know about the new pricing structure.

As of July 5, 2023, BigQuery will come with three pricing tiers: Standard, Enterprise and Enterprise Plus. Customers will have the ability to mix and match these editions to suit various workloads.

Read on for what this means and ways you can hope to save some money as a result of the licensing changes. I’m slightly pessimistic about it—only slightly because I’m quite ignorant of BigQuery pricing to begin with, but more than zero pessimism because companies often don’t change licensing terms with the intent of users paying them less.

Comments closed

Updates to Always Encrypted since 2016

Aaron Bertrand notes some changes:

In 2015, during the SQL Server 2016 beta, I explored a new feature is this article, Always Encrypted. This feature finally allowed us to encrypt data at rest and on the wire, and I showed how beneficial this was and how much more secure your data could be. I also explained that, as a new feature, some limitations made it difficult to use and, sometimes, impossible to adopt.

Several major versions of SQL Server later, how has this feature evolved, and is it easier to use today?

Read on for the answer. Aaron also covers secure enclaves, a big topic for Always Encrypted users.

Comments closed

NTILE and QUALIFY in Snowflake

Kevin Wilkie continues to build a sample:

In our last post, we went over one way to get a sample of data. In the end was it right? Heck, no – at least not if we wanted a percentage of rows returned. Now, SAMPLE does work fine if you want a specific number of random rows returned each time.

But, let’s face it – sometimes we will be asked for 10% of a table – especially in this world of Machine Learning. So let’s attempt to find a way to make this happen, shall we?

QUALIFY would be a nice clause to have. Instead, if you’re in the T-SQL world, you’re probably used to creating a common table expression or subquery and then applying a WHERE clause to the outputs of that CTE/subquery.

Comments closed

Publish to Power BI Environments via ADO

Richard Swinbank deploys a report:

In the first post in this series, I built an Azure DevOps pipeline to automate steps in a Power BI development workflow. The pipeline implemented a very basic workflow – as soon as a developer committed a new report version to Git, the pipeline deployed it immediately into a Power BI workspace.

In this post I’ll be building a pipeline to support a more sophisticated workflow that enables peer review and stakeholder testing.

Click through for the step-by-step process.

Comments closed

A Summary of Full-Text Search in SQL Server

Paul Hernandez gives us a primer on full-text search in SQL Server:

Sometimes you want to perform a search using one or more keywords over one or multiple character columns in a table. Clustered, nonclustered or column stored indexes (organized in a B-Tree structure) will help you with such a task. You can of course use the LIKE operator and do wildcard text searches, but this is still inefficient. Full-text search in SQL Server and Azure SQL lets you perform full-text queries against character based-data in your tables.

Read on to learn more about the topic. I’ve used full-text search with some success once, and my failed attempts count (in that, I tried to use FTS but it wasn’t a good use case and it didn’t work) is a little bit higher. The biggest thing I found was that it struggled with very large numbers of rows–I had tried examples with 50-100 million rows and the index never finished building.

Comments closed

Performance Overhead of TDE

Matthew McGiffen answers an age-old question:

Microsoft states that enabling TDE usually has a performance overhead of 2–4%. That doesn’t sound like very much, and personally I wouldn’t let it bother me if I want to make sure my data is encrypted at rest. However, you may have heard other sources saying that it’s actually a lot more than that – and the performance impact is a high price to pay for the level of protection offered. So, what’s the truth?

It turns out the answer is a bit more complex than simply saying “x%,” though as a first approximation, I’d still say that the 2-4% is a good starting point. For what would move you off of that 2-4%, read the whole thing.

Comments closed