Press "Enter" to skip to content

Month: January 2023

Finding Skipped Identity Values in a Table

Brent Ozar minds the gap:

When someone says, “Find all the rows that have been deleted,” it’s a lot easier when the table has an Id/Identity column. Let’s take the Stack Overflow Users table:

It has Ids -1, 1, 2, 3, 4, 5 … but no 6 or 7. (Or 0.) If someone asks you to find all the Ids that got deleted or skipped, how do we do it?

Click through for two methods, one specific to SQL Server 2022 and one which works for all versions of SQL Server.

Comments closed

OData Feeds and Dynamic Data Source Errors

Chris Webb handles an error:

I’ve blogged about the “dynamic data sources” error and the Web.Contents function several times (most recently here), main post here, but never about the fact that you can encounter the same error when working with OData data sources and the OData.Feed function. More importantly, while it seems like the solution to the problem is the same for both functions this is not the case! In fact, the solution may be simpler than you might think.

Click through for an example.

Comments closed

Setting Up Transparent Data Encryption

Matthew McGiffen enables transparent data encryption on a database:

Transparent Data Encryption (TDE) is one of the easiest ways of encrypting your data at rest. In the previous posts we looked at what TDE is and how it works.

In this post we go through the steps of setting TDE up. You can set up TDE when you first create a database (which we will look at in this post), or you can apply it to an existing database (which we’ll cover in posts I’ve got coming up). In the latter case, once TDE has been enabled it will encrypt your existing data in the background. In either case the steps are the same.

Read on to see what you need to do.

Comments closed

End-to-End Testing via shinytest2

Russ Hyde begins a new series:

{shinytest2} builds upon the {shinytest} package and was written by Barret Schloerke and his colleagues at RStudio. Like puppeteer, {shinytest2} uses the Chrome DevTools Protocol to interact with the browser, which is a pretty stable basis for building a browser automation tool (the predecessor {shinytest} was built on a now-unsupported browser library called PhantomJS, so we strongly recommend migrating to {shinytest2} if you are still using {shinytest}). Test scripts are written in R and so should be accessible to R developers who are comfortable with {testthat}. There is an automated tool (described in the next post) for creating these test scripts. Also, {shinytest2} understands the architecture of shiny apps, and so it is simple to access the input and output variables that are stored by a shiny app at any given time, the inputs can be modified easily as well – to access these variables using the more general UI-based end-to-end testing tools is much more difficult.

Read on for the “why” behind this series and the next posts will get into more of the “how.”

Comments closed

Executing Multiple Notebooks in one Spark Pool with Genie

Shalu Ganotra Chadra, et al, explain what Synapse Genie is:

The Genie framework is a metadata driven utility written in Python. It is implemented using threading (ThreadPoolExecutor module) and directed acyclic graph (Networkx library). It consists of a wrapper notebook, that reads metadata of notebooks and executes them within a single Spark session. Each notebook is invoked on a thread with MSSparkutils.run() command based on the available resources in the Spark pool. The dependencies between notebooks are understood and tracked through a directed acyclic graph.

Read on for more information about how you can use it and what the setup process looks like.

Comments closed

Using Managed Identities with Azure Functions

Dennes Torres takes us through the proper use of managed identities:

Let’s talk about authentication between Azure Functions and resources used by Azure Functions and conclude with many poorly documented secrets about how to use User Assigned Managed Identity. When we build Azure functions, they usually need to authenticate against other Azure resources: Azure SQL DatabaseStorage AccountsService Bus and many more.

Each of these services have an authentication that we can call “Meh!”: Azure SQL has SQL Standard Logins, storage accounts have SAS tokens, service bus, shared access keys and so on. These are not the safest methods possible. If the key leaks, you will have a security problem because anyone with the key will be able to access the resource.

There are multiple solutions for this problem, some of them would pass through Key Vault, used to store secrets, keys and passwords. But let’s go directly to the best one: Remove the usage of keys at all. 

Read on to learn how.

Comments closed

Translation in Power BI via Cognitive Services

Leila Etaati gets lost in translation:

There is a possibility to call cognitive service for translation inside Power Query.

I will use this to translate 3000 rows of data about people arrested in Iran for protesting; This information contains city Name, Fullname and Other statements.

In this article, I will show how to call cognitive services for translation, create a proper JSON call and finally, use it inside Power Query.

Read on for the translation in Power Query, specifically from Farsi to English.

Comments closed

Common Table Expressions in MySQL

Robert Sheldon looks at the syntax for common table expressions in MySQL:

As with many relational database management systems, MySQL provides a variety of methods for combining data in a data manipulation language (DML) statement. You can join multiple tables in a single query or add subqueries that pull data in from other tables. You can also access views and temporary tables from within a statement, often along with permanent tables.

MySQL also offers another valuable tool for working with data—the common table expression (CTE). A CTE is a named result set that you define in a WITH clause. The WITH clause is associated with a single DML statement but is created outside the statement. However, only that statement can access the result set.

The syntax is very similar to that of SQL Server save for an explicit RECURSIVE clause rather implicit recursion as in T-SQL.

Comments closed

K-Fold Cross-Validation in Python

Shanthababu Pandian gives us a primer on k-fold cross-validation:

In each set (fold) training and the test would be performed precisely once during this entire process. It helps us to avoid overfitting. As we know when a model is trained using all of the data in a single shot and gives the best performance accuracy. Resisting this k-fold cross-validation helps us to build the model as a generalized one.

To achieve this K-Fold Cross Validation, we have to split the data set into three sets, Training, Testing, and Validation, with the challenge of the volume of the data.

Read on for the explanation and an example.

Comments closed