2022-01-28 – Curated SQL

Implementing Homomorphic Encryption with SEAL

Published 2022-01-28 by Kevin Feasel

Tsuyoshi Matsuzaki has a tutorial on using Microsoft SEAL:

Microsoft SEAL is a homomorphic encryption (HE) library, developed by Microsoft Research.
With homomorphic encryption (HE), the encrypted item can be used on computation without decryption. For sensitive data (such as, privacy data in healthcare), the customers can operate their own data without submitting private text to cloud service providers. (See below.)

Click through to see how it all works. Homomorphic encryption is a clever solution to an important class of data security problems and I’m happy to see walkthroughs like this be available.

Comments closed

Building Packages from Base R Files

Published 2022-01-28 by Kevin Feasel

John Nash and Arkajyoti Bhattacharjee package things up:

This article tries to explain an approach to developing alternative versions of functions which are in the distributed base of R. Our interest was in developing improvements to the nls() function and related features in R as part of a Google Summer of Code project for which Arkajyoti Bhattacharjee was the funded student. However, nls() has many tentacles involving a number of files and functions that may or may not be called as nls() is executed.
Part of the difficulty in carrying out such development of alternative versions is that one needs to be able to execute the new variants in parallel with the existing ones. A heavy-effort approach would be to have separate full sets of R code and build each system and run them separately. That is, we want to have two or more versions of R in the same computing system.

Read on for the process, some difficulties you might encounter along the way, and specific issues you might run into on Windows. H/T R-Bloggers.

Comments closed

Multivariate Time Series Anomaly Detection in Azure

Published 2022-01-28 by Kevin Feasel

Louise Han announces an update to the anomaly detection service:

We are excited to announce that we are adding more powerful capabilities in Microsoft Azure Multivariate Anomaly Detector (MVAD) today. In the latest version(v1.1-preview.1) of this API, we implemented a new , in a synchronous manner, which means you could get the anomaly detection results immediately once you call this API. This synchronous inference API is a substantial change compared with previous inference process and will be more intuitive and easier-to-use.
Also, we added a new field named ‘interpretation‘ to give more explanations on an anomaly, like which variables have huge correlation changes and cause the anomaly. These updates will support you to better leverage MVAD and get more useful information to analyze and take actions.

Click through for some more details.

Comments closed

Automating Excel Report Creation with Python

Published 2022-01-28 by Kevin Feasel

Mira Celine Klein needs to create Excel reports:

In this article you will learn how to get data from Python into an Excel file and add some formatting. Excel reports are a great way to communicate data or results, especially to people who don’t use Python. Another great advantage is that you can create automated reports: You define once what the reports should look like, and then you can create it very quickly for, for example, different subgroups of data, or data that is updated regularly.
The first part of the article describes the most important functions and actions, for example, setting column widths, changing font colors, or adding hyperlinks to other sheets. In the second part, all of these features are combined in one Excel file.

This looks a lot like programming against the Excel COM objects in Powershell but maybe a little easier.

Comments closed

Testability and Functional Code

Published 2022-01-28 by Kevin Feasel

I describe why the functional approach to writing code makes it testable:

Another important aspect of functional programming relevant to writing testable Python code is that functions should not have side effects. In other words, functions take inputs and convert them to outputs; they don’t do anything else. This approach is aspirational rather than entirely realistic—after all, saving to the database is a side effect, and most applications would be fairly boring if they offered absolutely no way to modify the data. It just happens to be the case that our outlier detection engine can be close to side effect-free because we do not create files, save to a database, or push results to some third-party service. With most applications, however, we do not tend to be so lucky.

Click through for an excerpt from the draft of an upcoming book as well as a bit of elucidation on key points. The specific language I’m talking about here is Python but the concepts apply to most languages.

Comments closed

Defining Technical Debt

Published 2022-01-28 by Kevin Feasel

Paul Andrew has thoughts on technical debt:

A few times now I’ve been asked to define technical debt. It can be an ugly term if your role is a project manager or scrum master. But, for me, as a more technically minded person I see this debt as a very normal thing that in an agile delivery team can be managed. However, before we can manage it, I’d like to use this blog post to define it (and get my own thoughts in order).

Read the whole thing. If you want a bit more on the topic, I have a post sharing my thoughts. Reading Paul’s post, I think there’s a lot of common ground in our ways of thought on this.

Comments closed

Row-Level Security and Parallelism

Published 2022-01-28 by Kevin Feasel

Jose Manuel Jurado Diaz hits on an issue with row-level security:

Today, I worked on a service request that our customer reported that running a complex query this is executing in parallel but having more than 2 vCores in Azure SQL Database this query is not using parallelism.
During the troubleshooting process we suggested multiple tips and tricks, but any of them made that Azure SQL Engine uses parallelism:

Being on-premises versus in Azure turned out to be a red herring and the solution was something maybe even more difficult to spot than triggers.

Comments closed

Physical Read Double-Counting in Query Stats

Published 2022-01-28 by Kevin Feasel

David Alcock reviews the latest SQL Server 2019 cumulative update:

Microsoft recently released Cumulative Update 15 for SQL Server 2019. It contains a bunch of fixes and some improvements, I get a bit geeky with updates like this and love to have a look through the different fixes to see “Physical reads for read-ahead reads are counted incorrectly (two times) when you run queries. Therefore, the information in sys.query_store_runtime_stats and sys.dm_exec_query_stats shows incorrect values.”

Read on to see what this means and a quick test to see if it works as expected.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Day: January 28, 2022