January 2022 – Page 2

Testability and Functional Code

Published 2022-01-28 by Kevin Feasel

I describe why the functional approach to writing code makes it testable:

Another important aspect of functional programming relevant to writing testable Python code is that functions should not have side effects. In other words, functions take inputs and convert them to outputs; they don’t do anything else. This approach is aspirational rather than entirely realistic—after all, saving to the database is a side effect, and most applications would be fairly boring if they offered absolutely no way to modify the data. It just happens to be the case that our outlier detection engine can be close to side effect-free because we do not create files, save to a database, or push results to some third-party service. With most applications, however, we do not tend to be so lucky.

Click through for an excerpt from the draft of an upcoming book as well as a bit of elucidation on key points. The specific language I’m talking about here is Python but the concepts apply to most languages.

Comments closed

Defining Technical Debt

Published 2022-01-28 by Kevin Feasel

Paul Andrew has thoughts on technical debt:

A few times now I’ve been asked to define technical debt. It can be an ugly term if your role is a project manager or scrum master. But, for me, as a more technically minded person I see this debt as a very normal thing that in an agile delivery team can be managed. However, before we can manage it, I’d like to use this blog post to define it (and get my own thoughts in order).

Read the whole thing. If you want a bit more on the topic, I have a post sharing my thoughts. Reading Paul’s post, I think there’s a lot of common ground in our ways of thought on this.

Comments closed

Row-Level Security and Parallelism

Published 2022-01-28 by Kevin Feasel

Jose Manuel Jurado Diaz hits on an issue with row-level security:

Today, I worked on a service request that our customer reported that running a complex query this is executing in parallel but having more than 2 vCores in Azure SQL Database this query is not using parallelism.
During the troubleshooting process we suggested multiple tips and tricks, but any of them made that Azure SQL Engine uses parallelism:

Being on-premises versus in Azure turned out to be a red herring and the solution was something maybe even more difficult to spot than triggers.

Comments closed

Physical Read Double-Counting in Query Stats

Published 2022-01-28 by Kevin Feasel

David Alcock reviews the latest SQL Server 2019 cumulative update:

Microsoft recently released Cumulative Update 15 for SQL Server 2019. It contains a bunch of fixes and some improvements, I get a bit geeky with updates like this and love to have a look through the different fixes to see “Physical reads for read-ahead reads are counted incorrectly (two times) when you run queries. Therefore, the information in sys.query_store_runtime_stats and sys.dm_exec_query_stats shows incorrect values.”

Read on to see what this means and a quick test to see if it works as expected.

Comments closed

Finding Performance Regressions

Published 2022-01-27 by Kevin Feasel

Angus Croll embraces the power of control charts:

Technically, “performance” metrics are those relating to the responsiveness or latency of the app, including start up time.
But TV devices also tend to be more memory constrained than other devices, and as such are more liable to crash during a memory spike — so for Netflix TV we actually care about memory at least as much as performance, maybe more so.
At Netflix the term “performance” usually encompasses both performance metrics (in the strict meaning) and memory metrics, and that’s how we’re using the term here.

Click through for a primer on how they use changepoint and anomaly detection to do this.

Comments closed

Inserting Arrays into Hive Tables

Published 2022-01-27 by Kevin Feasel

The Hadoop in Real World team takes us through an interesting insertion problem:

When we attempt to insert in to the employee_depts table with the below INSERT, the insert operation will fail with TOK_FUNCTION not supported in insert/values SemanticException.

The solution is straightforward but not something I would have figured out intuitively.

Comments closed

Temporal Tables and Azure DevOps Deployments

Published 2022-01-27 by Kevin Feasel

Rayis Imayev notes a problem with Azure DevOps deployments:

Here is one thing that still doesn’t work well when you try to alter an existing temporal table and run this change through the [SqlAzureDacpacDeployment@1] DevOps task, whether this change is to add a new column or modify existing attributes within the table. Your deployment will fail with the “This deployment may encounter errors during execution because changes to … are blocked by …’s dependency in the target database” error message.

Read on to see what causes this problem and what we can do to work around it.

Comments closed

Checking a SQL Server Database’s Master Key

Published 2022-01-27 by Kevin Feasel

Steve Jones is trying to find where he left his keys:

I’ve been working with encryption in SQL Server for a long time, and have delivered quite a few presentations on the topic. Recently I was updating some code and wanted to check if a database had a master key created in it. This post shows how to do that.

Click through to see how you can create a database master key as well as how to drop it if necessary.

Comments closed

Combining Transparent Data Encryption and Instant File Initialization

Published 2022-01-27 by Kevin Feasel

Tom Collins performs a test:

Instant File Initialization (IFI) enabled on SQL Server, for data files only, leads to faster execution on certain file operations, as it bypasses the zeroing procedure while reclaiming disk space. Instead, disk content is overwritten as new data is written to the files.
But do these benefits continue when Transparent Data Encryption (TDE) is enabled ?

Click through for the answer.

Comments closed

January 2022 Updates for Azure Synapse Analytics

Published 2022-01-27 by Kevin Feasel

Saveen Reddy has an update for us:

You can now easily add data quality, data validation, and schema validation to your Synapse ETL jobs by leveraging Assert transformation in Synapse data flows. Add expectations to your data streams that will execute from the pipeline data flow activity to evaluate whether each row or column in your data meets your assertion. Tag the rows as pass or fail and add row-level details about how a constraint has been breached. This is a critical new feature to an already effective ETL framework to ensure that you are loading and processing quality data for your analytical solutions.

Read on for the full list of changes.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Month: January 2022