Press "Enter" to skip to content

Author: Kevin Feasel

Pattern Matching in Scala

Kuldeepak Gupta shows off pattern matching in Scala:

Pattern Matching is a mechanism of checking a value against a Pattern. It gives a way of checking the given sequence of tokens for the presence of a specific pattern. Here, we match expressions against a pattern.

Compared to the ‘switch’ in C++, C, JAVA, there’s no fall through to the next alternative in Scala pattern matching. A Match error is thrown when no pattern matches.

This is a powerful part of functional programming.

Comments closed

Understanding the Oldest Page Wait

Tom Collins explains a database wait:

SQL Server Log truncation deletes inactive Virtual Log Files (VLF) from the SQL Server database transaction log . The Log truncation process frees  space in the logical log for reuse by the Physical transaction log. If no truncation occurs , eventually it will fill all the disk space allocated to physical log files.

SQL Server Log truncation can be delayed for a range of different reasons.A good starting point is to  query the sys.databases log_reuse_wait and log_reuse_wait_desc columns. This will supply different waits describing the reason for a delay 

Read on for more info about the OLDEST_PAGE wait.

Comments closed

Using GREATEST and LEAST in Azure SQL DB

Aaron Bertrand preps us for SQL Server 2022:

In an earlier tip, “Find MAX value from multiple columns in a SQL Server table,” Sergey Gigoyan showed us how to simulate GREATEST() and LEAST() functions, which are available in multiple database platforms but were – at least at the time – missing from Transact-SQL. These functions are now available in Azure SQL Database and Azure SQL Managed Instance, and will be coming in SQL Server 2022, so I thought it was a good time to revisit Sergey’s methods and compare.

Read on to see how the workaround compares.

Comments closed

Azure Synapse Analytics November Updates

James Serra keeps us up to date on Synapse:

Delta Lake support for serverless SQL is generally available: Azure Synapse has had preview-level support for serverless SQL pools querying the Delta Lake format. This enables BI and reporting tools to access data in Delta Lake format through standard T-SQL. With this latest update, the support is now Generally Available and can be used in production. See How to query Delta Lake files using serverless SQL pools

Click through for the full list of what James likes.

Comments closed

Lambda Expressions in Scala

Shubham Shrivastava explains how lambda expressions work in Scala:

Lambda expressions in Scala the syntax for these uses a symbol it’s an equals and a greater than and we refer to this as rocket ( => ). When we’re reading the code the idea of a lambda expression is a short literal expression that defines a function and typically these should not be overly long. so for example I could define a function for squaring values that looks something like this.

Lambda expressions are great in cases where you need to perform an operation exactly one time. If you create a separate function with its own name, there’s always a wonder in the back of a developer’s mind if this thing will get used again, and so it takes up a little bit of cognitive load. A lambda expression answers that conclusively: no, we won’t use this code again.

Comments closed

Deploying dbt on Databricks

Dave Eyler, et al, have a great announcement:

At Databricks, nothing makes us happier than making our users more productive, which is why we are delighted to announce a native adapter for dbt. It’s now easier than ever to develop robust data pipelines on Databricks using SQL.

dbt is a popular open source tool that lets a new breed of ‘analytics engineer’ build data pipelines using simple SQL. Everything is organized within directories, as plain text, making version control, deployment, and testability simple.

Click through for more information on how this works and how you can get the native adapter.

Comments closed

Determining the Right Batch Size for Deletes

Jess Pomfret breaks out the lab coat and safety goggles:

I found myself needing to clear out a large amount of data from a table this week as part of a clean up job.  In order to avoid the transaction log catching fire from a long running, massive delete, I wrote the following T-SQL to chunk through the rows that needed to be deleted in batches. The question is though, what’s the optimal batch size?

I usually go with a rule of thumb: 1K for wide tables (in terms of columns and row size) or when there are foreign key constraints, 10K for medium-width tables, and about 25K for narrow tables. But if this is an operation you run frequently, it’s worth experimenting a bit.

Comments closed

Automating Historical Partition Processing in PBI Per User

Gilbert Quevauvilliers runs into a timing issue:

I recently had a big challenge with one of my customers where due to the sheer volume of data and network connectivity speed I was hitting the 5-hour limit for processing of data into my premium per user dataset.

My solution was to change the partitions from monthly to daily. And then once I have all the daily partitions merge them back into monthly partitions.

The challenge I had was I now had to process daily partitions from 2019-01-01 to 2021-11-30. This was a LOT of partitions and I had to find a way to automate the processing of partitions.

Not only that, but I had to ensure that I did not overload the source system too!

Read on to see what Gilbert did to solve this problem.

Comments closed

Building a Pipeline for External Data Sharing

Hope Foley has data to share:

I worked with a customer recently who had a need to share CSVs for an auditing situation.  They had a lot of external customers that they needed to collect CSVs from for the audit process.  There were a lot of discussions happening on how to best do it, whether we’d pull data from their environment or have them push them into theirs.  Folks weren’t sure on that so I tried to come up with something that would work for both. 

Read on for Hope’s solution to the problem.

Comments closed