Press "Enter" to skip to content

Curated SQL Posts

Triggers In Postgres

Ryan Booz explains how triggers work in Postgres from the standpoint of someone familiar with SQL Server:

They are implemented as Functions (Stored Procedures)
This caught me off guard at first. I’ve been working with and dealing with triggers in SQL Server since day 1. They are first-class citizens… objects that have their own code blocks and rules.

PostgreSQL approaches it differently. Any reusable code block, regardless of its true purpose is a Function of varying types. Triggers are no different. Therefore, you write the logic of your trigger in a Function and then call it by adding a trigger to the DML event of a table.

Click through for a few more tips on triggers.

Comments closed

Wait Stats In Azure Data Studio

Paul Randal takes the Wait Stats Report in Azure Data Studio for a spin:

Note that the x-axis is percentage of all waits, not wait count. You’ll see that PREEMPTIVE_OS_FLUSHFILEBUFFERS is the top wait on my Linux instance – that’s by design and I’ll blog about that next. I’ve also submitted a GitHub change to add that wait to the list of waits filtered out by script the extension uses.

Anyway, you can drill in to the details by clicking the ellipsis at the top-right of the graph and selecting ‘Show Details’. That’ll give all the waits and by selecting each one you can see the usual output from my waits script. To get more information on what each wait means, select the bottom cell, right-click on the URL to copy it, and paste into your favorite browser to go to my waits library. And of course, you can refresh the results via the ellipsis as well.

I like how Azure Data Studio is coming together as a full product. There’s a ways to go yet, but it’s getting there.

Comments closed

How’s My Database?

Daniel Janik has a Windows app for you:

There are actually about 40 things it checks for.

Current limitations are that queries with a cursor or temp table are not analyzed. There’s also a bug where the missing indexes and warnings appear on the wrong node/operator. Since the tool is using estimated plans at the moment, it may not be as accurate.

I’m planning on a few new features in the next month to add feeding the utility a query plan and displaying the original query. I’m also planning on adding history and the ability to execute a query from the tool. Before we get to those we need to fix some known bugs though. I’m hoping that you. Yes! you can help me identify other bugs to make this a great tool for the SQL community.

The product is in beta, so check it out and send Daniel some feedback.

Comments closed

When Differential Backups Grow Larger Than Fulls

Kenneth Fisher notes that differential backups can end up being larger than full backups of the same database:

The thing about DBA Myths is that they are generally widespread and widely believed. At least I believed this one until I posted What’s a differential Backup?and Randolph West (b/t) pointed out that my belief that differential backups can’t get larger than full backups was incorrect. In fact, differential backups (like FULL backups) contain enough transaction log information to cover transactions that occur while the backup is taking place. So if the amount of data that needs to be backed up combined with transactions requires more space than just the data ….

Read on for a demonstration.

Comments closed

Native Math Libraries And Spark ML

Zuling Kang shares with us how we can use native math libraries in netlib-java to speed up certain machine learning algorithms in Apache Spark:

Spark’s MLlib uses the Breeze linear algebra package, which depends on netlib-java for optimized numerical processing.  netlib-java is a wrapper for low-level BLASLAPACK, and ARPACK libraries. However, due to licensing issues with runtime proprietary binaries, neither the Cloudera distribution of Spark nor the community version of Apache Spark includes the netlib-java native proxies by default. So without manual configuration, netlib-java only uses the F2J library, a Java-based math library that is translated from Fortran77 reference source code.

To check whether you are using native math libraries in Spark ML or the Java-based F2J, use the Spark shell to load and print the implementation library of netlib-java. The following commands return information on the BLAS library and include that it is using F2J in the line, “com.github.fommil.netlib.F2jBLAS,” which is highlighted below:

In the examples here, you can get about a 2x difference using the native math libraries versus without, so although that’s not an order of magnitude difference, it’s still nothing to sneeze at.

Comments closed

Kafka Cruise Control Frontend

Naresh Kumar Vudutha announces the Kafka Cruise Control Frontend:

For those that may be unfamiliar, Cruise Control features include:

1. Kafka broker resource utilization tracking
2. The ability to query the latest replica state (offline, URP, out of sync) from brokers
3. Goal-based resource distribution
4. Anomaly detection with self-healing
5. Admin operations on Kafka (add/remove/demote brokers, rebalance cluster, run PLE)

In this post, we will take a look at the frontend for Cruise Control, which provides a birds-eye view of all the Kafka installations and provides a single place to manage all of them.

That’s a lot of functionality in one tool.

Comments closed

Case Classes In Scala

Shubham Dangare explains what case classes are in Scala:

Case class is scale way to allow pattern matching on an object without requiring a large amount of boilerplate. All you need to do is add a single case keyword modifier to each class that you want to pattern matching using such modifier makes scala compiler add some syntactic conveniences to your class and compiler add companion object(with the apply method)
Adds factory method with the name of the class this means that for instance, you can write StringValue(“X”) to construct a StringValue object instead of using new StringValue(“X”)

Given how useful case classes are in Spark, it’s good to know how they operate. For more background on the topic, Alessandro Lacava has a post from a few years back describing the topic well.

Comments closed

SSIS Error “Deserializing The Package”

Andy Leonard troubleshoots an odd error in SSIS:

Exception deserializing the package “Operation is not valid due to the current state of the object.”. (Microsoft.DataTransformationServices.VsIntegration)

As a professional consultant who has been blogging about SSIS for 12 years and authored and co-authored a dozen books related to Microsoft data technologies, my first response was:
“Whut?!”

That is a reasonable first response. Fortunately, Andy also had a second response which was more helpful in finding the root cause.

Comments closed

Saving An ADF Pipeline As A Template

Rayis Imayev shares with us how you can save an Azure Data Factory pipeline as a template:

Azure Data Factory (ADF) provides you with a framework for creating data transformation solutions in the Microsoft cloud environment. Recently introduced Template Gallery for ADF pipelines can speed up this development process and provide you with helpful information to create additional activity tasks in your pipelines.

We naturally long to seek if something standard can be further adjusted. That custom design is like ordering a regular pizza and then hitting the “customize” button in order to add a few toppings of our choice. It would be very impressive then to save this customized “creation” for future ordering. And Azure Data Factory has a similar option to save your custom data transformation solutions (pipelines) as templates and reuse them later.

Click through to see how you can do just that.

Comments closed

No Type Equivalence In M

Imke Feldmann notes an oddity in types in Power Query:

But this function will not return any matches. I also tried out a (potentially) slower version using Table.SelectColumns(Types, each [Value] = x[Types]) – but still no match. 

What I found particularly frustrating here was, that in some cases, these lookups or filters on type-columns worked.

That behavior seems odd to me. Imke shares a link from Microsoft which explains that the behavior occurs, but the why behind it eludes me.

Comments closed