Press "Enter" to skip to content

Curated SQL Posts

Comparing HBase to Cassandra

The Instaclustr team performs a comparison:

Apache HBase® and Apache Cassandra® are both open source NoSQL databases well-equipped to handle incredible amounts of data–but that’s where the similarities end. 

In this blog, discover the architectures powering these technologies, when and how to use them, and which option may prove to be the better choice for your operations.  

Click through for their overview of the two systems and recommendations on when to use which.

Comments closed

Digging into the OpenLibrary ISBN API

Robert Cain is looking for a book:

The format of the Advanced API is slightly different from the simple. Here is template.

https://openlibrary.org/api/books?bibkeys=ISBN:[ISBN Goes Here]&jscmd=data&format=json"

You will replace the [ISBN Goes Here] text with the ISBN number you want to look up. Be aware this can only be digits, you must remove any spaces, dashes, or other characters.

Let’s look at a code example of calling the API and getting all its properties.

Click through for examples and how you can use Powershell to parse the results.

Comments closed

TDE with Customer-Managed Keys in Azure SQL Database

Mirek Sztajno announces a public preview:

In this scenario, a key that is stored in a customer-owned and customer-managed Azure Key Vault (AKV) can be used for each database within a server to encrypt the database encryption key (DEK), called the TDE protector. The feature provides the ability to add keys, remove keys, and change the user-assigned managed identity (UMI) for each database. For more information on identity management, see Managed identity types in Azure.

Click through for more details on how it works and what’s currently not supported in the public preview.

Comments closed

Hiding Power BI Pages Is Not for Security

Marc Lelijveld points out that, just because the kitten covered its eyes and can’t see you don’t mean that you can’t see it:

Most of us probably know, but every now and then, I still run into cases where hiding pages in a Power BI report is used for security purposes. Although hiding pages might seem like a security feature, it is not intended to be used as one since it can be easily bypassed. Users can find hidden pages by simple URL tweaking or right-clicking on the tabs. The article provides an example of how users can use the Reports-Get Pages API to get all report pages and report sections without having to write a single line of code. The default page when publishing a report is the one that is open, even if it is hidden.

Yeah, hiding pages is really more of a convenience thing, especially if you’re making heavy use of drillthrough and other cross-page functionality.

Comments closed

KQL: Show Me

Brian Bønk shows off:

In Kusto and the services Azure Data Explorer and Synapse Data Explorer, there is one main part of the meta data queries – the .show command. The .show command preceeds the rest of the following commands for exploring the meta data in the engine:

  1. queries
  2. commands
  3. commands-and-queries
  4. journal
  5. operations
  6. ingestion failures
  7. table data statistics

Read on for examples of how the .show command can be quite useful.

Comments closed

Automating Self-Hosted Integration Runtime Deployment

Jonathan D’Aloia doesn’t want to click next-next-next:

Welcome to my blog on how to fully automate the deployment of a Self-Hosted Integration Runtime using Terraform!

The title of this blog is very much self-explanatory but I hope you find the contents useful and are able to apply this on your projects in some aspect.

Click through for a brief overview of self-hosted integration runtimes, the process to follow, and a link to the repo.

Comments closed

Using Flyway for MySQL Database Deployments

Ting Chou builds a proof of concept:

Few days ago, I asked myself a question about how can you better manage all the database changes defined in SQL scripts with DDL (Data Definition Language)? Maybe try to use version control on the SQL code and then use DevOps pipelines to achieve continuous deployment. Then, I came up with a poc solution, choux130/DevOps_In_DE/jenkins_mysql_flyway.

Click through for the repo, as well as a few TODOs.

Comments closed

Improving the Robustness of ML Model Deployment

Alexander Billington shares a few tools and tips:

Machine learning (ML) model deployment is a critical part of the MLOps lifecycle, and it can be a challenging process. In the previous blog, we explored how Azure Functions can simplify the deployment process. However, there are many other factors to consider when deploying ML models to production environments. In this blog, we’ll delve deeper into some of the essential hints and tips for more robust model deployments. We’ll look at topics such as proper model versioning and packaging, data validation, and performative code optimisations. By implementing these practices, data scientists and ML engineers can ensure their models are deployed efficiently, accurately, and with minimal downtime.

MLflow is definitely a good recommendation, as is Pydantic (which is on my to-learn list…one of these days).

Comments closed

Point-in-Time Restoration with pg_basebackup

Matt Pearson and Luke Davies restore a PostgreSQL database:

I had a conversation with another DBA about interview questions, and one interview topic that came up was using pg_basebackup to restore a database. I had a horrible realisation that I had not done a restore using pg_basebackup in PostgreSQL 15. With modern backup tools, using pg_basebackup is like using a manual screwdriver instead of an electrically-powered screwdriver; it gets the job done, but much more effort is involved.

However, sometimes pg_basebackup is the only tool available.

So, in this blog, we’ll look at PostgreSQL’s recovery options and their implications for restoring. We’ll also look at a simple restore using pg_basebackup when a user fails. I’ll be using a PG 15 database for these tests.

Click through for the process and a demonstration.

Comments closed