Press "Enter" to skip to content

Day: June 11, 2024

Choosing between Data Warehouses, Lakes, and Lakehouses

Den Smyrnov talks architecture:

Historically, the two most popular approaches to storing and managing data are Data Warehouse and Data Lake. The choice between them usually depends on business objectives and needs. While Data Lakes are ideal for preserving large volumes of diverse data, warehouses are more favorable for business intelligence and reporting. Sometimes, organizations try to have the best of both worlds and mix Data Lake & Data Warehouse architectures. This, however, can be a time and cost-consuming process.

Against this backdrop, a new hybrid approach—Data Lakehouse—has emerged. It combines features of a Data Lake and a Data Warehouse, allowing companies to store and analyze data in the same repository and eliminating the Data Warehouse vs. Data Lake dilemma. Data Lakehouse mixes the scalability and flexibility of a Data Lake with the ability to extract insights from data easily. Ever so compelling, this approach still has certain limitations. It should not be treated as a “one-size-fits-all” solution.

Read on for an explanation of each of these three styles, including their pros and cons.

Leave a Comment

Converting Numbers into Roman Numerals with C#

Adron Hall changes numbering schemes:

Today, I’ll dive into a fascinating challenge: converting modern numbers into their ancient Roman numeral counterparts. The task is straightforward but intricate, involving a programming challenge that takes any positive integer from 1 to 3999 and converts it into the corresponding Roman numeral.

To convert regular decimal numbers into Roman numerals, one must follow a set of rules based on the values and combinations of specific Roman numeral characters. Here’s a brief summary of the conversion process:

Click through for the translation rules and how to do this in C#. Now I’m thinking about how to do this in F# and thinking tail-call recursion. I might give that a try on my own and blog about it if I come up with something neat.

Leave a Comment

Bucket Operations in Oracle Object Storage

Brendan Tierney continues a series on working with Oracle Object Storage:

In a previous post, I showed what you need to do to setup your local PC/laptop to be able to connect to OCI. I also showed how to perform some simple queries on your Object Storage environment. Go check out that post before proceeding with the examples in this blog.

In this post, I’ll build upon my previous post by giving some Python functions to:

  • Check if Bucket exists
  • Create a Buckets
  • Delete a Bucket
  • Upload an individual file
  • Upload an entire directory

Read on for those examples.

Leave a Comment

Test Data Generation Concepts and Challenges

Phil Factor talks about data generation:

Data generation is the science and art of providing data for database development work that is as realistic and controllable as possible. The skills of generating realistic data are an essential part of being a database developer. It is important: you need plenty of data of exactly the right type, size and verisimilitude in order to develop and test a database. It has got to look right. If you are launching a corporate, data-driven application, you’ll need it for training as well as testing and development.

Where things get really tough is when you also want to get the distribution of data right. Is a uniform distribution (e.g., outputs from a random number generator) okay? Or do you have a few large accounts that take up the majority of your data? That difference could affect query plans and if you have nice and uniform data, you might never see some of the cardinality estimation problems until they pop up in production.

Leave a Comment

Cross-Database Ownership Chaining in SQL Server

David Seis explains and warns:

In SQL Server, an ownership chain is created when one object (like a stored procedure) accesses another object (like a table) in the same database. If both objects have the same owner, SQL Server only checks permissions for the first object, not the second. This is known as ownership chaining.

Cross-database ownership chaining extends this concept across databases. If a stored procedure in Database A accesses a table in Database B, and both objects have the same owner, SQL Server will not check permissions for the table in Database B.

For what I consider the definitive explanation of why you shouldn’t use cross-database ownership chaining (or other techniques like setting TRUSTWORTHY on), Solomon Rutzky’s article is it. Module signing is the answer and the rest are traps.

Leave a Comment

Exporting SQL Server Query Results to an HTML Table

Vlad Drumea builds a table:

In a previous post, I’ve covered how to import and query HTML table data in SQL Server, so I figured it makes sense to also show ho to export SQL Server query results to an HTML table using PowerShell and dbatools.

This is really similar to how the HTML report from PSBlitz is generated. The differences being that PSBlitz doesn’t rely on dbatools and there’s a bit more formatting and some JavaScript involved.

Click through for a piece-by-piece breakdown of the script and Vlad’s commentary.

Leave a Comment