Press "Enter" to skip to content

Curated SQL Posts

Finding Oracle Blogs

Brendan Tierney theoretically makes my life easier:

A regular question I get asked is, “is there a list of Oracle-related Blogs?” and “what people/blogs should I follow to learn more about Oracle Database?” This typically gets asked by people in the early stages of their careers and even by those who have been around for a while.

You could ask these questions to twenty different (experienced) people, and you’d get largely the same answers with some variations. These variations would be down to their preferences on how certain people cover certain topics. This comes down to experience of following lots of people and learning over time.

Click through for a list of top Oracle blogs. I should probably check out some of them, though I don’t tend to do much with Oracle here. But if you do, there’s a starting point with 100 separate blogs to check out.

Comments closed

What’s New with OneLake Shortcuts

Miquella de Boer gives us an update:

Microsoft Fabric shortcuts enable organizations to unify their data across various domains and clouds by creating a single virtual data lake. These shortcuts act as symbolic links to data in different storage locations, simplifying access and reducing the need for multiple copies.

OneLake serves as the central hub for all analytics data. By using OneLake shortcuts, organizations can connect to existing data sources like Azure and AWS through a unified namespace, streamlining workflows and enhancing collaboration.

Click through for several feature improvements for shortcuts.

Comments closed

Time Range Generation in Data Diluvium

Adron Hall extends Data Diluvium:

Following up on my previous posts about adding humidity and temperature data generation to Data Diluvium, I’m now adding a Time Range generator. I decided this would be a nice addition to give any graphing of the data a good look. This will complete the trio of generators I needed for my TimeScale DB setup. While humidity and temperature provide the environmental data, the Time Range generator ensures we have properly spaced time points for our time-series analysis.

Click through to see how it works.

Comments closed

Checking Who Created a Table in SQL Server

Burt King reviews the logs:

I have noticed new SQL Server database objects have been created and want to know how we can track down who created these objects. What options are available in SQL Server? In this article, learn how in SQL Server to check who created a table or other objects.

Burt shows a couple of techniques for this, though I’d lean heavily on using Extended Events over a server-side trace or Profiler for the task.

Comments closed

Converting a CSV to Parquet with DuckDB and Polars in R

Michael Mayer makes a swap:

In this recent post, we have used Polars and DuckDB to convert a large CSV file to Parquet in steaming mode – and Python.

Different people have contacted me and asked: “and in R?”

Simple answer: We have DuckDB, and we have different Polars bindings. Here, we are using {polars} which is currently being overhauled into {neopandas}.

Click through for the comparison.

Comments closed

Reactable Tables with Sparklines in Shiny Apps

Osheen MacOscar continues a series:

This is the third blog in a series about the {sparkline} R package for inline data visualisations. You can read the first one about getting started with the package here and the second one about embedding them in HTML tables with the {reactable} package here.

In this blog I am taking it a step further and demonstrating how to use our sparkline reactable table in a Shiny app. Thankfully {reactable} has some helpful functions that make this super easy! I will also look at using a dynamic traffic light image in a reactable table at the end.

Click through to see how it all works.

Comments closed

Calculating a Matrix Inversion in SQL Server

Sebastiao Pereira performs matrix math in-database:

There are numerous applications to obtain a Matrix inverse for a given Matrix. Is it possible to do it using only SQL Server? Read on to learn how to build a matrix inverse calculator using a set of SQL Server custom functions.

I expect this to be extremely slow in comparison to GPU-based methods using a language like C, but this approach maximizes style points.

Comments closed

Avoid Exposing PostgreSQL Port 5432 to the Internet

Christophe Pettus shares some good advice:

Sometimes, we run into a client who has port 5432 exposed to the public Internet, usually as a convenience measure to allow remote applications to access the database without having to go through an intermediate server appllication.

Do not do this.

This is the equivalent of exposing port 1433 on a SQL Server instance to the broader internet, and is a bad idea for many of the same reasons.

Comments closed

B-Tree Indexes in PostgreSQL vs SQL Server

Lukas Fittl compares and contrasts:

When it comes to optimizing query performance, indexing is one of the most powerful tools available to database engineers. Both PostgreSQL and Microsoft SQL Server (or Azure SQL) use B-Tree indexes as their default indexing structure, but the way each system implements, maintains, and uses those indexes varies in subtle but important ways.

In this blog post, we explore key areas where PostgreSQL and SQL Server diverge: how their B-Tree indexes implementations behave under the hood and how they store and access data on disk. We’ll also benchmark the impact of deduplication of values on index size in each database system.

I love this kind of post because you hear that SQL Server has indexes and PostgreSQL has indexes (or Oracle has indexes or whatever), and thus, all of your index building knowledge in one applies to the other…right?

One thing that changes the article a bit is that the author doesn’t use page-level compression on indexes in SQL Server. I’d expect the results to change a fair amount, even if the SQL Server non-clustered indexes still ended up larger in the end than PostgreSQL indexes.

Comments closed