Press "Enter" to skip to content

Curated SQL Posts

Partitioning in Oracle versus PostgreSQL

Umair Shahid continues a series on migrating from Oracle to PostgreSQL:

Table partitioning is a database design technique that divides a large table into smaller, more manageable sub-tables called partitions. Each partition holds a subset of the data based on specific criteria, such as date ranges, categories, or hash values. While partitioning makes it seem like you’re working with a single large table, behind the scenes, queries and operations are distributed across multiple partitions.

Read on to understand the differences between the two platforms. In this case, there are some pretty significant differences.

Leave a Comment

Reasons to Migrate from Synapse to Fabric

James Serra has a list:

Many customers ask me about the advantages of moving from Azure Synapse Analytics to Microsoft Fabric. Here’s a breakdown of the standout features that make Fabric an appealing choice:

  • Unified Environment for All Users
    Fabric serves everyone—from report writers and citizen developers to IT engineers—unlike Synapse, which primarily targets IT professionals.
  • Hands-Free Optimization
    Fabric is auto-optimized and fully integrated, allowing most features to perform well without requiring technical adjustments.

I suppose that James is too politic to give what I’d consider the top reason: because there have actually been meaningful updates to Microsoft Fabric in the past year. I’m not sure you can really say the same thing about Azure Synapse Analytics.

The tricky part about this, however, is that–to my knowledge, at least–there’s no clean way to migrate dedicated SQL pools.

Leave a Comment

Configuring SQL Server Alerts in SQLMonitor

Ajay Dwivedi has an update to SQLMonitor:

If you are responsible for managing & monitoring SQLServers, then it is mandatory to have monitoring and alerting for critical issues of your SQL Servers.

For this purpose, open source SQLMonitor now has built-in Alert Engine. With the capability of this alert engine, alerts can be sent to Slack & Email to various teams. The following are some advantages –

Read on to see some of those capabilities and how to set things up.

Leave a Comment

A Primer on SparkSQL and PySpark

Anurag K covers the basics of PySpark:

In the era of big data, efficient data processing is critical for insights-driven decision-making. PySpark SQL, a part of Apache Spark, enables data engineers and analysts to work with structured data at massive scale. Combining SQL’s simplicity with Spark’s processing power, it opens a gateway to handling vast datasets seamlessly. This comprehensive guide walks you through PySpark SQL, from foundational concepts to advanced querying techniques, with detailed code examples. Let’s dive in and master PySpark SQL for data-driven analytics.

Click through for examples covering a variety of operations you can perform.

Leave a Comment

Vector Search Performance Optimizations in Elasticsearch

Venkata Gummadi works on vector search response times:

As data engineers, we are tasked with implementing these sophisticated solutions, ensuring organizations can derive actionable insights from vast datasets. This article explores the intricacies of vector search using Elasticsearch, focusing on effective techniques and best practices to optimize performance. By examining case studies on image retrieval for personalized marketing and text analysis for customer sentiment clustering, we demonstrate how optimizing vector search can lead to improved customer interactions and significant business growth.

Read on for a vector search primer and some guidance of how you can improve the performance of vector search queries. I’d expect that much of this can also apply to Azure AI Search and Amazon OpenSearch.

Leave a Comment

Converting SQL Audit FileTime to DateTime Format

Patrick Keisler helps a customer:

One of my customers recently wanted to rename each of the SQL audit files will the datetime stamp of when it was created. I explained to them the filename already contains a datetime stamp. While it does not look like a typical timestamp, it is based on the Windows Filetime data structure that is a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC). Nonetheless, they still wanted a traditional datetime stamp in the file name.

Read on to see how. I can understand the displeasure in adding redundancy to a filename, though I also understand the reasoning from the customer’s point of view: FileTime isn’t human-readable in any meaningful way.

Leave a Comment

Move Data between Lakehouses and Workspaces in Microsoft Fabric

Gilbert Quevauvilliers performs an exfiltration:

With the new Schema’s in a Lakehouse, it now is possible to read from Lakehouse A (In Workspace A) and write to Lakehouse B (In Workspace B).

Here are more details about the Schema preview: Lakehouse schemas (Preview) – Microsoft Fabric | Microsoft Learn

This opens a whole new world of possibilities.

I also really like the fact that I can simply use the Names, and I do not need to get the actual GUIDS!

For example, I can use the following as shown below which is WorkspaceName.LakehouseName,SchemaName.TableName

Click through to see it in action.

Leave a Comment