Press "Enter" to skip to content

Category: Data

Data Recovery in SQL Server without a Backup

Rodrigo Riberio Gomes digs in:

In more than 10 years of experience, I have dealt with cases where someone has performed incorrect operations on a table, such as updating or deleting wrong rows, in a SQL Server database that does not have full backups available. There are multiple reasons for no full backup: corrupted backups, taking too much time to restore, etc.

In this post, I want to show an alternative for these cases, an ace up one’s sleeve, that you can use to recover data. This method also provides a deep understanding of the internal workings of how your SQL Server stores data. So, in addition to learning how to recover data, you will gain more insights into the internals of SQL.

Read on to see how. Rodrigo also points out some limitations or things that would need to change if you have index compression. I consider this a very neat thing you might need to know but never want to use.

Leave a Comment

Counting NULLs in SQL Server Tables

Vlad Drumea counts that which does not exist:

I’ve seen variations of the question “how to count all NULLs in all columns of a table” pop up on reddit once every couple of months, and I figured I’d give it a shot and post here in case anyone else might need it in the future.

Click through for the script. It does require M*N scans, where M represents the average number of columns in each table and N the number of tables in the database. In other words, don’t expect an immediate response.

Leave a Comment

Data Conversion via Generative AI

Grant Fritchey rearranges some data:

The DM-32 is a Digital Mobile Radio (DMR) as well as an analog radio. You can follow the link to understand all that DMR represents when talking radios. I want to focus on the fact that you have to program the behaviors into a DMR radio. While the end result is identical for every DMR radio, how you get there, the programming software, is radically different for every single radio (unless you get a radio that supports open source OpenGD77, yeah, playing radio involves open source as well). Which means, if I have more than one DMR radio (I’m currently at 7, and no, I don’t have a problem, shut up) I have more than one Customer Programming Software (CPS) that is completely different from other CPS formats. Now, I like to set up my radios similarly. After all, the local repeaters, my hotspot, and the Talkgroups I want to use are all common. Since every CPS is different, you can’t just export from one and import to the next. However, I had the idea of using AI for data conversion. Let’s see how that works.

Click through for the scenario as well as Grant’s results. Grant’s results were pretty successful for a data mapping operation, though choice of model and simplicity of the input and output examples are important to generate the Python code.

Comments closed

What’s New with OneLake Shortcuts

Miquella de Boer gives us an update:

Microsoft Fabric shortcuts enable organizations to unify their data across various domains and clouds by creating a single virtual data lake. These shortcuts act as symbolic links to data in different storage locations, simplifying access and reducing the need for multiple copies.

OneLake serves as the central hub for all analytics data. By using OneLake shortcuts, organizations can connect to existing data sources like Azure and AWS through a unified namespace, streamlining workflows and enhancing collaboration.

Click through for several feature improvements for shortcuts.

Comments closed

Time Range Generation in Data Diluvium

Adron Hall extends Data Diluvium:

Following up on my previous posts about adding humidity and temperature data generation to Data Diluvium, I’m now adding a Time Range generator. I decided this would be a nice addition to give any graphing of the data a good look. This will complete the trio of generators I needed for my TimeScale DB setup. While humidity and temperature provide the environmental data, the Time Range generator ensures we have properly spaced time points for our time-series analysis.

Click through to see how it works.

Comments closed

Generating Realistic Data with Data Diluvium

Adron Hall wants to generate some realistic time-series data. First up is humidity data:

When working with TimeScale DB for time-series data, having realistic environmental data is crucial. I’ve found that humidity is a particularly important parameter that affects everything from agriculture to HVAC systems. 

Then comes temperature data:

Following up on my previous post about adding humidity data generation to Data Diluvium, I’m now adding temperature data generation. This completes the pair of environmental data generators I needed for my TimeScale DB setup. Temperature data is crucial for time-series analysis and works perfectly alongside the humidity data we just implemented.

Check out both of these for detailed coverage of how to generate realistic-looking simulated data.

Comments closed

The Design of DataDiluvium

Adron Hall wraps up a series on DataDiluvium. First up is some commentary on data generation:

In Parts 1 and 2, I set up the development environment and implemented the schema parsing functionality. Now, I’ll explore the data generation system and final implementation details that make DataDiluvium a complete solution.

And finally we have some additional notes and wrapup:

In my previous three posts, I covered the core functionality of DataDiluvium. In this follow-up post, I’ll explore the additional features, utilities, and implementation details that I’ve added to enhance the application’s functionality and developer experience.

Check out both posts. I do enjoy seeing people walk through and explain some of the key concepts and decisions they’ve made when developing solutions.

Comments closed

An Overview of DataDiluvium

Adron Hall has a new tool and a new blog series. The first post is a product overview:

DataDiluvium is a web-based tool available at datadiluvium.com that helps developers, database administrators, and data engineers generate realistic test data from SQL schema definitions. Whether you’re setting up a development environment, creating test scenarios, or preparing data for demonstrations, DataDiluvium streamlines the process of data generation.

The second covers some of the development precepts Adron used:

DataDiluvium is a web-based tool I’ve built designed to help developers, database administrators, and data engineers generate realistic test data based on SQL schema definitions. The tool takes SQL table definitions as input and produces sample data in various formats, making it easier to populate development and testing environments with meaningful data.

The tool is free, so if you’re looking for a sample data generator, check it out.

Comments closed

2025 Data Professional Salary Survey Results

Brent Ozar shares this year’s survey results:

We’ve been running our annual Data Professional Salary Survey for almost a decade, and I was really curious to see what the results would hold this year. How would inflation and layoffs impact the database world? Download the raw data here and slice & dice it to see what’s important to you. Here’s what I found.

Read on for Brent’s analysis and grab the data for yourself to try things out. I’ve used this dataset in the past for presentations and it usually goes over pretty well, especially because it includes quite a few real-life data quality challenges.

Comments closed

Parquet File Customization and SQL Server

Ed Pollack writes some files:

Previously, we introduced and discussed the Parquet file format and SQL Server and why it is an ideal format for storing analytic data when it does not already reside in a native analytic data store, such as a data lake, data warehouse, or an Azure managed service.

Both Python and the Parquet file format are quite flexible, allowing for significant customization to ensure that file-related tasks are as optimal as possible. Compatibility with other processes, as well as keeping file sizes and properties under control will also be introduced here.

Click through for some examples.

Comments closed