Press "Enter" to skip to content

Category: Tools

Working with Parquet Files in Postgres

Craig Kerstiens announces an extension:

Today, we’re excited to release pg_parquet – an open source Postgres extension for working with Parquet files. The extension reads and writes parquet files to local disk or to S3 natively from Postgres. With pg_parquet you’re able to:

  • Export tables or queries from Postgres to Parquet files
  • Ingest data from Parquet files to Postgres
  • Inspect the schema and metadata of existing Parquet files

Code is available at: https://github.com/CrunchyData/pg_parquet/.

Read on for more background on why we built pg_parquet or jump below to get a walkthrough of working with it.

Hey, that’s my job to tell people to read on to learn more!

Comments closed

Building a Data Detective Toolkit

Deb Melkin talks tools:

Happy T-SQL Tuesday! I wasn’t really sure I’d be able to crank something out for this one but somehow I managed to squeeze it in. Tim Mitchell ( b ) is hosting and he has a great topic for us: What’s in our Data Detective toolkit?

I love this topic for so many reasons. Partly because I feel like I’m asked to look at so many projects where I’m dropped in and asked to figure things out, usually performance related but occasionally new functionality or features. But as I’m asked to do this fairly often, I may have to see if Data Detective can be my new title… hmm…

Being a Data Detective in a film noir. On the one hand, that sounds like a really neat idea. On the other hand, things usually don’t turn out so well for the detective.

Comments closed

Creating Profiles in Visual Studio Code and Azure Data Studio

I have a new video:

In this video, I show off a not-so-well-known capability in Visual Studio Code and Azure Data Studio: creating profiles.

Profiles are very useful in Visual Studio Code, though probably less useful for Azure Data Studio. I think the primary benefit to that would be handling things like zoom levels and menu layouts when you switch from a laptop on the go to something plugged into a larger monitor.

Comments closed

Building a Test Data Generator for PostgreSQL

Mika Sutinen builds some data:

I recently had a project where I needed quickly to generate some realistic looking test data to PostgreSQL database. While I often like to go for ready-made solutions, this felt like a good opportunity to stretch my coding muscles and develop it myself. Moreover, this seemed like a fun puzzle to solve, and I could probably use the same solution later on elsewhere.

Click through for a description of the generator, as well as a link to Mika’s GitHub repo. Taking a quick peek at it, it does appear that you could probably use this for other data platforms like SQL Server with very limited modification.

Comments closed

mssql-tools 18 and Two Common Errors

Vlad Drumea covers a pair of errors you might run into with mssql-tools version 18:

In this post I cover the 0A000086 and “command not found” errors that you might encounter with the new version of SQL Server command-line tools, namely sqlcmd and bcp, for Linux.

While the latest version of SQL Server command-line tools, based on Microsoft ODBC 18, brings improvements, it also brings some gotchas that can break your automations.

Read on to learn more about each.

Comments closed

Missing Columns in the Extended Events Live Data Explorer

Grant Fritchey explains a UI oddity:

Let me be extremely clear up front, this is not my original work. I saw this post on DBA.StackExchange.com and I wanted to share and promote it. Nice work FevziKartal.

The rest of this post is just me replicating work already done by others. I just want to see it in action.

Read on for the example and what happens when you don’t have any events in the live data explorer.

Comments closed

Tips for Bringing a Streamlit App into Production

I have wrapped up another series:

In this video, I discuss some of the things you should consider as you transition a Streamlit application from development to production. We will cover four methods of bringing a Streamlit app to production and some thoughts on performance optimization.

This one doesn’t have much in the way of demos, but I do spend a lot of time at the virtual whiteboard, so it’s got that going for it.

Comments closed

pg_dump and the Backup Tool Debate

Gulcin Yildirim Jelinek explains the debate around whether pg_dump is a backup tool or not:

Recently, while writing about the vulnerability affecting pg_dump, the topic of decommissioning pg_dump came up on Twitter. Unlike the nostalgic feelings many had for Pluto, there was less reluctance to see pg_dump reclassified. In fact, some people were eager to retire it as a backup utility, and I even got a bit of pushback for still referring to pg_dump as one

I was talking to my colleague Simona the other day, and she mentioned that everybody in Postgres circles says, “pg_dump is not a backup tool,” but perhaps it’s not always explained well why it is not.

Read on for that explanation.

Comments closed