Press "Enter" to skip to content

Category: Tools

Tips for Bringing a Streamlit App into Production

I have wrapped up another series:

In this video, I discuss some of the things you should consider as you transition a Streamlit application from development to production. We will cover four methods of bringing a Streamlit app to production and some thoughts on performance optimization.

This one doesn’t have much in the way of demos, but I do spend a lot of time at the virtual whiteboard, so it’s got that going for it.

Comments closed

pg_dump and the Backup Tool Debate

Gulcin Yildirim Jelinek explains the debate around whether pg_dump is a backup tool or not:

Recently, while writing about the vulnerability affecting pg_dump, the topic of decommissioning pg_dump came up on Twitter. Unlike the nostalgic feelings many had for Pluto, there was less reluctance to see pg_dump reclassified. In fact, some people were eager to retire it as a backup utility, and I even got a bit of pushback for still referring to pg_dump as one

I was talking to my colleague Simona the other day, and she mentioned that everybody in Postgres circles says, “pg_dump is not a backup tool,” but perhaps it’s not always explained well why it is not.

Read on for that explanation.

Comments closed

Checking SQL Server Backup Status with sp_CheckBackup

Jeff Iannucci announces a new tool:

Whether you are backing up your SQL Server databases with Ola Hallengren’s scripts, Maintenance Plans, or some third-party software, your backups are your lifeline for recovering your data in case of a disaster.

So…are your databases recoverable? Do you know WHERE your most recent backups were written? Do you know WHEN they were written? Did you verify the backups for recoverability? Have you backed up any encryption certificates associated with your backups?

And if you think know the answers to these questions…are you sure? And can you confirm your Recovery Point Objective (RPO)?

Read on to learn more about the procedure and grab a copy from the GitHub repo.

Comments closed

Direct Query of Flat Files in Postgres via file_fdw

Semab Tariq uses a Postgres extension:

The file_fdw (Foreign Data Wrapper) is a PostgreSQL extension that lets you access data stored in flat files, like CSV files, as if they were regular tables in your PostgreSQL database. This is useful for integrating external data sources without needing to import the data directly into your database.

file_fdw is a contrib module, meaning it’s an additional feature included with PostgreSQL but not part of its core functionality. Contrib modules provide extra capabilities and enhancements beyond the core database system.

At first, I was going to write that the mechanism looks a lot like PolyBase in SQL Server. But in actuality, it’s more like a hybrid of PolyBase and OPENROWSET, as there’s no definition of external data sources or file formats, but there is the creation of an external (“foreign”) table.

Comments closed

SSMS Layout and Tips

Tom Zika talks about SQL Server Management Studio:

I’d like to pin my Object Explorer and Registered Servers to the right and set them to auto-hide for these reasons:

  • I spend most of my time in the Query window and I like that it’s left-aligned
  • I have to resize the Object Explorer based on the level of nesting or length of the object name, sql jobs, etc
  • The Solution Explorer in Visual Studio is also on the right side

Read on for quite a few tips. Tom has a good tip regarding tabs and I’d bring up a separate tool for consideration: Tabs Studio. I paid for a license of it out of my own pocket when I was a database engineer and absolutely loved the control it gave me over displaying SSMS tabs, including having literally dozens of them on the screen at once.

1 Comment

Execution Plan Analysis in Postgres with StatsViaExplainAnalyze

Deepak Mahto shares a tool with us:

Tuning SQL is an integral part of most developer’s or DBA’s primary responsibilities. Understanding and dissecting problematic execution plans are critical steps in the overall tuning process.

One key ingredient of SQL performance tuning, along with the runtime plan, is object statistics. Looking only at the execution plan is often insufficient for making solid recommendations; we need a fair understanding of current object statistics.As the saying goes,

Behind every optimized SQL query lies a deep understanding of the underlying object stats.

Click through to see examples of the StatsViaExplainAnalyze tool and how you can get a copy of it from GitHub.

Comments closed

Building a Series of T-SQL Statements via Regular Expression

Andy Levy now has two problems:

Earlier this year, I embarked on a bit of a project to tidy up the indexes in a sizeable database. This database has over 900 tables, and there are quite a few indexes which I have long suspected don’t need to exist for a variety of reasons. Some indexes are redundant, others don’t get used, still others could be combined into a single index. By reducing the number of indexes, we can improve write performance in the database, and the size of the database itself. But we have to do it carefully.

Read on for an example of how you can use a regular expression to take text in one form (DatabaseName.SchemaName.IndexName, for example) and convert it into a T-SQL script.

Comments closed

Contoso Data Generator v2

Marco Russo announces an updated product:

I am proud to announce the second version of the Contoso Data Generator!

In January 2022, we released the first version of an open-source project to create a sample relational database for semantic models in Power BI and Analysis Services. That version focused on creating a SQL Server database as a starting point for the semantic model.

We invested in a new version to support more scenarios and products! Yes, Power BI is our primary focus, but 90% of our work could have been helpful for other platforms and architectures, so… why not?

Read on to see how you can use this and generate as much data as you want.

Comments closed

Unit Testing a Database

Olivier Van Steenlandt builds some tests:

In the past few years, I learned much about collaborative data warehouse development and deployment automatization by using Database Projects (SSDT) and Azure DevOps (and other tools).

I had my fair share of learning curves, making mistakes, and having great learning opportunities. Lately, I started my next journey to learn about Unit Testing for data warehousing/database development.

In this data cookbook (blog post series), we will discover the wonderful world and different flavors of unit testing from a data perspective. In the coming weeks/months, new data recipes (blog posts) will be released bi-weekly.

This first post provides an overview of the topic and includes links to three tools, though SQL Test is an implementation of tSQLt. Of the three, Visual Studio tests are the best of the bunch, though they’re more integration tests than unit tests.

Comments closed

Tips to Identify Poorly-Performing Code

Rich Benner shares a few tricks:

So, you have a database or block of code. You’ve been told to “make it fast” but you’re not sure where to start. I’ve got you. We’re going to create a process to follow to ensure we can tune effectively and prove that we’ve made things faster. It’s all about having a structure when performance tuning.

Read on for Rich’s recommendations on how to identify poorly-performing code.

Comments closed