Press "Enter" to skip to content

Month: March 2021

The DBA Role Over Time

Brent Ozar has started a retrospective. The first post covers “What would you say you do here?”:

When the database goes down, the business stops. If you’re selling things online, and your web site goes down, that’s an emergency. Even if it’s not down – if it just slows down – that can be an emergency too. DBAs help prevent that problem.

When someone accidentally deletes data, the business can stop. Unfortunately, in all too many companies, there are too many folks with access to that data – and those folks are often untrained. They’re just human, and they make mistakes. Database administrators help recover from that problem.

The second post looks at the lack of progress in many areas:

When I started working with databases, we had to:

– Provision the right CPUs and memory
– Provision the right storage throughput and size
– Install the database
– Configure the database
– Protect it – which sounds easy, but that also includes designing the right high availability and disaster recovery solution
– Design the right tables, and put indexes on them based on how we wanted to query the data
– Write fast, accurate queries to load & retrieve data
– Troubleshoot the whole thing when it went wrong

Today, decades later…yeah.

We’re still doing all of that stuff. It’s just that we’re cave-people using rocks for tools instead of banging on things by hand.

I’ve got some thoughts on this but they won’t fit on this stamp. I’ll have to put them together some other day.

Comments closed

Checking on Transactional Replication Status

Steve Stedman wants to see how far behind transactional replication is:

Working with Transactional Replication with many of our SQL Server clients, one of the big questions that we are always asking is replication keeping up? Recently when deleting a few hundred million rows from a large table that is synced with transactional replication, I needed to programmatically check how the distribution was keeping up. The one measure that indicated that things were behind was the subscriber undistributed commands tab in the replication monitor. When the number of commands waiting to be applied was too large, I had to slow down my delete process.

Click through for an easy script. The really frustrating part is that transactional replication is easy but answering the same question with merge replication is a royal pain.

Comments closed

Building VBA Macros for Excel in Powershell

Mikey Bronowski has a case of developer inception:

This is part of the How to Excel with PowerShell series. Links to all the tips can be found in this post.
If you would like to learn more about the module with an interactive notebook, check this post out.

In this last post of the series, I am going to mix Excel, PowerShell and VBA. If you weren’t using PowerShell to manage Excel files before, you might have used VBA (Visual Basic for Applications) to do so. Excel is a powerful tool and even this area can be improved a little bit with PowerShell.

It’s a shame to see this series wrap up, but Mikey has provided a huge amount of content around automating Excel spreadsheet creation. If you haven’t been reading these, I definitely recommend it.

Comments closed

Tracking Azure Resources with Tags

Jess Pomfret explains the value of tags:

One of the vital parts of this learning and experimenting needs to be cleaning up after myself.  We all know the risks of leaving things running in Azure- it’s likely to drain your training budget pretty quickly.  To be fair, this is also a good lesson for real world scenarios. Getting used to turning off or scaling down resources based on need is a good way to reduce your Azure spend.

This brings me to one morning last week. I logged in to the portal and got a pop up that my credit was down to under $5, which is not what I was expecting. I started looking around and wondering what I’d left running – it isn’t always easy to spot though.

Read on to see how tags can help with this, as well as other forms of cloud governance. If you remember to set them, that is.

Comments closed

Columnstore in Standard Edition

Erik Darling looks at how powerful (or not) columnstore indexes are in SQL Server Standard Edition:

The top plan is from Standard Edition, and runs for a minute in a full serial plan. There is a non-parallel plan reason in the operator properties: MaxDOPSetToOne.

I do not have DOP set to one anywhere, that’s just the restriction kicking in. You can try it out for yourself if you have Standard Edition sitting around somewhere. I’m doing all my testing on SQL Server 2019 CU9. This is not ancient technology at the time of writing.

The bottom plan is from Enterprise/Developer Edition, where the the plan is able to run partially in parallel, and takes 28 seconds (about half the time as the serial plan).

You get what you pay for in this case.

Comments closed

Early Thoughts on Dremio

Meagan Longoria gives us a review of Dremio:

I’ve been working on a project for the last few months with a client who has chosen to implement Dremio in Azure. Dremio is a data lake engine that creates a semantic layer and supports interactive queries.

It uses Apache Arrow, Gandiva, and Parquet files under the hood. It runs on either Linux VMs or Kubernetes containers. Like most big data systems, there is at least one coordinator node and one or more executor nodes. These nodes communicate and are managed using Apache Zookeeper. Client applications connect to Dremio via ODBC, JDBC, REST APIs, or Arrow Flight. Dremio can read from storage accounts, external databases, and a few other sources.

Read on for good and bad aspects of the product.

Comments closed

knitr Options and Hooks

The folks at Jumping Rivers conclude a series:

As with many aspects of programming, when you are working by yourself you can be (slightly) more lax with styles and set-up. However, as you start working in a team, different styles can quickly become a hindrance and lead to errors.

Using {knitr} is no different. When you work on documents with different team members, it’s helpful to have a consistent set of settings. If the default for eval changes, this can easily waste time as you try to track down an error. At Jumping Rivers, we use {knitr} a lot. From our training courses, to providing feedback to clients, to constructing monthly reports on clients infrastructure. The great thing about {knitr} is it’s really easy to customise. The bad thing is that without some care, it’s really easy for every member of the team to have different default options. This proliferation of different default options, means that when we pick up someone else document, mistakes may creep in.

Read on for different options they use to keep things consistent.

Comments closed

Understanding Interpolation

Joe Celko offers up some thoughts on interpolation:

Interpolation is a mathematical technique which was popular before we had a lot of cheap computing power. The basic idea is that if you’re given a set of data and looking for a value in the same range, you can interpolate it to get a reasonable estimation for the value that is not actually in the set.

If you can find an old calculus, finance, statistics or algebra book, they had lookup tables in the back. Remember that the only computational tools students had back then were pencil and paper or a slide ruler. If you wanted to use a pencil and paper, you had to know what formula to use. If you use the slide ruler, you can only have three decimal places in your answer (yes, there were a couple of over-sized specialized slide rulers which could go as high as four or five decimal places. They were very expensive). But if your slide ruler didn’t have a particular function you were trying to compute, it was hard to get even the three decimal places.

When you try to approximate a value outside the range of your set, that’s called extrapolation. It’s a different topic and requires a slight leap of faith.

Interpolation is a key part of regression techniques. Read the whole thing.

Comments closed

Common Power BI Administration Scripts

Brett Powell continues a series:

Today’s post builds on top of Power BI Admin Scripts Part I by describing five additional PowerShell scripts that Power BI service administrators can utilize to address relatively common scenarios. Like Part I, the five new scripts have been added to my GitHub repository and I’ll only share context in the blog.

Please be sure to read through the prerequisites section of Part I and confirm you have the necessary permissions (e.g. Power BI Admin role) and software installed such as the latest Power BI Management PowerShell modules.

Check it out for five more scripts. Brett also teases a part 3, but you have to wait until after the credits sequence to see it.

Comments closed

The Value of Foreign Keys

Kevin Wilkie tempts Betteridge’s Law of Headlines:

Foreign Keys are truly columns that point to the primary key of another table. Yes, you can create a foreign key pointing to the same table – but that’s… well… special. There are reasons, but far beyond the reasons I want to go into foreign keys here.

What a foreign key does for you is extremely important – Enforce Data Consistency.

Click through to learn how.

Comments closed