Press "Enter" to skip to content

Author: Kevin Feasel

The First 30 Days as a DBA

Tracy Boggiano has some experience with new jobs:

Over the last four years, ok it seems longer than that, I’ve started four jobs. A couple just weren’t good fits. One I was at for three years. I currently just finished my first 30 days at my fourth one. Having done the first 30 days several times over the last few years, I’ve searched each time what you would do when you start that new position to take over the environment. What would you evaluate, where to start with everything, what to do first? With no luck mind you. So, I’m going to blog about my journey through this as I’ve done it several times over my career and believe it can help others as they start new positions know where to begin. Coincidentally, Aaron Bertrand (t) just blogged about his first month at Stack Overflow as a DBRE.

I can’t believe it’s been four years since Tracy and I worked together. Somebody’s been messing with my time machine, right?

Comments closed

Building a Reproducable Example with SQL Server

Erik Darling wants to help you ask questions more effectively:

What no one wants to get into when it comes to performance questions is a giant wall-of-text word problem.

You may be the most eloquent question-asker in the known universe, but having the above items is worth hundreds of millions of words.

Click through to see what Erik would like to see. This is a really good post to read if you ever use Stack Overflow (or DBA Stack Exchange) or any other method of asking a bunch of randos how to solve a problem.

Comments closed

Power BI Aggregations from Azure Data Explorer Data

Dany Hoter has some recommendations if you’re aggregating data from Azure Data Explorer into Power BI:

Every visual shown in a report in PBI, contains some form of aggregation

The question is how the aggregations are calculated and at which step in the pipe of bringing the data from the data source to the report.

In this article, I’ll be using data coming from Azure Data Explorer aka Kusto aka ADX.

Most of the content is relevant for other sources as well.

Read on for the advice, which I’d call fairly unexpected—I actually expected the recommendation to go the other way for performance reasons.

Comments closed

Stopping Azure Kubernetes Service Nodes

Andrew Pruski wants to shut the whole thing down:

A while back I wrote a post on Adjusting Pod Eviction Timings in Kubernetes. To test the changes made in that post I had to shut down nodes in an Azure Kubernetes Service cluster.

This can be done easily in the Azure portal: –

However I did a presentation recently and didn’t want to have to keep jumping into the portal from VS Code…so I wanted to be able to shut down the nodes in code.

So here’s how to use the azure-cli to shut down a node in an Azure Kubernetes Service cluster.

Read on to see how but also read Andrew’s warning / disclaimer so you don’t mess anything up in a production environment.

Comments closed

Power Query Online Memory and CPU Usage

Chris Webb wants to see how things are going:

Power Query Online is, as the name suggests, the online version of Power Query – it’s what you use when you’re developing Power BI Dataflows for example. Sometimes when you’re building a complex, slow query in the Query Editor you’ll notice a message in the status bar at the bottom of the page telling you how long the query has been running for and how much memory and CPU it’s using:

Read on to understand what the memory value indicates and for a few tips on the topic.

Comments closed

PHI De-Identification in Databricks with NLP

Amir Kermany, et al, share a set of notebooks:

John Snow Labs, the leader in Healthcare natural language processing (NLP), and Databricks are working together to help organizations process and analyze their text data at scale with a series of Solution Accelerator notebook templates for common NLP use cases. You can learn more about our partnership in our previous blog, Applying Natural Language Processing to Health Text at Scale.

To help organizations automate the removal of sensitive patient information, we built a joint Solution Accelerator for PHI removal that builds on top of the Databricks Lakehouse for Healthcare and Life Sciences. John Snow Labs provides two commercial extensions on top of the open-source Spark NLP library — both of which are useful for de-identification and anonymization tasks — that are used in this Accelerator:

This is a really interesting scenario.

Comments closed

Building Custom ggplot2 Palettes

Nicola Rennie busts out the beret and fancy palette board:

Choosing which colours to use in a plot is an important design decision. A good choice of colour palette can highlight important aspects of your data, but a poor choice can make it impossible to interpret correctly. There are numerous colour palette R packages out there that are already compatible with {ggplot2}. For example, the {RColorBrewer} or {viridis} packages are both widely used.

If you regularly make plots at work, it’s great to have them be consistent with your company’s branding. Maybe you’re already doing this manually with the scale_colour_manual() function in {ggplot2} but it’s getting a bit tedious? Or maybe you just want your plots to look a little bit prettier? This blog post will show you how to make a basic colour palette that is compatible with {ggplot2}. It assumes you have some experience with {ggplot2} – you know your geoms from your aesthetics.

Click through to see how you can build a palette and use it across multiple ggplot2 charts.

Comments closed

Determining Why Constraints Are Untrusted

Tom Zika adopts a zero-trust constraint architecture:

Okay, so you went through the effort of fixing them, but the next day your constraints are not trusted again. What gives?

If you are sure none of the DBAs or developers is doing this to spite you, the most common culprit is a BULK INSERT or bulk copy tool (bcp).

One of its parameters is -h(hints)
and one of those hints is CHECK_CONSTRAINTS

Read on to see how this can mess everything up, as well as how you can track and fix it. There are cases, particularly in extremely high-write systems, where you don’t necessarily need the constraint to exist but want it to be there for documentation purposes. In that case, the constraints are usually disabled rather than simply untrusted. The other time I see people purposefully using untrusted constraints is that old data is garbage and essentially unfixable but they want new data to be correct. Most of the time, though, constraints are untrusted because nobody noticed the problem.

Comments closed

Creating Line Charts in Excel

Amy Esselman builds a line chart:

A line chart is a simple graph that is familiar to most audiences. Lines are great for showing continuous data, such as plotting how the value of something changes over time. In this post, we will cover how to create a line chart in Excel, using a sample dataset from a community exercise: table takeaways. The information is about an annual corporate fundraiser to provide meals to those in need. You can download the file here to follow along as we build the line chart. 

It might be that I’ve spent too much time in Power BI but creating charts in Excel seems a lot harder than it needs to be. This is especially true once you throw some unused columns into the mix.

Comments closed

SSIS — RPC Server is Unavailable

Jon Morisi does some troubleshooting:

I just spent a long slog sorting out why I could not connect to my SSIS instance remotely.  I work in a very secure environment requiring network approval for any and all ports.  According to the following article, I was under the impression that a request to open incoming traffic on port 135, to a specific IP, would allow SQL Server Management Studio, on that specific IP, to connect remotely to SSIS:

https://docs.microsoft.com/en-us/sql/sql-server/install/configure-the-windows-firewall-to-allow-sql-server-access?redirectedfrom=MSDN&view=sql-server-ver16#BKMK_ssis

After opening port 135, I was receiving the error message in the title of this article:

If you find yourself in this situation, read on to see how Jon was able to solve the problem.

Comments closed