Category: Learning

The First 30 Days as a DBA

Published 2022-06-27 by Kevin Feasel

Tracy Boggiano has some experience with new jobs:

Over the last four years, ok it seems longer than that, I’ve started four jobs. A couple just weren’t good fits. One I was at for three years. I currently just finished my first 30 days at my fourth one. Having done the first 30 days several times over the last few years, I’ve searched each time what you would do when you start that new position to take over the environment. What would you evaluate, where to start with everything, what to do first? With no luck mind you. So, I’m going to blog about my journey through this as I’ve done it several times over my career and believe it can help others as they start new positions know where to begin. Coincidentally, Aaron Bertrand (t) just blogged about his first month at Stack Overflow as a DBRE.

I can’t believe it’s been four years since Tracy and I worked together. Somebody’s been messing with my time machine, right?

Comments closed

Multidimensional Bloom Filters

Published 2022-06-17 by Kevin Feasel

The Instaclustr team talks bloom filters:

Bloom filters are space-efficient probabilistic data structures that can yield false positives but not false negatives. They were initially described by Burton Bloom in his 1970 paper “Space/Time Trade-offs in Hash Coding with Allowable Errors“. They are used in many modern systems including the internals of the Apache® projects Cassandra®, Spark™, Hadoop®, Accumulo®, ORC™, and Kudu™.
Multidimensional Bloom filters are data structures to search collections of Bloom filters for matches. The simplest implementation of a Multidimensional Bloom filter is a simple list that is iterated over when searching for matches. For small collections (n < 1000) this is the most efficient solution. However, when working with collections at scale other solutions can be more efficient.

Read on to learn more, including some discussion about an implementation in Cassandra.

Comments closed

Management Strategies: Architects and Gardeners

Published 2022-06-14 by Kevin Feasel

Derik Hammer discusses two management strategies for team leadership:

There are two analogies for leaders that have made a visceral impact on my life and career, the architect leader and the gardening leader. These analogies became central to my personal and professional growth ever since I formally entered management five years ago. Arguably, I have been a leader for much longer than that in the various team or technical leadership positions and my time in the military. However, it wasn’t until my focus moved predominantly to management that I began building models for leadership mindsets.

My philosophy on this hews really well with the book Turn the Ship Around! by David Marquet. I highly recommend it for anybody in management or looking to go into management. One really short synopsis of the strategy I try to follow is to make goals clear, get the level below you invested (in part by bringing them in and actually listening to what they say), and be hands-off enough to let people learn and take initiative in how they solve problems and meet your goals.

Comments closed

Seven Principles for BI Skill Development

Published 2022-06-13 by Kevin Feasel

Brett Powell thinks about first principles:

Data and analytics languages should be prioritized far beyond graphical interface tools/software/services and should form a solid foundation of a skillset. Unlike software applications and various user interface controls which change frequently, the essential concepts and semantics of data languages such as SQL and DAX don’t change nearly as frequently and thus languages offer a much greater return on the time invested to learn them. For example, the fundamental PowerShell scripting knowledge I built up years ago using the Windows PowerShell ISE can still be applied today in many different tools, apps, and services that weren’t around back then such as Azure Function Apps and Visual Studio Code.
In almost every BI project I can remember, even projects that were explicitly intended to use low-code or no-code tools, it was the combination of different languages such as SQL, DAX, Kusto (KQL), Power Fx, and others that delivered the most value or which made the difference between project success and failure. Similarly, even in projects in which my role was intended to exclusively focus on the data model layer with DAX, I’ve almost always found myself also writing SQL, Power Query (M) and using other languages and code either in the data warehouse or on the reporting layer.

Brett has put a lot of thought into this and I think many of the principles apply outside of business intelligence work as well.

Comments closed

Office Hours Text Version

Published 2022-05-12 by Kevin Feasel

Brent Ozar does some Q&A:

Q: WhatsUpDocs?: Hi Brent, have you ever needed to look at business documentation (check business rules/logic) when consulting or as an employee, but it was severely lacking? Recently joined a different team in work and trying to find simple answers to questions is an uphill struggle…
The vast, vast majority of companies don’t document their technology. The tech is in a constant state of flux, and it’s a miracle if the tech even works, let alone is documented accurately. If you’re the kind of person who needs accurate, up-to-date documentation on the tools you use, you’ll be happier working for very large, slow-moving companies with compliance needs. Think giant global financial corporations.

Click through for the full list.

Comments closed

Chronological Snobbery in the Tech World

Published 2022-05-10 by Kevin Feasel

Andy Leonard wants us off his lawn:

Joking aside, chronological snobbery is a bias rarely discussed in technological circles. As a result, our (me included) tendency is to judge newer technology as better technology. Newer technology often sports cool new features, but it’s not entirely accurate to claim newer is always better. Forty-seven years in this field informs me this is not always the case; there is often some baby in the bathwater of the pervious generation of technology.
It’s more accurate to claim newer technology is often better for some applications.

It’s hard to draw that line between “new technology is better” and “old technology is venerable” appropriately. A lot of this is risk appetite, as we’ve known for decades (Crossing the Chasm is 30 years old, for example): some people will gravitate toward novel technologies faster than others. It’s hard for me to describe the tangled mess of beliefs I have around the topic, so I’ll just say to read Andy’s thoughts and maybe I’ll turn it into an episode of Shop Talk or something…

1 Comment

Thoughts on Technical Interview Questions

Published 2022-05-10 by Kevin Feasel

Steve Jones shares some thoughts:

Redgate had a discussion recently among our developers about our interview process and questions. There has been a standard question asking candidates about 2D arrays, but as one developer pointed out, we don’t use these in our code base. So, why do we ask candidates about this topic?
The developers came up with a different question, actually a series of questions that ask about a class and then how to test parts of this class. We mostly work in C# in a DevOps culture, so this seemed like a good idea. They proposed a scenario with a few questions and then asked current developers to solve the questions and give feedback on the language, structure, and difficulty of the problem.

Read on for Steve’s thoughts. It’s been a minute since I’ve given an interview (a plus side to having a really stable pair of database teams the past couple of years) but one of the things I enjoy doing is taking screenshots of Management Studio in various phases of work and ask “What do you see here? There are no right or wrong answers.” I say the latter because I don’t want you to enumerate through every string you see on the screen; I want you to explain what information of importance you’ve caught.

One big tip for interviewers: instead of algorithmic or gotcha questions, show actual code at the 25th, 50th, 75th, and 95th percentiles of difficulty within your code base, focusing on things a person could understand with about 20-30 lines of code and zero context. “Difficulty” can mean that this code was tough to write, is tough to maintain, or that you have included common (and sometimes uncommon) errors to an otherwise real segment of code. For database developers, that might include things like invalid NULL checks, incorrect assignments, etc. Ask the person to perform a code review and point out what they see that is interesting. That way, you get an opportunity to check their technical bona fides in a realistic but relatively low-pressure scenario by simulating the activities that a person actually would do in the job.

I have more advice but I’ll save that for another day.

Comments closed

Debugging a Production Failure

Published 2022-04-25 by Kevin Feasel

Roel Hogervorst diagnoses trouble:

When you are in panic mode you focus on what is right in front of you and make suboptimal decisions. Here is some I have made.

Read on for a couple stories as well as a practical implementation of debugging as an OODA loop. Something that Sean McCown mentioned before has always stuck with me: it’s amazing just how few people know how to troubleshoot issues. Our inclination seems to be one of two things: adduce a conclusion from the first piece of evidence (usually just a flimsy error message) or immediately give up.

Comments closed

Animated SQL: Visualizing Query Operations

Published 2022-04-19 by Kevin Feasel

Steve Jones looks at an interesting site:

While I think SQL is interesting, I know some people struggle with the way the language work. Someone at work posted a link to this site: https://animatesql.com/
I think the idea is this site helps you visualize how a SQL query works. It’s not free form, and I can’t just write any SQL, but you choose a keyword and then a sample query is shown. If you press Visualize, it walks through how this query is processed.

Click through to see how it works and Steve’s thoughts. It looks like they’re using either MySQL or Postgres in the background; it’s hard to tell because both support all of the site functionality including LIMIT/OFFSET (versus TOP and OFFSET/FETCH). Sadly, it’s pretty limited in terms of the queries supported—for example, I tried adding in a quick ROW_NUMBER() window function and that did not go over well. Still, I like this a lot as a teaching tool, especially for people brand new to SQL and haven’t sorted out how to think in sets.

Comments closed

T-SQL Advice for Beginners

Published 2022-04-19 by Kevin Feasel

Rob Farley offers some advice:

Following on from my last post… what advice would I give about T-SQL to my younger-self?
Well, for that I’m going back a long way. To when I had learned about queries, but still had some way to go.
It’s the same advice I give to everyone who’s learning to write T-SQL, even the most basic of beginners. And that is to understand that the queries you write get turned into execution plans, and it’s them that actually run.

Read on for some good advice.

Comments closed