Press "Enter" to skip to content

Author: Kevin Feasel

Use Cases for Multiple Data Lakes

James Serra explains why you might want multiple data lakes in an organization:

A question I get asked frequently from customers when discussing Data lake architecture is “Should I use one data lake for all my data, or multiple lakes?”. Ideally, you would use just one data lake, but I have seen many valid use cases where customers are using multiple data lakes. Here are some of those reasons:

I’d quibble with a couple of these (and given James’s intro, I’m not sure he’s fully on board with all of the reasons) but this is a good list of reasons why you might see several data lakes in an organization.

Comments closed

SQLCMD Variables in Database Projects

Olivier Van Steenlandt can’t live in this static world:

When I started to explore and use Database Projects, I ran into a specific situation quite fast where I was required to use SQLCMD variables. In this blog post, I will describe what they are, how you can use SQLCMD variables in Database Projects and where this might become very useful for you.

Click through for a scenario, a primer on using SQLCMD variables, and some basic details on how to use them in database projects.

Comments closed

“No Healthy Upstream” Error in vCenter

Denny Cherry diagnoses a problem:

Over the weekend, I was configuring our new VMware servers. I was happily working around when all of a sudden, vCenter started showing the hated “no healthy upstream” message on the vCenter website.

Thankfully, this was not the first time I’d seen this happen, and it usually occurs randomly (at least in my experience). The solution is easier than most people would think.

Click through to learn what you should do if you see that error.

Comments closed

Q&A on Data Engineering

Dustin Vannoy talks to the mirror:

An aspiring data engineer recently reached out to me for some guidance on pivoting into the field from a software development background. The questions they asked are similar to what others have asked me in the past, so I decided to capture my responses here. I link to prior posts and other resources when possible to try and keep the responses brief. These are informal thoughts of mine, not something I have sat down to rethink and research for new ideas beyond what is already in my head.

Dustin is one of the best people to talk to about data engineering. Click through for his advice.

Comments closed

Building a Shiny App in R and Python

Nicola Rennie does a language throw-down:

Shiny is an R package that makes it easier to build interactive web apps straight from R. Back in July 2022 at rstudio::conf(2022), Posit (formerly RStudio) announced the release of Shiny for Python. As someone who knows Python but hasn’t written any Python code for quite a long time, I wanted to see how the two compared. So I did the only logical thing and built a Shiny app – twice!

After building (almost) identical Shiny apps, with one built solely in R and the other solely in Python, I’ve written this blog post to take you through some of the things that are the same, and a few things that are slightly different.

Note: at the time of writing Shiny for Python is still in alpha, so if you’re reading this blog quite a while after it was first published, some things may have changed.

The code, as you’d expect, looks quite similar. I also learned about plotnine, something I’ll need to keep in mind. H/T R-Bloggers.

Comments closed

Finding a Scalar Function Caller

Matthew McGiffen searches for the root of the problem:

In this post we look at a method using Extended Events (XE) to identify what parent objects are calling a given SQL function and how often.

The background is that I was working with a team where we identified that a certain scalar function was being executed billions of time a day and – although lightweight for a single execution – overall it was consuming significant CPU on the server. We discussed a way of improving things but it required changing the code that called it. The problem was that the function was used in about 700 different places across the database code – both in stored procedures and views – though the views themselves would then be referenced by other stored procedures. Rather than update all the code they’d like to target the objects first that execute the function the most times.

Read on to see how Matthew did it, as well as some caveats along the way.

Comments closed

Well-Architected Framework for Oracle in Azure

Kellyn Pot’vin-Gorman has a new tool for us:

This invaluable framework provides clear guidance on the recommended practices to assess, architect and migrate Oracle workloads to the Azure cloud.  This should be the first place for answers to success for Oracle on Azure!

A special thanks to my teammate, Jessica Haessler for working so hard to help me get this to the finish line, as I would have never been able to get this done on my own!  

Click through for a link to the guide. There isn’t a Well-Architected Framework assessment for this yet but the WAF articles themselves have quite a bit of detail to them.

Comments closed

Roles and Privileges in Postgres

Ryan Booz gives us an introduction to Postgres security:

Recall that in PostgreSQL both users and groups are technically roles. These are always created at the cluster level and granted privileges to databases and other objects therein. Depending on your database background it may surprise you that roles aren’t created as a principal inside of each database. For now, just remember that roles (users and groups) are created as a cluster principal that (may) own objects in a database, and owning an object provides additional privileges, something we’ll explore later in the article.

For the purposes of this article, all example user roles will be created with password authentication. Other authentication methods are available, including GSSPI, SSPI, Kerberos, Certificate, and others. However, setting up these alternative methods is beyond what we need to discuss object ownership and privileges.

Read the whole thing if you’re doing anything with Postgres.

Comments closed