Shiny is an R package that makes it easier to build interactive web apps straight from R. Back in July 2022 at rstudio::conf(2022), Posit (formerly RStudio) announced the release of Shiny for Python. As someone who knows Python but hasn’t written any Python code for quite a long time, I wanted to see how the two compared. So I did the only logical thing and built a Shiny app – twice!
After building (almost) identical Shiny apps, with one built solely in R and the other solely in Python, I’ve written this blog post to take you through some of the things that are the same, and a few things that are slightly different.
Note: at the time of writing Shiny for Python is still in alpha, so if you’re reading this blog quite a while after it was first published, some things may have changed.
Day: January 23, 2023
An aspiring data engineer recently reached out to me for some guidance on pivoting into the field from a software development background. The questions they asked are similar to what others have asked me in the past, so I decided to capture my responses here. I link to prior posts and other resources when possible to try and keep the responses brief. These are informal thoughts of mine, not something I have sat down to rethink and research for new ideas beyond what is already in my head.
Dustin is one of the best people to talk to about data engineering. Click through for his advice.
This invaluable framework provides clear guidance on the recommended practices to assess, architect and migrate Oracle workloads to the Azure cloud. This should be the first place for answers to success for Oracle on Azure!
A special thanks to my teammate, Jessica Haessler for working so hard to help me get this to the finish line, as I would have never been able to get this done on my own!
Click through for a link to the guide. There isn’t a Well-Architected Framework assessment for this yet but the WAF articles themselves have quite a bit of detail to them.
In this post we look at a method using Extended Events (XE) to identify what parent objects are calling a given SQL function and how often.
The background is that I was working with a team where we identified that a certain scalar function was being executed billions of time a day and – although lightweight for a single execution – overall it was consuming significant CPU on the server. We discussed a way of improving things but it required changing the code that called it. The problem was that the function was used in about 700 different places across the database code – both in stored procedures and views – though the views themselves would then be referenced by other stored procedures. Rather than update all the code they’d like to target the objects first that execute the function the most times.
Read on to see how Matthew did it, as well as some caveats along the way.
This is a topic that I’ve written about previously, but it’s so important that it warrants revisiting from time to time.
Let’s have a chat at what privileges a Power BI administrator has with respect to accessing metadata and data throughout the Power BI tenant.
Click through for the answer.
Recall that in PostgreSQL both users and groups are technically roles. These are always created at the cluster level and granted privileges to databases and other objects therein. Depending on your database background it may surprise you that roles aren’t created as a principal inside of each database. For now, just remember that roles (users and groups) are created as a cluster principal that (may) own objects in a database, and owning an object provides additional privileges, something we’ll explore later in the article.
For the purposes of this article, all example user roles will be created with password authentication. Other authentication methods are available, including GSSPI, SSPI, Kerberos, Certificate, and others. However, setting up these alternative methods is beyond what we need to discuss object ownership and privileges.
Read the whole thing if you’re doing anything with Postgres.
There are two ways to read data inside Data Lake using the Synapse Serverless engine. In this article, we’ll look at the first method which uses OPENROWSET to query a path within the lake.
Synapse is a collection of tools with four different analytical engines (Dedicated Pool, Spark Pool, Serverless Pool, Data Explorer Pool). This gives you a lot of options for ingesting, transforming, storing, and querying your data. The article will focus on how you can use the Synapse Serverless Pool to query the data in your ADLS account.
Click through for a primer on the topic, as well as a demo video.
When using Power BI Premium or Premium Per user you get the option to backup the database, there can be occasions when you try and restore the backup and it fails.
The reason that it could fail is because when a restore happens it can consume additional memory which would take you up and over the memory limit.
Below I will explain a new option which allow this to restore successfully!
Gilbert includes a copy of the error message and one new option in the post.