Press "Enter" to skip to content

Curated SQL Posts

Rounded Corners in Power BI Error Bars

Mara Pereira trims off the edges:

If you’ve been following my posts, you probably know by now that one of the things that really annoys me in Power BI is not being able to build bar charts with rounded corners, unless you use a custom visual.

However, recently I found out about a trick to make this happen without needing to use custom visuals (special shout out to Federico Pastor who shared this trick with me!).

Read on to see how you can do this. It’s definitely not ideal, but if you absolutely need rounded corners for your bar chart, you can get them this way.

Comments closed

Poisson Hidden Markov Models in SAS

Ji Shen shows off how to perform discrete time series in SAS:

The HMM procedure in SAS Viya supports hidden Markov models (HMMs) and other models embedded with HMM. PROC HMM supports finite HMM, Poisson HMM, Gaussian HMM, Gaussian mixture HMM, the regime-switching regression model, and the regime-switching autoregression model. This post introduces Poisson HMM, the latest addition to PROC HMM in the SAS Viya 2023.03 release.

Count time series is ill-suited for most traditional time series analysis techniques, which assume that the time series values are continuously distributed. This can present unique challenges for organizations that need to model and forecast them. As a popular discrete probability distribution to handle the count time series, the Poisson distribution or the mixed Poisson distribution might not always be suitable. This is because both assume that the events occur independently of each other and at a constant rate. In time series data, however, the occurrence of an event at one point in time might be related to the occurrence of an event at another point in time, and the rates at which events occur might vary over time.

HMM is a valuable tool that can handle overdispersion and serial dependence in the data. This makes it an effective solution for modeling and forecasting count time series. We will explain how the Poisson HMM can handle count time series by modeling different states by using distinct Poisson distributions while considering the probability of transitioning between them.

Read on for an overview of Hidden Markov Models (in general and the Poisson variation in particular) and some of the challenges you can run into when performing this test.

Comments closed

An Introduction to ggflowchart

Nicola Rennie shows off a new package:

Flowcharts can be a useful way to visualise complex processes. However, I couldn’t find an easy way to create a flowchart in R. There are a few packages for either drawing basic components of flowcharts (like {grid}), packages that are great for visualising complex network data where order doesn’t really matter (like {ggnetwork} and {igraph}), but none of them gave me the control over customisation I was used to with {ggplot2}.

{ggflowchart} tries to fill that gap. The aim of {ggflowchart} is to help R users make simple, good-looking flowcharts, with as little code as possible. It computes a layout, then uses existing {ggplot2} functions to stitch together rectangles, text, and arrows.

It does remind me a bit of Mermaid, though quite early in the process. H/T R-Bloggers.

Comments closed

Databricks SQL in VSCode

Falek Miah tries out an extension:

Recently, I had the opportunity to explore the Databricks SQL extension for VSCode, and I was thoroughly impressed.

In December 2022, Databricks launched the Databricks Driver for SQLTools extension, and although it is still in preview, the features are already good and useful.

For data analysts, report developers and data engineers, having the ability to execute SQL queries against Databricks workspace objects is crucial for streamlining workflows and making data analysis activities much more efficient and quicker. The Databricks SQL extension for VSCode provides just that, with a simple and intuitive interface, this extension makes it easy to connect to Databricks workspace and run SQL queries directly from VSCode.

Click through for Falek’s thoughts. And if Databricks SQL is brand new to you, Falek also has a primer on it.

Comments closed

Variable Sharing in Polyglot Notebooks

Matt Eland performs a few swaps:

While Polyglot Notebooks certainly brings the dream of notebook development to dotnet, Polyglot is at its finest when you work with one language and then hand off data to the next language for additional processing.

In this article we’ll talk about sharing variables between kernels using Polyglot Notebooks and VS Code. We’ll explore the syntax and tooling that exists around these functionalities as well as the current limitation of sharing variables between kernels.

For simplicity, I’m going to avoid getting into SQL and KQL kernels in this article, but I plan on delving further into each of these specialized kernels in future articles.

Click through for an example using the best .NET language, as well as C#. Do read the whole thing, especially if you think about passing around discriminated unions or method-reach objects.

Comments closed

Dashboard for Azure Cost Management

Saira Shaik digs into the numbers:

I created this Dashboard to display the units consumed and the respective cost for Selected filters.

This Dashboard will be helpful for any Azure Customer, irrespective of their contract agreement with Microsoft. 

Customers can view their usage and cost by uploading the Usage files into this Power BI file. You can download .pbit file and setup instructions from Setup 

I’m actually going through a cost-reducing exercise with a client right now, so this is potentially quite helpful.

Comments closed

Automatic Refresh in Power BI

Chris Webb does some ‘splainin’:

One of the most confusing things about troubleshooting Power BI refresh problems is the way Power BI will sometimes try running a refresh again after it has failed. It means that refreshes seem to take a lot longer than you would expect and also that more queries are run against your data source than you would expect. I don’t have all the answers but I thought it would be useful to highlight a few scenarios where it does happen.

Read on for two scenarios in which you might find this.

Comments closed

Postgres Vacuum and Auto-Vacuum

Henrietta Dombrovskaya covers a couple of background processes:

Welcome to the second blog of the “magic of parameters” series. In the first entry, I covered memory parameters, and in this article. In this article will talk about PostgreSQL configuration parameters which manage the (auto)vacuum and (auto)analyze background processes.

Read on for a deeper dive into vacuuming and a bit on auto-analyze as well.

Comments closed

The Two File Formats for Polyglot Notebooks

Matt Eland chooses a file extension:

I’ve been talking more and more about Polyglot Notebooks and as people try it out, they tend to ask me one common question: should I create a .dib file or an .ipynb file? What’s the difference anyways?

In this short article I’m going to explore the .dib and .ipynb file formats and explain the difference between the two while answering the question of which one you should choose when creating your own notebooks in Polyglot Notebooks.

Read on for Matt’s thoughts. My tendency is to create them in .ipynb format for additional tooling support and potential cross-product flexibility (assuming you have the right kernels installed on your Jupyter server), though Matt explains his preference for .dib.

Comments closed

Feature Branching and Hotfixes for Azure DevOps

Vytas Suopys covers a bit of source control strategy:

Have you ever deployed a release to production only to find out a bug has escaped your testing process and now users are being severely impacted? In this post, I’ll discuss how to deploy a fix from your development Synapse Workspace into a production Synapse Workspace without adversely affecting ongoing development projects.

This example uses Azure DevOps for CICD along with a Synapse extension for Azure DevOps: Synapse Workspace Deployment. In this example, I assume Synapse is already configured for source control with Azure DevOps Git and Build and Release pipelines are already defined in Azure DevOps. Instructions on how to apply this this can be found in the Azure Synapse documentation for continuous integration and delivery.

The specific example covers Synapse, though the general principle applies no matter what you’re deploying.

Comments closed