Press "Enter" to skip to content

Day: May 15, 2023

Poisson Hidden Markov Models in SAS

Ji Shen shows off how to perform discrete time series in SAS:

The HMM procedure in SAS Viya supports hidden Markov models (HMMs) and other models embedded with HMM. PROC HMM supports finite HMM, Poisson HMM, Gaussian HMM, Gaussian mixture HMM, the regime-switching regression model, and the regime-switching autoregression model. This post introduces Poisson HMM, the latest addition to PROC HMM in the SAS Viya 2023.03 release.

Count time series is ill-suited for most traditional time series analysis techniques, which assume that the time series values are continuously distributed. This can present unique challenges for organizations that need to model and forecast them. As a popular discrete probability distribution to handle the count time series, the Poisson distribution or the mixed Poisson distribution might not always be suitable. This is because both assume that the events occur independently of each other and at a constant rate. In time series data, however, the occurrence of an event at one point in time might be related to the occurrence of an event at another point in time, and the rates at which events occur might vary over time.

HMM is a valuable tool that can handle overdispersion and serial dependence in the data. This makes it an effective solution for modeling and forecasting count time series. We will explain how the Poisson HMM can handle count time series by modeling different states by using distinct Poisson distributions while considering the probability of transitioning between them.

Read on for an overview of Hidden Markov Models (in general and the Poisson variation in particular) and some of the challenges you can run into when performing this test.

Comments closed

An Introduction to ggflowchart

Nicola Rennie shows off a new package:

Flowcharts can be a useful way to visualise complex processes. However, I couldn’t find an easy way to create a flowchart in R. There are a few packages for either drawing basic components of flowcharts (like {grid}), packages that are great for visualising complex network data where order doesn’t really matter (like {ggnetwork} and {igraph}), but none of them gave me the control over customisation I was used to with {ggplot2}.

{ggflowchart} tries to fill that gap. The aim of {ggflowchart} is to help R users make simple, good-looking flowcharts, with as little code as possible. It computes a layout, then uses existing {ggplot2} functions to stitch together rectangles, text, and arrows.

It does remind me a bit of Mermaid, though quite early in the process. H/T R-Bloggers.

Comments closed

Databricks SQL in VSCode

Falek Miah tries out an extension:

Recently, I had the opportunity to explore the Databricks SQL extension for VSCode, and I was thoroughly impressed.

In December 2022, Databricks launched the Databricks Driver for SQLTools extension, and although it is still in preview, the features are already good and useful.

For data analysts, report developers and data engineers, having the ability to execute SQL queries against Databricks workspace objects is crucial for streamlining workflows and making data analysis activities much more efficient and quicker. The Databricks SQL extension for VSCode provides just that, with a simple and intuitive interface, this extension makes it easy to connect to Databricks workspace and run SQL queries directly from VSCode.

Click through for Falek’s thoughts. And if Databricks SQL is brand new to you, Falek also has a primer on it.

Comments closed

Variable Sharing in Polyglot Notebooks

Matt Eland performs a few swaps:

While Polyglot Notebooks certainly brings the dream of notebook development to dotnet, Polyglot is at its finest when you work with one language and then hand off data to the next language for additional processing.

In this article we’ll talk about sharing variables between kernels using Polyglot Notebooks and VS Code. We’ll explore the syntax and tooling that exists around these functionalities as well as the current limitation of sharing variables between kernels.

For simplicity, I’m going to avoid getting into SQL and KQL kernels in this article, but I plan on delving further into each of these specialized kernels in future articles.

Click through for an example using the best .NET language, as well as C#. Do read the whole thing, especially if you think about passing around discriminated unions or method-reach objects.

Comments closed

Dashboard for Azure Cost Management

Saira Shaik digs into the numbers:

I created this Dashboard to display the units consumed and the respective cost for Selected filters.

This Dashboard will be helpful for any Azure Customer, irrespective of their contract agreement with Microsoft. 

Customers can view their usage and cost by uploading the Usage files into this Power BI file. You can download .pbit file and setup instructions from Setup 

I’m actually going through a cost-reducing exercise with a client right now, so this is potentially quite helpful.

Comments closed

Automatic Refresh in Power BI

Chris Webb does some ‘splainin’:

One of the most confusing things about troubleshooting Power BI refresh problems is the way Power BI will sometimes try running a refresh again after it has failed. It means that refreshes seem to take a lot longer than you would expect and also that more queries are run against your data source than you would expect. I don’t have all the answers but I thought it would be useful to highlight a few scenarios where it does happen.

Read on for two scenarios in which you might find this.

Comments closed