Press "Enter" to skip to content

Day: October 23, 2023

Creating Pareto Charts in R with qcc

Steven Sanderson builds a Pareto chart:

A Pareto chart is a type of bar chart that shows the frequency of different categories in a dataset, ordered by frequency from highest to lowest. It is often used to identify the most common problems or causes of a problem, so that resources can be focused on addressing them.

To create a Pareto chart in R, we can use the qcc package. The qcc package provides a number of functions for quality control, including the pareto.chart() function for creating Pareto charts.

Manufacturing companies love Pareto charts

Comments closed

Several Useful R Functions

Maelle Salmon shows off four useful R functions:

Recently I caught myself using which(grepl(...)),

animals <- c("cat", "bird", "dog", "fish")
which(grepl("i", animals))
#> [1] 2 4

when the simpler alternative is

animals <- c("cat", "bird", "dog", "fish")
grep("i", animals)
#> [1] 2 4

Read on for another example of using grep() instead of grepl(), as well as three other functions you might want to keep in mind. H/T R-Bloggers.

Comments closed

Exploring Poker Hands in R

Benjamin Smith sorts and deals:

Recently, I have been reading “Mathematical Statistics” by Professor Keith Knight and I noticed a interesting passage he mentions when discussing finite sample spaces:

*In some cases, it may be possible to enumerate all possible outcomes, but in general such enumeration is physically impossible; for example, enumerating all possible 5 card poker hands dealt from a deck of 52 cards would take several months under the most
favourable conditions. * (Knight 2000)

While this quote is taken out of context, with the advent of modern computing this is a task which is definitely possible to do computationally!

Click through to see how you can do this in R, at least for 5-card stud. 5-card draw would have the same number of final combinations, though if you also track intermediary combinations, it would grow rather considerably.

Comments closed

Microsoft Fabric’s Reflex as Watchdog

Tom Martens brings home a junkyard dog:

Reflex is many things next to one of the workloads of Microsoft Fabric. Before I delve into these things in more detail in later articles (yes, maybe this is the birth of another series of articles), I want to say this: Reflex is cool. It was never that simple to watch your data in your Power BI datasets (and this is only one of the capabilities of Reflex).

Because I need images whenever I try to understand things, I start with a simple image of Reflex: I consider Reflex a watchdog! Reflex is watching something and alarms me or someone else when something happens – a defined condition is met.

Read on for an example of how this works using a real dataset.

Comments closed

Postgres Performance Tuning via work_mem

Salman Ahmed explains what working memory is in Postgres and the effects of changing the work_mem value:

PostgreSQL, by default, is configured to run everywhere with minimum resource utilization. To achieve maximum performance under specific scenarios, PostgreSQL’s parameters can be tuned to enhance performance. One such parameter that can impact performance in PostgreSQL is work_mem.

In this blog we will discuss how work_mem can be used to optimize performance in PostgreSQL.

Click through for that discussion.

Comments closed

Debugging SQLPackage Issues in Powershell

Jose Manuel Jurado Diaz simplifies SQLPackage output:

Handling massive SQLPackage diagnostic logs, like those spanning over 4 million rows, can be an overwhelming task when troubleshooting support cases. This article introduces a PowerShell script designed to efficiently parse through SQLPackage diagnostic logs, extract error messages, and save them to a separate file, thus simplifying the review process and enhancing the debugging experience.

Click through for a Powershell script that can help.

Comments closed

Building Diagrams in Mermaid

Michael Bourgon tries out Mermaid:

Just found out about this the past month. 

I like diagrams for my documentation, and I detest making it. I also would like to build it via script, since that’s more useful.

I used Mermaid to create a series of architectural diagrams a couple years back. It was a reasonably good experience, although you have to keep in mind that you don’t get pixel-perfect designs and certain concepts can be difficult to represent. Even so, it’s quite alright for straightforward diagrams and includes support for icon sets for a variety of cloud and on-premises environments.

Comments closed

Caching: In-Database and External

Adron Hall talks caches:

All aboard the Data Express! Let’s imagine our database as this massive train station. The trains are packed with information – from passengers’ details to the schedules. Every time you want to know when the next train to DevOps Land is, you have to ask the station master (the database). If too many folks keep asking the same question, the station master will get tired, slowing down the whole operation. So, what do we do? Enter: Caching!

Read on for different caching mechanisms in several major relational databases, various reasons for external caches (like Redis and memcached) to exist, and four patterns for external caching. I’ve found that database people tend not to care much about external caches, leaving that to application developers. But there can be good reasons to store high-read, low-write data in caches, reducing some of the strain on those expensive database servers.

Comments closed