Press "Enter" to skip to content

Author: Kevin Feasel

Functional Data Analysis in R

Joseph Ricker gives us a gentle introduction to a not-so-gentle topic:

This plot might depict 80 measurements for a participant in a clinical trial where each data point represents the change in the level of some protein level. Or it could represent any series of longitudinal data where the measurements are take at irregular intervals. The curve looks like a time series with obvious correlations among the points, but there are not enough measurements to model the data with the usual time series methods. In a scenario like this, you might find Functional Data Analysis (FDA) to be a viable alternative to the usual multi-level, mixed model approach.

This post is meant to be a “gentle” introduction to doing FDA with R for someone who is totally new to the subject. I’ll show some “first steps” code, but most of the post will be about providing background and motivation for looking into FDA. I will also point out some of the available resources that a newcommer to FDA should find helpful.

Read on to learn more.

Comments closed

Power BI Announcements at Microsoft Business Applications Summit

Gilbert Quevauvilliers has a long list of Power BI announcements:

As I have done each and every year I go through and give an overview of all the Power BI Announcements at the Microsoft Business Applications Summit 2021.

This year once again they have announced some incredible features either available now or coming soon so, please read below.

There are quite a few interesting features here. One of the ones which caught my eye was automatic aggregations for DirectQuery calculations, as that reminded me of MDX pre-calculations.

Comments closed

Doodles about the Storage Engine

Forrest McDaniel explains via image:

Paul Randal is a SQL Server legend with loads of informative articles. But when I was a baby DBA first reading Inside the Storage Engine, I got a little stuck. It took many passes before, eventually, finally, it clicked. I wish I had a lightweight introduction, so in the practice of paying it forward…

Here’s the starting point: sometimes it’s easier to manage lots of small things (say, the 1s and 0s of data) by grouping them into larger things. It’s the same reason you don’t buy rice by the grain.

Read on for that introduction to the storage engine.

Comments closed

Multi-Select Slicers in Power BI

Reza Rad simplifies multiple selection:

This is a very short, simple article about how to have a multi-select slicer in Power BI. Power BI slicer is in fact multi-select by default, however, there is a very small option that if you set it, makes it even easier to use, let’s talk about it. If you want to learn more about Power BI, read the Power BI book from Rookie to Rock Star.

Click through to see how you can perform multi-selection by default, as well as an alternative setting.

Comments closed

Tips for Improving Code Performance in R

Mira Celine Klein continues a series on code performance in R:

This is the second part of our series about code performance in R. It contains a lot of approaches to reduce the time your code needs to run. It’s useful to know those ideas before starting to write new code, but it also helps to optimize existing code.

If you have already written some code you want to speed up, but don’t know which part of it is actually slow, I recommend you to read the first part of this series on profiling. That article also introduces the microbenchmark package which we are going to use to measure code performance in this article.

Let’s start with a seemingly obvious rule, which is however not always easy to follow.

Read on for some tips. H/T R-bloggers.

Comments closed

Table Design in R with mmtable2

Matt Dancho walks through a package to make tables look great in R:

I love ggplot2 for plotting. The grammar of graphics allows us to add elements to plots. Tables seem to be forgotten in terms of an intuitive grammar with tidy data philosophy – Until now. mmtable2 aims to be the ggplot2 for tables, leveraging the awesome GT table package.

The mmtable2 package aims to make it easy to create tables by:

1. Using a ggplot2-style syntax for using a grammar of table operations.

2. Extends the amazing GT table package.

Read on for the process and a demonstration.

Comments closed

Uncommenting XML from C#

Joy George Kunjikkur needs to remove some XML comment tags:

Requirement 

As part of the installation, some XML fragments (eg: <authentication>) need to be uncommented in web.config file based on the environment,. This can be done either via PowerShell or C#.Net as this has to be triggered from MSI installation. Never during the runtime of the application.

Alternatives

We can either do string-based detection and replace it. Or use XML parser of .Net. Since the string parser is complex, let us stick with the .Net library to replace it.

Read on for one way to do this.

Comments closed

Surviving a Kafka Outage

Jakub Korab walks us through availability features in Kafka as well as what to expect if your brokers are unavailable:

In the case of an outage, you have to ensure that these messages can be processed eventually. Keeping unsent messages around and retrying indefinitely in the hopes that the outage will rectify may eventually result in your application running out of memory. This is a crucial consideration in high-throughput applications.

If business functions are performed by systems downstream of Kafka, and the sending application only acts as an ingestion point, the situation is slightly more relaxed. If Kafka is unavailable to send messages to, then no external activity has taken place. For these systems, a Kafka outage might mean that you do not accept new transactions. In such a case, it may be reasonable to return an error message and allow the external third party to retry later. Retail applications typically fall into this category.

Read the whole thing.

Comments closed

Defining an Ad Hoc Query

Kathi Kellenberger explains what it means to be an ad hoc query:

Someone recently asked me which queries are ad hoc in SQL Server. An ad hoc query is a single query not included in a stored procedure and not parameterized or prepared. Depending on the server settings, SQL Server can parameterize some statements initially written as ad hoc queries. Ad hoc doesn’t mean dynamic.

Next on the list, a post hoc ergo propter hoc query. That’s where I explain to the DBAs that just because the server goes down every time I run a query, it doesn’t mean my queries caused this.

Comments closed