Press "Enter" to skip to content

Category: Misc Languages

A Primer on Functional Programming

Anirban Shaw gives us the skinny:

In the ever-evolving landscape of software development, there exists a paradigm that has been gaining momentum and reshaping the way we approach coding challenges: functional programming.

In this article, we delve deep into the world of functional programming, exploring its advantages, core principles, origin, and reasons behind its growing traction.

I like this as an introduction to the topic, helping explain what functional programming languages are and why they’ve become much more interesting over the past 15-20 years. Anirban hits the topic of concurrency well, showing how a functional approach with immutable data makes it easy for multiple machines to work on separate parts of the problem independently and concurrently without error. I’d also add one more bit: functional programming languages tend to be more CPU-intensive than imperative languages, so in an era of strict computational scarcity, imperative languages dominate. With strides in computer processing, we tend to be CPU-bound less often, so the trade-off of some CPU for the benefits of FP makes a lot more sense. H/T R-Bloggers.

Comments closed

Switch Statements and Expressions in C#

Hasan Savran points out the overloaded nature of switch in C# 8 and later:

It works great but the break and the case syntaxes are getting duplicated, new switch syntax gets rid of the case, and the break statements. Here how this example looks like using the new switch syntax.

Click through for Hasan’s demo. Basically, this is the difference between a statement and an expression. C#’s switch keyword has historically been a statement: given some input, perform an action but do not return an output. Performing an action within the function is known as a side effect and it adds some mental overhead to the way we process things, especially as your methods get more complex and you have to keep track of more things in your mind at once.

By contrast, Hasan’s second example is switch as an expression, which is more in the F# style and an example of why I like to joke about how what you’ll find in C# vNext is what you got in F# two versions ago. An expression is an operation which takes an input and returns an output without performing any actions causing side effects along the way. This makes expressions easier to diagram and conceptualize than statements, though statements offer more flexibility, especially when you do want to take radically different actions depending on some given input.

Comments closed

A Primer on Regular Expressions

Steven Sanderson provides a quick guide to regular expressions:

Regular expressions, often abbreviated as regex, are powerful tools used in programming to match and manipulate text patterns. While they might seem intimidating at first, regular expressions are incredibly useful for tasks like data validation, text parsing, and pattern matching. In this blog post, we’ll explore regular expressions in the context of R programming, breaking down the concepts step by step and providing practical examples along the way. By the end, you’ll have a solid understanding of regular expressions and be ready to apply them to your own projects.

This is an extremely powerful language which can take years (decades?) to master, especially considering that there are several regular expression syntaxes and they don’t all behave the same way. But still, I’ve found that the more familiar you are with regular expressions, the simpler certain classes of problem become.

Comments closed

Adding Count to a Grouped DataFrame in Spark

The Big Data in Real World team does some counting:

We want to group the dataset by Name and get a count to see the employee and the number of projects they are assigned to. In addition to that sub count, we also want to add a column with a total count like below.

One important thing to remember about Spark transformations is that they’re lazy: just because you ran df.groupBy(...).agg(...) doesn’t mean the new DataFrame exists yet, so until you call the show() action (or whatever), the original data is still there for the taking, which is how you can reference it again later in the chained statement.

Comments closed

Analyzing Big-O Notation in Polyglot Notebooks

Matt Eland brings me back to college:

Polyglot Notebooks is a great way of running interactive code experiments mixed together with rich markdown documentation.

In this short article I want to introduce you to the #!time magic command and show you how you can easily measure the execution time of a block of code.

This can be helpful for understanding the rough performance characteristics of a block of code inside of your Polyglot Notebook.

In fact, we’ll use this to explore the programming concepts behind Big O notation and how code performance changes based on the number of items.

I like this for two reasons. First, because a visual indicator of Big-O notation is helpful for students learning about the topic. Second, because that’s not the only thing you can do with the #time magic.

Comments closed

Importing Code into Polyglot Notebooks

Matt Eland brings some code to the party:

We’ve seen that Polyglot Notebooks allow you to mix together markdown and code (including C# code) in an interactive notebook and these notebooks allow you to share data between cells and between languages. However, frequently in programming you want to reference code that others have written without having to redefine everything yourself.

In this article we’ll explore how Polyglot Notebooks allows you to import dotnet code from stand-alone files, DLLs, and NuGet packages so your notebooks can take advantage of external code files and the same libraries that you can work with from your code in Visual Studio.

The syntax, by the way, is very similar to the F# Interactive (and the short-lived C# Interactive) tool, particularly #i and #r.

Comments closed

SandDance in Polyglot Notebooks

Matt Eland continues a series on dotnet Polyglot Notebooks:

As I’ve been doing more and more dotnet development in notebooks with Polyglot Notebooks I’ve found myself wanting more options to create rich data visualizations inside of VS Code the same way I could use Plotly for data visualization using Python and Jupyter Notebooks.

Thankfully, Microsoft SandDance exists and fills some of that gap in terms of doing rich data visualization from dotnet code.

In this article I’ll talk more about what SandDance is, show you how you can use it inside of a Polyglot Notebook in VS Code, and show you a simple way you can use it without needing a Polyglot Notebook.

Most of my experience with SandDance was with Azure Data Studio, but it’s nice to see this capability in notebooks as well.

Comments closed

Poisson Hidden Markov Models in SAS

Ji Shen shows off how to perform discrete time series in SAS:

The HMM procedure in SAS Viya supports hidden Markov models (HMMs) and other models embedded with HMM. PROC HMM supports finite HMM, Poisson HMM, Gaussian HMM, Gaussian mixture HMM, the regime-switching regression model, and the regime-switching autoregression model. This post introduces Poisson HMM, the latest addition to PROC HMM in the SAS Viya 2023.03 release.

Count time series is ill-suited for most traditional time series analysis techniques, which assume that the time series values are continuously distributed. This can present unique challenges for organizations that need to model and forecast them. As a popular discrete probability distribution to handle the count time series, the Poisson distribution or the mixed Poisson distribution might not always be suitable. This is because both assume that the events occur independently of each other and at a constant rate. In time series data, however, the occurrence of an event at one point in time might be related to the occurrence of an event at another point in time, and the rates at which events occur might vary over time.

HMM is a valuable tool that can handle overdispersion and serial dependence in the data. This makes it an effective solution for modeling and forecasting count time series. We will explain how the Poisson HMM can handle count time series by modeling different states by using distinct Poisson distributions while considering the probability of transitioning between them.

Read on for an overview of Hidden Markov Models (in general and the Poisson variation in particular) and some of the challenges you can run into when performing this test.

Comments closed

Variable Sharing in Polyglot Notebooks

Matt Eland performs a few swaps:

While Polyglot Notebooks certainly brings the dream of notebook development to dotnet, Polyglot is at its finest when you work with one language and then hand off data to the next language for additional processing.

In this article we’ll talk about sharing variables between kernels using Polyglot Notebooks and VS Code. We’ll explore the syntax and tooling that exists around these functionalities as well as the current limitation of sharing variables between kernels.

For simplicity, I’m going to avoid getting into SQL and KQL kernels in this article, but I plan on delving further into each of these specialized kernels in future articles.

Click through for an example using the best .NET language, as well as C#. Do read the whole thing, especially if you think about passing around discriminated unions or method-reach objects.

Comments closed