Press "Enter" to skip to content

Author: Kevin Feasel

An Introduction to ML.NET

Ivan Matec gives us a walkthrough of the ML.NET library and its Model Builder component:

Before we dive into our example, let’s talk a bit about ML.NET history and its current state.

ML.NET draws its origins from the 2002’s Microsoft Research project named TMSN, which stands for “test mining search and navigation.” Later it was renamed to TLC, “the learning code.” ML.NET war derived from the TLC library. Initially, it was used on internal Microsoft products.

The first publicly available version ML.NET 1.0 was released in 2019. It included the Model Builder add-in and AutoML (Automated Machine Learning) capabilities.

The current version is 1.6.0. More details about all releases can be found on the official ML.NET release page.

ML.NET is not a bad library if you need to do some fairly simple work

Comments closed

Working with App Secrets in .NET Core

Santosh Hari shows us how to use application secrets when building .NET Core applications:

I was writing a sample dotnetcore console application for a talk because why I felt using a sample aspnet core web app was overkill. The app was connecting to a bunch of Azure cloud and 3rd party services (think Twilio API for SMS or LaunchDarkly API for Feature Flags) and I had to deal with connection strings.

Now I have a nasty habit of “accidentally” checking in connection string and secrets into public GitHub repositories, so I wanted to do this right from the get go.

That’s a bad habit to be in, and Santosh shows us how we can avoid doing that via use of application secrets.

Comments closed

Fun with Arrays in Powershell

Robert Cain looks at how arrays operate in Powershell:

In this article, we’ll look at the different ways to use Arrays in PowerShell. We’ll cover the basics, then move onto more advanced topics.

For all of the examples, we’ll display the code, then under it the result of our code. In this article I’ll be using PowerShell Core, 7.1.3, and VSCode. The examples should work in PowerShell 5.1 in the PowerShell IDE, although they’ve not been tested there.

Click through for a variety of tips and tricks when working with arrays.

Comments closed

Visualizing Data over Time with F#

Codesuji takes us through creating an interesting video:

How is this accomplished? I reach into F#’s bag of tricks to leverage Deedle, Plotly.NET, and ffmpeg in order to transform a series of data files into a singular video showing county-level drought data from 1900-2016. Together these bring static data into a dynamic representation. For reference, the Palmer Drought Severity Index (PDSI) typically ranges from -10 (dry) to 10 (wet). Putting this all together is pretty straight-forward, but I wanted to call out a couple specific parts. For this particular example Deedle is overkill, but pairing it with Plotly.NET can often be useful in more complex situations. Plotly offers some nice customization options, which I take advantage of below. Once all the images are generated with Plotly, F# can shell out to ffmpeg to perform the video assembly. I do this in two parts, creating both an mp4 and webm file.

We’re reading datasets, parsing text files, deserializing JSON contents, building a visual for each point in time, and then creating a video out of it—all in 100 lines of code. Not bad.

Comments closed

AWS RDS for SQL Server Notes and Limitations

Tom Collins summarizes places where AWS Relational Database Services (RDS) for SQL Server differs from the box product:

Some AWS  RDS SQL Server limitations

– Some ports are reserved for Amazon RDS, and you can’t use them when you create a DB instance.

– Amazon RDS for SQL Server doesn’t support importing data into the msdb database.

– You can’t rename databases on a DB instance in a SQL Server Multi-AZ deployment.

–  AWS RDS doesn’t support Data Quality Services and Master Data Services on the same RDS service, need to spin up an EC2 and run services form another server

Read on to see more limitations, notes on how security is different, and notes on feature support. Do keep in mind, though, that some of these may change over time—a few years back, the number of limitations was much greater.

Comments closed

Getting Started with Citus on Azure

Gauri Mahajan sets up Azure Database for PostgresSQL and picks the really expensive version:

PostgreSQL is an open-source and one of the most popular relational databases that are typically used for OLTP systems. One important feature of this database is that it’s supported by a large community, and with it comes several extensions that can be applied on the PostgreSQL server to use it for a variety of different applications. Examples of such extensions are AppOS, HypoPG, OpenFTS, PostGIS, TimescaleDB (PostgreSQL for time-series), etc.

One such PostgreSQL extension is Citus – which transforms PostgreSQL into a distributed database that enables usage of Postgres in a scale-out or cluster model. With Citus, the PostgreSQL server can be used for high transaction throughputs, processing time-series or IoT data, building analytical warehouses as well as for real-time analytics. Managing such dynamic infrastructure on which PostgreSQL, as well as Citus extension operates, can be quite challenging. Azure recently launched the Citus flavor of PostgreSQL in the form of Azure Database for PostgreSQL – Hyperscale server group. This can be compared to the likes of Azure Synapse or AWS Redshift. In this article, we will learn how to deploy the Hyperscale server group of the Azure Database for PostgreSQL and explore its configuration options.

Read on for setup instructions, as well as some of the benefits you get by using the Citus extension.

Comments closed

Handling Content Access Requests in Power BI

Marc Lelijveld walks us through the process of requesting (and granting) access to content in Power BI:

When we look at the Power BI ecosystem, we can identify a bunch of different artifacts. For example, dataflows, datasets, reports, dashboards and many derivatives. As I explained in the previous post, the best practice for sharing content is through a Power BI App, which includes a list of users or active directory group containing multiple users. With that, the content becomes available after publishing to those who are granted access. Though, it can happen that one of the users shares the link with other users who do not have access to the content. As a property of the Power BI App, you can allow users to share the app and underlying dataset with share permissions. Though, working with sensitive data this might now be what you are looking for, as you might loose control over who has access.

Read on to see what constitutes a content access request and what you can do about them.

Comments closed

Compilations per Second in SQL Server

Fabiano Amorim clarifies a metric’s definition:

As you can see, the number of SQL Compilations/Sec is very high. It’s important to step back and remember the general description and guideline for this counter and understand what I mean by “high”:

Official Description: “Number of SQL compilations per second. Indicates the number of times the compile code path is entered.”

Read on for a dive into ad hoc SQL statements parameterization, how an instance can have a high compilations/sec value relative to batch requests/sec, and how that can affect performance in the long run.

Comments closed

Explaining an ML Model with SHAP

Dan Lantos, et al, walk us through one technique for model explainability:

Interpretability has to do with how accurately a machine learning model can associate a cause (input) to an effect (output). 

Explainability on the other hand is the extent to which the internal mechanics of a machine or deep learning system can be explained in human terms. Or to put it simply, explainability is the ability to explain what is happening. 

Let’s consider a simple example illustrated below where the goal of the machine learning model is to classify an animal into its respective groups. We use an image of a butterfly as input into the machine learning model. The model would classify the butterfly as either an insect, mammal, fish, reptile or bird. Typically, most complex machine learning models would provide a classification without explaining how the features contributed to the result. However, using tools that help with explainability, we can overcome this limitation. We can then understand what particular features of the butterfly contributed to it being classified as an insect. Since the butterfly has six legs, it is thus classified as an insect.

Being able to provide a rationale behind a model’s prediction would give the users (and the developers) confidence about the validity of the model’s decision.

Read on to see how you can use a library called SHAP in Python to help with this explainability.

Comments closed

A Cross-Platform Comparison of JSON in Relational Databases

Lukas Eder gives us some information on SQL/JSON and where different relational database management systems are in their JSON journies:

Building (dogfooding) on top of our own SQL/JSON API has revealed a lot of caveats of the various SQL/JSON implementations across vendors, and to be frank, it’s been a bit of a sobering experience. Despite there now being the ISO/IEC TR 19075:6 standard (mostly driven by Oracle this time), many vendors have already implemented some kind of JSON support, and it looks differently in all dialects – to the extent where writing vendor agnostic SQL/JSON is almost impossible with hand written native SQL. You’ll need an API like jOOQ or any other abstraction to standardise the different dialects.

Click through for the survey.

Comments closed