Press "Enter" to skip to content

Month: January 2022

Training a Model in the Azure ML Designer

I continue a series on low-code machine learning in Azure ML:

Machine learning is a lot like an action film from the 1980s: we see early on that there’s a problem, we train in a cool montage with upbeat rock music, and then we come back to the problem and defeat it with car chases and bazookas and quippy one-liners. Well, maybe that simile got away from me a little bit, but I think I’ll stick with it.

What we’ll do in this post is cover the process of training a simple model using the Azure ML designer. I won’t deviate too far from the “classic” Azure ML script, which involves using the Designer to train a model and then deploy an endpoint for consumption. And away we go!

Sometimes, when a model is running, I say to it, “I have to remind you Sully, this is my weak arm!”

Comments closed

Handling Categorical Data in R

The RSquared Academy blog has a two-parter on handling categorical data in R. Part 1 elaborates on kinds of categorical data and introduces a case study:

While we can rank the categories, we cannot assign a value to them. For example, in satisfaction ranking, we cannot say that like is twice as positive as dislike i.e. we are unable to say how much they differ from each other. While the order or rank of data is meaningful, the difference between two pieces of data cannot be measured/determined or are meaningless. Ordinal data provide information about relative comparisons, but not the magnitude of the differences.

Part 2 shows off ways to work with categorical data in tables:

In this section, we will explore the above ways of summarizing categorical data. We will also spend some time learning about tables as you will be using them extensively while working with categorical data. R has many packages for tabulating data and we list and explore all of them in the R scripts shared in the GitHub repository.

Click through for both guides. H/T R-Bloggers.

Comments closed

Date Math in Powershell

Steve Jones adds 12 years in Powershell:

I saw a fun post on Twitter recently asking days until retirement. I wrote this code:

DECLARE @YearsToRetire INT = 11;
SELECT DATEDIFF (DAY, GETDATE (), (DATEADD (YEAR, @YearsToRetire, GETDATE ())));

I thought that wasn’t bad, but then I wondered, how would I do this in PowerShell? I knew there had to be a way, so I googled and ran into this article.

Normally I need to take off my shoes to add that many years.

Comments closed

Page Compression Success Rates

Paul Randal has a script for us:

Yesterday I was chatting with Jonathan about some of the internals of the page compression algorithms for a client discussion. When you enable page compression, rows are first just row compressed (storing values with empty bytes stripped out) until a page fills up. The Storage Engine then attempts page compression for the page. 

Click through to see what that entails and how you can see what percentage of pages successfully compress at the page level.

Comments closed

The Evolution of Cloud Architecture

Ben Brauer has a two-parter looking at how architecture is changing. Part 1 looks at containers and machine learning:

Let’s start describe containers at a high level.  A container is a packaging and distribution mechanism that abstracts and resolves many of the installer issues that result from ‘unique’ environments.  We’ve all heard developers exclaim “well, it works on my machine,” after pushing an application to a new environment only to realize its broken.  Containers strive to address this problem by creating a hard boundary between the infrastructure and the software stack used by an application. External dependencies are not necessarily added to the container, but all your internal dependencies (frameworks, runtimes, etc.) are there.  This makes the deployment of the application to a new environment significantly more predictable as the compute environment is consistent as its part of the container.

Part two looks at serverless compute and low-code/no-code development:

Low-code (or no-code) development for applications is not a new concept. It strives to democratize development in a similar way as decades ago Visual Basic expanded the number of developers from thousands of C++ developers to hundreds of thousands of developers creating Windows-based solutions. Low-code takes this concept to non-technical professionals. Although this notion is great for productivity and usability, the maintenance and performance of these apps can be daunting to say the least. Now non-technical application authors need to learn about application management, documentation and, application deployment.  Without a clear understanding of these considerations, the environment can quickly become chaotic.  The good news is that platforms and tools have come a long way since Visual Basic. For example, Microsoft’s Power Apps platform provides many of the platform services needed to maintain a healthy application lifecycle and governance paradigm.

These are good concepts to know about, regardless of your particular cloud platform.

Comments closed

Creating Custom Objects with PSCustomObject

Robert Cain shows us one method of working with classes in Powershell:

For this post I’ll begin a series on the use of PSCustomObject. Prior to the addition of classes in PowerShell 5.0, this was the technique needed to create your own customized objects. It still has a lot of validity today though, as you can use these techniques to extend the objects other people defined, including those already built into PowerShell.

In addition, understanding the use of PSCustomObject will give you a better understanding of the way classes work.

Click through to see how you can create an object and assign properties to it, though methods will come in the next post.

Comments closed

Debugging Code in Python

Adrian Tam takes us through debugging options with Python:

The purpose of a debugger is to provide you a slow motion button to control the flow of a program. It also allow you to freeze the program at certain point of time and examine the state.

The simplest operation under a debugger is to step through the code. That is to run one line of code at a time and wait for your acknowledgment before proceeding into next. The reason we want to run the program in a stop-and-go fashion is to allow us to check the logic and value or verify the algorithm.

For a larger program, we may not want to step through the code from the beginning as it may take a long time before we reached the line that we are interested in. Therefore, debuggers also provide a breakpoint feature that will kick in when a specific line of code is reached. From that point onward, we can step through it line by line.

This is something I definitely need to get better at when doing Python development.

Comments closed

Solutions for Matching Supply with Demand

Itzik Ben-Gan has some solutions to show:

This month, I’m going to start exploring the submitted solutions, roughly, going from the worse performing to the best performing ones. Why even bother with the bad performing ones? Because you can still learn a lot from them; for example, by identifying anti-patterns. Indeed, the first attempt at solving this challenge for many people, including myself and Peter, is based on an interval intersection concept. It so happens that the classic predicate-based technique for identifying interval intersection has poor performance since there’s no good indexing scheme to support it. This article is dedicated to this poor performing approach. Despite the poor performance, working on the solution is an interesting exercise. It requires practicing the skill of modeling the problem in a way that lends itself to set-based treatment. It is also interesting to identify the reason for the bad performance, making it easier to avoid the anti-pattern in the future. Keep in mind, this solution is just the starting point.

Click through for a solution which is straightforward but slow.

Comments closed