Press "Enter" to skip to content

Author: Kevin Feasel

tidyAML Now Available in CRAN

Steven Sanderson has a package make the big-time:

I’m excited to announce that the R package {tidyAML} is now officially available on CRAN! This package is designed to make it easy for users to perform automated machine learning (AutoML) using the tidymodels ecosystem. With a simple and intuitive interface, tidyAML allows users to quickly generate high-quality machine learning models without worrying about the underlying details.

Read on to learn more more about this package, as well as the broader healthyverse series of packages.

Comments closed

Building an Internal Load Balancer in Azure

Vaibhav Kumar balances the scales:

The Internal load balancer manages load for a private network with any inbound access from the public platform. As in the diagram below, the primary load balancer managing load from the internet is a public-type load balancer. But, the VMs communication to storage or database is managed through a type-internal load balancer.

Click through for a walkthrough of the process.

Comments closed

Recommendations for Dedicated SQL Pool Data Modeling

Bhaskar Sharma has some advice:

In this article, I will discuss how to physically model an Azure Synapse Analytics data warehouse while migrating from an existing on-premises MPP (Massive Parallel Processing) data warehouse solution like Teradata and Netezza. The approach and methodologies discussed in this article are purely based on the knowledge and insight I have gained while migrating these data warehouses to Azure Synapse dedicated SQL pool. 

Dedicated SQL pools are close enough to regular SQL Server that we make a lot of assumptions about it, some of which may be wrong.

Comments closed

Pivoting in Postgres with CROSSTAB

Rajendra Gupta pivots abruptly:

A pivot table is a popular tool in Microsoft Excel that shows summarized data and helps you analyze it in various ways. Pivot tables collect and organize data from different rows, columns, and tables. Pivot tables are a great way to summarise data, and a handy tool for analyzing sales revenue, products sold, sales performance, etc.

Relational database tables store data in multiple rows and columns. You can calculate data using various functions such as count, sum, and average. SQL Server provides the PIVOT and UNPIVOT functions for working with pivot tables. How do we create the pivot tables in PostgreSQL? Let’s find it out.

Read on for a demonstration.

Comments closed

SQL Server 2022 CU1 and SQL Server 2019 CU19 Released

Srinivas Kandibanda and Harvey Mora have announcements:

The 1st cumulative update release for SQL Server 2022 RTM is now available for download at the Microsoft Downloads site. Please note that registration is no longer required to download Cumulative updates.

Both of these came out several months later than expected, though with the big GDR that dropped yesterday, it seems like that cleared up the logjam.

Comments closed

Working with Postgres Extensions in Azure Cosmos DB

Sarah Dutkiewicz runs into an issue:

Problem: I installed PostGIS on my single-node cluster without issues. However, I scaled my cluster to 2 nodes afterwards. When I ran the query that uses ST_X and ST_Y from PostGIS, I got the following error:

ERROR:  type "public.geometry" does not exist
CONTEXT:  while executing command on private-w0.azure-cosmos-db-global-ug-demo.postgres.database.azure.com:5432

When I read the CONTEXT message, I realized by the w# reference that the worker nodes didn’t have PostGIS installed. When you scale the nodes – at least in this case, it doesn’t enable the extensions over there.

Read on to see how Sarah was able to resolve this issue.

Comments closed

Visualizing Moving Averages in R with healthyR.ts

Steven Sanderson shows off a useful R library:

Are you interested in visualizing time series data in a clear and concise way? The R package {healthyR.ts} provides a variety of tools for time series analysis and visualization, including the ts_ma_plot() function.

The ts_ma_plot() function is designed to help you quickly and easily create moving average plots for time series data. This function takes several arguments, including the data you want to visualize, the date column from your data, the value column from your data, and the frequency of the aggregation.

Read on to learn more about this plot and see an example of it in action.

Comments closed

Azure Defender for SQL Overview

Deepthi Goguri looks at an Azure security offering:

Azure Defender for SQL, once you enable it will alert you for any SQL injection attacks, brute force attacks or any breached identities trying to access the data of your database. It also provides the vulnerability assessments. Vulnerability assessments give you alerts about the configurations of your database. If your database configuration is not following the standards of Azure, you will receive the alerts in the vulnerability assessment report.

You can enable the Azure Defender at the subscription level or at the Server level or at the resource level as well. Under the recommendations in the security center in the Azure portal, check for the Remediate security configuration. This will show if the Azure defender is configured properly.

I like Azure Defender for SQL, especially the advanced threat protection element. It’s based on IP address location and has caught me in different locations as I’ve traveled.

Comments closed

Estimating Quantiles in Python

Christian Lorentzen digs into quantile calculation:

Applied statistics is dominated by the ubiquitous mean. For a change, this post is dedicated to quantiles. I will give my best to provide a good mix of theory and practical examples.

While the mean describes only the central tendency of a distribution or random sample, quantiles are able to describe the whole distribution. They appear in box-plots, in childrens’ weight-for-age curves, in salary survey results, in risk measures like the value-at-risk in the EU-wide solvency II framework for insurance companies, in quality control and in many more fields.

There are easy functions to calculate quantiles in R and Python; this post serves as a way of understanding the variety of quantile functions available and how they can affect results with small sample sizes.

Comments closed

Azure ML Overview

Sanil Mhatre gets us started with Azure Machine Learning:

The five-part series is designed to jump-start any IT professional’s journey in the fascinating world of Data Science with Azure Machine Learning (Azure ML). Readers don’t need prior knowledge of Data Science, Machine Learning, Statistics, or Azure to begin this adventure.

All you will need is an Azure subscription and I will show you how to get a free one that you can use to explore some of Azure’s features before I show you how to set up the Azure ML environment.

Part 1 is available now, with the other parts coming up soon. Even so, Part 1 is a big article on its own.

Comments closed