Press "Enter" to skip to content

Day: March 18, 2024

Plotting Training and Testing Results with tidyAML

Steven Sanderson builds a plot:

In the realm of machine learning, visualizing model predictions is essential for understanding the performance and behavior of our algorithms. When it comes to regression tasks, plotting predictions alongside actual values provides valuable insights into how well our model is capturing the underlying patterns in the data. With the plot_regression_predictions() function in tidyAML, this process becomes seamless and informative.

Read on to see how the function works and the kind of result you can expect from it.

Comments closed

Feature Engineering with Azure ML and Microsoft Fabric

Siliang Jiao, et al, talk architecture:

Feature engineering is the process of using domain knowledge to extract features (characteristics, properties, attributes) from raw data. The extracted features are used for training the models that can predict values for relevant business scenarios. A feature engineering system provides the tools, processes, and techniques used to perform feature engineering consistently and efficiently. 

This article elaborates on how to build a feature engineering system based on Azure Machine Learning managed feature store and Microsoft Fabric. 

Click through to see how the pieces fit together.

Comments closed

Copilot in Microsoft Fabric Dataflows Gen2

Reza Rad shows off a capability:

There has been a lot of hype recently about Generative AI and Copilot in Microsoft. Microsoft Fabric incorporates many of those features, and one of the areas it has been added to is the Dataflow Gen2 in Microsoft Fabric, or we can also call it Power Query in Power BI Service Dataflows. In this article and video, I will describe how the Copilot works with Data Factory Dataflow Gen2, its requirements, and its examples.

Click through for the video and the article. The thing that I believe will keep many people from using this is that you need a Microsoft Fabric capacity of F64 or greater to get access to Copilot. That’s a pretty hefty requirement.

Comments closed

Copying a Direct Lake Semantic Model between Fabric Workspaces

Kevin Chant makes a copy:

In this post I introduce scripts to improve copying a Direct Lake semantic model to another workspace using Microsoft Fabric Git integration.

I wanted to do this follow-up after my previous post about my initial tests to copy a Direct Lake semantic model to another workspace using Microsoft Fabric Git integration.

Due to the fact that I want to show how you can work with scripts locally to create the repository that contains the Direct Lake semantic model. Plus, how to do this in a way that includes the new Tabular Model Definition Language (TMDL) semantic file format.

Read on to see how it all fits together.

Comments closed

Using IN and NOT IN in SQL Server

Erik Darling shares some advice:

I’ll be brief here, and let you know exactly when I’ll use IN and NOT IN rather than anything else:

  • When I have a list of literal values

That’s it. That’s all. If I have to go looking in another table for anything, I use either EXISTS or NOT EXISTS. The syntax just feels better to me, and I don’t have to worry about getting stupid errors about subqueries returning more than one value.

I’m typically a lot more flexible about using IN, though I do agree with NOT IN: that clause is usually more trouble than it’s worth.

Comments closed

Postgres Internals: Database Clusters, Databases, and Tables

Semab Tariq begins a new series:

A database cluster is a collection of multiple databases managed by a single PostgreSQL server. It can be referred to as a data/base directory.

A database is a collection of database objects. Whereas a database object is a data structure used to store objects such as tables, views, indexes, extensions, Sequences functions, etc. In simple words, anything that we can create or store within a database is a database object

Read on to learn more about how Postgres lays out database files and tablespaces.

Comments closed