Press "Enter" to skip to content

Day: January 20, 2021

Gradient Descent in R

Holger von Jouanne-Diedrich lays out the basics of gradient descent:

Gradient Descent is a mathematical algorithm to optimize functions, i.e. finding their minima or maxima. In Machine Learning it is used to minimize the cost function of many learning algorithms, e.g. artificial neural networks a.k.a. deep learning. The cost function simply is the function that measures how good a set of predictions is compared to the actual values (e.g. in regression problems).

The gradient (technically the negative gradient) is the direction of steepest descent. Just imagine a skier standing on top of a hill: the direction which points into the direction of steepest descent is the gradient!

Click through for an example in R.

Comments closed

The Benefits of Table Partitioning

Brenda Bentz lays out some of the benefits of table partitioning in SQL Server:

Table partitioning in MS SQL Server is an advanced option that breaks a table into logically smaller chunk (partition) and then stores each chuck to a multiple filegroups.  Each chuck can be accessed and maintained separately. Partitioning is not visible to users; it behaves like one logical table to the end user. Partitioning was introduced in SQL 2005 as an Enterprise edition feature but starting with SQL 2016 SP1 it is available on all the editions.

Table partitioning can work to improve performance in specific circumstances, though you’ll definitely want to do a lot of testing before rolling it out. H/T Amanda White.

Comments closed

Writing Calculations on Power BI Real-Time Streaming Datasets

Reza Rad shows how we can write DAX measures against a Power BI streaming dataset:

In Power BI, you can have a dataset with Imported dataDirectQueryLive Connection, or Composite mode. You can build all of those types of Power BI datasets in the Power BI Desktop. However, there is a single type of dataset, which you can only build through the service, called the Streaming dataset.

A streaming dataset is for building reports with real-time response time. For example, if you want to build a Power BI dashboard that shows the room temperature as soon as captured by a temperature sensor. For this type of dataset, you send the data rows using Power BI REST API, which can be called using a custom C# application, or PowerShell scripts, or even from a Power Automate flow process.

Read on to see how.

Comments closed

When Statistics Get Updated

Matthew McGiffen never promised us there would be no math:

Before using values from the statistics, the Optimizer will check to see if the statistics are “stale”, i.e. the modification counter exceeds a given threshold. If it does, SQL will trigger a resampling of the statistics before going on to form an execution plan. This means that the plan will be formed against up to date statistics for the table.

Read on for some experimentation on when stats updates kick in.

1 Comment

Windows Drive Corruption Bug

Ax Sharma reports on a nasty bug in Windows:

In August 2020, October 2020, and finally this week, infosec researcher Jonas L drew attention to an NTFS vulnerability impacting Windows 10 that has not been fixed.

When exploited, this vulnerability can be triggered by a single-line command to instantly corrupt an NTFS-formatted hard drive, with Windows prompting the user to restart their computer to repair the corrupted disk records.

The researcher told BleepingComputer that the flaw became exploitable starting around Windows 10 build 1803, the Windows 10 April 2018 Update, and continues to work in the latest version.

Read the whole thing. It looks like this also affects Windows XP, which makes me figure it might affect Windows 7, 8, and 8.1 too. H/T Mark Wilkinson.

Comments closed

Normalization and Reduced Blocking

Erik Darling points out one of the many benefits of normalizing tables in a database:

Looking at the design, there are two big problems:

1. There are “order” columns that are going to get a lot of inserts and updates
2. You’re going to be storing the same customer information over and over again

The more related, but not independent, data you store in the same table, the harder it becomes to effectively index that table.

My take on this is that the old adage of “Normalize until it hurts; denormalize until it works” hasn’t been operative for the past 15 years, when the SSD era began.

Comments closed

DTUs and vCores in Azure SQL Database

Arun Sirpal gives us one take on whether to use DTUs or vCores for Azure SQL Database:

If you are building database solutions in Azure , using Azure SQL Database then you will know that you have a purchasing option decision to make. That being should you use a vCore model or DTU approach?

Arun prefers the vCore model for solid reasons. I agree with the sentiment when we’re talking about production databases. For small-scale, personal stuff, however, the bottom end of the DTU model is much less expensive.

Comments closed