Press "Enter" to skip to content

Author: Kevin Feasel

Training Data With Azure ML

Koos van Strien discusses training data sets and cross-validating results:

When choosing a train and testset, you’ll implicitly introduce a new bias: it could be that the model you just trained predicts well for this particular testset, when trained for this particular trainset. To reduce this bias, you could “cross-validate” your results.

Cross-validation (often abbreviated as just “cv”) splits the dataset into n folds. Each fold is used once as a testset, using all other folds together as a training set. So in our pizza example with 100 records, with 5 folds we will have 5 test runs:

This isn’t Azure ML-specific, and is good reading.

Comments closed

Last Item In Each Group

Reza Rad shows how to get the last item in each group using Power Query:

Scenario that I want to solve as an example is this:

FactInternetSales has sales transaction information for each customer, by each product, each order date and some other information. We want to have a grouped table by customer, which has the number of sales transaction by each customer, total sales amount for that customer, the first and the last sales amount for that customer. First and last defined by the first and last order date for the transaction.

In T-SQL, this sounds like the job of window functions.  In Power BI, we write M.

Comments closed

Confirming Checkpoints

Arun Sirpal shows how to log when checkpointing runs:

Via Configuration manager I enabled trace flags 3502 and 3605 – both needed to get the checkpoint information and write it to the error log.

I then shutdown the machine, on start-up I looked into the error log.

1
EXEC XP_READERRORLOG

Notice the ‘s’ in front of the spid<number>? Well that means the checkpoint was done via the automatic process; if you do a manual checkpoint it won’t see this letter.

I did not know that the “s” indicated that this was an automated process.

Comments closed

Generating Absurd Numbers Of Columns

Brent Ozar wants to generate SmallInt.Max columns:

Alright, so we’ve learned that I can’t return more than 65,535 columns, AND I can only use 4,096 elements in my SELECT. I can think of several workarounds there – 65,535 / 4096 = about 16, which means I could create a few tables or CTEs and do SELECT *’s from them, thereby returning all 65,535 columns with less than 4,096 things in my SELECT. But for now, we’ll just start with 4,096 things in my SELECT:

If you think you need 65K columns returned, I refer you to Swart’s Ten Percent Rule.

Comments closed

Auditing Within Power BI

Adam Saxton has a video on how to use Power BI Auditing:

In this video, I look at the Power BI Auditing feature that was made available a few weeks ago. I show how to turn it on and how to search. This can be helpful with understanding who is doing what within your organization.

You can read more about Power BI Auditing by checking out the official docs.

Auditing Power BI in your organization

Adding the ability to audit data access is important enough within regulated environments that this was probably a deal-killer until a few weeks ago.

Comments closed

Migrating Data To SQL Server Using Data Factory

Ginger Grant moves data from Azure Blob Storage into Azure SQL Database using Data Factory:

There are instances where data resides in Azure Blob Storage and the data is needed in a SQL database. For example, if one ran a Machine Learning experiment in Data Factory, the results would be stored in Azure Blob storage, and for analysis purposes, it may make a lot more sense to move the data to SQL database. Moving data around in Data Factory, means writing JSON. In this example we will be using an Azure SQL DB, but it is not essential that the data be stored in Azure. An on-premises SQL Server could also be used, as long as a gateway was added for the connection, the other steps would be the same. There are five different Data Factory elements required to move data from an Azure blob to a database: a pipeline for the data, a data set containing the definition for the blob, a linked service for the blob, a data set containing a definition for the SQL Data, and a linked service to connect to the SQL database.

There’s a lot of JSON ahead.

Comments closed

Calculating DTU

John Sterrett gives us a measure for calculating DTUs in Azure SQL Database:

The whole query is below. Right now, let’s just focus on the secret sauce. The secret sauce is how DTU percentage gets calculated.  In a nutshell, the maximum of CPU, Data IO, Log Write Percent determine your DTU percentage.  What does this mean to you? Your max consumer limits you. So, you can be using 1% of your IO but still be slowed down because CPU could be your max consumer resource.

That’s a rather interesting finding.  I think the next step (which may be so context-dependent that it’s not possible to generalize) might be to figure out what various workloads do to the metrics and if there’s a way to predict with some reasonable accuracy the expected DTU load given an anticipated change in workload, rather than seeing the value spike and reacting to it later.

Comments closed

Releasing An Azure Page/Blob Lease

Denny Cherry has VB code to release an Azure page or blob storage lease:

Sometimes when firing up VMs or moving VMs from the page or blob store you’ll get an error that there is still a lease on the file.  To solve this you need to release the lease. But waiting won’t do the trick, as the leases don’t have an expiration date.

I found some VB.NET code online that with some tweaking (with the help of Eli Weinstock-Herman and Christiaan Baes) I was able to get to release the lease.

Click through for the code.

Comments closed

Copying On-Prem Databases To Azure SQL Database

Kenneth Fisher walks us through migrating a database to Azure SQL Database:

It turns out it’s pretty easy (even if it takes some time). So where to start? Well the first thing we need is a place to put our database. An Azure SQL Database Server. If you don’t already have one creating a new one is fairly easy.

First start at portal.azure.com. Log in and follow these steps

This is the longer, manual process.  It’s good to walk through it this way at least once before writing a Powershell script, just to see what the script is doing.

Comments closed