Press "Enter" to skip to content

Day: December 7, 2017

Estimating Used Car Prices

Kevin Jacobs wants to estimate the value of his car and shows how to set up a machine learning job to do this:

As you can see, I collected the brand (Peugeot 106), the type (1.0, 1.1, …), the color of the car (black, blue, …) the construction year of the car, the odometer of the car (which is the distance in kilometers (km) traveled with the car at this point in space and time), the ask price of the car (in Euro’s), the days until the MOT (Ministry of Transport test, a required periodical check-up of your car) and the horse power (HP) of the car. Feel free to use your own variables/units!

It’s an interesting example of how you can approach a real problem.

Comments closed

Introduction To Neural Nets

Ben Gorman has a two-part series introducing neural networks.  First, the basics behind neural networks:

We can solve both of the above issues by adding an extra layer to our perceptron model. We’ll construct a number of base models like the one above, but then we’ll feed the output of each base model as input into another perceptron. This model is in fact a vanilla neural network. Let’s see how it might work on some examples.

Then, he digs into the mathematics of backpropagation:

Our problem is one of binary classification. That means our network could have a single output node that predicts the probability that an incoming image represents stairs. However, we’ll choose to interpret the problem as a multi-class classification problem – one where our output layer has two nodes that represent “probability of stairs” and “probability of something else”. This is unnecessary, but it will give us insight into how we could extend task for more classes. In the future, we may want to classify {“stairs pattern”, “floor pattern”, “ceiling pattern”, or “something else”}.

Our measure of success might be something like accuracy rate, but to implement backpropagation (the fitting procedure) we need to choose a convenient, differentiable loss function like cross entropy. We’ll touch on this more, below.

This is definitely a series to read after you’ve gotten your coffee.

Comments closed

Aggregating Data Migration Assistant Assessments

Dustin Ryan has put together a Power BI template to collate results for a number of Azure Data Migration Assistant assessments:

Recently a customer reached out to me to help with the challenging task of understanding the assessment results of 61 SQL environments including over 500 databases  being considered for migrating to Azure SQL Database. Now there is already a great solution that exists for aggregating DMA assessment exports but it only works for assessment exports in .JSON format. The existing solution also requires that the assessment results be written to a SQL Server database.

So I built a solution that uses Power BI to parse the DMA assessment exports (.CSV format) and aggregate the data so it can be more effectively browsed and understood in a report.

Click through for a link to download that template, as well as additional resources.

Comments closed

Fetching U-SQL Job Input And Output Paths

Matthew Hicks shows how to retrieve information on U-SQL input and output paths using Powershell:

Each time you submit a U-SQL job, a job folder is created in your Azure Data Lake Store account. This folder contains useful debugging information about the job, including a file called the U-SQL algebra file. This is an XML file containing information about your job graph, the list of input and output files, and other key U-SQL job metadata.

We’ve just published a sample script that reads the U-SQL algebra file for a specified job and returns the input or output files. Give it a try!

Read on for more.

Comments closed

Restricting Login Usage

Kenneth Fisher shows how to prevent people from using those high-power application accounts:

Anyone of these would cause you to fail a security audit. All of them together? Not good.

So how do we fix it? Well, the best possible method is to not give your developers the password. Use config files containing an encrypted copy of the password and you can dramatically limit knowledge of the password. However, that isn’t necessarily a quick or easy solution (modifying the app to use a config file at all for example). So what to do in the meantime?

The simplest thing to do is to create a logon trigger to control where this account can come from. Before we start if you are going to use a logon trigger make sure you know how to log in and disable it if there are any mistakes.

The logon trigger is hardly perfect, but it does help at the margin.

Comments closed

The TREATAS Function In DAX

Matt Allington explains what the TREATAS function does:

The TREATAS function can be used to detect filters from your visual (filter context) and then apply these filters to a disconnected table in your data model.

  • It takes a source table (first parameter) and applies the values from that table to columns in a target table (second and subsequent parameters).

  • You can use a function like VALUES as the first parameter to detect the initial filter context in a visual and hence TREATAS can propagate filter context to the target table.

  • You do not need to have a physical relationship between the source table and the target table.  It therefore means that TREATAS can be used as a virtual many to many relationship.

  • You can pass multiple filters (columns) from the source table to the target table.  TREATAS can therefore can be used to apply multiple relationships (ie on more than one column) between tables.

Read on for a good example of how this works.

Comments closed