Press "Enter" to skip to content

Curated SQL Posts

Neural Nets With R And Power BI

Leila Etaati continues her series on using neural nets in Power BI:

we are going to predict the concrete strength using neural network. neural network can be used for predict a value or class, or it can be used for predicting multiple items. In this example, we are going to predict a value, that is concrete strength.

I have loaded the data in power bi first, and in “Query Editor” I am going to write some R codes. First we need to do some data transformations. As you can see in the below picture number 2,3 and 4,data is not in a same scale, we need to do some data normalization before applying any machine learning. I am going to write a code for that (Already explained the normalization in post KNN). So to write some R codes, I just click on the R transformation component (number 5).

There’s a lot going on in this demo; check it out.

Comments closed

Loan Chargeoff Templates

Ajay Jagannathan announces a couple new Cortana Intelligence Solutions Gallery templates:

For more information, read this blog: End to End Loan ChargeOff Prediction Built Using Azure HDInsight Spark Clusters and SQL Server 2016 R Service

We have published two solution templates deployable using two technology stacks for the above chargeoff scenario:-

  1. Loan Chargeoff Prediction using SQL Server 2016 R Services – Using DSVM with SQL Server 2016 and Microsoft ML, this solution template walks through how to create and clean up a set of simulated data, use 5 different models to train, select the best performant model, perform scoring using the model and save the prediction results back to SQL Server. A PowerBI report connects to the prediction table and show interactive reports with the user on the chargeoff prediction.

  2. Loan Chargeoff Prediction using HDInsight Spark Clusters – This solution demonstrates how to develop machine learning models for predicting loan chargeoff (including data processing, feature engineering, training and evaluating models), deploy the models as a web service (on the edge node) and consume the web service remotely with Microsoft R Server on Azure HDInsight Spark clusters. The final predictions is saved to a Hive table which could be visualized in Power BI.

These tend to be nice because they show you how the different pieces of the Azure stack tie together.

Comments closed

Wait Stats In Query Store

Andrejs Antjufejevs has great news if you’re using Query Store:

Starting today in Azure SQL Database and from CTP 2.0 of SQL Server 2017 wait stats per query are available in Query Store. Now you can exactly identify why and how much every plan waited for some resource. Information about wait times are persisted so you can also analyze through the history what was the problems and why queries waited for resources

This is a welcome improvement for query tuners on 2017.

Comments closed

Automatic Temporal Table Data Purging

Bert Wagner has warmed the cockles of my heart with this news:

The problem with temporal tables is that they produce a lot of data. Every row-level change stored in the temporal table’s history table quickly adds up, increasing the possibility that a low-disk space warning is going to be sent to the DBA on-call.

In the future with SQL Server 2017 CTP3, Microsoft allows us to add a retention period to our temporal tables, making purging old data in a temporal table as easy as specifying:

I’m in a situation where this will be very useful.

Comments closed

Test The DBATools Beta

Chrissy LeMaire wants you to test the beta of dbatools:

Before the official release of bagofbobbish to master and the PowerShell Gallery, we need help finding bugs. Then, we’ll need some time to resolve those bugs. Hopefully this can be done before community members show off dbatools at a few key SQLSaturdays around the world this Saturday, July 8th.

We would really appreciate it if you would download the beta from GitHub and (in a test environment) see if you can find anything that doesn’t work as expected.

If you find any bugs, please file a report on GitHub. You can also reach out to us in the Slack channel.

Currently, there aren’t any webpages for the commands listed in this post, but all commands have help, so when you need help, simply type Get-Help commandName -Examples or Get-Help commandName -Full.

Get testing.  There are a lot of new commands, so if you haven’t checked out dbatools in a while, give it a go.  Also, congrats to Rob Sewell for his newly minted MVP status.

Comments closed

Option Explicit In Biml VB

Ben Weissman now has nothing to stop him from writing bad VB code in Biml:

Previously, you had to declare any kind of variable and object type (instead of just using something like “var” in C#):

1
2
3
4
5
6
7
8
<#@template Language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# for n as integer = 1 to 25 #>
    <Package Name="MyAutomatedBiml<#= n #>"/>
<# next#>
</Packages>
</Biml>

That was true even for the most simple cases like for n as integer = 1 to 25 instead of just for n = 1 to 25, even though it is clear that his can never be anything else in this context.

Now, you can use two new attributes in the template definition: optionexplicit and/or optionstrict

Read the whole thing if you want to write VB code in Biml.  If you want to write C# code in Biml, keep doing your thing.  If you want to write F# in Biml, the pitchfork mob is organizing over here.

Comments closed

Agorics

One of my interests about a decade ago was agorics, the study of computational markets.  Mark S. Miller and K. Eric Drexler pushed this idea in the late 1990s and collected a fair portion of the work on the topic on Drexler’s website.  A sample from the section on computation and economic order:

Trusting objects with decisions regarding resource tradeoffs will make sense only if they are led toward decisions that serve the general interest-there is no moral argument for ensuring the freedom, dignity, and autonomy of simple programs. Properly-functioning price mechanisms can provide the needed incentives.

The cost of consuming a resource is an opportunity cost-the cost of giving up alternative uses. In a market full of entities attempting to produce products that will sell for more than the cost of the needed inputs, economic theory indicates that prices typically reflect these costs.

Consider a producer, such as an object that produces services. The price of an input shows how greatly it is wanted by the rest of the system; high input prices (costs) will discourage low-value uses. The price of an output likewise shows how greatly it is wanted by the rest of the system; high output prices will encourage production. To increase (rather than destroy) value as ‘judged by the rest of the system as a whole’,a producer need only ensure that the price of its product exceeds the prices (costs) of the inputs consumed. This simple, local decision rule gains its power from the ability of market prices to summarize global information about relative values.

 

I still think it’s an interesting concept, and the rise of cloud computing has, to an extent, fulfilled this idea:  AWS spot pricing is one of the best examples I know of, where resource spot prices will change depending upon load.

Comments closed

HASSP

Drew Furgiuele wants to put SQL Server into space.  I’ve linked to the entire series thus far, which has been fun to follow.  Here’s an excerpt from his latest post:

That’s from the “Hardware and Software Requirements for Installing SQL Server” product page.  I’ve had people ask if I was using a Raspberry Pi, or some other Micro ATX PC. The answer is neither; the problem with a Pi is that it’s not a 64-bit processor. Pis use ARM architecture, and SQL Server doesn’t (yet) support ARM. Also, most Pi 3’s run at 1.2Ghz and only support 1GB of RAM. As for MicroATX form factor PCs, they’re closer to what we’d need, but they’re still heavy. Plus, you’d need a pretty substantial power supply that we couldn’t (safely) support that high up in the sky. Even if you stripped it down to bare components, it’d be pushing it.

There are companies that make small SoC solutions, but after evaluating them we determined that they were either pretty flaky or got so hot they risked bursting into flames even just booting into Windows. Instead, we found a really unique piece of hardware: the Intel Joule.

Comments closed

No Curation Today

Happy 4th of July.  Because today is a day for eating hot dogs and blowing stuff up, our normally scheduled curation is on hold.  We’ll pick up once more tomorrow.

In the meantime, stand by for a couple larger links.

Comments closed

Thinking About The Data Lake

James Serra explains at a high level what the data lake metaphor is and how it works:

The data lake introduces a new data analysis paradigm shift:

OLD WAY: Structure -> Ingest -> Analyze

NEW WAY: Ingest -> Analyze -> Structure

This allows you to avoid a lot of up-front work before you are able to analyze data.  With the old way, you have to know the questions to ask.  The new way supports situations when you don’t know the questions to ask.

This solves the two biggest reasons why many EDW projects fail:

  • Too much time spent modeling when you don’t know all of the questions your data needs to answer

  • Wasted time spent on ETL where the net effect is a star schema that doesn’t actually show value

There are some good details here.  My addition would be to reiterate the importance of a good data governance policy.

Comments closed