Plotting ML Results In R

Bernardo Lares shows off the plots he creates in R to compare ML models:

Split and compare quantiles

This parameter is the easiest to sell to the C-level guys. “Did you know that with this model, if we chop the worst 20% of leads we would have avoided 60% of the frauds and only lose 8% of our sales?” That’s what this plot will give you.

The math behind the plot might be a bit foggy for some readers so let me try and explain further: if you sort from the lowest to the highest score all your observations / people / leads, then you can literally, for instance, select the top 5 or bottom 15% or so. What we do now is split all those “ranked” rows into similar-sized-buckets to get the best bucket, the second best one, and so on. Then, if you split all the “Goods” and the “Bads” into two columns, keeping their buckets’ colours, we still have it sorted and separated, right? To conclude, if you’d say that the worst 20% cases (all from the same worst colour and bucket) were to take an action, then how many of each label would that represent on your test set? There you go!

Read on to see what else he uses and how you can build it yourself.

SQL Operations Studio July Edition

Alan Yu announces a new version of SQL Operations Studio:

Highlights for this release include the following.

  • SQL Server Agent preview extension Job configuration support
  • SQL Server Profiler preview extension Improvements
  • Combine Scripts Extension
  • Wizard and Dialog Extensibility
  • Social content
  • Fix GitHub Issues

For complete updates, refer to the Release Notes.

Alan also has demos for each of these.  I still wish that they wouldn’t call their Extended Events viewer “Profiler” because that makes it harder for us to explain the difference between “good Profiler” and “bad Profiler.”

Custom Statistics Block Column Alteration DDL

Max Vernon demonstrates that custom statistics and prevent you for modifying a column:

Interestingly, if SQL Server has auto-created a stats object on a column, and you subsequently modify that column, you receive no such error. SQL Server silently drops the statistics object, and modifies the column. The auto-created stats object is *not* automatically recreated until a query is executed that needs the stats object. This difference in how auto-created stats and manually created stats are treated by the engine can make for some confusion.

Just one more thing to think about if you manually create statistics on tables.  But at least the error message is clear.

Power BI Helper 3.0

Reza Rad has a new version of Power BI Helper out:

It is a pleasure to announce the newest version of Power BI helper, version 3.0 July 2018 with the great feature of exporting model documentation. The documentation part of the insight from Power BI Helper has been always in our backlog, but haven’t had a chance to work on it. Gladly now you can export the document to an HTML file. The exported documentation at the moment, has information about all tables in the model, all measures, all columns and calculated columns in each table with possible expressions and descriptions. If you like to learn more about Power BI Helper, read this page.

Read on to see what Power BI Helper has in terms of documentation.

SQL Server Backup Restoration With LOADHISTORY

Kenneth Fisher explains what the LOADHISTORY option means when you run a RESTORE VERIFYONLY command:

So, first of all, it only works with RESTORE VERIFYONLY. RESTORE VERIFYONLY does some basic checking on a backup to make sure that it can be read and understood by SQL. Please note, it does not mean that the backup can be restored. It will check things like the checksum, available diskspace (if you specify a location), the header and that the backup set is actually complete and readable. Basically enough to see if it will start restoring, but it could still have errors later on.

As for what LOADHISTORY actually does? It causes you to write an entry to the restore history table. You can tell which record this is because the restore_type is set to a V. Really, the only benefit here (as I see it) is that you can do reporting on what backups you’ve verified.

Click through for a demo.


July 2018
« Jun Aug »