Press "Enter" to skip to content

Author: Kevin Feasel

Updating A Collection Of Variables In Powershell

Klaas Vanbenberghe shows how to update multiple variables in the same way using Powershell:

We turn to two cmdlets: Get-Variable and Set-Variable. They may seem redundant as we get and set variables all the time without those. Maybe you have even never heard of these two cmdlets.
Well here they prove their usefulness:
we will set the variables using their names, and use the current value as a starting point.
An important rule to remember is this:

$ is a token to indicate we’re dealing with a variable, but it is not part of the variable name!

It’s worth reading the whole thing.

Comments closed

Neural Nets Optimizing For Imperfect

John Cook describes a paradox with neural nets:

Deep neural networks have enough parameters to overfit the data, but there are various strategies to keep this from happening. A common way to avoid overfitting is to deliberately do a mediocre job of fitting the model.

When it works well, the shortcomings of the optimization procedure yield a solution that differs from the optimal solution in a beneficial way. But the solution could fail to be useful in several ways. It might be too far from optimal, or deviate from the optimal solution in an unhelpful way, or the optimization method might accidentally do too good a job.

Conceptually, this feels a little weird but isn’t really much of a problem, as we have other analogues:  rational ignorance in economics (where we knowingly choose not to know something because the benefit is not worth the opportunity cost of learning), OPTIMIZE FOR UNKNOWN with SQL Server (where we knowingly do not use the passed-in parameter because we might get stuck in a lesser path), etc.  But the specific process here is interesting.

Comments closed

COALESCE, ISNULL, And Data Types

Shane O’Neill shows a subtle difference between ISNULL and COALESCE:

You may be asking yourself “What the…?!”.

I asked myself the same question but then I thought of a better idea and asked my Senior the question instead (he usually has better answers than I do), who proceeded to tell me that the final test would only work exactly at midnight!

….so I repeated my question to him.

It’s an interesting read, and not something you’d commonly think about.

Comments closed

Backup Compression And Encryption

Arun Sirpal shows the combined effects of backup encryption and backup compression in SQL Server 2017:

Do not forget about the certificate! Warning: The certificate used for encrypting the database encryption key has not been backed up. Imagine if you need to recover the backup and you can’t?  You will get the dreaded thumbprint error.

Msg 33111, Level 16, State 3, Line 25 Cannot find server certificate with thumbprint ‘0x78FAB5A2A5D593FD3C4E163C90B745F70AB51233’. Msg 3013, Level 16, State 1, Line 25

RESTORE DATABASE is terminating abnormally.

So make sure you respect this certificate (and the key) and back it up and re-create them on the target server for a successful restore.

In SQL Server 2016 and 2017, there’s no reason not to encrypt backups; the marginal cost is practically nil even if you’re low enough on disk space that you need to do backup compression.

Comments closed

Use Fixed Filegrowth Settings

Andy Galbraith notes that you should use fixed-increment filegrowth settings for log and data files:

As you probably already know, the key flaw to percentage-based FILEGROWTH is that over time the increment grows larger and larger, causing the actual growth itself to take longer and longer.  This is especially an issue with LOG files because they have to be zero-initialized before they can be used, causing excessive I/O and file contention while the growth is in progress.  Paul Randal (blog/@PaulRandal) describes why this is the case in this blog post.  (If you ever get a chance to see it Paul also does a fun demo in some of his classes and talks on why zero initialization is importan, using a hex editor to read the underlying contents of disk even after the old data is “deleted”)

Andy also has a script to change filegrowth to fixed-increment growth depending upon the size of the file, so check that out.

Comments closed

Imbalanced Data In R

Rathnadevi Manivannan explains how to deal with imbalanced data using R:

Imbalanced data refers to classification problems where one class outnumbers other class by a substantial proportion. Imbalanced classification occurs more frequently in binary classification than in multi-level classification. For example, extreme imbalanced data can be seen in banking or financial data where majority credit card uses are acceptable and very few credit card uses are fraudulent.

With an imbalanced dataset, the information required to make an accurate prediction about the minority class cannot be obtained using an algorithm. So, it is recommended to use balanced classification dataset.

Rathnadevi uses fraudulent transactions for his sample, but medical diagnoses is also a good example:  suppose 1 person in 10,000 has a particular disease.  You’re 99.99% right if you just say nobody has the disease, but that’s a rather unhelpful model.

Comments closed

Getting Started With Apache Mesos

Mahesh Chand Kandpal shows how to install Apache Mesos:

Follow the following instructions to install required packages and other Mesos dependencies.

# Update the packages.
$ sudo apt-get update

# Install a few utility tools.
$ sudo apt-get install -y tar wget git

# Install the latest OpenJDK.
$ sudo apt-get install -y openjdk-8-jdk

# Install other Mesos dependencies.
$ sudo apt-get -y install build-essential python-dev python-six python-virtualenv libcurl4-nss-dev libsasl2-dev libsasl2-modules maven libapr1-dev libsvn-dev zlib1g-dev

Then, Anubhav Tarar shows how to install Spark on top of Mesos:

7.now got to $SPARK_HOME/CONF

inside your spark-env.sh add following parameters

export MESOS_NATIVE_JAVA_LIBRARY= /usr/local/lib/libmesos.so
export SPARK_EXECUTOR_URI=/path/to/spark-2.2.0-bin-hadoop2.7.tgz

8. start spark shell with mesos as master

./bin/spark-shell –master mesos://127.0.0.1:5050

Mesos is a rather interesting platform, and if you’re getting interested in Hadoop and Spark, it’s worth learning about this.

Comments closed

Comparing Tree Graphs In SQL

Dmitriy Vlasov shows how to compare two trees in PL/SQL:

During the day, various changes are received by the accounting system from the design system. Production planning is based on the data from the accounting system. Conditions allow you to accept all the changes for the day and recalculate the product specification at night. However, as I wrote above, it is unclear how the yesterday state of the product differs from the today one.

I would like to see what was removed from the tree and what was added to it, as well as which part or assembly replaced another one. For example, if an intermediate node was added to the tree branch, it would be wrong to assume that all the downstream elements were removed from the old places and added to the new ones. They remained where they were, but the insert of the mediation node took place.  In addition, the element can ‘travel’ up and down only within one branch of the tree due to the specifics of the manufacturing process.

This is Oracle-specific; migrating it to another platform like SQL Server would take a bit of doing.

Comments closed

Extended Events Profiler

Marek Masko shows off the new Extended Events Profiler In SQL Server Management Studio 17.3:

XE Profiler looks promising and can be really a great feature. We can use it with no issues on any version of SQL Server which supports extended events – not only with newest SQL Server 2017. I tested it with SQL Server 2014 and it was working well. Currently, lack of configuration of new templates, and logic based on hard-coded names is the biggest concern and discomfort for the user. However Microsoft didn’t officially release yet this version of SQL Server Management Studio, so it’s hard to say what will be the final feature functionality.

I’m hoping that when the final version appears, it will be good enough to get people finally to kick the Profiler habit.

Comments closed