# Day: February 15, 2018

Colorization is a computer-assisted process of adding color to a monochrome image or movie. In the paper the authors presented an optimization-based colorization method that is based on a simple premise: neighboring pixels in space-time that have similar intensities should have similar colors.

This premise is formulated using a quadratic cost function  and as an optimization problem. In this approach an artist only needs to annotate the image with a few color scribbles, and the indicated colors are automatically propagated in both space and time to produce a fully colorized image or sequence.

In this article the optimization problem formulation and the way to solve it to obtain the automatically colored image will be described for the images only.

It’s an interesting approach.

First, we need to install ggradar and load our relevant libraries. Then, I create a quick standardization function which divides our variable by the max value of that variable in the vector. It doesn’t handle niceties like divide by 0, but we won’t have any zero values in our data frames.

The `radar_data` data frame starts out simple: build up some stats by continent. Then I call the `mutate_each_` function to call `standardize` for each variable in the `vars` set. `mutate_each_`is deprecated and I should use something different like `mutate_at`, but this does work in the current version of ggplot2 at least.

Finally, I call the `ggradar()` function. This function has a large number of parameters, but the only one you absolutely need is plot.data. I decided to change the sizes because by default, it doesn’t display well at all on Windows.

It was a lot of fun putting this series together. I think the most important part of the series was learning just how easy ggplot2 is once you sit down and think about it in a systemic manner.

Writing M in the Advanced Editor in Excel or Power BI can be a frustrating experience unless you’re the kind of masochist who loves writing code in Notepad. There are some options for writing M code outside Excel and Power BI, for example Lars Schreiber’s M extension for Notepad++ (see here for details) or the M extension for Visual Studio Code (available from the Visual Studio Marketplace here; more details on Brett Powell’s blog here), but the trouble with them is that you have to copy the code back into Excel or Power BI to run it. What many people don’t realise, however, is that it is possible to write M code and have IntelliSense, formatting, keyword highlighting and also the ability to execute your own M queries, using the Power Query SDK in Visual Studio.

The Power Query SDK (which you can download here) supports Visual Studio 2015 and 2017 and is intended for people who are writing custom Data Connectors for Power BI. To let you test your Data Connector you can create a .pq file containing M code, and this in fact allows you to run any M query you want whether you’re building a Data Connector or not.

And then, once you get comfortable with M, start learning F#.  That will allow you to laugh haughtily at those poor object-oriented sods out there.

Sit a spell and let Grandpa Andy tell yall a story about some data integratin’.

Suppose for a minute that you’ve read and taken my advice about writing small, unit-of-work SSIS packages. I wouldn’t blame you for taking this advice. It’s not only online, it’s written in a couple books (I know, I wrote that part of those books). One reason for building small, function-y SSIS packages is that it promotes code re-use. For example, SSIS packages that perform daily incremental loads can be re-used to perform monthly incremental loads to a database that serves as a data mart by simply changing a few parameters.

Change the parameter values and the monthly incremental load can load both quarterly and yearly data marts.

You want better performance out of the daily process, so you read and implement the parallel execution advice* you’ve found online. For our purposes let’s assume you’ve designed a star schema instead of one of those pesky data vaults (with their inherent many-to-many relationships and the ability to withstand isolated and independent loads and refreshes…).

You have dependencies. The dimensions must be loaded before the facts. You decide to manage parallelism by examining historical execution times. Since you load data in chronological order and use a brute-force change detection pattern, the daily dimension loads always complete before the fact loads reach the latest data. You decide to fire all packages at the same time and your daily execution time drops by half, monthly executions time drops to 40% of its former execution time, and everyone is ecstatic…

This is a great post.

In SQL Server, when we use the “WITH SCHEMABINDING” clause in the definition of an object (view or function), we bind the object to the schema of all the underlying tables and views. This means that the underlying tables and views cannot be modified in a way that would affect the definition of the schema-bound object. It also means that the underlying objects cannot be dropped. We can still modify those tables or views, as longs as we don’t affect the definition of the schema-bound object (view or function).

If we reference a view or function in a schema-bound object (view or function), then the underlying view or function must also be schema-bound. And we cannot change the collation of a database, if it contains schema-bound objects.

I’ve only used schemabinding when mandated (e.g., using row-level security or creating an indexed view), but I can see the value behind using it with normal development.

Recently one of our development teams has increased the request of importing an Excel file with 20 sheets in to 20 tables in a database from about once a quarter to multiple times a week and this past Monday was three times in one day.  I have been the lucky DBA to get these requests as of late and after Monday I was determined to fix the process.  The current procedure is to use the good ol’ Import/Export Wizard since this was a rare request. (This included a lot of point and click and possibility for manual error)  With increased requests and increased table counts I knew there had to be a better way to get this accomplished without grimacing each time I see the request.

Garry has a script which he uses, but which can be tailored for other uses pretty easily.

We setup new PsSessions using `New-PsSession`, I set `ErrorAction` to `SilentlyContinue` just in case a host isn’t available for some reason (if I was being good I’d try/catch here).

As we’re just using PS standard functionality here with `Get-Service` there’s no need to build a a new function, we can just call this directly here. By calling `Invoke-Command` against a session pointed at numerous hosts we can PowerShell handle all the connection management here and just assume the command will be ran against each host. if we were running against a lot of hosts then we would want to look into using the `-ThrottleLimit` parameter to limit the number of concurrent hosts we’re hitting. The one little trick here is using the `using` scope modifier here so PS pulls in the variable defined in our main scope (gory details on scoping here

Click through for the script, and do check out the comments, where Stuart gives a bit of advice when you’re trying to execute against a large number of servers.