Press "Enter" to skip to content

Author: Kevin Feasel

Master Data Services No Longer Uses Silverlight

Niko Neugebauer is happy about an update to Master Data Services in SQL Server 2019:

Before we continue, let me ask you one question, have you heard about Silverlight?
Or in other words, and with a kind of evil voice “DID YOU EVER INSTALLED SILVERLIGHT ON A PRODUCTION SERVER”?.
If you have worked with MDS oh yes, you did! At least in order to check if everything is configured/upgraded correctly and nothing is broke, I will do a wild guess and claim that you did! So am I … :s

Because in order to make things work in MDS correctly, one needs this old, for a very long time deprecated framework, that is supported only in deprecated browser that is called Internet Explorer v.11, and that pain-in-the-neck framework is called Silverlight and if you dare to work with any SQL Server versions before SQL Server 2019, the picture on the left will appear in front of you at the moment you will try to explore the master data in the MDS Explorer – ensuring that unless you install a totally abandoned (and obviously unnecessary product, that represents another risk on your server) is a necessary thing. That is alone is the reason for some people would use some development VM in order to work with MDS, but that is not a good excuse to include that product in SQL Server 2016 or in SQL Server 2017.

The interface still has problems, as Niko points out, but hopefully this is the first step and not the last one.

Comments closed

Looking At Databricks Cluster Pricing

Tristan Robinson takes a look at Azure Databricks pricing:

The use of databricks for data engineering or data analytics workloads is becoming more prevalent as the platform grows, and has made its way into most of our recent modern data architecture proposals – whether that be PaaS warehouses, or data science platforms.

To run any type of workload on the platform, you will need to setup a cluster to do the processing for you. While the Azure-based platform has made this relatively simple for development purposes, i.e. give it a name, select a runtime, select the type of VMs you want and away you go – for production workloads, a bit more thought needs to go into the configuration/cost.  In the following blog I’ll start by looking at the pricing in a bit more detail which will aim to provide a cost element to the cluster configuration process.

There are a few complicating factors in figuring out cluster price but rest assured that it will be costly.

Comments closed

Sorting And Aggregating Extended Events Results

Matthew McGiffen shows off some of the things you can do easily with Extended Events Profiler:

When I’m using Profiler to analyse performance issues I often save the results to a table, or upload a trace file into a table, so that I can analyse the data. Often this involves aggregating the values for particular queries so that I can see the most resource hungry.

This is by no means a difficult process, but with Extended Events (XE) it’s arguably even easier.

Click through for a demonstration.

Comments closed

Validating SSIS Packages Using T-SQL

Annie Xu shows us how to validate SSIS packages in the SSISDB catalog using T-SQL:

Recently, I need to do a data warehouse migration for a client. Since there might be some difference between the Dev environment source databases and Prod environment source databases. The migrated SSIS packages for building data warehouse might have some failures because of the changes. So the challenge is how can I validate all my DW packages (100 +) all at once.

Click through for the script.

Comments closed

Contrasting Integration Services And Pentaho Data Integration

Koen Verbeeck contrasts SQL Server Integration Services with Pentaho Data Integration:

For generating SSIS packages, you need to rely on Biml (much about that can be found on this blog or on the net), or older frameworks such as ezApi. Or you need 3rd party tools such as BimlStudio or TimeXtender. Using Biml means writing XML and .NET. Don’t get me wrong, I love Biml and I use it a lot in my SSIS projects.

But generating transformations in PDI is so much easier. First, you create a template (you create a transformation, but you leave certain fields empty, such as the source SQL statement and the destination table). Then you have another transformation reading metadata. This metadata is pushed to the template using the Metadata Injection Transformation. In this transformation, you point to the template and you map those empty fields to your metadata fields.

It’s interesting to see where each product stands out or falls flat compared to the other, and Koen’s comparison is definitely not a one-sided bout.

Comments closed

Replicating Solr Indexes

Nirmal Prabhu walks us through configuring replicated Solr instances:

Step 4: [Creating master Core]

First, we need to create a core for indexing the data. The Solr create command has the following options:

  • -c <name> — Name of the core or collection to create (required).
  • -d <confdir> — The configuration directory, useful in the SolrCloud mode.
  • -n <configName> — The configuration name. This defaults to the same name as the core or collection.
  • -p <port> — Port of a local Solr instance to send the create command to; by default the script tries to detect the port by looking for running Solr instances.
  • -s <shards> — Number of shards to split a collection into, default is 1.
  • -rf <replicas> — Number of copies of each document in the collection. The default is 1.

In this example, we will use the -c parameter for core name, -rf parameter for replication and -d parameter for the configuration directory.

Read on for step-by-step instructions.

Comments closed

Lazy Log Truncation

Paul Randal explains why Virtual Log Files might remain in status 2 even after they are cleared:

Earlier this year I was sent an interesting question about why the person was seeing lots of VLFs in the log with status = 2 (which means ‘active’) after clearing (also known as ‘truncating’) the log and log_reuse_wait_desc showed NOTHING.

I did some digging around and all I could find was an old blog post from 2013 that shows the behavior and mentions that this happens with mirroring and Availability Groups. I hadn’t heard of this behavior before but I guessed at the reason, and confirmed with the SQL Server team.

Read on for the answer.

Comments closed

Approved Powershell Verbs

Richard Siddaway on approved verbs in Powershell:

The other very useful set information are the synonyms for verbs that you shouldn’t use. For instance don’t use Append, Attach, Concatenate or Insert – use Add. Some of this information is contextual though as you shouldn’t use Pop or Out as a synonym for Exit BUT Pop is perfectly valid when removing an item off a stack (Pop-Location is the only cmdlet I know of that works in that way).

Read on for a link to the approved verbs list.

Comments closed

Going In-Depth On Powershell Arrays

Kevin Marquette has a tour de force on Powershell arrays:

When your array is a collection of string or integers (value types), sometimes you will want to update the values in the array as you enumerate them. Most of the iteration loops above use a variable in the loop that holds the value. If you update that variable, the original value in the array is not updated.

The exception to that statement is the for loop. If you are wanting to walk an array and update values inside it, then the for loop is what you are looking for.

    for ( $index = 0; $index -lt $data.count; $index++ )
    {
        $data[$index] = "Item: [{0}]" -f $data[$index]
    }

This examples takes a value by index, makes a few changes, and then uses that same index to assign it back.

This is a book chapter-length blog post full of good information.

Comments closed

Safely Dropping Databases

Bob Pusateri notes a little issue when it comes to dropping databases:

At a previous employer, we had a well-defined process when dropping databases for a client. It went something like this:

  1. Confirm in writing the databases on which servers/instances to be dropped
  2. Take a final full backup of databases
  3. Take databases offline
  4. Wait at least two weeks to make sure nothing breaks in the absence of this database
  5. Drop databases

This is a pretty good and safe method. If taking the database offline causes some unforeseen system to stop working, it can be very quickly brought back online in-place, instead of having to locate the backup and restore it. But it there’s just one problem…

Read on for that problem and its solution.

Comments closed