Press "Enter" to skip to content

Month: October 2019

Mocking Objects with R

The R-hub blog has an interesting post on creating mocks in R for unit testing:

In some of these cases, the programming concept you’re after is mocking, i.e. making a function act as if something were a certain way! In this blog post we shall offer a round-up of resources around mocking, or not mocking, when unit testing an R package.

It’s interesting watching data scientists work through the same sorts of problems which traditional developers have hit, whether that be testing, deployment, or source control management. H/T R-bloggers

Comments closed

Using Lenses and GitOps to Migrate Kafka to HDInsight

Andrew Stevenson takes us through migrating from a self-managed Kafka cluster on Lenses to HDInsight using GitOps:

Let’s dig deeper with an example. I have a Self-Managed Kafka cluster and I want to migrate to HDInsight Kafka.

First, we will concentrate on topics. I may have 1000’s of topics. How do I ensure that the configuration (the metadata) are migrated efficiently?

I could do this manually, but this is error-prone, time-consuming and importantly also lacks governance and auditing. A better approach would be to automate this, which is what we can achieve with Lenses and a GitOps approach.

Click through to see how to automate this.

Comments closed

Storing Database Deployment Metadata with JSON

Phil Factor combines a couple SQL Server features to track database deployment history:

We maintain the current record where it is easy to get to and simply add an array to hold the history information. Our only headache is that we can only hold an NVARCHAR of 3750 characters (7500 of varchar characters) because extended properties are held as SQL_Variants. They need careful handling! This means that if our JSON data is larger, we have to trim off array elements that make the JSON exceed that number.

The combination of JSON and extended properties is not one that I’ve seen before—typically, there’s a deployment log table.

Comments closed

Enhanced HA/DR Benefits for SQL Server

Amit Banerjee has just made a bunch of DBAs happy:

Starting Nov 1st, every Software Assurance customer of SQL Server will be able to use three enhanced benefits for any SQL Server release that is still supported by Microsoft:

Failover servers for high availability – Allows customers to install and run passive SQL Server instances in a separate operating system environment (OSE) or server for high availability on-premises in anticipation of a failover event. Today, Software Assurance customers have one free passive instance for either high availability or DR
Failover servers for disaster recovery NEW – Allows customers to install and run passive SQL Server instances in a separate OSE or server on-premises for disaster recovery in anticipation of a failover event
Failover servers for disaster recovery in Azure NEW – Allows customers to install and run passive SQL Server instances in a separate OSE or server for disaster recovery in Azure in anticipation of a failover event

You still need to pay if you’re using the servers (as read-only replicas, for example), but this can substantially reduce your SQL Server licensing costs.

2 Comments

Embedding SSIS Packages in Azure Data Factory Pipelines

Andy Leonard shows us how to embed an SSIS package inside Azure Data Factory pipelines:

The Azure-SSIS Team has done it again; they’ve added more cool SSIS execution functionality to Azure Data Factory!

Click through to see what has Andy excited. I think this is a big thing for ADF as well, especially in shops which dedicated a lot of time and energy into building SSIS packages for ETL work over the years.

Comments closed

Building a SQL Cluster Lab

Ryan Adams has started a series on building a Windows cluster in Hyper-V and layering SQL Server on top of it:

Before we start to build a SQL Cluster Lab, let’s look at the desired result. You will build a 3-node cluster replicating an environment that has two data centers. As a result, the first two nodes will reside in data center 1 and the third node in data center 2. We are creating this architecture because it is the most common architecture I see for Availability Groups. It is multi-subnet and can solve for both HA and DR.

You will notice the domain controller in the middle. That piece is certainly not representative of a production environment. However, we are using it in our lab for several different functions and being a router is one of them.

Part 1 is the only part which is currently up, but this looks like it will be a good one. Go buy a couple more sticks of RAM for your PC and get reading.

Comments closed

SQL Server 2019: Database Snapshots and In-Memory Filegroups

Niko Neugebauer points out a small but interesting addition to SQL Server 2019:

Database Snapshots can serve for so many great purposes, such as Quick Restores and DWH/BI/Reporting reading operations, between others.

In-Memory is still the feature that I love a lot, and while the most active Programming Feature has not seen a lot (or enough) development, because frankly we are back to “Chicken & Egg, Who was first” kind of problem, I see little but important developments and most importantly feature integration. The original implementation in Sql Server 2014 and further improvements in Sql Server 2016 & Sql Server 2017 have improved the programming surface to being useful member of the toolkit, but some of the current limitations are dreading for a number of people and projects.

Well, it seems that Microsoft has been silently working on the improvements and one of them is the support for the Database Snapshots of the In-Memory File Groups in the Sql Server 2019.

I haven’t used database snapshots in a while, but I was fond of them for testing purposes and even wrote an ugly WinForms app to let devs manage them at a prior company.

Click through for Niko’s demonstration as well as a limitation with this.

Comments closed

Changing Colors in Bash

Kellyn Pot’vin-Gorman shows how you can change the colors you see in bash shells:

I agree with him-  you never know what the projector quality will be, the lighting in the room, color-blind attendees or other factors that could impact the readability of  the demonstration when you have a black background and colored text.  I realized, as the Azure Cloud Shell is a service, we have less control of the terminal offered to us, so it was important to tell people how to update their Azure Cloud Shell to change the execution prompt to not be in color and highlight the background in white, with black text bolded for easier reading.

Click through for a sample .bashrc file as well as a bonus “How do I exit vim?” bit. The correct answer is, you never exit vim; you simply dedicate the rest of your life to it.

Comments closed

Offset and Limit with Cosmos DB

Hasan Savran takes us through the OFFSET and LIMIT clauses in Cosmos DB:

OFFSET LIMIT clause one of the latest additions to the Azure Cosmos DB. Skip/Take function was a big request from users and Cosmos DB team listened users and deliver this functionality. If you think Cosmos DB is missing a feature and if you have a new idea, you can use Feedback Forums to give feedback to Cosmos Db team.


     OFFSET LIMIT clause let you skip x number of results then take y numbers of values from the query. Count for OFFSET and Limit are integer and both are required. In other words, You must use LIMIT if you use OFFSET.

A common use for this is paging. I’d be interested to see if this shares the issues that the SQL Server version has: you may only return back 20 rows, but you’re potentially scanning N + 20 each time.

Comments closed

Kafka Consumer Delivery Semantics

Sylvester Daniel continues a series on Apache Kafka:

Enable.auto.commit
This defines how offsets are committed to Kafka — by default  enable.auto.commit is set to true. When this property is set to true, you may also want to set how frequent offsets should be committed using  auto.commit.interval.ms.

By default,auto.commit.interval.ms is set to 5,000ms (5 seconds). When enable.auto.commit is set to true, consumer delivery semantics is “At most once, and commits are async.

Click through for more information on what this means, as well as several other important settings.

Comments closed