Calculating Lifetime Value With R

Sergey Bryl shows how to calculate the lifetime value of a subscription service:

Predicting LTV is a common issue for a new, recently launched product/service/application when we don’t have a lot of historical data but want to calculate LTV as soon as possible. Even though we may have a lot of historical data on customer payments for a product that is active for years, we can’t really trust earlier stats since the churn curve and LTV can differ significantly between new customers and the current ones due to a variety of reasons.

Therefore, regardless of whether our product is new or “old”, we attract new subscribers and want to estimate what revenue they will generate during their lifetimes for business decision-making.

This topic is closely connected to the Cohort Analysis and if you are not familiar with the concept, I recommend that you read about it and look at other articles I wrote earlier on this blog.

Read the whole thing.

Writing SQL Against Elasticsearch

Guy Shilo shows how you can write SQL to query Elasticsearch:

The mappings Elastic SQL uses are:

Index = Table

Document = Row

Field = Column

This mapping is quite intuitive. Types are left out because they are obsolete in Elastic 6.0 on.

So let’s give it a try. I used the latest Elastic 6.4 for this demonstration and ran the queries from Kibana, although they can be run with curl or just a browser as well. First we will need some data. I found this article in Elastic documentation that suggests several data files ready to be loaded. I did not need all of the data so I only used the json file that contains all the works of William Shakespeare that can be downloaded here.

Feasel’s Law continues.

Watch Those Powershell Shell Versions

Emin Atac reviews a couple of 64- versus 32-bit Powershell oddities:

    • Context

My colleagues send an message with a link that points to a script located on a shared drive to help our users reinstall their software.
Our users just click on the link in their Outlook and got a message saying:
\\servername.fqdn\share\softwarename\install.ps1 cannot be loaded because running scripts is disabled on this system. For more information, see about_Execution_Policies at https:/go.microsoft.com/fwlink/?LinkID=135170.

    • Issue

Users use Outlook that is a 32-bit process. If they click on link that points to a script, it will spawn a 32-bit console and run a 32-bit powershell.exe child process.
It appears that the ExecutionPolicy isn’t defined in the 32-bit PowerShell and set to its default value: “Restricted” although it’s defined in the 64-bit Powershell.

Read on for the solution to this issue as well as a second, similar issue.

Sending SQL Server Notifications To Slack

Alessandro Alpi shows how to integrate SQL Server notifications with Slack:

Now, how can we send notifications from SQL Server in an easier way than using custom code or a Slack incoming webhook? Is there any integration or a Slack app?  Yes. And guess what? I think you’ll like it because you don’t need to write a single line of code, and you don’t need to choose between CLR, PowerShell or any other language. It’s ironic, but the integration is called “Email”.

Speaking of CLR, I’ve had success with the SqlServerSlackAPI in the past.

Running SQL Server On Cluster Shared Volumes

Sreekanth Bandarla continues a series on clustered shared volumes:

In the previous part of this series, we have seen what a cluster shared volume is and what are the advantages and other considerations to keep in mind when deploying CSVs for SQL Server workloads. In this article, I will walk though actual installation of a failover cluster Instance leveraging CSVs.

To begin with, I will walk you through my cluster setup from 20,000 foot view. I created two brand new VMs running windows server 2012 R2 and renamed them accordingly. Nothing special w.r.t disk drives at this point, Just basic VMs with a system drive(C$).

The rest of the story is over at SQLShack.

The DAX Guide

Marco Russo announces a new site:

  • What is DAX Guide? DAX Guide is a website offering a complete reference to the DAX language. Every function is presented with its complete syntax, a short description, and links to related functions and articles.

  • Is DAX Guide a tutorial to learn DAX?No, DAX Guide is not designed as a learning tool. The goal of DAX Guide is to provide a quick reference with accurate information. The only commitment is “quality first”.

  • What are some unique features of DAX Guide?DAX Guide is updated automatically through the monitoring of new versions of Microsoft products. Every DAX function comes with a compatibility matrix describing in which Microsoft products and versions the function may be available. Additional attributes highlight which functions perform a context transition, which arguments are executed within a row context, and which functions are obsolete or deprecated – in our opinion.

If that sounds interesting to you, check it out.

A Quick Look At Data Visualization Tools

Vincent Wong walks us through data visualization tools on the market today:

Highcharts

When we talk about Echarts, we will usually compare it with Highcharts. The relationship between them is a bit like the relationship between WPS and Office.

Highcharts is also a visualization library which you have to pay for it if you are going to use it. It has many advantages, for example, its documents and tutorials, JS scripts, and CSS are very detailed. It saves time and allows you to pay more attention to learning and developing. What’s more, it is very stable.

There are some good tools on this list.

It’s All ETL (Or ELT) In The End

Robin Moffatt notes that ETL (and ELT) doesn’t go away in a streaming world:

In the past we used ETL techniques purely within the data-warehousing and analytic space. But, if one considers why and what ETL is doing, it is actually a lot more applicable as a broader concept.

  • Extract: Data is available from a source system
  • Transform: We want to filter, cleanse or otherwise enrich this source data
  • Load: Make the data available to another application

There are two key concepts here:

  • Data is created by an application, and we want it to be available to other applications
  • We often want to process the data (for example, cleanse and apply business logic to it) before it is used

Thinking about many applications being built nowadays, particularly in the microservices and event-driven space, we recognize that what they do is take data from one or more systems, manipulate it and then pass it on to another application or system. For example, a fraud detection service will take data from merchant transactions, apply a fraud detection model and write the results to a store such as Elasticsearch for review by an expert. Can you spot the similarity to the above outline? Is this a microservice or ETL process?

Things like this are reason #1 why I expect data platform jobs (administrator and developer) to be around decades from now.  The set of tools expand, but the nature of the job remains similar.

Simplified Disaster Recovery With dbatools

Chrissy LeMaire shows how you can make DR a lot easier with dbatools:

When we talk about Disaster Recovery or DR, it’s often coupled with the term High Availability or HA. Here are some definitions from my graduate course on HADR.

high availability

  • Deals with minor outages, and failover solutions are automated
  • The goal is to restore full system functionality in a short time

disaster recovery

  • Deals with major outages such as natural and man-made disasters
  • Focuses on manual processes and procedures to restore systems back to their original state
  • Characterized by a phased approach to restoring the primary site

In the context of SQL Server, HA would be Availability Groups (AG), Failover Clustering (FCI), Log Shipping and more. I won’t be addressing High Availability in this post, however.

Chrissy has a demo of everything in action, including running a series of tests to ensure that your DR site actually has everything.

New(ish) VLF Status: 4

Paul Randal points out a new VLF status which can appear if you’re using an Availability Group:

At least since I started working on the SQL Server team (just after 7.0 shipped) and since then there have only been two VLF status codes:

  • 0 = the VLF is not active (i.e. it can be (re)activated and overwritten)
  • (1 = not used and no-one seems to remember what it used to mean)
  • 2 = the VLF is active because at least one log record in it is ‘required’ by SQL Server for some reason (e.g. hasn’t been backed up by a log backup or scanned by replication)

A few weeks ago I learned about a new VLF status code that was added back in SQL Server 2012 but hasn’t come to light until recently (at least I’ve never encountered it in the wild). I went back-and-forth with a friend from Microsoft (Sean Gallardy, a PFE and MCM down in Tampa) who was able to dig around in the code to figure out when it’s used.

Read on to uncover the mysteries of the VLF status of 4.

Categories

September 2018
MTWTFSS
« Aug  
 12
3456789
10111213141516
17181920212223
24252627282930