Press "Enter" to skip to content

Month: November 2020

The ETLT Pattern

Abe Dearmer bridges the gap:

Because ETL and ELT present different strengths and weaknesses, many organizations are using a hybrid “ETLT” approach to get the best of both worlds. In this guide, we’ll help you understand the “why, what, and how” of ETLT, so you can determine if it’s right for your use-case. 

The idea can certainly be useful.

Comments closed

AzureTableStor: Table Storage in R

Hong Ooi announces a new package on CRAN:

I’m pleased to announce that the AzureTableStor package, providing a simple yet powerful interface to the Azure table storage service, is now on CRAN. This is something that many people have requested since the initial release of the AzureR packages nearly two years ago.

Azure table storage is a service that stores structured NoSQL data in the cloud, providing a key/attribute store with a schemaless design. Because table storage is schemaless, it’s easy to adapt your data as the needs of your application evolve. Access to table storage data is fast and cost-effective for many types of applications, and is typically lower in cost than traditional SQL for similar volumes of data.

If that sounds like a fit for you, check out the package.

Comments closed

Improving Performance Counters with Powershell

Jeffrey Hicks has an improvement to Get-Counter in Powershell:

I wanted to tell you about another addition to the latest release of the PSScriptTools module. This is something I’ve written about before but I decided to add the function to the module. I hope you find it a much easier way to work with performance counters. And it works in Windows PowerShell and PowerShell 7.x.

Click through to see what has changed.

Comments closed

The Default Cardinality Estimator and Ascending Keys

Erik Darling compares cardinality estimators:

Look, I’m not saying there’s only one thing that the “Default” cardinality estimator does better than the “Legacy” cardinality estimator. All I’m saying is that this is one thing that I think it does better.

What’s that one thing? Ascending keys. In particular, when queries search for values that haven’t quite made it to the histogram yet because a stats update hasn’t occurred since they landed in the mix.

Read the whole thing.

Comments closed

Database DevOps and Availability Groups

Kendra Little has some considerations for us:

One question which comes up periodically from our Microsoft Data Platform customers is whether there is anything special that they need to do when implementing version control, continuous integration, and automated deployments for a production database which is a member of a SQL Server Availability Group. 

The good news is that deployments are quite straightforward. You’ve most likely already configured a listener for your applications to connect to the Availability Group and automatically find the writeable database. To do a deployment, you simply need to connect to the listener as well.

There are a few other considerations which are helpful to think about when building your approach to database DevOps, however.

Check out the article for the details.

Comments closed

Refreshing Power Query on Protected Excel Sheets

Imke Feldmann shows how to refresh Power Query results on protected sheets in Excel:

When working with Power Query in Excel you might want to refresh Power Queries on protected sheets. But this will not work by default. Using a macro to temporarily unprotect the sheet and protect it again will do the trick. But this requires the password being displayed in the VBA code. So please have in mind that this technique only works for scenarios where you want to prevent accidental changes with the password protection.

Read on for the process.

Comments closed

Finding Function Names with Overlap in R

Tomaz Kastrun has a script for us:

So the following function does just that; if goes through the list of functions for all the installed packages in particular environment and finds duplicates. Giving the R programmer the insights where to be cautious and refer to scope and namespace.

This is a cautionary tale to use full object naming, like coin::show() rather than just show().

Comments closed

An Intro to Time Series Databases

Kyle Buzzell looks at time series databases:

As the name implies, a time series database (TSDB) makes it possible to efficiently and continuously add, process, and track massive quantities of real-time data with lightning speed and precision. While other database models have been used for these kinds of workloads in the past, TSDBs utilize specific algorithms and architecture to deal with their unique needs.

In this piece, we’ll take a deeper look at time series databases, including the unique needs of the workloads they’re built for, their benefits, common use cases, and the TSDBs out there.

Click through for an overview. Time series databases are definitely a niche product, but they are really good inside that niche.

Comments closed