Press "Enter" to skip to content

Author: Kevin Feasel

Pivoting Data in R

Dave Mason shows how you can pivot SQL Server data using Machine Learning Services and R:

Pivoting data in SQL Server is something that comes up infrequently for me. Whenever the need arises, I have to pause and ask myself “What is it I’m trying to do again?”. Next I go to the documentation for the T-SQL PIVOT syntax (which I’ll never commit to memory) and realize for the umpteenth time the pivoted values have to be hard coded. Then I ponder using dynamic T-SQL because I won’t always know the values to pivot at query design time.

If T-SQL isn’t a good hammer to PIVOT’s nail, perhaps R is. There are different packages that make summarizing and transposing data frames somewhat easy. It can even dynamically pivot unknown values at runtime. But there is a catch, which I’ll get to in a bit.

This excerpt ends on a cliffhanger, so you’ll have to read Dave’s post to learn about the catch.

Comments closed

Searching for Functions in dbatools

Andy Levy wants to find a function in dbatools:

dbatools has a lot of functions. A lot. Over 550. There is a great command index on the website, and the documentation gets updated every time a new version is published. But sometimes, it feels like you can’t find what you need unless you already know the name of the thing you’re looking for. Other times, you might start writing your own wrapper around dbatools functions, or maybe start a new function from scratch, because it seems like the functionality you need isn’t there.

Read on to see how it works and a couple of examples.

Comments closed

Displaying a Toast with Powershell

Jess Pomfret has some fun with dbatools and the BurntToast Powershell module:

I have just a quick tip for you today using the BurntToast module to notify us when a backup is complete. As DBAs there is always plenty to do, so we don’t want to have to sit and watch a long running script to catch the moment when it finishes.  Usually what happens to me is I kick off the script, move on to something else and then totally forget about it, perhaps until someone asks if it’s done yet. Oops. Well this tip will help avoid that.

Best used sparingly, but quite helpful when you’re working on a long-running task and need to get back to Spider Solitaire work.

Comments closed

Managing Performance on Azure SQL Managed Instances

Tim Radney has a few considerations for you if you want to start using Azure SQL Managed Instances:

Storage is a bit more difficult to plan and make considerations for, due to having to consider multiple factors. For storage you need to account for the overall storage requirement for both storage size, and I/O needs. How many GBs or TBs are needed for the SQL Server instance and how fast does the storage need to be? How many IOPS and how much throughput is the on-premises instance using? For that, you must baseline your current workload using perfmon to capture average and max MB/s and/or taking snapshots of sys.dm_io_virtual_file_stats to capture throughput utilization. This will give you an idea of what type of I/O and throughput you need in the new environment. Several customers I’ve worked with have missed this vital part of migration planning and have encountered performance issues due to selecting an instance level that didn’t support their workload.

Tim has a lot of good advice in here, so read the whole thing.

Comments closed

Finding Folder Sizes with Powershell

Jeffrey Hicks shows us a quick way to retrieve information on folder sizes:

It is simple enough to run Get-Childitem and pipe the results to Measure-Object.

But this can be often time consuming. The other drawbacks, besides typing is it takes extra work for format the results into something more user friendly. And if i want to include hidden files, I have to remember to use -Force with Get-ChildItem.

Cmdlets are nice and convenient. And I always recommend to beginner or even intermediate scripters, if there is a cmdlet to use over the .NET Framework, use the cmdlet. But there are always exceptions and as you grow in expertise, you’ll realize there are times when going directly to the .NET Framework is a better choice. Which is what I have done.

Click through for the solution.

Comments closed

Using Sqoop to Move Data into Hive

Jon Morisi continues a series on Sqoop:

Sqoop completes the import task by running MapReduce jobs importing the data to HDFS, and then running Hive commands (CREATE TABLE / LOAD DATA INPATH) to move the data to Hive.  The default HDFS location is: /user/[login]/[TABLENAME].  If you have any issues during the import you may need to remove the HDFS directory prior to re-running, or else you will get an Error:

Read on for sample calls and additional notes.

Comments closed

Streaming Pipelines in AWS with Flink and Kinesis Data Analytics

Steffen Hasumann shows us how to put together a streaming ETL pipeline in AWS using Apache Flink and Amazon Kinesis Data Analytics:

The remainder of this post discusses how to implement streaming ETL architectures with Apache Flink and Kinesis Data Analytics. The architecture persists streaming data from one or multiple sources to different destinations and is extensible to your needs. This post does not cover additional filtering, enrichment, and aggregation transformations, although that is a natural extension for practical applications.

This post shows how to build, deploy, and operate the Flink application with Kinesis Data Analytics, without further focusing on these operational aspects. It is only relevant to know that you can create a Kinesis Data Analytics application by uploading the compiled Flink application jar file to Amazon S3 and specifying some additional configuration options with the service. You can then execute the Kinesis Data Analytics application in a fully managed environment. For more information, see Build and run streaming applications with Apache Flink and Amazon Kinesis Data Analytics for Java Applications and the Amazon Kinesis Data Analytics developer guide.

Click through for the walkthrough.

Comments closed

Power BI and Tabular Model Relationship Types

Marco Russo takes us through the different types of relationships we might encounter in Power BI and Analysis Services Tabular models:

relationship can be strong or weak. In a strong relationship the engine knows that the one-side of the relationship contains unique values. If the engine cannot check that the one-side of the relationship contains unique values for the key, then the relationship is weak. A relationship can be weak either because the engine cannot ensure the uniqueness of the constraint, due to technical reasons we outline later, or because the developer defined it as such.

A weak relationship is not used as part of table expansion. Power BI has been allowing composite models since 2018; In a composite model, it is possible to create tables in a model containing data in both Import mode (a copy of data from the data source is preloaded and cached in memory using the VertiPaq engine) and in DirectQuery mode (the data source is only accessed at query time).

There is quite a bit of useful information in here.

Comments closed

Receiving Notifications when Azure Function Apps Fail

Gilbert Quevauvilliers shares how to receive notification e-mails when an Azure Function App fails:

Below are the steps to enable error notifications on Azure Function Apps

Follows on from my previous blog post How you can store All your Power BI Audit Logs easily and indefinitely in Azure, where every day it extracts the Audit logs into Azure Blob storage. One of the key things when working with any job that runs, is that I want to know when the job fails. If I do not have this and I assume that the data is always where, I could fall into a situation where there is missing data that I cannot get back.

Below explains how to create an alert with a notification email if an Azure Function App fails.

Read on for the step-by-step instructions.

Comments closed