Using cdata To Created Faceted Plots

Nina Zumel shows how to use the cdata package to create faceted ggplot2 plots:

First, load the packages and data:

library("ggplot2")library("cdata")
iris <- data.frame(iris)

Now define the data-shaping transform, or control table. The control table is basically a picture that sketches out the final data shape that I want. I want to specify the x and y columns of the plot (call these the value columns of the data frame) and the column that I am faceting by (call this the key column of the data frame). And I also need to specify how the key and value columns relate to the existing columns of the original data frame.

Read on to see how you can use cdata to tie together different faceted plots.

Generating E-Mail Alerts From Perfmon

Dave Bermingham shows how to send e-mail alerts based on Perfmon counter values:

The first thing that you need to do is write a Powershell script that when run can send an email. While researching this I discovered many ways to accomplish this task, so what I’m about to show you is just one way, but feel free to experiment and use what is right for your environment.

In my lab I do not run my own SMTP server, so I had to write a script that could leverage my Gmail account. You will see in my Powershell script the password to the email account that authenticates to the SMTP server is in plain text. If you are concerned that someone may have access to your script and discover your password then you will want to encrypt your credentials. Gmail requires and SSL connection so your password should be safe on the wire, just like any other email client.

It’s an interesting use of built-in Windows functionality to perform alerting.

Baslining Modern Versions Of SQL Server

Erin Stellato goes back over an older baselining article and gives us some updates in what we should consider for more recent versions:

Last week I got an email from a community member who had read this older article of mine on baselining, and asked if there were any updates related to SQL Server 2016, SQL Server 2017, or vNext (SQL Server 2019). It was a really good question. I haven’t visited that article in a while and so I took the time to re-read it. I’m rather proud to say that what I said then still holds up today.

The fundamentals of baselining are the same as they were back in 2012 when that article was first published. What is different about today? First, there are a lot more metrics in the current release of SQL Server that you can baseline (e.g. more events in Extended Events, new DMVs, new PerfMon counters,  sp_server_diagnostics_component_results). Second, options for capturing baselines have changed. In the article I mostly talked about rolling your own scripts for baselining. If you’re looking to establish baselines for your servers you still have the option to develop your own scripts, but you also can use a third-party tool, and if you’re running SQL Server 2016+ or Azure SQL Database, you can use Query Store.

Read on for more details.

Calling Power BI REST API From Microsoft Flow

Chris Webb has started a series on calling Power BI’s REST API from Microsoft Flow.  In Part 1, he creates a custom connector:

Playing around with Microsoft Flow recently, I was reminded of the following blog post from a few months ago by Konstantinos Ioannou about using Flow to call the Power BI REST API to refresh a dataset:

https://medium.com/@Konstantinos_Ioannou/refresh-powerbi-dataset-with-microsoft-flow-73836c727c33

I was impressed by this post when I read it, but don’t think I understood quite how many exciting possibilities this technique opens up for Power BI users until I started to use it myself. The Power BI dev team are making a big investment in the API yet most Power BI users, myself included, are not developers and can’t easily write code (or PowerShell scripts) to call the API. With Flow, however, you can use the API without writing any code at all and solve a whole series of  common problems easily. In this series of blog posts I’m going to show a few examples of this.

In Part 2, Chris shows us how to automate data refreshes when source data changes:

For a while now I’ve had an idea stuck in my head: wouldn’t it be cool to build a Power BI solution where a user could enter data into an Excel workbook and then, as soon as they had done so, they could see their new data in a Power BI report? It would be really useful for planning/budgeting applications and what-if analysis. I had hoped that a DirectQuery model using the CData Excel custom connector (mentioned here) might work but the performance wasn’t good enough; using Flow with the Power BI REST API (see Part 1 of this series for details on how to get this set up) gets me closer to my goal, even if there’s still one major problem with the approach. Here’s how…

Read on for the approach as well as the major problem.

Reminder: Windows Server Still Exists

Allan Hirt reminds us that Windows Server is still an important product:

If you’ve been in hibernation, today you woke up to a world where Microsoft has embraced open source and Linux. What was once unthinkable is now happening. What is going on? Why am I even talking about this?

Since the introduction of SQL Server 2017 and the support for Linux-based deployments, I’ve had a steady stream of questions from C-levels on down to DBAs asking in essence this: “Do I need to abandon SQL Server on Windows Server and learn Linux?” I would use something stronger if this was a casual conversation, but the answer is an emphatic “NO!” SQL Server still runs just fine and is supported on Windows Server (including Windows Server 2019, which is just released). Support is not ending any time soon. Linux is just another option and there may be enhancements specific to each platform because of their differences. It’s not an “either/or” thing. So breathe, OK? If you have a use case for Linux, by all means deploy SQL Server on it.

I am on the SQL on Linux bandwagon and enjoy the path that Microsoft is forging, but Allan provides us a critical tonic in this regard.

Polling For File Existence In SQL Agent

Bill Fellows gives us an example of polling for a file change using SQL Agent:

A fun question over on StackOverflow asked about using SQL Agent with SSIS to poll for a file’s existence. As the comments indicate, there’s a non-zero startup time associated with SSIS (it must validate the metadata associated to the sources and destinations), but there is a faster, lighter weight alternative. Putting together a host of TSQL ingredients, including undocumented extended stored procedures, the following recipe could be used as a SQL Agent job step.

If you’re going to limit yourself to just SQL Server-based solutions and running on a frequent schedule isn’t enough, this is one of the better options.

Mounting HDFS As A Local Filesystem

Guy Shilo looks at two techniques for mounting HDFS as a local filesystem:

NFS Gateway is a HDFS component that enables the use to expose HDFS through NFS3 interface so that Linux machines can mount it and access it just as a local filesystem.

The manual installation is quite cumbersome and is covered here.

Cloudera manager automates the process so we will use it. If you do not already have NFS Gateway installed in your Cloudera cluster, go to HDFS -> Instances -> Add role instances and choose a host for NFS Gateway:

Guy also looks at Fuse and runs a quick test to see which is faster.

How Humio Uses Kafka

Kresten Krab describes ways that Humio uses Apache Kafka for their product:

Humio is a log analytics system built to run both on-prem and as a hosted offering. It is designed for “on-prem first” because, in many logging use cases, you need the privacy and security of managing your own logging solution. And because volume limitations can often be a problem in Hosted scenarios.

From a software provider’s point of view, fixing issues in an on-prem solution is inherently problematic, and so we have strived to make the solution simple. To realize this goal, a Humio installation consists only of a single process per node running Humio itself, being dependent on Kafka running nearby (we recommend deploying one Humio node per physical CPU so a dual-socket machine typically runs two Humio nodes).

We use Kafka for two things: buffering ingest and as a sequencer of events among the nodes of a Humio cluster.

Read on for more details and a few tips on using Kafka to its fullest.

Oddity With User Write Count In dm_db_index_usage_stats

Shaun J. Stuart looks at an oddity with the user_updates column on sys.dm_db_index_usage_stats:

This pulls both reads and writes from the sys.dm_db_index_usage_stats dynamic management view. A read is defined as either a seekscan, or lookup and a write is defined as an update. All seemed good until I noticed something strange. One of the top written to tables was, based on our naming convention, a lookup table. That seemed odd. A lookup table should have lots of reads, but only few writes. The query above showed my lookup table had almost twice as many writes as reads!

I dug around a bit and found two stored procedures that referenced that particular table. I checked them out, but nothing seemed out of the ordinary to me, so I dug a little deeper and discovered something strange: theuser_updates value of sys.dm_db_index_usage_stats can get incremented even when there is no actual update to the table!!

Read on for the explanation.

Comparing TensorFlow Versus PyTorch

Anirudh Rao compares PyTorch to TensorFlow:

For small-scale server-side deployments both frameworks are easy to wrap in e.g. a Flask web server.

For mobile and embedded deployments, TensorFlow works really well. This is more than what can be said of most other deep learning frameworks including PyTorch.

Deploying to Android or iOS does require a non-trivial amount of work in TensorFlow.

You don’t have to rewrite the entire inference portion of your model in Java or C++.

Other than performance, one of the noticeable features of TensorFlow Serving is that models can be hot-swapped easily without bringing the service down.

Read on for the full comparison.

Categories

October 2018
MTWTFSS
« Sep  
1234567
891011121314
15161718192021
22232425262728
293031