2018-10-12 – Curated SQL

I have found that performance across all devices and browsers is definitely not equal. By far the best browser I have found for viewing the apps is Google Chrome. I have also tended to find that my Ubuntu machines don’t do as well as Microsoft machines in picking up words correctly. A chat I had with someone recently suggested this might be down to drivers under Ubuntu for the microphones but that is not my area of expertise. Voice recognition was also fine on both of my Blackberry phones (one running BB OS 10, the other running Android 7).

It is worth noting that this does require an internet connection to function, in Chrome the voice to text is performed in the cloud.

The other thing I have noticed is that annyang seems relatively sensitive to background noise. This isn’t so bad for functions called using specific phrases but does sometimes have a large effect on the multi-word splats. This is because the splats are greedy and the background noise makes the recognition engine think that you are still talking long after you finished which gives the appearance of the application hanging.

The solution is by no means perfect, but it does look quite interesting.

Comments closed

Security Improvements In Kafka And Confluent Platform

Published 2018-10-12 by Kevin Feasel

Vahid Fereydouny demonstrates a number of security improvements made to Apache Kafka 2.0 as well as Confluent Platform 5.0:

Over the past several quarters, we have made major security enhancements to Confluent Platform, which have helped many of you safeguard your business-critical applications. With the latest release, we increased the robustness of our security feature set to help with:

Using standard and central directory services like Active Directory (AD)/Lightweight Directory Access Protocol (LDAP)

Simplifying the management of access control lists (ACLs)

Proactive management and monitoring of security configurations to address the gaps as soon as possible

The following new security features are available in both Confluent Platform 5.0 and Apache Kafka 2.0:

Support for ACL-prefixed wildcards to simplify the management of access control

Kafka Connect password protection with support for externalizing secrets (to “secrets stores,” etc., like Hashicorp Vault)

The following security features are available only in Confluent Platform 5.0:

AD/LDAP group support

Feature access controls in Confluent Control Center

Viewing of broker configurations in Confluent Control Center, including differences in security configurations between brokers

Let’s walk through each of these enhancements in detail.

Read on for examples.

Comments closed

SparkSession Versus SparkContext

Published 2018-10-12 by Kevin Feasel

Abhishek Baranwal explains the differences between the SparkSession object and the SparkContext object when writing Spark code:

Prior to spark 2.0, SparkContext was used as a channel to access all spark functionality. The spark driver program uses sparkContext to connect to the cluster through resource manager.

SparkConf is required to create the spark context object, which stores configuration parameters like appName (to identify your spark driver), number core and memory size of executor running on worker node.

In order to use API’s of SQL, Hive, and Streaming, separate context needs to be created.

Read on to see where SparkSession fits in.

Comments closed

Simulating LAG And LEAD Prior To SQL Server 2012

Published 2018-10-12 by Kevin Feasel

Izik Ben-Gan highlights a reader submission from his last post:

Last month I covered a Special Islands challenge. The task was to identify periods of activity for each service ID, tolerating a gap of up to an input number of seconds (@allowedgap). The caveat was that the solution had to be pre-2012 compatible, so you couldn’t use functions like LAG and LEAD, or aggregate window functions with a frame. I got a number of very interesting solutions posted in the comments by Toby Ovod-Everett, Peter Larsson, and Kamil Kosno. Make sure to go over their solutions since they’re all quite creative.

Curiously, a number of the solutions ran slower with the recommended index than without it. In this article I propose an explanation for this.

Even though all solutions were interesting, here I wanted to focus on the solution by Kamil Kosno, who’s an ETL developer with Zopa. In his solution, Kamil used a very creative technique to emulate LAG and LEAD without LAG and LEAD. You will probably find the technique handy if you need to perform LAG/LEAD-like calculations using code that is pre-2012 compatible.

Kamil’s solution was quite clever.

Comments closed

Actual Versus Estimated Rows In SSMS 18

Published 2018-10-12 by Kevin Feasel

David Alcock is happy with a feature in SQL Server Management Studio 18:

Last week Microsoft released SQL Server Management Studio 18.0 into public preview, here’s a link so you can read about the new and improved functionality it offers.

One significant change is the addition of actual vs estimated row counts onto the showplan operators in execution plans (only actual, not estimated…which kinda makes sense).

Here I’m running a very simple bit of code on some DMV’s (namely exec requests and sessions) to demonstrate this addition.

Read on for the example.

Comments closed

Creating Fancy HTML Reports With Powershell

Published 2018-10-12 by Kevin Feasel

Jeffery Hicks shares several tips on creating fancy HTML reports using Powershell:

Usage is pretty straightforward. You specify one or more computers and off you go. There is a default value for the resulting HTML file, but you’ll likely want to specify your own. Because the function is generating custom HTML on the fly, I also provided options for you to provide pre and post content HTML material, just as you might with ConvertTo-HTML. I also give you an option to specify a graphics file which is display like a logo at the top of the report. The graphics file will be embedded in the HTML file. The CSS is also embedded in the HTML making the entire file completely self-contained.

The one knock I have is the gradient color scheme: people with Protanopia or Deuteranopia will have trouble reading the free space indicator, as the colors blur into one another. Otherwise, this looks great.

Comments closed

In Praise Of Tabular Editor

Published 2018-10-12 by Kevin Feasel

Teo Lachev shares a positive review of Tabular Editor, a community tool for working with Tabular models:

What tool do you use for Analysis Services Tabular development? SSDT right, what else? Here is a little secret. I almost don’t use SSDT anymore, except for limited tasks, such as importing new tables and visualizing relationships. I switched to a great community tool – Tabular Editor and you should too if you’re frustrated with the SSDT Tabular Designer. Back in 2012 Microsoft ported the Power Pivot designer to SSDT to let BI practitioners implement Tabular models. This is why you still get weird errors that Excel has encountered some error. Microsoft haven’t made any “professional” optimizations despite all the attention that Tabular gets. As a result, developers face:

Performance issues – As your model grows in complexity, it gets progressively slower for even simple changes, such as renaming columns. The problem of course is that any change results in a commit operation to the workspace database. SSDT requires a workspace database for the Data View but it slows down all tasks even if it doesn’t have data. While the data view is useful for data analysts, I’d personally rather sacrifice it to gain development speed.
The horrible measure grid – Enough said. To Microsoft credit, the Tabular Explorer helps somewhat but it still doesn’t support the equivalent of the SSAS MD script editor.
No automation for repetitive tasks – It’s not unusual to create many measure variants, such as YTD, QTD. SSDT doesn’t help much automating them.

It does look interesting.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Day: October 12, 2018

Voice Control For Shiny Apps

Security Improvements In Kafka And Confluent Platform

SparkSession Versus SparkContext

Simulating LAG And LEAD Prior To SQL Server 2012

Actual Versus Estimated Rows In SSMS 18

Creating Fancy HTML Reports With Powershell

In Praise Of Tabular Editor