Kevin Feasel – Page 1587

Interactive Graphics With ggiraph

Published 2016-10-11 by Kevin Feasel

David Smith sheds some light on the ggiraph project:

R’s ggplot2 package is a well-known tool for producing beautiful static data visualizations that you can include in a printed report. But what if you want to include a ggplot2 graphic on a webpage and provide the ability for the user to interact with the data? The ggiraph package by David Gohel (available for installation via CRAN). WIth ggiraph, you can take an existing ggplot2 bar chart, scatterplot, boxplot, map, or many other types of chart and add one or both of the following iteractions:

Display a tooltip of your choice (e.g. data values or labels) when the cursor hovers over sections of the chart
Perform an action (a javascript function you provide: jump to another page, for example) when the viewer clicks on an element of the chart

I like it.

Comments closed

Identity As A Service

Published 2016-10-11 by Kevin Feasel

Cristian Satnic argues that we should look at Identity as a Service solutions for our applications:

What exactly is Azure Active Directory B2C?

Cloud identity service with support for social accounts and app-specific (local) accounts
For enterprises and ISVs building consumer facing web, mobile & native apps
Builds on Azure Active Directory – a global identity service serving hundreds of millions of users and billions of sign-ins per day (same directory system used by Microsoft online properties – Office 365, XBox Live and so on)
Worldwide, highly-available, geo-redundant service – globally distributed directory across all of Microsoft Azure’s datacenters

I am a big fan of OAuth and making it easy for line-of-business developers to deal with authentication (lest they get harebrained ideas like rolling their own encryption algorithms).

Comments closed

Word Cloud Visual

Published 2016-10-11 by Kevin Feasel

Devin Knight shows off the word cloud custom visual in Power BI:

Key Takeaways

Great for parsing unstructured data
Utilize stop words to remove commonly used filler words like a, the, an, etc…
- You can use the default stop word that are provided and add your own that you would like to remove from the visual.
The size of the words in the visual tell you how frequently the word is used.

Cf. yesterday’s word cloud example. I’m not sure how truly valuable word clouds are for visualization purposes, but at least they’re fun to peruse.

Comments closed

Nonclustered Columnstore Indexes On Indexed Views

Published 2016-10-11 by Kevin Feasel

Niko Neugebauer notes that non-clustered columnstore indexes can now sit on top of indexed views, as of SQL Server 2016:

From the perspective of the disk access, this is where you will definitely win at least a couple of times with the amount of the disk access while processing the information, amount of memory that you will need to store and process (think hashing and sorting for the late materialisation phases), and you will pay less for the occupied storage.

Another noticeable thing was that the memory grants for the Indexed Views query was smaller compared to the query that was processing the original columnstore table FactOnlineSales.

Clustered indexes are currently not available as an option; we’ll see if that changes in the next version of SQL Server.

Comments closed

Auditing In Power BI

Published 2016-10-11 by Kevin Feasel

Ginger Grant shows off some of the auditing capabilities within Power BI:

As you can see by looking at the available Power BI options, there are a number of options to choose from. If you select the top item PowerBI activities, then everything gets selected. After doing that click outside of the menu for the menu to go away. Select a date and time range of your choosing, select specific users if you wish, then click on the Search button. Depending on how big your date range is, this may take some time to load. Once you see the results, you have the ability to filter as well.

Another day, another two dozen new Power BI features… This one’s a good one.

Comments closed

Word Clouds In Python

Published 2016-10-10 by Kevin Feasel

Allison Tharp shows how to generate a word cloud using Python:

Every week, someone on Reddit posts a “word cloud” on all of the NFL team’s subreddits. These word clouds show the most used words on that subreddit for the week (the larger the word, the more it was used). These word plots are always really fascinating to me, so I wanted to try to make some for myself. In this tutorial, we’ll be making the following word cloud from my board game stats twitter feed, @BGGStats

Looks like the implementation is fairly straightforward, so check it out.

Comments closed

Azure Data Lake Updates

Published 2016-10-10 by Kevin Feasel

Saveen Reddy points out a few updates to Azure Data Lake Store & the Azure Data Lake Analytics portal:

Use Custom Delimeters when Previewing Files

Previously, we had supported comma, colon, space, tab, ampersand, and bar delimiters. With the many different kinds of files used in Azure Data Lake Store and Azure Storage, we’ve added a “Custom” delimiter options for you to define your own delimiter.

To change the delimiter on the Azure Portal:

Open the file you want to preview using Data Explorer.
Click on Format
Under Delimiter, click the dropdown and change it to Custom
A new Custom Delimiter field will appear, type in your delimiter here
Click OK

Read on for more updates.

Comments closed

Change With Automatic Stats Update

Published 2016-10-10 by Kevin Feasel

Jack Li notes that SQL Server 2016 has changed when automatic statistics update gets called:

Old threshold: it takes 20% of row changes before auto update stats kicks (there are some tweaks for small tables, for large tables, 20% change is needed). For a table with 100 million rows, it requires 20 million row change for auto stats to kick in. For vast majority of large tables, auto stats basically doesn’t do much.

New threshold: Starting SQL 2008 R2 SP1, we introduced a trace flag 2371 to control auto update statistics better (new threshold). Under trace flag 2371, percentage of changes requires is dramatically reduced with large tables. In other words, trace flag 2371 can cause more frequent update. This new threshold is off by default and is enabled by the trace flag. But in SQL 2016, this new threshold is enabled by default for a database with compatibility level 130.

Important to know.

Comments closed

Generating JSON In Power BI

Published 2016-10-10 by Kevin Feasel

Chris Webb shows how to generate JSON in M:

Often, when calling web services from Power BI or Power Query, you’ll need to generate some JSON inside your query to send to these web services. The M language makes this relatively easy to do with the Json.FromValue() function but there’s only one example of how to use it in the help so I though it might be useful to provide some worked examples of M data types and how Json.FromValue() turns them into JSON.

First, here’s a function – that I’ve called GetJson() for the examples here – that takes a parameter of any data type, passes it to Json.FromValue() and returns the JSON representation of the input as text:

Read on for the code sample.

Comments closed

T-SQL And R Performance Comparisons

Published 2016-10-10 by Kevin Feasel

Tomaz Kastrun does several performance comparisons between various R packages and T-SQL constructs:

Couple of packages I will mention for data manipulations are plyr, dplyr and data.table and compare the execution time, simplicity and ease of writing with general T-SQL code and RevoScaleR package. For this blog post I will use R packagedplyr and T-SQL with possibilites of RevoScaleR computation functions.

My initial query will be. Available in WideWorldImportersDW database. No other alterations have been done to underlying tables (fact.sale or dimension.city).

Read on for code and conclusions. I don’t think there are any shocking conclusions: the upshot is to filter data as early as possible.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel

What exactly is Azure Active Directory B2C?

Key Takeaways