Visualizing In R: 3 Packages

Kristian Larsen has a quick demo of three R visualization packages, ggplot2, dygraphs, and plotly:

Another value generating visualisation package in R is dygraphs. This package focuses on creating interactive visualisations with elegant interactive coding modules. Furthermore, the package specialises in creating visualisations for machine learning methods. The below coding generates different visualisation graphs with dygraphs:

Three useful libraries to learn.  Two more which might be useful are ggvis and rbokeh.

Modifying A ggplot2 Theme

Sebastian Sauer gives us an example of modifying a standard ggplot2 theme:

ggplot2 is customizeable. Frankly, one can change a heap of details – not everything probably, but a lot. Of course, one can add a theme to the ggplot call, in order to change the theme. However, a more catch-it-all approach would be to change the standard theme of ggplot itself. In this post, we’ll investigate this option.

To date, I’ve only used themes others have created, but if you need to customize a theme, there’s a lot you can do here.

Automating Azure SQL Database Scaling

Arun Sirpal shows how to use Azure Logic Apps to auto-scale Azure SQL Database:

When I was presenting my Azure SQL Database session at DataRelay (used to be SQLRelay) I was asked (over coffee) about auto scaling capabilities. Quite simply there is nothing out of the box to achieve this. The idea of auto scaling would be good where you would need a burst to fulfill higher demand in terms of workload for a time duration, you know, something like “end of the day, Friday night sale” for your database.

Classically you would probably go down the PowerShell route via a runbook, but I am different.

In this case, the automation is timer-based rather than load-based.

Reuse Versus Learning

Fred Weinmann argues for the value in learning modules over built-it-yourself solutions for expanding knowledge:

When you start with a new technology, you don’t start on a green field. You’ve got lots of luggage you carry around with you (previous experience) and a perception where you want to go (project conditions, goals). However the technology you interact with may not be limited to just that.
Using, consuming and discovering tools written for a technology by someone who has been busy in that particular field for years can guide you in your own comprehension of the technology.
For example, if you were to shift to Database Administration of MSSQL servers, there is this community module called “dbatools” which covers most of the tools you will need (seriously, we spent lots of time on it to make that true). Now, no module can replace your own mastery of the topic. You will need to know how backup and restore works. How to design a new database and how to troubleshoot inefficient queries.
No tool can save you from needing to understand the concepts and the procedures.
However by looking up the commands, what they do and how they do it, you can benefit from the experience from some seriously senior dbas. For free.
The key point here is that going completely your own path may result in a bad solution, in piled up technological debt which you didn’t see coming because you didn’t have the context yet, because you tried to map new information into your previous context, whether that fit or not.

Click through for the full argument.  I’d go a step further:  these modules are capital.  They are the sum of knowledge built up over time and eschewing this so you can traverse the same ground and try to solve the same problems is a waste of time (unless you can do it better).  Build from what is there and use that precious time you have solving other problems.

Analysis Of A Failed Project

Eugene Meidinger looks back at a big project which fell apart:

So the first issue was that the software was built in-house by another company in the same industry. Imagine, for example, if a large bakery had created an ERP system and another large bakery wanted to move to that system. Sounds great, right? Well, you run into two issues in that scenario.

First, a bakery is not an independent software vendor. Programming, by definition, is not their core competency. Which means that you may run into fragility or issues that you wouldn’t run into with a commercial piece of software. It also means that there isn’t going to be any documentation on migrating to the software or implementing it. Why would there be. If you built software for one company, why would you create scaffolding to move other companies onto it?

Second, not every business is the same. A lot of the fundamentals are the same, but you will run into many edge cases. We do invoices this way. They do workorders this way. We handle purchase orders this way. They handle inventory that way.

The way that I think about it is like a sea shell. It’s this intricate curve that’s grown over time, organically, to fit that creature. If you just try to fit a different snail or mollusk in that shell, it may not work out.

Read the whole thing.

Embedding Images In Power BI

Zach Conroe shows how you can embed an image in Power BI:

The good news is that there are workarounds to this challenge. We are going to reconstruct the above use case and demonstrate how to pull in images from a local database, and then use custom columns in Power Query to reformat the source data in a way we can render graphically.

Note: This same custom column technique can also be used to display images imported using a local folder as a data source. If you have Power BI Desktop installed, you can work through along with this post by downloading the .pbix file with this link.

In our sample database we have a couple of tables containing images stored in a binary format, as well as a few columns of metadata for the images. The images being used here are a JPEG file type, but this technique can also be used for PNG files. We imported the data into Power BI and loaded two tables of images: Examples 1 and 2. For the first example we used three small images in the table, as shown below.

There’s a 32K size limit that Zach mentions, which can be a bit painful to work within.

Risks Of Using Resource Governor To Set Max DOP

Joe Obbish builds an example where Resource Governor’s CPU cap can actively harm query performance:

I uploaded the query plan here if you want to look at it. This type of scenario can happen even without Resource Governor. For example, a compiled parallel query may be downgraded all the way to MAXDOP 1 if it can’t get enough parallel threads.

The query performs significantly worse than before, which hopefully is not a surprise. A single execution took 12860 ms of CPU time and 13078 ms of elapsed time. Nearly all of the query’s time is spent on the hash join for the index intersection, with a tempdb spill and the processing of additional rows both playing a role. The tempdb spill occurs because SQL Server expected the build side of the hash join to be reduced to 1213170 rows. The bitmap filtering does not occur so 8 million rows were sent to the build side instead.

Read the whole thing.

SQL Undercover Inspector 1.2 Released

Adrian Buckman announces a new version of SQL Undercover Inspector:

#21 Added AG Databases check to warn on databases not joined to an AG 

If you are using Availability groups and you have this new module enabled the Inspector will assume that all databases should be joined to an AG, every database name for the instance is inserted into a new table called [Inspector].[AGDatabases] and the Is_AG flag is set to a 1 , if databases are joined to an AG then the Is_AGJoined column is set to a 1 therefore no Advisory will be shown on the report. If a database is marked as Is_AG then it will continue to warn if not joined to an AG on the Inspector reports, if you wish to exclude a given database from the advisory condition simply update Is_AG to a 0 .

Instances that are not Hadr enabled with at least one AG will automatically have Is_AG set to 0 and will be excluded from the checking even if the module is enabled.

Click through for a big list of changes.


October 2018
« Sep