Curated SQL – Page 1363 – A Fine Slice Of SQL Server

When we do a transformation on any RDD, it gives us a new RDD. But it does not start the execution of those transformations. The execution is performed only when an action is performed on the new RDD and gives us a final result.

So once you perform any action on an RDD, Spark context gives your program to the driver.

The driver creates the DAG (directed acyclic graph) or execution plan (job) for your program. Once the DAG is created, the driver divides this DAG into a number of stages. These stages are then divided into smaller tasks and all the tasks are given to the executors for execution.

Click through for more details.

Comments closed

Tracking Down Long-Running xp_cmdshell Processes

Published 2018-04-26 by Kevin Feasel

Thomas Rushton investigates what’s taking so long with an xp_cmdshell call:

I wanted to know what he was up to, but the sql_text field only gives “xp_cmdshell”, not anything useful that might help to identify what went wrong.

So we have to go to Taskmanager on the server. On the “Process Details” page, you can select which detail columns you want to see. We want to see the Command Line, as that’ll tell us if it’s some manually-launched batch job that’s failed or something else going wrong.

An alternative to using the Task Manager is to open ProcMon, part of the Sysinternals toolset. It takes a bit of getting used to, but is quite powerful once you know its ins and outs.

Comments closed

Creating Seaborn Plots With R

Published 2018-04-26 by Kevin Feasel

Abdul Majed Raja shows how to call Python from R and build plots using the Seaborn Python package:

The reticulate package provides a comprehensive set of tools for interoperability between Python and R. The package includes facilities for:

Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session.

Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays).

Flexible binding to different versions of Python including virtual environments and Conda environments.

Reticulate embeds a Python session within your R session, enabling seamless, high-performance interoperability.

The more common use of reticulate I’ve seen is running TensorFlow neural networks from R.

Comments closed

Getting The Current Date And Time In SQL Server

Published 2018-04-26 by Kevin Feasel

Randolph West shows a few functions which can retrieve current date and time information:

What do we mean by local date and time?

As discussed previously, SQL Server is not time zone aware, nor does it have to be. This is because the operating system that SQL Server runs on can have multiple custom regional settings depending on which user is logged into the server.

This holds true for the SQL Server service account as well, which is just another user on the operating system. When any of these functions is called, it is asking for the date and time from the operating system.

If you’re going to use DATETIME2 (which you generally should), take advantage of the precision that SYSUTCDATETIME() gives you over GETUTCDATE().

Comments closed

SQL Operations Studio April Release

Published 2018-04-26 by Kevin Feasel

Alan Yu announces the April release of SQL Operations Studio:

Highlights for this build include the following.

Public preview release of SQL Agent extension
Added new extensions and improved existing extensions
- Improvements to Server Reports Extension
- Release of SSMS Keymap extension
- Release of AlwaysOn Insights extension
- Release of MSSQL Instance Insights
- Release of MSSQL Db Insights
Added Visual Studio Code 1.21 platform source code refresh
- Improved large and protected file support for saving Admin protected and >256M files within SQL Ops Studio
- Integrated Terminal splitting to work with multiple open terminals at once
- Reduced installation on-disk file count footprint for faster installs and startup times
Continue to fix GitHub issues

There’s a lot in here.

Comments closed

Takeaways From Implementing Power BI Embedded

Published 2018-04-26 by Kevin Feasel

Meagan Longoria has some thoughts after a proof of concept using Power BI Embedded:

After making changes and testing your report, make sure to clear any slicer values before publishing, if you have row-level security on a field shown in a slicer and you leave values selected. The selected values will be shown to users when they view the report. For example, let’s say you have created a row-level security role that can only see Product A, but you can see everything, and you left Product A and Product B selected and deployed the report. A user who views the report next and is a member of that RLS role will see the two selected values in the slicer, even though they can’t see the data for Product B on the page. This may not be a big deal for an internal report. But now imagine this is for clients. You don’t want clients to see other clients in the list. This behavior is consistent in the Power BI web service and isn’t specific to embedding. It’s just important to remember this.

There are plenty of interesting notes here, so check it out if you’re thinking of a Power BI project.

Comments closed

Creating A Custom Calendar Table With Power Query

Published 2018-04-26 by Kevin Feasel

Matt Allington shows how to create a calendar table which allows users to set the start and end dates:

My approach to teaching people to use Power Query is to always use the UI where possible. I first use the UI to do the hard work, then jump in and make small changes to the code created by the UI to meet any specific variations required. Keep this concept in mind as you read this article.

I am going to use Power BI Desktop as the tool for this, but of course Power Query for Excel will work just as well and the process is identical. In fact the calendar query at the end can easily be cut and pasted between Power BI and Power Query for Excel.

Check it out for another method for building calendar tables. I tend to build them in SQL Server because that’s what I’m most familiar with, but it’s good to know a few different ways of doing this.

Comments closed

Azure Data Lake Alerting

Published 2018-04-25 by Kevin Feasel

Jose Lara shows how to send alerts if you hit a utilization threshold:

If you want to see the step-by-step guide to create a new Log Analytics alert, check out our recent blog post on creating Log Analytics Alerts.

For the alert signal logic, use the following values:

Use the query from the previous step
Set the sum of AUs to 50 as the threshold (you can use any number that reflects your own threshold)
Set the trigger to 0: whenever the threshold is breached
Set the period and frequency for 24 hours.

Read the whole thing if you use Azure Data Lake Analytics; an unexpectedly large bill is a tough thing to swallow.

Comments closed

Creating Map Plots With ggmap

Published 2018-04-25 by Kevin Feasel

Laura Ellis shows how to use the ggmap package to create choropleth maps in R:

In the last map, it was a bit tricky to see the density of the incidents because all the graphed points were sitting on top of each other. In this scenario, we are going to make the data all one color and we are going to set the alpha variable which will make the dots transparent. This helps display the density of points plotted.

Also note, we can re-use the base map created in the first step “p” to plot the new map.

Check it out. This is an introduction to creating choropleths, making it a good start.

Comments closed

Running The Azure DTU Calculator On An Older Server

Published 2018-04-25 by Kevin Feasel

Jim Donahoe shows us how to get the Azure DTU calculator running on an older server without Powershell:

I recently had to do an analysis of a client’s database workload using the Azure DTU Calculator(DTU Calculator) and thought it might be interesting to share just how I did that. I have run this tool numerous times on other clients via the PowerShell method and the Command Line method, however this client’s environment was: Windows Server 2008R2, and SQL Server 2008R2 SP3 and had to be done differently.

Now, from the DTU Calculator page itself, it tells you how the process works. It essentially runs a perfmon trace for an hour with the following counters:

Processor – % Processor Time

Logical Disk – Disk Reads/sec

Logical Disk – Disk Writes/sec

Database – Log Bytes Flushed/sec

My client did not have PowerShell accessible for me to use unfortunately. I normally prefer the PowerShell script, however in this case I had to use the Command Line Interface, they both return the same results.

Click through to see how Jim did it.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Curated SQL Posts