Press "Enter" to skip to content

Author: Kevin Feasel

Resource Governor For R

Tomaz Kastrun describes using Resource Governor with Microsoft R Services:

Setting external resource pool for execution of R commands using sp_execute_external_script has proven extremely useful, especially in cases where you have other workers present, when you don’t want to overdo on data analysis and get to much resources from others (especially when running data analysis in production environment) or when you know that your data analysis will require maximum CPU and memory for a specific period. In such cases using and definingexternal resource pool is not only useful but highly recommended.

Resource governor is a feature that enables you to manage SQL Server workload and system resource consumption their limits. Limits can be configures for any workload in terms of CPU, Memory and I/O consumption. Where you have many different workloads on the same SQL Server, resource Governor helps allocate requested resources.

If you’re concerned about R soaking up all of your server’s memory, Resource Governor is a great way of limiting that risk.

Comments closed

Powershell All The Things

Max Trinidad reports that Microsoft is making Powershell a cross-platform language:

This is PowerShell Core only, Alpha Version 6.0.0.9 and there’s a lot of work to do. Bugs and feedback are been submitted as the community are contributing for it success.

This version is also available for Windows 10 / Server 2016 and Windows 8.1 / Server 2012 R2. You can have it side-by-side with the current version of PowerShell.

Max has notes on how to install it on Ubuntu.  Given that Microsoft is bringing Bash to Windows and Powershell to Linux, these are interesting times.

Comments closed

T-SQL Tuesday 081 Roundup

Jason Brimhall has a roundup of T-SQL Tuesday #081:

One of the tricks to becoming and staying a top tier data talent or professional is a perpetual cycle to learn, adapt, change, and evolve. We must be in a continual cycle of self evaluation and self modification. Let’s call this by something else – we must be agile. There I said the five letter word. Think about it in broad strokes with your career – it is a development process with perpetual evaluation, review and tweaks.

Now think about the invite and see how that fits with what I just said or with the, cough cough, agile flow. You start (albeit very basically) with a need for enhancement, then you plan which pieces of the enhancement you can accomplish, you then do the work (whether successful or not), then after you deliver the work you conclude with a retrospective (what went well and what needs to change). Yes! I do feel rather dirty for sneaking this on everybody like this. That said, when you think about the model and apply it in broad strokes to your career path – it has merit.

Read on to see who participated this month and Jason’s thoughts.

Comments closed

Flink: Streams Versus Batches

Kevin Jacobs has an article comparing Apache Flink to Spark Streaming:

The other type of data are data streams. Data streams can be visualized by water flowing from a tap to a sink. This process is not ending. The nice property of streams is that you can consume the stream while it is flowing. There is almost no latency involved for consuming a stream.

Apache Spark is fundamentally based on batches of data. By that, for all processing jobs at least some latency is introduced. Apache Flink on the other hand is fundamentally based on streams. Let’s take a look at some evidence for the difference in latency.

Read the whole thing.

Comments closed

Introduction To Powershell

Allison Tharp is starting to learn Powershell:

PowerShell is developed by Microsoft, specifically for task automation and configuration management.  A shell is a user interface that gives the user access to services of an operating system.  The shell you are likely most familiar with is the Command Prompt.  PowerShell is like the Command Prompt in many ways, but it is much more powerful.

PowerShell is based on the .Net framework and is considered a scripting language.  Windows also developed the Windows PowerShell ISE (Integrated Scripting Environment).  This is a GUI that lets us make scripts without having to type all of the commands into the command line.

I highly recommend learning Powershell if you’re in the Windows administration space.  It’s not quite as useful on the development side (where you can quickly throw together a .NET app) but it’s great for scripting out those admin tasks.

Comments closed

Streaming Data To Power BI

Reza Rad shows how to hook up streaming data sources to Power BI:

As I mentioned before, download the sample project. and then select the project under this path:

samples/consoleapp/getting-started-for-dotnet-real-time/PBIRealTimeStreaming

And open the PBIRealTimeStreaming solution in Visual Studio. This project creates a data set including a datetime value and a numeric value and will pass that through Power BI API (which will be discussed later in another post) to Power BI service.

Real-time dashboards are great for making it look like you’re doing Very Important Things.  They can also be useful in other ways too.  Read the whole thing.

Comments closed

Package Processing With Biml

Ben Weissman gives us a table-driven process for SSIS package creation through Biml:

Now, that we’ve seen multiple ways on how to use Biml to create SSIS packages that will generate and populate tables, we’ll try to bridge the gap on how to use Biml for packages that will execute stored procedures and other tasks for us. We’ll use a metadata-model for that again, though it will be very minimalistic. This approach makes it very easy to maintain – we’re actually using it in exactly this way in multiple projects.

To get started, let’s create a new database “MyBiml_SimplisticMetaData” and create two tables, a view and some sample data in there:

One area in Biml where I’ve had trouble is finding the “right” mix of SQL Server tables, T-SQL operations, and Biml (e.g., building up sets of tables for a warehouse load by storing them in a metadata table or by going through RootNode).  I tend to lean heavily toward this kind of solution, where most of my metadata lives in T-SQL, but when I do that, it always feels like there’s a “purer” way of doing it.  Regardless, one reason I like Biml is that even if I am doing it wrong (for some definition of “doing it wrong”), it’s still a huge time-saver.

Comments closed

Checking For Credentials

Denny Cherry uses a try-catch block to figure out if you can authenticate automatically and prompts you otherwise:

Runbooks are very powerful tools which allow you to automate PowerShell commands which need to be run at different times.  One of the problems that I’ve run across when dealing with Azure Runbooks is that there is no way to use the same script on prem during testing and the same script when deploying. This is because of the way that authentication has to be handled when setting up a runbook.

The best way to handle authentication within a runbook is to store the authentication within the Azure Automation configuration as a stored credential.  The problem here is that you can’t use this credential while developing your runbook in the normal Powershell ISE.

This is a clever idea.

Comments closed

Partition Alignment In Storage

Wayne Sheffield has an article on I/O partition alignment:

What we need to do is to offset the beginning of the data being stored on disk to a location more conducive to how the program is operating. This offset is known as the “Partition Alignment Offset”. To be in tune with SQL Server, this value should be an increment of 64KB. However, you also need to consider the entire storage subsystem – the disks, controllers and memory. Starting with Windows Server 2008, this offset is at 1024KB – a nice increment of 64KB that also works very nicely with most RAID disks/controllers. Prior to Windows Server 2008, partition alignment offset was not explicitly performed, so this will need to be performed.

If you’ve migrated disk from server to server to server over the years, this is worth checking out.

Comments closed

Explaining RBAR

Kenneth Fisher explains RBAR with the help of an animated GIF:

So 23 milliseconds for the batch version and 850 milliseconds for RBAR. What a difference.

Now in this case the code for the RBAR is also a lot more complicated. But that isn’t always the case. It also isn’t always the case that RBAR is slower. But it’s almost always a lot slower than batch.

So, while the code for RBAR is often easier to write, even though it might be physically longer, it’s probably going to be slower too.

Well-written, set-based solutions aren’t always guaranteed to be faster, but that’s one of the safest bets to make with T-SQL.

Comments closed