2017-10-23 – Curated SQL

Inline U-SQL Functions

Published 2017-10-23 by Kevin Feasel

Damien Widera shows us how to write inline functions in U-SQL:

Now let’s go to the new thing – undocumented usage of inline functions. My function is pretty simple and I can imagine that function you will write could be as simple as mine but your functions will probably do something more useful. To simplyfy the coding process you could use inline function in your USQL script and not have to write any code in the C# file.

The could could look like this:

DECLARE @in string = "/Data/Aircraft/2006ByMonth/{*}.tsv";
DECLARE @out string = "/Data/Aircraft/2006ByMonth/out/CSharpFunction.tsv";
DECLARE @func Func<int,int> = (s)=>{return s+1;};

Things which make languages look more like functional languages generally get a thumbs up from me.

Comments closed

Copying Azure Data Lake Databases

Published 2017-10-23 by Kevin Feasel

Yanan Cai shows how to copy Azure Data Lake databases for local debugging and development:

The concept of a database is used to group related data structures and functions together. ADLA users have databases in their production environment that contain tables, assemblies, table valued functions and other objects. Previously, when developing and tuning U-SQL queries on a local machine, developers would have to manually recreate everything in their production database. After coding they would have to identify any changes to the database and then update the production account’s database. This process took extra time and introduced errors without adding any value.

Using the Export Wizard, developers can clone the existing database environment and sample data directly to the local account. Developers can also choose to export only parts of the database to the local database. Follow below steps to export your U-SQL databases.

Click through for the step-by-step process.

Comments closed

More R Services Internals Spelunking

Published 2017-10-23 by Kevin Feasel

Niels Berglund continues his series on R Services internals:

What happens is that straight after the AuthenticateConnection you will hit the WriteAsync breakpoint twice, the same way as we see in Code Snippet 4. The first hit at WriteAsync sort of makes sense, as it ties up with the call-stack we see in Code Snippet 5. But what about the second WriteAsync (the one that causes the second package to be sent), where does that come from? To try to figure that out, we start with the call-stack for that particular WriteAsync.

Execute the code in Code Snippet 1 again, and continue to the second WriteAsync. When you hit the breakpoint do a kc again. The call-stack should now look somewhat like this (this time it is the full call-stack):

It’s interesting the kind of stuff you can find with Wireshark and a debugger.

Comments closed

Defining Memory Sets

Published 2017-10-23 by Kevin Feasel

Kendra Little has concise definitions of a few terms that we see when working with memory:

Private Bytes for a process

For a process in Windows, this shows…

How much memory is allocated
This includes page file space allocated and standby list

Click through for the full set of definitions.

Comments closed

External Memory Pressure With SQL On Linux

Published 2017-10-23 by Kevin Feasel

Anthony Nocentino explains how SQL Server on Linux reacts to memory pressure:

We can use tools like ps, top and htop to look our are virtual and physical memory allocations. We can also look in the /proc virtual file system for our process and look at the status file. In here we’ll find the point in time status of a process, and most importantly the types of memory allocations for a process. We’ll get granular data on the virtual memory allocations and also the resident set size of the process. Here are the interesting values in the status file we’re going to focus on today.

VmSize – total current virtual address space of the process
VmRSS – total amount of physical memory currently allocated to the process
VmSwap – total amount of virtual memory currently paged out to the swap file (disk)

The differences are going to be interesting for people to troubleshoot later, particularly if you look at SOS_SCHEDULER_YIELD and give a knee-jerk reaction that the problem is with CPU.

Comments closed

Powershell’s Switch Statement

Published 2017-10-23 by Kevin Feasel

Shane O’Neill shows us Powershell’s switch statement:

We send out notification emails if jobs fail as well but hey, we’re DBAs, we like a backup!

Since this used to be done manually, and since I like PowerShell, I created separate functions that can take individual log files and parse out the needed data in a nice, easy format.

My problem, and the reason for this post, was figuring out, based on the name of the file in a directory, which function to call…

Click through to see how to use the switch statement, as well as how to switch on an expression.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Day: October 23, 2017

Inline U-SQL Functions

Copying Azure Data Lake Databases

More R Services Internals Spelunking

Defining Memory Sets

Private Bytes for a process

External Memory Pressure With SQL On Linux

Powershell’s Switch Statement