U-SQL Parallelism

Kevin Feasel

2016-06-20

U-SQL

Saveen Reddy discusses parallelism with U-SQL jobs:

Previously an Data Lake Analytics account was limited to a total of 60 units on parallelism available to 3 concurrent U-SQL jobs. And each U-SQL job was limited to only using a maximum of 20 units of parallelism. However, sometimes developers want to run a U-SQL job that uses more parallelism. For example: they might want two concurrently running jobs that each use 30 units. Today, we are removing the per-job limit. Now you can concurrently run jobs that use any amount of parallelism as long as the total for the running jobs doesn’t go beyond the maximum for your account (currently 60 units of parallelism).

If you are already running U-SQL jobs, there’s no need to worry. This change doesn’t require any action on your part. Your jobs run just as they did before. But now, if you want to you can take advantage of all the parallelism in your account.

This doesn’t change the available performance units, so there’s no billing change.

Negative Width And Height In SSMS

Manoj Pandey ran into an issue with Management Studio wanting to open a window with a negative size:

By checking the error its obvious that there is something wrong with Width or Height of SSMS Query-Editor window.

So, I went to REGEDIT (In RUN, type regedit.exe) and after navigating here n there got the location where to update this property.

Navigate to folder: HKEY_CURRENT_USER\SOFTWARE\Microsoft\SQL Server Management Studio\13.0\

Here check the MainWindow property value (image below), it was showing: 0 451 109 -120 876 1

Change it to a positive value considering the width of your SSMS editor window, I replaced -120 with 1400

I had no idea that the main window size details were kept in the Registry.

Dealing With Backup Files

Kenneth Fisher discusses backup files, especially when they mysteriously double in size:

Did you know a single backup file can contain multiple database backups? When you backup a database to a file, if that file already exists, then by default the backup will be appended to the existing file. Causing the file to increase in size. If it’s the same database (and yes you can have a single file containing backups from multiple databases) then the file size will double or more. This behavior is controlled by the INIT/NOINIT clause of the BACKUP DATABASE command. NOINIT (the default) tells SQL to append the new backup to the existing file. INIT tells SQL to overwrite the existing backup files. Note the header of the file is not initialized.

This is good information to know.

Biml And Excel

Kevin Feasel

2016-06-20

Biml

Dave Ballantyne has put together a Biml script to create complex Excel spreadsheets from data sets:

Anyway,  i’ve combined these two pain points to create a BIML routine that uses EPPlus to output multi-sheeted Excel spreadsheet reliably and fast.

At the moment its very basic , take SQL statements and output the data to an excel file, but in time i will be hoping to create some meta data to start ‘getting the crayons out’ and making them look a bit more pretty.

Check out his GitHub repo for more details.

Tools For Cortana Intelligence Suite Development

Melissa Coates has a list of tools she uses when working with Cortana Intelligence Suite:

4. Azure SDK

The Azure SDK sets up lots of libraries; the main features we are looking for from the Azure SDK right away are (a) the ability to use the Cloud Explorer within Visual Studio, and (b) the ability to create ARM template projects for automated deployment purposes. In addition to the Server Explorer we get from Visual Studio, the Cloud Explorer from the SDK gives us another way to interact with our resources in Azure.

This is a nice tools checklist to compare against what you’re using.

Indirect Checkpoint

Kendra Little notes that indirect checkpointing is now the default in SQL Server 2016:

SQL Server 2016 introduces big new features, but it also includes small improvements as well. Many of these features are described in the “It Just Runs Faster” series of blog posts by Bob Ward and Bob Dorr.

One article in this series explained that new databases created in SQL Server 2016 will use “Indirect Checkpoint” by default. Indirect checkpoint was added in SQL Server 2012, but has not previously been enabled by default for new databases. The article emphasizes this point:

Check out the comments as well.

Using Query Store

Justin Goodwin looks at Query Store in SQL Server 2016:

You can see that our problem query is incredibly easy to find in the top left window based on total duration. Also notice that in the top right Plan summary window, there is currently only one available plan for the query (plan_id 49).

We need to figure out how we can get our ‘good plan’ using Trace Flag 9481 as an available plan that we can force using the Query Store.

The cardinality estimator change in SQL Server 2014 wasn’t perfect, but when you can fix individual plans like this, it makes SQL Server much more powerful.

SSIS And NUMA

SQL Sasquatch has some SSIS package issues stemming from a lack of NUMA awareness:

So the server had plenty of free RAM.  But NUMA node 1 was in a pinch.  And SSIS spooled its buffers to disk.  Doggone it.

I guess I’d figured that notifications were sent based on server-wide memory state.  But I guess maybe memory state on each NUMA node can lead to a memory notification?

The target SQL Server instance, a beefy one, was also on this physical server.  There’s 1.5 TB of RAM on the server.  🙂

It also looks like the easiest fix is something which was deprecated in Windows Server 2012 R2.

ElasticSearch Production Checklist

Darren Perucci has a checklist for spinning up a production ElasticSearch environment:

When you are looking for a node or a cluster, it is a good idea to have a name which describes what you will need to find and separates one from another.

The default cluster name of “elasticsearch ” could allow any nodes to join the cluster, even if this was not intended. Thus, it is a good idea to give the cluster a distinct identifier instead.

There are a few things to think about in this post.

Row-Level Security

Steve Jones talks about row-level security:

I’ve wanted this feature to be easy for a long time. In fact, I’ve implemented a similar system a few times in different applications, but it’s been a cumbersome feature to meet, plus each developer needs to understand how the system works for it to work well. Even in the case where we once used views to hide our RLS, it was a performance issue.

Microsoft has made things easier with their Row Level Security feature. This was actually released in Azure in 2015, but it’s now available in SQL Server 2016 for every on premise installation as well.

Essentially for each row, there is some data value that is checked to determine if a user has access. This doesn’t mean a join. This doesn’t mean you write a lot of code. The implementation is simple, and straightforward, and I like it.

The implementation is pretty easy, but I’m concerned about the performance.  At least from my early view, this adds a good bit of performance drag on your queries.  That’s not so bad in a data warehouse environment, but in a busy transactional system, I’m not sure it can keep up.  I’d be interested in hearing other thoughts, though.

Categories

July 2018
MTWTFSS
« Jun  
 1
2345678
9101112131415
16171819202122
23242526272829
3031