Press "Enter" to skip to content

Day: December 17, 2018

Summarizing Improvements In Spark 2.4

Anmol Sarna summarizes Apache Spark 2.4 and pushes his meme game at the same time:

The next major enhancement was the addition of a lot of new built-in functions, including higher-order functions, to deal with complex data types easier.
Spark 2.4 introduced 24 new built-in functions, such as  array_unionarray_max/min, etc., and 5 higher-order functions, such as transformfilter, etc.
The entire list can be found here.
Earlier, for manipulating the complex types (e.g. array type) directly, there are two typical solutions:
1) exploding the nested structure into individual rows, and applying some functions, and then creating the structure again.
2) building a User Defined Function (UDF).
In contrast, the new built-in functions can directly manipulate complex types, and the higher-order functions can manipulate complex values with an anonymous lambda function similar to UDFs but with much better performance.

2.4 was a big release, so check this out for a great summary of the improvements it brings.

Comments closed

Building A Full-Stack App With F#

Shanglun Wang has a tutorial on building a full application using F#:

In the United States, there is a popular saying: “It’s five o’clock somewhere”.
In some parts of the world, 5:00 pm is the earliest time when it is socially acceptable to have a drink, or a traditional cup of tea.
Today, we will build an application based on this concept. We will build an application that, at any given time, searches through the various time zones, find out where it is five o’clock, and provide that information to the user.

It’s a neat app.  

Comments closed

Using Powershell As Your Default Prompt On Windows Server Core

Patrick Gruenauer shows us how to make Powershell the shell of choice when running on Windows Server Core:

Well, if you’re running a Server Core (I hope so, for domain controllers, dns, file services and more there’s no good reason to do not so), then it’s an ease to change that. The corresponding setting has to be configured in the registry. Regedit can be opened on Server Core (there are more graphical user interfaces that are shipped with core, for example notepad.exe and timedate.cpl). I’m talking about this key:

Click through for the instructions.

Comments closed

Service Account Names And SQL Agent Not Starting Up After Reboots

Hamish Watson walks us through a weird scenario:

Looking at the event logs I (eventually) found this:
Event ID 7000:
The MSSQLSERVER service failed to start due to the following error: 
The account name is invalid or does not exist, or the password is invalid for the account name specified.
Which is bizarre – as the service account had been used for months – but after each reboot the services had to be manually started. GPO and other things had been blamed but no one could actually find out why.

I’ve never used @ naming for service accounts, so until I read this I didn’t even know it was possible.  Now I’m going to forget again because apparently it’s not a good idea.

Comments closed

Mandatory And Default Powershell Parameters

Jess Pomfret shows us an example of creating mandatory parameters but also having a default set:

I came across a situation this week where I wanted to add the option of running an existing script,  for a specific server/database combination.  The script currently has no parameters and runs against all servers in the environment from a scheduled task.  I wanted to make sure that behavior didn’t change. The other requirement was that if I specified Server, Database should be a mandatory parameter and vice versa.
The final solution was to add the two parameters to a parameter set and make them both mandatory.  I also had to add a different DefaultParameterSet (thanks to Andrew for this idea), otherwise it defaulted to the defined parameter set, meaning the script always required both Server and Database parameters.

Click through for a demo.

Comments closed

Bulk Insertion With An Identity Column

Kenneth Fisher gives us a quick post on bulk insertion against tables with identity columns:

TL;DR; BULK INSERT doesn’t have an easy way to specify a column list so the columns in the insert file must match the columns in the table unless you use a format file or a staging table.
As simple as they appear on the surface identity columns are fairly complicated things. BULK INSERT, on the other hand, is pretty complicated even on the surface (at least as far as I’m concerned). Because of this, the combination can get even worse. When you do an insert into a table that has an identity column you usually just ignore that column and it’s automatically filled in. However, that’s not true with BULK INSERT. Unless you use a format file, the number of columns (and their order) in the file has to match the table.

Read the whole thing.

Comments closed

Availability Groups And Read-Only Filegroups

Allan Hirt walks us through a couple of scenarios involving databases with read-only filegroups using Always On Availability Groups:

A question came across my inbox this week which I decided to investigate: what happens if you have read only filegroups as part of your database, and you want to use Always On Availability Groups? Let’s find out.
First, I have a database with two filegroups: one read write (PRIMARY) and one read only (ROFG).

Click through for a demonstration of this, as well as a different scenario in which you might want only the read-write data on the secondary.

Comments closed

Using DISTINCT With XML Data

Dave Bland has a workaround for a limitation with processing XML in SQL Server:

Since I was working on an example for my next performance class, I decided to use an example from that session.  In the query below the qp.* part of the column list will return a column named query_plan which is an XML data type.
SELECT DISTINCT pa.,cp.,qp.* 
FROM
(SELECT plan_handle
FROM sys.dm_exec_cached_plans) cp 
CROSS APPLY sys.dm_exec_query_plan (cp.plan_handle ) qp
CROSS APPLY sys.dm_exec_plan_attributes(cp.plan_handle) As pa
WHERE is_cache_key = 1 AND pa.attribute = ‘set_options’
However, when I add the DISTINCT keyword I get the error below.

Read on for a workaround for this.

Comments closed