Press "Enter" to skip to content

Author: Kevin Feasel

Miscellany On Java In SQL Server

Niels Berglund continues a series on SQL Server 2019 and Java support:

This post is the fourth post where I look at the Java extension in SQL Server, i.e. the ability to execute Java code from inside SQL Server. The previous three posts are:
SQL Server 2019 Extensibility Framework & Java – Hello World: We looked at installing and enabling the Java extension, as well as some very basic Java code.
SQL Server 2019 Extensibility Framework & Java – Passing Data: In this post, we discussed what is required to pass data back and forth between SQL Server and Java.
SQL Server 2019 Extensibility Framework & Java – Null Values: This, the Null Values, post is a follow up to the Passing Data post, and we look at how to handle null values in data passed to Java.
This fourth post acts as a “roundup” of miscellaneous “stuff” I did not cover in the three previous posts

If you haven’t seen the first three, check them out too.

Comments closed

The State Of Database Scoped Configurations

Niko Neugebauer takes us through the current state of Database Scoped Configurations in SQL Server:

I have already blogged about the first version of the Database Scoped Configurations for SQL Server 2016, with 4 visible optionsplus the procedure cache cleaning option, but we have followed in SQL Server 2017 with 5 (listed) & 9 (in practice – DISABLE_INTERLEAVED_EXECUTION_TVF, DISABLE_BATCH_MODE_ADAPTIVE_JOINS, BATCH_MODE_MEMORY_GRANT_FEEDBACK, BATCH_MODE_ADAPTIVE_JOINS are visible and functioning), and in just another year we have received a huge upgrade to the currently available 21 for SQL Server 2019.

It seems like this is a common route the SQL Server teams are going down, and it makes sense: your settings for Mega-DB probably shouldn’t be the same as for the tiny database in the corner. Oh, and that whole Azure SQL Database thing.

Comments closed

Querying SSISDB For Errors

Kevin Hill shows us two ways to get at error messages in Integration Services packages running in the SSIS Catalog:

My client makes extensive use of SSIS and deploys the packages to the Integration Services Catalog (ISC), and runs them via hundreds of jobs.
When one of the jobs fail, I have to go get the details.
Job History doesn’t have it.

I’d recommend the query route if you have to do this more than once or twice.

Comments closed

Building A Kubernetes Cluster With Kubespray

Chris Adkin continues a series on Kubernetes clusters:

In essence Kubespray is a bunch of Ansible playbooks; yaml file that specify what actions should take place against one or more machines specified in a hosts.ini file, this resides in what is known as an inventory. Of all the infrastructure as code tools available at the time of writing, Ansible is the most popular and has the greatest traction. Examples of playbooks produced by Microsoft can be found on GitHub for automating tasks in Azure and deploying SQL Server availability groups on Linux. The good news for anyone into PowerShell is that PowerShell modules can be installed via Ansible and PowerShell commands can be executed via Ansible. Also, there are people already using PowerShell desired state configuration with Ansible. Ansible’s popularity is down to the facts it is easy to pick up and agent-less because it relies on ssh, hence why one of the steps in this post includes the creation of keys for ssh. This free tutorial is highly recommended for anyone wishing to pick up Ansible.

Click through for a step-by-step tutorial.

Comments closed

Creating An Azure Storage Account

John Morehouse walks us through setting up an Azure Storage Account through the Azure Portal:

Azure offers a lot of features that enable IT professionals to really enhance their environment.  One feature that I really like about Azure is storage accounts.  Since disk is relatively cheap, this continues to hold true in the cloud.  For less than $100 per month, you could get up to 5TB of storage including redundancy to another Azure region.

Read on to learn how to set up one of these.

Comments closed

Diving Into OPTION(RECOMPILE)

Arthur Daniels explains some of the nuance behind OPTION(RECOMPILE) on T-SQL statements:

SQL Server will compile an execution plan specifically for the statement that the query hint is on. There’s some benefits, like something called “constant folding.” To us, that just means that the execution plan might be better than a normal execution plan compiled for the current statement.
It also means that the statement itself won’t be vulnerable to parameter sniffing from other queries in cache. In fact, the statement with option recompile won’t be stored in cache.

Click through for a couple of demos as well as a discussion of positives and negatives regarding its use.

Comments closed

MLflow 0.8.1 Released

Aaron Davidson, et al, announce a new version of Databricks MLflow:

When scoring Python models as Apache Spark UDFs, users can now filter UDF outputs by selecting from an expanded set of result types. For example, specifying a result type of pyspark.sql.types.DoubleType filters the UDF output and returns the first column that contains double precision scalar values. Specifying a result type of pyspark.sql.types.ArrayType(DoubleType) returns all columns that contain double precision scalar values. The example code below demonstrates result type selection using the result_type parameter. And the short example notebook illustrates Spark Model logged and then loaded as a Spark UDF.

Read on for a pretty long list of updates.

Comments closed

File Formats Supported In HDFS

Manoj Pandey covers a few of the file types supported by the Hadoop Distributed File System:

HDFS or Hadoop Distributed File System is the distributed file system provided by the Hadoop Big Data platform. The primary objective of HDFS is to store data reliably even in the presence of node failures in the cluster. This is facilitated with the help of data replication across different racks in the cluster infrastructure. These files stored in HDFS system are used for further data processing by different data processing engines like Hadoop Map-Reduce, Hive, Spark, Impala, Pig etc.

There are a few other formats not included in this list, including RCFile (which has been superseded by both ORC and Parquet), but this hits the highlights.

Comments closed

SQL Server’s Built-In Monitoring

Jason Brimhall has a three-part series on the types of monitoring built into SQL Server. Part one is an overview and includes the Default Trace:

The default trace by itself is something that can be turned off via configuration option. There may be good reason to disable the default trace. Before disabling the default trace, please consider the following that can be captured via the default trace. I will use a query to demonstrate the events and categories that are configured for capture in the default trace.

Part two looks at the system_health Extended Event session:

Beyond being a component of the black box for SQL Server, what exactly is this event session? The system_health is much as the name implies – it is a “trace” that attempts to gather information about various events that may affect the overall health of the SQL Server instance.
The event session will trap various events related to deadlocks, waits, clr, memory, schedulers, and reported errors. To get a better grasp of this, let’s take a look at the event session makeup based on the available metadata in the dmvs and catalog views.

Part three is the sp_server_diagnostics stored procedure:

Beyond being a component of the black box for SQL Server, what exactly is this diagnostics process? The sp_server_diagnostics is much as the name implies—it is a “diagnostics” service that attempts to gather information about various events that may affect the overall health of the SQL Server instance.
The diagnostics process will trap various server related health (diagnostics) information related to the SQL Server instance in an effort to try and detect potential failures and errors. This diagnostics session/process traps information for five different categories by default. There is a sixth category of information for those special servers that happen to be running an Availability Group.

I’ve used the first two but did not know about the third. Jason goes into good depth on each, showing you the types of information you can get out of these. Read the whole thing.

Comments closed