Press "Enter" to skip to content

Month: September 2020

Building an HTML Report with Failed Agent Job Info

Garry Bargsley continues a series on alerting when SQL Agent jobs fail:

Welcome to Part 2 in the series about SQL Server Agent Job Failures. In this part you are going to learn how to build an HTML report with Failed Agent Job information and send the report via e-mail to the interested team(s).

Let’s get started putting the pieces together to build our report.

Click through for the details, as well as a full code sample in Powershell.

Comments closed

Spark SQL in Delta Lake

Kundan Kumarr walks us through some of the basic SQL operations you can perform with Delta Lake in Apache Spark:

Nowadays Delta lake is a buzz word in the Big Data world, especially among the spark developers because it relegates lots of issues found in the Big Data domain. Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It is evolving day by day and adds cool features in its every release. On 19th June 2020, Delta lake version 0.7.0 was released and this is the first release on Spark 3.x. This release involves important key features that can make the spark developer’s work easy.

One of the interesting key features in this release is the support for metastore-defined tables and SQL DDLs. So now we can define Delta tables in the Hive metastore and use the table name in all SQL operations. We can perform SQL DDLs to create tables, insert into tables, explicitly alter the schema of the tables, and so on. So in this blog, we will learn how we can perform SQL DDLs/DMLS/DQL in Delta Lake 0.7.0.

Click through for the examples.

Comments closed

A Review of Distributed Availability Groups

Joey D’Antoni shares some thoughts on distributed Availability Groups in SQL Server 2016 and later:

I’m writing this post because I’ve been mired in configuring a bunch of distributed availability groups for a client, and while the feature is technically solid, the lack of tooling can make it a challenge to implement. Specifically, I’m implementing these distributed AGs (please don’t use the term DAG as you’ll piss off Allan Hirt, but more importantly its used in Microsoft Exchange High Availability, so it’s taken) in Azure which adds a couple of additional changes because of the need for load balancers. You should note this feature is Enterprise Edition only, and is only available starting with SQL Server 2016.

Read on for some of the positives around distributed AGs, as well as some negatives (mostly around the lack of tooling).

Comments closed

Starting and Stopping Services Things with dbatools

Mikey Bronowski continues a series on dbatools:

OK, let’s start. While doing some maintenance tasks there is a need to start/stop/restart SQL Services. Using SQL Server Configuration Manager is one way or even straight from SSMS, however, working with multiple instances is easier with dbatools.

Mikey covers services, SQL Agent jobs, endpoints, traces, and Extended Events sessions, so check it out.

Comments closed

Automating Collection of SSIS Failure Data

Rob Sewell has a love for automation:

I have tried my best at all times to follow this advice in the last decade and pretty much I am happy that I have managed it.

– I use PowerShell (a lot!) to automate all sorts of routine tasks including migrating this blog
– I use Jupyter Notebooks to enable myself and others to automate Run Books, Training, Documentation, Demonstrations, Incident Response. You can find my notebooks here
– I use Azure DevOps to automate infrastructure creation and changes with terraform and delivery of changes to code as well as unit testing.
– I use GitHub actions to create this blog, publish the ADSNotebook module
– I use Chocolatey to install and update software
– I have used Desired State Configuration to ensure that infrastructure is as it is expected to be

At every point I am looking for a means to automate the thing that I am doing because it is almost guaranteed that there will be a time in the future after you have done a thing that there will be a need to do it again or to do it slightly differently.

Click through to see what Rob has for us in the way of collecting SSIS log data after job failures.

Comments closed

Building an Azure Function to Automate CHECKDB

Arun Sirpal shows us how to build an Azure Function:

The title is a mouthful and so is this post. In the past I have linked to blog posts from Microsoft that say consistency checks for Azure SQL Database is the responsibility of Microsoft. (https://azure.microsoft.com/en-gb/blog/data-integrity-in-azure-sql-database/)

However, Paul Randal got me thinking about his thoughts on it (via his insider email). Forming the core of this post. If you desire to run DBCC CHECKDB against Azure SQL Database (which I know people do) – how can you do this? There are many ways, but for this blog post – Enter Azure functions. There are many moving parts to this, but once setup and coded it is a very satisfying experience. Let’s dig in. I am NOT going to copy and paste every little element of the high-level guide from Microsoft, there is no point in that but I will show you the links that you need to setup the relevant function app project then the tailored bits around CHECKDB forms the bulk of this post.

This isn’t necessary to do, but if you want to learn how Azure Functions work, it’s a good example of working through the mechanics.

Comments closed

Reading S3 Data into Kafka

Ramu Kakarla takes us through using Apache Camel to load Amazon S3 data into a Kafka topic:

Apache Camel is an open-source framework for message-oriented middleware with a rule-based routing and mediation engine that provides a Java object-based implementation of the Enterprise Integration Patterns using an application programming interface to configure routing and mediation rules

Read on to see how this fits together.

Comments closed

Tumbling and Sliding Windows in Flink

Kundan Kumarr takes us through two different window types in Apache Flink:

In the previous blog, we talked about Flink’s windows operator, a heart of processing infinite streams. Generally in Flink, after specifying that the stream is keyed or non keyed, the next step is to define a window assigner. The window assigner defines how elements are assigned to windows. Flink provides some useful predefined window assigners like Tumbling windowsSliding windowsSession windows, Count windows, and Global windows. We can use any of them as per our use case or even we can create custom window assigners in Flink.

In this blog, we will learn about the first two window assigners i.e., Tumbling and sliding windows. These two window assigners, assign elements to windows based on time, which can either be processing time or event time.

Click through for a description of each.

Comments closed

Forcing a Shrink of TempDB

Monica Rathbun has a script for people who just want to watch the world burn:

Occasionally, we must resize or realign our Tempdb log file (.ldf) or data files (.mdf or .ndf) due to a growth event that forces the file size out of whack. To resize we have three options, restart the SQL Server Service, add additional files, or shrink the current file. We most likely have all been faced with run away log files and in an emergency situation restarting the SQL Services may not be an option but we still need to get our log file size smaller before we run out of disk space for example. The process of shrinking down that file can get tricky so I created this flow chart to help you out if you ever get into this situation.

Now its very important to note that many of these commands will clear your cache and will greatly impact your server performance as it warms cache backup. In addition, you should not shrink your database data or log file unless absolutely necessary. But doing so,  it can result in a corrupt tempdb.

Click through for the process, including a well-made flow chart with a large “Know what you’re doing before doing this” warning.

Comments closed

How SQL Server Stores XML Data

Randolph West continues a series on internal storage of data types:

The SQL Server XML data type supports both schema-defined and schema-less XML documents (typed and untyped), and partial documents. Partial documents are still considered well-defined XML, but don’t have surrounding <?xml>...</xml> tags for example. Either way, the storage engine validates the data when inserting or updating into an XML column.

The XML data type lets us store up to 2GB per column per row, but it’s prudent here to note that just because we can, it doesn’t mean we should.

Click through for the details.

Comments closed