Month: June 2018

Tooltips

Don’t use tooltips to convey important information. Users with motor issues and users who do not use a mouse will have difficulties accessing them.
Do add tooltips to charts as ancillary information. It is included in the accessible Show Data table for each visual.

There’s a lot of good information here.

Comments closed

Using RAISERROR Instead Of PRINT

Published 2018-06-07 by Kevin Feasel

Randolph West recommends using RAISERROR WITH NOWAIT rather than PRINT for printing messages:

Read that last line again. It’s saying that the PRINT command needs to fill some sort of memory buffer before you see any output. Sometimes you’re lucky, and sometimes you’re not. Erland Sommarskog says the same thing in the series Error and Transaction Handling in SQL Server (Part 1: Jumpstart Error Handling | Part 2: Commands and Mechanisms | Part 3: Implementation).

If you want immediate output from your long-running process, use RAISERROR instead of PRINT, and make use of the WITH NOWAIT parameter.

Naturally I can’t leave you hanging with a 150-word blog post, most of which is quoting Erik, so I decided to do some digging. The question I posed myself was “How big is the PRINT buffer?”

I always use error level 10, as that’s the minimum level guaranteed to print to the console immediately and it does not write to error logs or stop operations.

Comments closed

Databricks MLflow

Published 2018-06-06 by Kevin Feasel

Matai Zaharia announces a new Databricks offering:

MLflow is inspired by existing ML platforms, but it is designed to be open in two senses:

Open interface: MLflow is designed to work with any ML library, algorithm, deployment tool or language. It’s built around REST APIs and simple data formats (e.g., a model can be viewed as a lambda function) that can be used from a variety of tools, instead of only providing a small set of built-in functionality. This also makes it easy to add MLflow to your existing ML code so you can benefit from it immediately, and to share code using any ML library that others in your organization can run.

Open source: We’re releasing MLflow as an open source project that users and library developers can extend. In addition, MLflow’s open format makes it very easy to share workflow steps and models across organizations if you wish to open source your code.

Mlflow is still currently in alpha, but we believe that it already offers a useful framework to work with ML code, and we would love to hear your feedback. In this post, we’ll introduce MLflow in detail and explain its components.

Even in alpha, it looks nice.

Comments closed

The Basics Of Azure Stream Analytics

Published 2018-06-06 by Kevin Feasel

Chris Seferlis gives us an overview of Azure Stream Analytics:

Here’s how it works. It starts with a data source such as Event Hub, IoT Hub or Azure Blob Storage, and it uses SQL-like query language that allows transformation on the fly. It helps you process operations like filtering, sorting, aggregating and joining the data together to make it more useable—turning data into information.

From there, when you identify the data that you want/need to use, you can then send that data downstream to be sent to a queue for triggering workflows or further processing of the data. You can also send that data to Power BI for real-time visualization. For example, let’s say you’re looking at a data quality stream and you want to pull certain key words out of Twitter to see how they’re used and watch how that’s being done. By connecting to the Twitter API, you can capture that data, stream it, and then report from it with a Power BI report.

Chris also has a video which you can watch.

Comments closed

Lookups And Conditionals In Azure Data Factory V2

Published 2018-06-06 by Kevin Feasel

Alex Whittles shows us how to perform lookups and operations with IF clauses in Azure Data Factory V2:

Azure Data Factory v2 (ADFv2) has some significant improvements over v1, and we now consider ADF as a viable platform for most of our cloud based projects. But things aren’t always as straightforward as they could be. I’m sure this will improve over time, but don’t let that stop you from getting started now.

This post provides a walk through of using the ‘Lookup’ and ‘If Condition’ activities to do some basic conditional logic depending on the results of a database query.

Assumptions: You already have an ADF pipeline created. If you want to hook into SSIS then you’ll also need the SSIS Integration Runtime set up – although this is not relevant just for the if condition.

Read on for an example.

Comments closed

SSMS Default File Encoding

Published 2018-06-06 by Kevin Feasel

Claudio Silva shows how to make UTF-8 the default file encoding for SQL Server:

The file that is used to create a new query window has ANSI encoding but when I save the file on the PowerShell script I save it as UTF-8 because the client have comments on the code with unicode characters.

On this process, the unicode characters are replaced by some symbols.

Read on for the solution.

Comments closed

Connecting To Azure SQL Database From On-Prem

Published 2018-06-06 by Kevin Feasel

Arun Sirpal shows how to set up a linked server instance between an on-prem SQL Server instance and Azure SQL Database:

You may (or may not) have a requirement to setup a linked server to Azure SQL Database from a locally installed SQL Server. One reason could be to pull down some reports from an Azure SQL Database to a local file share. Whatever your reason is hopefully you will find this blog post useful because I ran into some complications on the way.

This is what your linked server creation screens in SSMS (SQL Server Management Studio) should look like.

Take advantage of Arun’s hard-earned experience and read his post.

Comments closed

Re-Sending An E-Mail From SQL Server

Published 2018-06-06 by Kevin Feasel

Adrian Buckman shows us how to re-send an e-mail that was sent through DB Mail:

Sometimes emails from SQL Server go missing, especially when you share an inbox with colleagues. On most occasions it doesn’t always matter as the job that generated the email can simply be re ran to produce the email once again, but what about those emails that contain time specific information? we cannot simple just kick off the job again as it may be coded to evaluate right now and not provide us with the information that was sent 2 hours ago.

I was faced with exactly this issue the other day – i needed the information from the email that was sent at that given time so I wrote a bit of code to get the job done then decided that if I tidy this code up I could make it reusable so here is what I come up with:

Check out Adrian’s helpful script.

Comments closed

How Far Along Is That CHECKDB?

Published 2018-06-06 by Kevin Feasel

Steve Stedman shows us how to get percentage complete of a DBCC CHECKDB run:

You have a DBCC CHECKDB script running, something like the following, and it may take several hours to run to confirm if there is any corruption in your SQL Server Database.

1

DBCC CHECKDB ('MyDatabaseName') WITH DATA_PURITY, NO_INFOMSGS;

Then someone asks you the age old question… When will it be done?

Click through for a quick script and the answer.

Comments closed

Using Kafka To Go From Batch To Stream

Published 2018-06-05 by Kevin Feasel

Stephane Maarek has started a series on transforming a batch process into a streaming process using Apache Kafka. Part one introduces the topic and two of the four microservices:

Before jumping straight in, it’s very important to map out the current process and see how we can improve each component. Below are my personal assumptions:

When a user writes a review, it gets POSTed to a Web Service (REST Endpoint), which will store that review into some kind of database table.
Every 24 hours, a batch job (could be Spark) would take all the new reviews and apply a spam filter to filter fraudulent reviews from legitimate ones.
New valid reviews are published to another database table (which contains all the historic valid reviews).
Another batch job or a SQL query computes new stats for courses. Stats include all-time average rating, all-time count of reviews, 90 days average rating, and 90 days count of reviews.
The website displays these metrics through a REST API when the user navigates a website.

Part two finishes up the story:

In the previous section, we learned about the early concepts of Kafka Streams, to take a stream and split it in two based on a spam evaluation function. Now, we need to perform some stateful computations such as aggregations, windowing in order to compute statistics on our stream of reviews.

Thankfully we can use some pre-defined operators in the High-Level DSL that will transform a KStream into a KTable. A KTable is basically a table that gets new events every time a new element arrives in the upstream KStream. The KTable then has some level of logic to update itself. Any KTable updates can then be forwarded downstream. For a quick overview of KStream and KTable, I recommend the quickstart on the Kafka website.

This is a nice introduction to Kafka Streams using a realistic example.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30