Curated SQL – Page 1403 – A Fine Slice Of SQL Server

What’s New In Ambari 2.5

Published 2017-08-09 by Kevin Feasel

Paul Codding tells us what’s coming in the next version of Ambari:

Ambari Log Search (Tech Preview) has been one of our most popular features, and in this release has seen UI, and backend refreshes based on customer feedback. Log Search is planned for GA with the next major Ambari release, Ambari 3.0 in which the UI will be simplified, and the backend will have more robust log retention and scaling capabilities.

There are some interesting changes, so read the whole thing.

Comments closed

Breaking A Database Into Smaller Files

Published 2017-08-09 by Kevin Feasel

Jana Sattainathan shows how to break a SQL Server database into smaller files, as well as giving some reasons why you might want to do that:

You are probably reading this post because you have experienced the pain yourself and I dont want to waste anymore of your time and get right to the steps involved in breaking up a huge database or a datafile

Check the space situation on your host
Get the space usage by files for the big database/datafile in question
Decide on number of files to add/location
Add multiple secondary datafiles
Distribute data from big datafile into the new datafiles using EMPTYFILE option
Shrink the big datafile and set a maximum size
Change the default filegroup

Read on for Jana’s step-by-step approach.

Comments closed

Partial Text Slicer In Power BI

Published 2017-08-09 by Kevin Feasel

Mitchell Pearson shows how to slice results based on text input:

In this installment of the Problem, Design, Solution series we are going to show you how to perform a text search using slicers in Power BI, this simulates a “LIKE” type search. In the following screenshot you can see that when “Tax” is selected all records in the table that have “Tax” anywhere in the record are returned, likewise whenever “IT” is selected from the slicer all records in the table that have IT in them are returned. Hope you enjoy this post!

Click through for the explanation, followed by a video that walks you through the process.

Comments closed

Working With UTC And Local Times

Published 2017-08-09 by Kevin Feasel

Jo Douglass shows how to use the DATETIMEOFFSET data type and AT TIME ZONE syntax to convert between UTC and local times:

Run select SysDateTimeOffset(); and you should see a date and time which mirrors your server’s current time, plus a time zone offset showing its current offset from UTC; this includes any time zone offset, plus any daylight savings time offset.

If I were to run this (from the UK) on August 15th, 2017 while my clock is showing that it’s noon exactly, I would get 2017-08-15 12:00:00.0000000 +01:00; the +01:00 offset is because the UK is offset by one hour from UTC during daylight savings. The datetime2 portion of a datetimeoffset is in local time, not UTC.

My normal operation is to store everything in UTC and let the application convert to local times. That allows you to compare dates much more easily and reduces confusion around daylight savings time.

Comments closed

Forecasting Versus Predicting

Published 2017-08-09 by Kevin Feasel

Rob Collie explains that there are two different concepts which use similar names:

Once you’ve digested the illustration at the top of this article, yeah, you’ve kind already got it.

Forecasting is when we anticipate the behavior of “Lots” of people (customers, typically) on “Long” timelines.

Predictive Analytics anticipate the behavior of One person (again, typically a customer) on a “Short” timeline.

So… Macro versus Micro.

But let’s delve just a little bit deeper, in order to “cement” the concepts.

There’s a very useful distinction here and Rob does well to flesh out the details. I highly recommend this if you’re curious about micro- versus macro-level predictions.

Comments closed

SSMS Performance Dashboard

Published 2017-08-09 by Kevin Feasel

Pedro Lopes announces that the SQL Server Performance Dashboard is now built into SQL Server Management Studio:

Back in 2007, we released the Microsoft SQL Server 2005 Performance Dashboard Reports, which were designed to provide fast insight into performance issues from some newly created system views – DMFs (Dynamic Management Views). These were updated for SQL Server 2008 and later to SQL Server 2012, and while being very helpful they had a significant drawback – required separate download and install. This meant that when needed, most probably they were not installed in a specific SQL Server, and therefore were unusable when they were needed the most.

With the new SSMS 17.2, we are releasing the Performance Dashboard embedded as a built-in Standard Report. This means that it is available for any SQL Server instance starting with SQL Server 2008, without any extra downloads or running any extra scripts. Just connect to your server in SSMS and open the Performance Dashboard.

Aside from making it built into Management Studio, they’ve also added a few helpful things to the product, so it is worth checking out.

Comments closed

Backing Up That Linux-Based Database

Published 2017-08-09 by Kevin Feasel

David Klee shows how to back up a SQL Server on Linux database over the network:

As of SQL Server 2017 RC2, we’ll want to accomplish it in a way that is transparent to SQL Server. (Depending on the RTM version whenever it is released, I might change the recommendation on this.) To do this, we’ll want to create a folder on the local file system that actually maps to a remote network share for SQL Server backups.

SSH into your server without elevated privileges at this point.

The network share is presented from a Windows server with the SMB protocol. Linux can connect to this using a compatible protocol called CIFS, or Common Internet File System. We’ll need to install the packages so we can natively connect. On Ubuntu and other Linux distros, the easiest is with the cifs-utils package. To install from the package manager is as simple as this.

Sadly, that credentials file cannot be encrypted.

Comments closed

Deep Learning Isn’t The End-All Be-All Solution

Published 2017-08-08 by Kevin Feasel

Pablo Cordero explains that deep learning solutions are not the best choice in all cases:

The second preconception I hear the most is the hype. Many yet-to-be practitioners expect deep nets to give them a mythical performance boost just because it worked in other fields. Others are inspired by impressive work in modeling and manipulating images, music, and language – three data types close to any human heart – and rush headfirst into the field by trying to train the latest GAN architecture. The hype is real in many ways. Deep learning has become an undeniable force in machine learning and an important tool in the arsenal of any data modeler. Its popularity has brought forth essential frameworks such as tensorflow and pytorch that are incredibly useful even outside deep learning. Its underdog to superstar origin story has inspired researchers to revisit other previously obscure methods like evolutionary strategies and reinforcement learning. But it’s not a panacea by any means. Aside from lunch considerations, deep learning models can be very nuanced and require careful and sometimes very expensive hyperparameter searches, tuning, and testing (much more on this later in the post). Besides, there are many cases where using deep learning just doesn’t make sense from a practical perspective and simpler models work much better.

It’s a very interesting article, pointing out that deep learning solutions work better than expected on smaller data sizes, but there are areas where it’s preferable to choose something else.

Comments closed

Using Hive LLAP On ElasticMapReduce

Published 2017-08-08 by Kevin Feasel

Jigar Mistry shows how to configure and use Hive LLAP on AWS’s ElasticMapReduce:

With many options available in the market (Presto, Spark SQL, etc.) for doing interactive SQL over data that is stored in Amazon S3 and HDFS, there are several reasons why using Hive and LLAP might be a good choice:

For those who are heavily invested in the Hive ecosystem and have external BI tools that connect to Hive over JDBC/ODBC connections, LLAP plugs in to their existing architecture without a steep learning curve.
It’s compatible with existing Hive SQL and other Hive tools, like HiveServer2, and JDBC drivers for Hive.
It has native support for security features with authentication and authorization (SQL standards-based authorization) using HiveServer2.
LLAP daemons are aware about of the columns and records that are being processed which enables you to enforce fine-grained access control.
It can use Hive’s vectorization capabilities to speed up queries, and Hive has better support for Parquet file format when vectorization is enabled.
It can take advantage of a number of Hive optimizations like merging multiple small files for query results, automatically determining the number of reducers for joins and groupbys, etc.
It’s optional and modular so it can be turned on or off depending on the compute and resource requirements of the cluster. This lets you to run other YARN applications concurrently without reserving a cluster specifically for LLAP.

Read on for more details, including the bootstrap action you need to take and how to use LLAP once you have it configured.

Comments closed

Convert SSAS Tabular Processing Scripts Into Tables

Published 2017-08-08 by Kevin Feasel

Chris Koester shows how to take an Analysis Services Tabular processing script in TMSL format and turn it into a table using OPENJSON:

The previous post looked at how to process SSAS Tabular models with TMSL. Since SQL Server adds new JSON capabilities in 2016, let’s look at how to convert TMSL JSON to a Table with OPENJSON. OPENJSON is a new function in SQL Server 2016 that, per Microsoft:

OPENJSON is a table-valued function that parses JSON text and returns objects and properties from the JSON input as rows and columns.

In short, OPENJSON converts JSON text to a table. Since TMSL is JSON, this function can convert a SSAS Tabular processing script into a table. This could be useful if you wanted to document a SSAS processing schedule.

That’s an interesting use of OPENJSON.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts