Press "Enter" to skip to content

Author: Kevin Feasel

Breaking A Database Into Smaller Files

Jana Sattainathan shows how to break a SQL Server database into smaller files, as well as giving some reasons why you might want to do that:

You are probably reading this post because you have experienced the pain yourself and I dont want to waste anymore of your time and get right to the steps involved in breaking up a huge database or a datafile

  1. Check the space situation on your host

  2. Get the space usage by files for the big database/datafile in question

  3. Decide on number of files to add/location

  4. Add multiple secondary datafiles

  5. Distribute data from big datafile into the new datafiles using EMPTYFILE option

  6. Shrink the big datafile and set a maximum size

  7. Change the default filegroup

Read on for Jana’s step-by-step approach.

Comments closed

Partial Text Slicer In Power BI

Mitchell Pearson shows how to slice results based on text input:

In this installment of the Problem, Design, Solution series we are going to show you how to perform a text search using slicers in Power BI, this simulates a “LIKE” type search. In the following screenshot you can see that when “Tax” is selected all records in the table that have “Tax” anywhere in the record are returned, likewise whenever “IT” is selected from the slicer all records in the table that have IT in them are returned. Hope you enjoy this post!

Click through for the explanation, followed by a video that walks you through the process.

Comments closed

Working With UTC And Local Times

Jo Douglass shows how to use the DATETIMEOFFSET data type and AT TIME ZONE syntax to convert between UTC and local times:

Run select SysDateTimeOffset(); and you should see a date and time which mirrors your server’s current time, plus a time zone offset showing its current offset from UTC; this includes any time zone offset, plus any daylight savings time offset.

If I were to run this (from the UK) on August 15th, 2017 while my clock is showing that it’s noon exactly, I would get 2017-08-15 12:00:00.0000000 +01:00; the +01:00 offset is because the UK is offset by one hour from UTC during daylight savings. The datetime2 portion of a datetimeoffset is in local time, not UTC.

My normal operation is to store everything in UTC and let the application convert to local times.  That allows you to compare dates much more easily and reduces confusion around daylight savings time.

Comments closed

Forecasting Versus Predicting

Rob Collie explains that there are two different concepts which use similar names:

Once you’ve digested the illustration at the top of this article, yeah, you’ve kind already got it.

  • Forecasting is when we anticipate the behavior of “Lots” of people (customers, typically) on “Long” timelines.
  • Predictive Analytics anticipate the behavior of One person (again, typically a customer) on a “Short” timeline.

So…  Macro versus Micro.

But let’s delve just a little bit deeper, in order to “cement” the concepts.

There’s a very useful distinction here and Rob does well to flesh out the details.  I highly recommend this if you’re curious about micro- versus macro-level predictions.

Comments closed

SSMS Performance Dashboard

Pedro Lopes announces that the SQL Server Performance Dashboard is now built into SQL Server Management Studio:

Back in 2007, we released the Microsoft SQL Server 2005 Performance Dashboard Reports, which were designed to provide fast insight into performance issues from some newly created system views – DMFs (Dynamic Management Views). These were updated for SQL Server 2008 and later to SQL Server 2012, and while being very helpful they had a significant drawback – required separate download and install. This meant that when needed, most probably they were not installed in a specific SQL Server, and therefore were unusable when they were needed the most.

With the new SSMS 17.2, we are releasing the Performance Dashboard embedded as a built-in Standard Report. This means that it is available for any SQL Server instance starting with SQL Server 2008, without any extra downloads or running any extra scripts. Just connect to your server in SSMS and open the Performance Dashboard.

Aside from making it built into Management Studio, they’ve also added a few helpful things to the product, so it is worth checking out.

Comments closed

Backing Up That Linux-Based Database

David Klee shows how to back up a SQL Server on Linux database over the network:

As of SQL Server 2017 RC2, we’ll want to accomplish it in a way that is transparent to SQL Server. (Depending on the RTM version whenever it is released, I might change the recommendation on this.) To do this, we’ll want to create a folder on the local file system that actually maps to a remote network share for SQL Server backups.

SSH into your server without elevated privileges at this point.

The network share is presented from a Windows server with the SMB protocol. Linux can connect to this using a compatible protocol called CIFS, or Common Internet File System. We’ll need to install the packages so we can natively connect. On Ubuntu and other Linux distros, the easiest is with the cifs-utils package. To install from the package manager is as simple as this.

Sadly, that credentials file cannot be encrypted.

Comments closed

Deep Learning Isn’t The End-All Be-All Solution

Pablo Cordero explains that deep learning solutions are not the best choice in all cases:

The second preconception I hear the most is the hype. Many yet-to-be practitioners expect deep nets to give them a mythical performance boost just because it worked in other fields. Others are inspired by impressive work in modeling and manipulating images, music, and language – three data types close to any human heart – and rush headfirst into the field by trying to train the latest GAN architecture. The hype is real in many ways. Deep learning has become an undeniable force in machine learning and an important tool in the arsenal of any data modeler. Its popularity has brought forth essential frameworks such as tensorflow and pytorch that are incredibly useful even outside deep learning. Its underdog to superstar origin story has inspired researchers to revisit other previously obscure methods like evolutionary strategies and reinforcement learning. But it’s not a panacea by any means. Aside from lunch considerations, deep learning models can be very nuanced and require careful and sometimes very expensive hyperparameter searches, tuning, and testing (much more on this later in the post). Besides, there are many cases where using deep learning just doesn’t make sense from a practical perspective and simpler models work much better.

It’s a very interesting article, pointing out that deep learning solutions work better than expected on smaller data sizes, but there are areas where it’s preferable to choose something else.

Comments closed

Using Hive LLAP On ElasticMapReduce

Jigar Mistry shows how to configure and use Hive LLAP on AWS’s ElasticMapReduce:

With many options available in the market (Presto, Spark SQL, etc.) for doing interactive SQL  over data that is stored in Amazon S3 and HDFS, there are several reasons why using Hive and LLAP might be a good choice:

  • For those who are heavily invested in the Hive ecosystem and have external BI tools that connect to Hive over JDBC/ODBC connections, LLAP plugs in to their existing architecture without a steep learning curve.

  • It’s compatible with existing Hive SQL and other Hive tools, like HiveServer2, and JDBC drivers for Hive.

  • It has native support for security features with authentication and authorization (SQL standards-based authorization) using HiveServer2.

  • LLAP daemons are aware about of the columns and records that are being processed which enables you to enforce fine-grained access control.

  • It can use Hive’s vectorization capabilities to speed up queries, and Hive has better support for Parquet file format when vectorization is enabled.

  • It can take advantage of a number of Hive optimizations like merging multiple small files for query results, automatically determining the number of reducers for joins and groupbys, etc.

  • It’s optional and modular so it can be turned on or off depending on the compute and resource requirements of the cluster. This lets you to run other YARN applications concurrently without reserving a cluster specifically for LLAP.

Read on for more details, including the bootstrap action you need to take and how to use LLAP once you have it configured.

Comments closed

Convert SSAS Tabular Processing Scripts Into Tables

Chris Koester shows how to take an Analysis Services Tabular processing script in TMSL format and turn it into a table using OPENJSON:

The previous post looked at how to process SSAS Tabular models with TMSL. Since SQL Server adds new JSON capabilities in 2016, let’s look at how to convert TMSL JSON to a Table with OPENJSON. OPENJSON is a new function in SQL Server 2016 that, per Microsoft:

OPENJSON is a table-valued function that parses JSON text and returns objects and properties from the JSON input as rows and columns.

In short, OPENJSON converts JSON text to a table. Since TMSL is JSON, this function can convert a SSAS Tabular processing script into a table. This could be useful if you wanted to document a SSAS processing schedule.

That’s an interesting use of OPENJSON.

Comments closed

Drawing Cubes With SQL Server Spatial

Slava Murygin has entered his cubism phase:

Hey, there is a time to go level up and instead of drawing Spirals, Fractals and other cool stuff I decided to go 3D!

So, the first my try will be drawing 3D cubes.
As you know, SQL is not an Object Orienting Programming language, and I can’t just simply create an Object “Cube” with certain properties. To create a Cube I need a Stored Procedure:

Click through for a touch of Picasso in your database.

Comments closed