Press "Enter" to skip to content

Month: November 2016

Basics Of Spark

Jen Underwood gives a quick explanation of Spark as well as an introduction to SparkSQL and PySpark:

Spark’s distributed data-sharing concept is called “Resilient Distributed Datasets,” or RDD. RDDs are fault-tolerant collections of objects partitioned across a cluster that can be queried in parallel and used in a variety of workload types. RDDs are created by applying operations called “transformations” with map, filter, and groupBy clauses. They can persist in memory for rapid reuse. If an RDD data does not fit in memory, Spark will overflow it to disk.

If you’re not familiar with Spark, now’s as good a time as any to learn.

Comments closed

Building Metadata With Biml

Ben Weissman provides a set of Biml scripts to load a metadata table based off of existing tables and columns:

For each member of that collection, we follow some simple rules:

– Our table’s original name is the name of the table in the staging area without our connectionname prefix
– If our tablename still includes an underscore, we will split the name and assign the table- and schemaname respectively. Otherwise, our schema will be DBO.
– Create a DELETE statement towards our metadata store
– Create an INSERT statement towards our metadata store

Admittedly, I would have seen this as a one-time process and would have just written some scripts against sys.tables and sys.columns to generate this metadata, but “one-time processes” tend to happen over and over.

Comments closed

Alert On SQL Jobs Missing Schedules

Brian Hansen wraps up a three-part series on scheduled job alerts:

The first two parts of this series addressed the general approach that I use in an SSIS script task to discover and alert on missed SQL Agent jobs. With apologies for the delay in producing this final post in the series, here I bring these approaches together and present the complete package.

To create the SSIS, start with an empty SSIS package and add a data flow task. In the task, add the following transformations.

Regardless of how you do it, knowing when jobs fail is important enough to build some infrastructure around answering this question.

Comments closed

CISL 1.4.0

Niko Neugebauer has released the latest version of his Columnstore Index Scripts Library:

Another happy release of the CISL (Columnstore Indexes Script Library) is live – this time it is 1.4.0!

This release is focusing on the addition of the Extended Events, so that a user of CISL can easily set up the events for each of the SQL Server (2012,2014,2016) or Azure SQL Database versions.

This is an open source library which I recommend if you deal with columnstore indexes in any fashion.

Comments closed

Always Encrypted And Memory-Optimized Tables

Joey D’Antoni tests whether Always Encrypted works on memory-optimized tables in SQL Server 2016:

Last week was the PASS Summit, which is the biggest confab of SQL Server professionals on the planet (and educational as ever), Denny Cherry  (b|t) and I ran into Bob Ward (b|t) of Microsoft and of 500 level internals presentations. And for the first time ever, Bob asked us a question about SQL Server—of course we didn’t know the answer of the top of our heads, but we felt obligated to research it like we’ve made Bob do so many times. Anyone, the question came up a Bob’s internals session on Hekaton (In-Memory OLTP) and whether it supported the new Always Encrypted feature in SQL Server 2016. I checked books online, but could not find a clear answer, so I fired up SSMS and setup a quick demo.

Click through for scripts and the answer.

Comments closed

The Halloween Problem

Kenneth Fisher explains the Halloween Problem:

What is The Halloween Problem?
This is a bit more complicated. Let’s say you are trying to give a 10% raise to everyone who makes less than $25k.

Couple of quick notes here. This is a common example because this in fact the problem that exposed the issue. Also, while UPDATEs are probably the easiest way to explain what’s going on, it can affect any type of write.

So back to our update statement. There are several ways this could be implemented. I’m going to use pseudo T-SQL to demonstrate a couple and explain each.

This has certain implications as you can see in the linked Paul White series.  These implications typically mean slower performance (e.g., by forcing spooling) but getting rid of a potentially nasty problem.

Comments closed

On-Prem Power BI

Koen Verbeeck looks at the preview of Power BI integration inside Reporting Services:

  • one thing that I am missing, is when you are rendering the report that there is an “edit report” button that takes you to Power BI Desktop. A bit like in PowerBI.com, where you can also go to edit mode if you have the correct permissions.

  • by the way, if you truly want to test it locally, you can download the .vhd file (the virtual hard disk) and run it in your own HyperV environment.

All in all it looks very nice for a first preview. Currently only SSAS is supported and custom visualizations are not, but I guess the SSRS team will surprise us with more features soon. Great job SSRS team!

Lots of interesting thoughts here, so check it out.

Comments closed

UDL Files To Test Connectivity

Marek Masko shows how to test a database connection without having any database tools:

UDL extension stands for Universal Data Link. These files are used by Data Link API which exposes a user interface to create and manage OLE DB connections. This functionality was introduced in Windows OS at least in Windows 95, maybe even earlier. That means you can use it on every Windows machine you work on. You no longer need to worry about additional tools.

Sometimes you need a creative solution to a policy-induced problem.

Comments closed