Press "Enter" to skip to content

Day: November 6, 2020

From JSON to SQL Server

Phil Factor has some helper functions for us when working with JSON data:

If you know the structure and contents of a JSON document, then it is possible to turn this into one or more relational tables, but even then I dare to you claim that it is easy to tap in a good OpenJSON SELECT statement to do it. If you don’t know what’s in that JSON file, then you’re faced with sweating over a text editor trying to work it all out. You long to just get the contents into a relational table and take it on from there. Even then, You’ve got several struggles before that table appears in the result pane. You must get the path to the tabular data correct, you have to work out the SQL Datatypes, and you need to list the full panoply of keys. Let’s face it: it is a chore. Hopefully, all that is in the past with these helper functions.

Click through for those functions.

Comments closed

Automating Hadoop Workflows with Spark and Oozie

Prashanth Jayaram walks us through automating a sample data transfer with tools like Sqoop, Spark, and Oozie:

In the process of building a data product one would end-up applying many resource-intensive analytical operations on a medium to large data-set in an efficient way. Apache Spark is the bet in this scenario to perform faster job execution by caching data in memory and enabling parallelism in a distributed data environments.

Components involved in Spark implementation:

1. Initialize spark session using scala program
2. Ingest data from data lake through hive queries
3. Apply business logic using scala constructs or hive queries
4. Load data into HDFS or Hive targets
5. Execute spark programs through spark submit

Read on for a sample flow.

Comments closed

DBCC CHECKDB on Large Databases

Aaron Bertrand shares some thoughts on CHECKDB:

We have a lot of data. Some of that data is stored in large databases (dozens of terabytes each). In some shops, this is an excuse to not run integrity checks. We are not one of those shops.

But we don’t run full CHECKDB operations in production; we have a set of servers dedicated to testing our restores and running checks. We follow a lot of the guidance in these articles:

CHECKDB From Every Angle: Consistency Checking Options for a VLDB

Minimizing the impact of DBCC CHECKDB : DOs and DON’Ts

Minimize performance impact of SQL Server DBCC CHECKDB

Read the whole thing, even if you aren’t dealing with 30+ TB databases.

Comments closed

Creating Users in Azure SQL Database

Kenneth Fisher takes us through a nuance in adding users to Azure SQL Database:

Awesome! I did say I preferred code didn’t I? I am noticing a slight problem though. I don’t actually have a login yet. So I look in object explorer and there is no instance level security tab. On top of that when I try to create a login with code I get the following error:

Msg 5001, Level 16, State 2, Line 1
User must be in the master database.

Read on for the whole process.

Comments closed

In-Memory OLTP and HammerDB Setup Error

Erik Darling tracks down an error:

This a short post about a sort of quirky error message I got while trying to run the TPC-C load test against SQL Server’s in-memory tables.

Error in Virtual User1: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The integer value XXXXXXXXXXXX is out of range.

Click through for the solution. Also, read down to the bottom of the post for a huge discount on Erik’s recorded training content. Erik is extremely knowledgeable and has a great way of explaining things, so take advantage of that knowledge.

Comments closed