Month: September 2020

Multi-Tenant Database Designs

Published 2020-09-28 by Kevin Feasel

Adrian Hills walks us through four database designs for multi-tenant data storage:

In my previous blog post, I talked about some of the key considerations around designing a multi-tenant system using SQL Server. There are several ways to implement multi-tenancy, and, as is often the case, there is no single “best” way but rather a range of options that each offer different trade-offs. The approach that is right for you depends on your objectives and needs for your specific environment. It’s important to consider which of these approaches best suit your requirements and goals based on the 3 core considerations from Multi-Tenancy with SQL Server, Part 1: security, maintainability (manageability), and scalability.
The following are the 4 approaches I will cover in this blog post:
1. Single database, shared schema
2. Single database, separate schema
3. Database per tenant
4. Multiple databases, multiple tenants per database, shared schema

I’ve worked with options 1, 3, and 4. Read on for Adrian’s thoughts. Ceteris paribus, my preference is 3. That said, I’ve worked in a situation where I migrated from 3 to 1 because there were thousands of customers, none of whom had more than hundreds of megabytes worth of data. 4 provides a good balance in that fashion, where you can bunch up smaller clients and give larger clients their own databases (and sometimes even servers). But if you’re going with options 2, 3, and 4, you probably want a central data warehouse which collects data across all four for internal use.

Comments closed

Choosing Between Hive LLAP and Impala

Published 2020-09-25 by Kevin Feasel

David Dichmann walks us through the differences between Impala and Hive LLAP:

Written in C++, which is very CPU efficient, with a very fast query planner and metadata caching, Impala is optimized for low latency queries. Because of this, Impala is an ideal engine for use with a data mart, since people working with data marts are mostly running read-only queries and not large scale writes.
Impala also has a very efficient run-time execution framework, using code generation, process-to-process communication, massive parallelism, and metadata caching. Because of this, Impala is also great when working with ad-hoc queries, like when exploring by iteratively digging into data. You’ll want to change your query over and over again, at a moment’s notice, and have very fast response times so you’re not waiting forever for each iteration.

I was curious what would end up happening with Hive and Impala once old Cloudera (Impala) and Hortonworks (Hive) merged together. Looks like the answer, at least for now, is that they’re both useful in different circumstances. But I do wonder how long that lasts—it’s not impossible to sell using two separate data platform products for different steps in a warehouse implementation, but I could see architects and CIOs wanting to make things simpler and narrow down to one unless there was a particularly smooth bridge between the two.

Comments closed

SQL Server R and Python Language Extensions Now Open Source

Published 2020-09-25 by Kevin Feasel

The SQL Server team has an announcement:

Previously, we announced a Java extension. Today, we are sharing that we are open sourcing the R and Python language extensions for SQL Server for both Windows and Linux on GitHub.
These extensions are the latest examples using an evolved programming language extensibility architecture which allows integration with a new type of language extension. This new architecture gives customers the freedom to bring their own runtime and execute programs using that runtime in SQL Server, while leveraging the existing security and governance that the SQL Server programming language extensibility architecture provides.

Very interesting.

Comments closed

Diving into the Azure Resource Mover

Published 2020-09-25 by Kevin Feasel

Dennes Torres shows off what the Azure Resource Mover can do:

If you include the need to copy a resource or set of resources, instead of only moving, the list expands a lot.
Azure already offers the resources to do this: ARM templates, automated deployments, Data Sync, Recovery Services Vault, VM replication and so on. The problem is that sometimes, to move a set of objects together, you may need to use many of these services and understand how to use them.
The solution is a new free service, still in preview, called Azure Resource Mover. This service reduces the complexity of moving resources, minimizing the number of decisions needed on how the resources will be moved. More than that, the last step, deleting the source of the move, is optional, as you will see in detail later. You can use this feature, not only to move resources, but also to copy and distribute them across many regions. During the move process, only one side (source or destination) will be active, but once you finish the move, if you decide not to delete the source, you have in fact a new deployment of the solution.

This is a fairly detailed tutorial, so check it out.

Comments closed

Problems with Power BI’s Publish to Web

Published 2020-09-25 by Kevin Feasel

Adam Saxton explains when you might not want to use the Publish to Web option in Power BI:

Some don’t realize that Power BI Publish to Web is not secure. Adam shows you that this is the case. It’s a bit scary and there are other options to have secure embedding.

For demos and other resources which are supposed to be accessible to everybody, Publish to Web works great. But if you’re deploying company dashboards, not so much.

Comments closed

Stellar Repair: A Review

Published 2020-09-25 by Kevin Feasel

Grant Fritchey reviews a product which attempts to repair corrupted SQL Server databases:

Let’s start with the most important piece of information you need: it works.
The software itself is really simple to use and just does what you need, repairs your corrupted SQL Server instance. On that alone, I can recommend the tool.
However, there are a few gotchas I ran into along the way. Mostly, little stuff. It’s things a little polish in the UI and some clean up around language could help out. Don’t get me wrong, I’m happy with this software. It worked. It’s just how it works that we should talk about.

Click through for Grant’s full review.

Comments closed

Moving a Virtual Machine with the Azure Resource Mover

Published 2020-09-25 by Kevin Feasel

Kathi Kellenberger tries out the Azure Resource Mover:

Another task I may want to perform is to move a VM to another region. I found this set of steps that involves using Azure Recovery Services Vault that seems a bit complex. Fortunately, I recently heard about a new, much easier way to move VMs and other resources called Azure Resource Mover (in preview). It was announced today.

Read on to see how this works. I like the idea a lot, especially for those times when you accidentally create resources in different regions and only realize it when it’s time to tie everything together.

Comments closed

Azure Synapse Analytics Sample Datasets and Scripts

Published 2020-09-25 by Kevin Feasel

James Serra shows us where to find samples for Azure Synapse Analytics:

Datasets: A bunch of datasets that when added will show up under Data -> Linked -> Azure Blob Storage. You can then choose an action (via “…” next to any of the containers in the dataset) and choose New SQL script -> Select TOP 100 rows to examine the data as well as choose “New notebook” to load the data into a Spark dataframe. Any dataset you add is a linked service to files in a blob storage container using SAS authentication. You can also create an external table in a SQL on-demand pool or SQL provisioned pool to each dataset via an action (via “…” next to “External tables” under the database, then New SQL script -> New external table) and then query it or insert the data into a SQL provisioned database

Click through to learn more, as well as a few other things you can do with Synapse Analytics.

Comments closed

Automatic Soft NUMA in SQL Server

Published 2020-09-25 by Kevin Feasel

Ameena Lalani walks us through NUMA and automatic soft NUMA in SQL Server:

Modern processors have multiple cores per socket. Each socket is represented, usually, as a single NUMA node. The SQL Server database engine partitions various internal structures and partitions service threads per NUMA node. With processors containing 10 or more cores per socket, using software NUMA to split hardware NUMA nodes generally increases scalability and performance. Prior to SQL Server 2014 (12.x) SP2, software-based NUMA (soft-NUMA) required you to edit the registry to add a node configuration affinity mask, and was configured at the host level, rather than per instance. Starting with SQL Server 2014 (12.x) SP2 and SQL Server 2016 (13.x), soft-NUMA is configured automatically at the database-instance level when the SQL Server Database Engine service starts. Please read this documentation and this documentation for more understanding.

Read on for more info.

Comments closed

Operational Database Security in Cloudera Data Platform

Published 2020-09-24 by Kevin Feasel

Liliana Kadar, et al, walk us through some of the database security and auditing features in Cloudera Data Platform:

Database object-level security is available through the centralized authorization framework of Apache Ranger.
Both fine-grained access control of database objects and access to metadata is provided. Protected database objects include: database, table, column, view and User Defined Functions (UDFs).
Fine-grained access control for special administrative operations that can be performed on OpDBMS is also supported.

Click through for the full story.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30