Press "Enter" to skip to content

Author: Kevin Feasel

Finding Unused SQL Server Tables

Nisarg Upadhyay shares a few techniques to look for whether a table is currently in use:

In this article, I am going to demonstrate how we can identify the unused tables of a given SQL Database. As a database administrator, we must maintain the tables and objects of the SQL Database. In my organization, when we add a new column or change the data type of a column, we export the data of the existing table in the backup table. Often, we forgot to review and maintain those backup tables. So as a solution, we decided to create a SQL Job that populates the list of unused tables and email the list to the DBA Team for review.

The best way to decommission any table is to rename it first, and if it does not break the functionality of the application, we can drop the table and its dependencies. We decided that after a review of the unused table completes; we will rename the tables and later drop them.

Click through for the techniques and a script which checks tables. It does look fairly reasonable, with the concern being if you restart the SQL Server instance or look at a table which is only accessed at a particular time of year.

Comments closed

Using Stopwords and Stoplists with Full-Text Search

Haroon Ashraf walks us through stoplists and stopwords in SQL Server Full-Text Search:

First, let’s clarify the essence of Stopwords and Stoplist. Then we’ll proceed to use them to improve Full-Text Search.

A Stoplist

A stoplist, as the name implies, is a list of stopwords. When associated with Full-Text Search, the Stoplist can filter out meaningless words or terms, thus improving search results.

A Stopword

A stopword is a word that has a minor role in Full-Text Search, despite being important grammatically. Therefore, a stopword is not essential from the Full-Text Search perspective.

According to Microsoft documentation, a stopword can be a word with some meaning in a specific language, or it may be some token with no linguistic value. In both cases, it is useless for the Full-Text Search.

Read on to see examples and how to build your own stoplists.

Comments closed

Finding Loops in an Active Directory Hierarchy

Jana Sattainathan builds a recursive solution to an interesting problem:

Here is the example he gave me:

I am looking for a way to detect recursion loops in a specified AD Group.  Take this example of groups and members. I started with Group1, and found Group2, with group member Group2. Group2 has a group member Group3. Enumerating Group3 finds Group1, and loop!

He also said: Some groups are nested many levels down, and a single top level group can have more than one recursive loop. The script should be able to not only identify Group1>Group3>Group1, but also Group1>Group8>Group5>Group15>Group1.

This is a good use case for recursion.

Comments closed

Public Preview of SQL Server on Azure Arc

Sasha Nosov gives us an update on Azure Arc:

The preview includes the following features:

– Use Azure Portal to register and track the global inventory of your SQL instances across different hosting infrastructures. You can register an individual SQL instance or register a set of servers at scale using the same auto-generated script.

– Use Azure Security Center to produce a comprehensive report of vulnerabilities in SQL servers and get advanced, real time security alerts for threats to SQL servers and the OS.

– Investigate threats in SQL Servers using Azure Sentinel 

– Periodically check the health of the SQL Server configurations and provide comprehensive reports and remediation recommendations using the power of Azure Log analytics.

Click through for more information and documentation.

Comments closed

Multi-Tenant Database Designs

Adrian Hills walks us through four database designs for multi-tenant data storage:

In my previous blog post, I talked about some of the key considerations around designing a multi-tenant system using SQL Server. There are several ways to implement multi-tenancy, and, as is often the case, there is no single “best” way but rather a range of options that each offer different trade-offs. The approach that is right for you depends on your objectives and needs for your specific environment. It’s important to consider which of these approaches best suit your requirements and goals based on the 3 core considerations from Multi-Tenancy with SQL Server, Part 1: security, maintainability (manageability), and scalability.

The following are the 4 approaches I will cover in this blog post:
1. Single database, shared schema
2. Single database, separate schema
3. Database per tenant
4. Multiple databases, multiple tenants per database, shared schema

I’ve worked with options 1, 3, and 4. Read on for Adrian’s thoughts. Ceteris paribus, my preference is 3. That said, I’ve worked in a situation where I migrated from 3 to 1 because there were thousands of customers, none of whom had more than hundreds of megabytes worth of data. 4 provides a good balance in that fashion, where you can bunch up smaller clients and give larger clients their own databases (and sometimes even servers). But if you’re going with options 2, 3, and 4, you probably want a central data warehouse which collects data across all four for internal use.

Comments closed

Choosing Between Hive LLAP and Impala

David Dichmann walks us through the differences between Impala and Hive LLAP:

Written in C++, which is very CPU efficient, with a very fast query planner and metadata caching, Impala is optimized for low latency queries.  Because of this, Impala is an ideal engine for use with a data mart, since people working with data marts are mostly running read-only queries and not large scale writes.  

Impala also has a very efficient run-time execution framework, using code generation, process-to-process communication, massive parallelism, and metadata caching. Because of this, Impala is also great when working with ad-hoc queries, like when exploring by iteratively digging into data.  You’ll want to change your query over and over again, at a moment’s notice, and have very fast response times so you’re not waiting forever for each iteration.  

I was curious what would end up happening with Hive and Impala once old Cloudera (Impala) and Hortonworks (Hive) merged together. Looks like the answer, at least for now, is that they’re both useful in different circumstances. But I do wonder how long that lasts—it’s not impossible to sell using two separate data platform products for different steps in a warehouse implementation, but I could see architects and CIOs wanting to make things simpler and narrow down to one unless there was a particularly smooth bridge between the two.

Comments closed

SQL Server R and Python Language Extensions Now Open Source

The SQL Server team has an announcement:

Previously, we announced a Java extensionToday, we are sharing that we are open sourcing the R and Python language extensions for SQL Server for both Windows and Linux on GitHub.

These extensions are the latest examples using an evolved programming language extensibility architecture which allows integration with a new type of language extension. This new architecture gives customers the freedom to bring their own runtime and execute programs using that runtime in SQL Server, while leveraging the existing security and governance that the SQL Server programming language extensibility architecture provides.

Very interesting.

Comments closed

Diving into the Azure Resource Mover

Dennes Torres shows off what the Azure Resource Mover can do:

If you include the need to copy a resource or set of resources, instead of only moving, the list expands a lot.

Azure already offers the resources to do this: ARM templates, automated deployments, Data Sync, Recovery Services Vault, VM replication and so on. The problem is that sometimes, to move a set of objects together, you may need to use many of these services and understand how to use them.

The solution is a new free service, still in preview, called Azure Resource Mover. This service reduces the complexity of moving resources, minimizing the number of decisions needed on how the resources will be moved. More than that, the last step, deleting the source of the move, is optional, as you will see in detail later. You can use this feature, not only to move resources, but also to copy and distribute them across many regions. During the move process, only one side (source or destination) will be active, but once you finish the move, if you decide not to delete the source, you have in fact a new deployment of the solution.

This is a fairly detailed tutorial, so check it out.

Comments closed

Problems with Power BI’s Publish to Web

Adam Saxton explains when you might not want to use the Publish to Web option in Power BI:

Some don’t realize that Power BI Publish to Web is not secure. Adam shows you that this is the case. It’s a bit scary and there are other options to have secure embedding.

For demos and other resources which are supposed to be accessible to everybody, Publish to Web works great. But if you’re deploying company dashboards, not so much.

Comments closed