Press "Enter" to skip to content

Author: Kevin Feasel

CIS Security Checks with dbachecks

Tracy Boggiano shows how to perform a security check based on CIS requirements:

Well back at the end of 2019 I finished writing most of the checks related to the CIS Center for Internet Security requirements.  I have yet to write a blog post on how to use them.  So, well here is how to go about using them, it’s mostly code so should be pretty simple to implement.  I’ve mentioned this several times over the past year in presenting on dbatools.

So first you need to have dbachecks.  So, let’s start with the basics just in case you haven’t heard of dbachecks.  dbachecks is PowerShell module that checks the configuration of your SQL Server against various test have been predefined.  By default, it exports the data to JSON, and we will be opening PowerBI to display the data because why that is pretty.  So, go download you a copy of Power BI from the Microsoft website and let’s install dbachecks first.

Read on to see what you need, the steps for this process, and what the results look like.

Comments closed

Adding Methods to a PSCustomObject

Robert Cain builds on a prior post:

In the previous installment of this series, I covered the various ways to create objects using the PSCustomObject. We saw how to create it using the New-Object cmdlet, then how to add your custom properties to it using the Add-Member cmdlet.

In this post we’ll learn how to add our own methods to our objects using script blocks. Before we go on, just a quick reminder on vocabulary.

Click through for that reminder, as well as implementation details.

Comments closed

Choosing between String Data Types in SQL

Greg Larsen compares CHAR, VARCHAR, and VARCHAR(MAX):

In every database, there are different kinds of data that need to be stored. Some data is strictly numeric, while other data consists of only letters or a combination of letters, numbers, and even special symbols. Whether it is just stored in memory or on disk, each piece of data requires a data type. Picking the correct data type depends on the characteristics of the data being stored. This article explains the differences between CHARVARCHAR, and VARCHAR(MAX).

Click through for Greg’s explanation. My official rule of thumb is as follows:

  • If you have a fixed-length code which you display to customers, use NCHAR.
  • If you have a fixed-length code which you only use internally and you know that the code will never include characters outside of your SQL Server installation’s code page and you know that the code page will never change…probably still use NCHAR, though if you twist my arm enough I’d say fine, CHAR.
  • Otherwise, use NVARCHAR.

Three decades ago, the choice was a lot trickier given performance differences between the two. Today? Unless you’re hunting for microseconds I don’t think you’ll see a practical difference. And if you are hunting for microseconds, you probably want more than rules of thumb.

Comments closed

Understanding the Transaction Log

Paul Randal has a new series:

With this post, I’m starting an occasional series on the transaction log and how it works and should be managed, and I’ll touch on all the problems above over its course. In this post, I’ll explain what logging is and why it’s required.

Basic Terminology Around Logging

When I’m talking about any mechanism in SQL Server, I find there’s a chicken-and-egg problem where I need to use a word or phrase before I’ve explained it. To avoid that problem in this series, I’m going to start by explaining some terminology that needs to be used when discussing logging, and I’ll expand on many of these terms as the series progresses.

This post starts off with some of the basics and it’s always good to get the occasional refresher on the basics.

Comments closed

Testing Change Data Capture

Jeff Iannucci needs to test Change Data Capture:

I’m not sure how many of you use Change Data Capture (CDC) on your instances, but I’ve had to support it for a while now and I thought I’d share a little script I use fairly frequently to help troubleshoot the capture and cleanup, as well as some advice for resolving issues.

Click through for the script, as well as some additional notes on CDC.

Comments closed

Bucketing Data in Hive

Chitra Sapkal explains why bucketing in Hive can be so powerful:

When a column has a high cardinality, we can’t perform partitioning on it. A very high number of partitions will generate too many Hadoop files which would increase the load on the node. That’s because the node will have to keep the metadata of every partition, and that would affect the performance of that node

In simple words, You can use bucketing if you need to run queries on columns that have huge data, which makes it difficult to create partitions.

Click through to see how bucketing works and examples of how you can use it to make queries faster.

Comments closed

Graph Analysis with NetworkX

Tori Tompkins introduces us to a Python package:

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex graphs. It’s a really cool package that contains heaps of graph algorithms for all different uses. In this tutorial, I will cover how to create a graph from an edge list and different ways we can query it.

Unsure what a graph is exactly? Check out my Data Science Moments video which introduces graphs and their uses in 5 minutes:

Click through for that video, as well as a way to load, process, and display graph data.

Comments closed

The DIFFERENCE() and SOUNDEX() Functions

Hadi Fadlallah looks at two methods of string distance:

Soundex is a phonetic algorithm developed by Robert C. Russell and Margaret King Odell in the early 1900s. This algorithm is used to index names as they are pronounced in English. The main goal of such an algorithm is to encode homophones to the same representation to be matched even if there are some slight spelling differences. As an example, consider the names “Smith” and “Smyth”, or “Mohamad” and “Mouhammad”. Soundex mainly encodes consonants and only encodes a vowel if it is the first letter of the name.

Being one of the most popular phonetic algorithms, Soundex was implemented in multiple database engines such as OracleSQL ServerMySQLSQLite, and PostgreSQL.

These two methods are not perfect and they do really limit you to one word (or small word grouping), but they are useful.

Comments closed

Bidirectional Transactional Replication and Server Names

Mousa Janini points out a requirement of bidirectional transactional replication:

The steps to create a Bi-directional replication is simple, and similar to the steps for configuring transnational replication with extra step to enable the @loopback_detection parameter of sp_addsubscription to ensure that changes are only sent to the Subscriber and do not result in the change being sent back to the Publisher.

The most common issue for the Bi-directional replication is when the loop back detection is not working as expected; which results in data conflicts and Primary Key Violations.

Read on to see what is the cause of this problem and what you can do to solve it.

Comments closed