Press "Enter" to skip to content

Author: Kevin Feasel

Analyzing Windows Documents with SQL Server Semantic Search

Haroon Ashraf begins a series on semantic search over files:

The article examines storing unstructured data using native windows directory structure and managing it through the SQL Server database. Subsequently, we demonstrate how to perform comparative analysis with the help of Semantic Search in order to get valuable information out of the data. Additionally, you are going to get some hands-on experience of storing and linking documents on Windows File System and using Semantic Search.

It is of great importance for both database beginners and professionals to know how to apply file storage strategies such as File Table to searching technologies like Semantic Search for unstructured data analysis.

The first part of the series is all about setting up FileTable.

Comments closed

Discovering Provider-Specific Commands in Powershell

Jeffrey Hicks performs some Powershell research:

I’ve been diving into PowerShell help lately while preparing my next Pluralsight course. One of the sad things I have discovered is the loss of provider-aware help. As you may know, some commands have parameters that only exist when using a specific PSDrive.  For example, the -File parameter for Get-ChildItem only works in the file system. If you change to the CERT: PSDrive that parameter isn’t available. But you do get a new parameter like -CodeSigningCert. It used to be that you could use help to discover these differences. Get-Help has a -Path parameter to assist you.

Sadly, at least for me as a PowerShell teacher, this no longer works, especially if you have updated help recently in Windows PowerShell. Microsoft is working on some updates to how Help is delivered but for now we might need alternatives.

Read on for alternatives.

Comments closed

Python and Power BI Desktop

David Eldersveld continues a series on Python as an external tool in Power BI. Part 2 is about defining a proper JSON formatted file:

To see a new tool in the Power BI Desktop ribbon, you need to define a JSON file and place it in a specific folder on your workstation. The featured external tools Tabular Editor, DAX Studio, and ALM Toolkit have installers that take care of this step. Since you do not have a dedicated installer to place this file in the required directory, you need to manually define your own.

As long as the “enhanced metadata format” for the data model is enabled, and the JSON in your file is accurate, you should see your new tool in your ribbon after you re-open Power BI Desktop.

Part 3 shows off virtual environments in Python as well as how to connect to the Tabular Object Model:

The Tabular Object Model (TOM) library for .NET opens Power BI’s data model to external tools. Two pieces are required to allow Python to interface with .NET:
1) Pythonnet package for .NET CLR (pip install pythonnet)
2) Python-SSAS module (ssas_api.py placed in the same folder as the main script you’d like to run)

The python-ssas (ssas_api.py) Python module that facilitates the TOM connection is all the work of Josh Dimarsky–originally for querying and processing Analysis Services. I simply repurposed it for use with Power BI Desktop or the XMLA endpoint in Power BI Premium and extended it with some relevant examples. Everything relies on Josh’s Python module, which has functions to connect to TOM, run DAX queries, etc.

This does look pretty handy.

Comments closed

Marking a Table in DAX as a Date Table

Alberto Ferrari explains why you have to mark a date table as such in Power BI:

Even though this probably does not surprise you, in fact it should. The measure – as it is written – should not work. The reason why it works is because the relationship between Sales and Date is using the Sales[Order Date] column. Sales[Order Date] has a Date data type. If we use an integer to relate the two tables instead of using a Date column, the formula stops working. We now change the relationship using the integer column Sales[OrderDateKey] instead of the Sales[Order Date].

Read the whole thing.

Comments closed

Assuming a Role with AWS Powershell Tools

Sheldon Hull solves a problem:

I’ve had some issues in the past working with AWS.Tools PowerShell SDK and correctly assuming credentials.

By default, most of the time it was easier to use a dedicated IAM credential setup for the purpose.

However, as I’ve wanted to run some scripts across multiple accounts, the need to simplify by assuming a role has been more important.

It’s also a better practice than having to manage multiple key rotations in all accounts.

Read on to see how far Sheldon has been able to take this, but also how much more work is left to do.

Comments closed

Data Privacy in Confluent Platform

David Millman shows off the Privitar Kafka Connector:

The initial message structure, in the left column above, is a simple JSON document with five fields. The middle column contains the list of rules that must be applied, defining the policy. On the right is a sample output message generated as a result of the policy being applied to the initial message.

In the Privitar Policy Manager, a user maps the individual fields to the appropriate rule, as shown in the screenshot below. A rule is applied to each of the fields and the schema is read as a single table structure, named testfile. These rules can be applied for every instance of the schema.

Read on for more.

Comments closed

SQL Server Management Studio 18.6 Now GA

Drew Skwiers-Koballa announces SQL Server Management Studio version 18.6:

The 18.6 release is the second major release of SSMS in 2020 and packs several high impact changes, including a fix for crashes in database diagrams. Key fixes include:  

– Save to XEL file error fix. 
– Bacpac file import error fix. 
– Database diagrams crash fix. 
– Addressed sources of three common application hangs. 

Lots of bugfixes in here, but there are a few new things as well.

Comments closed

Backing Up PostgreSQL

Valerie Parham-Thompson gives us a few methods for backing up PostgreSQL databases:

There are at least four ways to back up a Postgres database: SQL dump, filesystem snapshots, continuous archiving, and third-party tools.

For each, notes on the recovery point objective (or RPO, which is a measure of how up-to-date your application and business needs will require the data to be) and recovery time objective (or RTO, which is a measure of how quickly the restore needs to be completed after an outage begins) are provided. You should weigh these indicators against your business requirements for the data in question.

Read on to learn more about each of these processes.

Comments closed

Downgrading SQL Server on Linux

Sreekanth Bandarla wants to roll back cumulative updates on Linux:

Of course you can get this info from SQL or several other ways in Linux. Okay, now we know we got SQL Server 2019 CU5 running on this server to work with. Let’s just assume CU5 broke something in my database and I want to go back to CU4. How do I do that?

Click through to see how to do this for Red Hat (or any system using yum). Debian-based don’t have a downgrade option, but you can use apt-get install mssql-server=[version number] instead.

Comments closed

Columnstore Index Maintenance

Ed Pollack continues a series on columnstore indexes:

Like with standard B-tree indexes, a columnstore index may be the target of a rebuild or reorganize operation. The similarities end here, as the function of each is significantly different and worth considering carefully prior to using either.

There are two challenges addressed by columnstore index maintenance:

1. Residual open rowgroups or open deltastores after write operations complete.
2. An abundance of undersized rowgroups that accumulate over time

Read on for the full story.

Comments closed