Press "Enter" to skip to content

Author: Kevin Feasel

Fixing Screen Repainting Issues in SSMS

Greg Low has a workaround for an annoying problem:

Once again, I’m seeing lots of customers reporting screen repainting issues in SQL Server Management Studio (SSMS). It mostly seems to affect version 18 but I’ve also seen it in version 17. And it’s most prevalent on Windows 10.

The typical issue is that you click on another open tab, and the contents of the tab doesn’t repaint. You are still seeing the previous tab. If you click into the tab, you start to see bits from both tabs.

Click through to see the fix. I’ve seen this issue pop up though I don’t remember seeing it on the latest version of SSMS 18…though now that I say that, I’m guaranteed to have the problem hit me today.

Comments closed

Managed Identity with Azure Functions

Taiob Ali shows how you can safely store credentials which your Azure Function apps need:

With the announcement of Powershell support in Azure Functions, it has become easier for data professionals to use functions to manage cloud resources such as Azure SQL Database, Managed Instances. A common challenge when using functions is how to manage the credentials in function code for authenticating databases. Keeping the credentials secure is an important task. Ideally, the credentials should never appear in the code or in the source control.

Manged Identity can solve this problem as Azure SQL Database and Managed Instance both support Azure AD authentication. You can read mode about Managed Identity here.

In this article, I will show how to set up Azure Function App to use Managed Identity to authenticate functions against Azure SQL Database.

The example connects Azure SQL DB, but this is a general-purpose solution.

Comments closed

Financial Statements in Power BI

Joseph Yeates has started a series on creating financial statements in Power BI:

This post is part 1 in my series on creating financial statements in Power BI! I’m starting with creating an Income Statement. The source data and Power BI file used in the example below can be found here.

I loaded the source data into the Power BI report. It consisted of three tables:

– Fact table: contains dollar amount of transactions
– GL table: contains categorization of transactions
– Calendar table: contains date information for the data model

This will be interesting to watch, especially because this kind of task is generally handled in a tool like Reporting Services instead of Power BI.

Comments closed

Deploying a Big Data Cluster to a Multi-Node kubeadm Cluster

Mohammad Darab shows how we can deploy a SQL Server Big Data Cluster on a multi-node kubeadm cluster:

There are a few assumptions before we get started:

1. You have at least 3 virtual machines running with the minimum hardware requirements
2. All your virtual machines are running Ubuntu Server 16.04 and have OpenSSH installed
3. All the virtual machines have static IPs and on the same subnet
4. All the virtual machines are updated and have been rebooted

Mohammad shows how to set up the cluster, configure Kubernetes, and then install Big Data Clusters. Definitely worth the read if you’re interested in building a Big Data Cluster on-premises.

Comments closed

Spark is Not ACID Compliant

Kundan Kumarr explains how it is that Apache Spark is not ACID compliant:

Atomicity states that it should either write full data or nothing to the data source when using spark data frame writer. Consistency, on the other hand, ensures that the data is always in a valid state.

As evident from the spark documentation below, it is clear that while saving data frame to a data source, existing data will be deleted before writing the new data. But in case of job failure, the original data will be lost or corrupted and no new data will be written.

Click through for an explanation of these two along with a demo, and then an explanation of how Spark Datasets don’t follow the Isolation or Durability properties either. I don’t think any of this is earth-shattering to people, but it is a good reminder that Spark doesn’t fit all use cases.

Comments closed

Updates in Confluent Platform 5.4

Tim Berglund takes us through what has changed in Confluent Platform 5.4:

Role-Based Access Control (RBAC)

Back in July, we announced the preview for RBAC as part of the Confluent Platform 5.3 release. After gathering feedback and learning from everyone who tried it out, we are now pleased to announce the availability of RBAC in Confluent Platform 5.4. You can now make use of this feature in production environments with Confluent’s full support.

RBAC offers a centralized security implementation for enabling access to resources across the entire Confluent Platform with just the right level of granularity. You can control permissions you grant to users and groups to specific platform resources, starting at the cluster level and moving all the way down to individual topics, consumers groups, or even individual connectors. You do this by assigning users or groups to roles. This gets you out of the game of managing the individual permissions of a huge number of principals—a real problem for large enterprise deployments.

RBAC delivers comprehensive authorization enforced via all user interfaces (Confluent Control Center UI, CLI, and APIs), and across all Confluent Platform components (Control Center, Schema Registry, REST Proxy, MQTT Proxy, Kafka Connect, and KSQL). Given the distributed architecture not only of Apache Kafka but also of other platform components like Connect and KSQL, having a single framework to centrally manage and enforce security authorizations across all the components is, in a word, essential for managing security at scale.

Click through for several more features and where you can try it out, either on-premises or in a major cloud host.

Comments closed

Explaining Black Box Models with LIME

Holger von Jouanne-Diedrich takes us through the intuition of LIME:

There is a new hot area of research to make black-box models interpretable, called Explainable Artificial Intelligence (XAI), if you want to gain some intuition on one such approach (called LIME), read on!

Before we dive right into it it is important to point out when and why you would need interpretability of an AI. While it might be a desirable goal in itself it is not necessary in many fields, at least not for users of an AI, e.g. with text translation, character and speech recognition it is not that important why they do what they do but simply that they work.

In other areas, like medical applications (determining whether tissue is malignant), financial applications (granting a loan to a customer) or applications in the criminal-justice system (gauging the risk of recidivism) it is of the utmost importance (and sometimes even required by law) to know why the machine arrived at its conclusions.

One approach to make AI models explainable is called LIME for Local Interpretable Model-Agnostic Explanations. There is already a lot in this name!

LIME is not trivial to use and it can be very slow, but it is a great way to visualize models.

Comments closed

Improving Join Performance on ADF Data Flows

Mark Kromer has a few tips on improving ADF data flow join performance:

When you include literal values in your join conditions, Spark may see that as a requirement to perform a full cartesian product first, then filter out the joined values. But if you ensure that you (1) have column values from both sides of your join condition, you can avoid this Spark-induced cartesian product and improve the performance of your joins and data flows. (2) Avoid use of literal conditions to represent the results of one side of your join condition.

In other words, avoid this for your join condition:source1@movieId == '1'Instead, implement that with a dummy derived column. 

There are several good tips in this post.

Comments closed