Kevin Feasel – Page 841

Connecting Confluent and Databricks on Azure

Published 2021-02-08 by Kevin Feasel

Angela Chu, et al, take us through a streaming data ingestion process:

How do you process IoT data, change data capture (CDC) data, or streaming data from sensors, applications, and sources in real time? Apache Kafka^® and Azure Databricks are widely adopted technologies in the industry, but they require specific skills and expertise to run. Leveraging Confluent Cloud and Azure Databricks as fully managed services in Microsoft Azure, you can implement new real-time data pipelines with less effort and without the need to upgrade your datacenter (or set up a new one).
This blog post demonstrates how to configure Azure Databricks to interact with Confluent Cloud so that you can ingest, process, store, make real-time predictions and gain business insights from your data.

Click through for a detailed demonstration.

Comments closed

Secure Cluster Connectivity in Azure Databricks

Published 2021-02-08 by Kevin Feasel

Abhinav Garg and Premal Shah have an announcement:

We’re excited to announce the general availability of Secure Cluster Connectivity (also commonly known as No Public IP) on Azure Databricks. This release applies to Microsoft Azure Public Cloud and Azure Government regions, in both Standard and Premium pricing tiers. Hundreds of our global customers including large financial services, healthcare and retail organizations have already adopted the capability to enable secure and reliable deployments of the Azure Databricks unified data platform. It allows them to securely process company and customer data in private Azure Virtual Networks, thus satisfying a major requirement of their enterprise governance policies.

Read on fore more detail about how this works.

Comments closed

Windows Authentication Across Domains

Published 2021-02-08 by Kevin Feasel

Daniel Hutmacher shows three methods for connecting to a remote SQL Server instance on a different domain:

A jump box is a virtual desktop on the client’s domain that you can connect to using Remote Desktop. You’d obviously have to ask somebody for access to one, and you’d have to set up your development environment from scratch. This may not be a big issue if you’re in SSMS all of the time, but when you need the Power BI Desktop, Excel or even Visual Studio, this setup can take some time (not to mention asking for local admin credentials on the jump box).

A fourth option is to run the executable with runas and /netonly, like:

runas /user:domain\username ssms.exe /netonly

Comments closed

Testing Power BI Report Server Datasources

Published 2021-02-08 by Kevin Feasel

Aaron Nelson has some cmdlets for us:

I finally did it. I created a function I’ve been wanting to be able to use for *years*. Test-RsRestItemDataSource is here.
I can’t tell you how many times I’ve started to work on a report I was told was working, only to find the connection info was invalid. This wastes valuable time, especially when you’ve already made changes to the report.
Other times, I’ve been asked to figure out why a bunch of subscriptions weren’t working, only to find out it was a simple connection issue. I’ve always wanted a simple PowerShell command to check the credentials of a bunch of reports before I touch anything.
Turns out, it wasn’t that hard to build at all. I only wish I had built it years ago.

Click through to see an example of this, as well as two more cmdlets.

Comments closed

Memory Grant Internals

Published 2021-02-08 by Kevin Feasel

Deepthi Goguri has two posts for us. First up, we learn about queries spilling to disk:

During the build time, the memory is allocated based on the cardinality estimates and the estimates are based on the input size and the probe. The buckets are created in the memory and we place the rows in those respective buckets. If the grant is exceeded then some of the buckets will be send to the disk. As some of the buckets are already in memory and some in disk, the initial probe of the data using the inner set. The probe rows will be scanned if the row hash to the in memory bucket the match is done for those rows. If the rows match to the on disk bucket, the probe row will be return to the disk along with the outer side bucket. So, because of this we have more disk writes to tempdb at the probe time. This is a build side spill.

Then, Deepthi wraps up this series with a bit of balance:

If this final part of the series, lets talk about how we balance these memory grants. If lot of your queries are requesting for more memory than they required, then that causes concurrency issues. Memory grants are allocated to a any query even before they are executed based on the cardinality estimates and based on the memory consuming iterators. Once the query memory is allocated to a query, that memory is only released once the query execution completes even if the query actually uses a part of the allocated query. If more memory is requested by the queries than they need, then we are wasting the memory. What happens if the queries only receive less memory grants than they need, then we there is a performance impact on the running queries.

Read on for ways to fix excessive or insufficient memory grant problems.

Comments closed

Running Out of Identity Values

Published 2021-02-08 by Kevin Feasel

Chad Callihan simulates integer overflow:

If we try to add one more record, we’ll get an error:
Msg 8115, Level 16, State 1, Line 14
Arithmetic overflow error converting IDENTITY to data type int.

Read on for two solutions to the problem.

Comments closed

Sunflower Plots in R

Published 2021-02-05 by Kevin Feasel

Kenneth Tay takes a look at a sunflower plot:

A sunflower plot is a type of scatterplot which tries to reduce overplotting. When there are multiple points that have the same (x, y) values, sunflower plots plot just one point there, but has little edges (or “petals”) coming out from the point to indicate how many points are really there.

My first thought on it is that it’s too busy and doesn’t do its job of portraying a mass of data points very well. When you have just a few observations, then yeah, it’s not too bad. But once you have any reasonable amount of density on the plot, it’s better to use jitter and transparency (as Kenneth points out). H/T R-bloggers

Comments closed

Counting Open Lockers in R

Published 2021-02-05 by Kevin Feasel

Holger von Jouanne-Diedrich solves a riddle:

We are standing in front of 100 lockers arranged side by side, all of which are closed. One man has a bunch of keys with all 100 keys and will pass the lockers exactly a hundred times, opening or closing some of them.
On the first pass, he opens all the lockers. On the second pass, the man will go to every other locker and change its state. That means: If it is closed, it will be opened. If it is already open, it will be closed. In this case, he closes lockers 2, 4, 6… 98 and 100, because all doors were open before.
On the third pass, he changes the state of every third locker – that is, 3, 6, 9, … 96, 99. Closed doors are opened, open doors closed. In the fourth pass, every fourth locker is changed, at the fifth every fifth – and so on. At the last, the 100th, the man finally only changes the state of door number 100.
The question is: How many of the 100 compartments are open after the 100th pass?

Click through for one solution in R.

Comments closed

Injecting a Tracer Token into Each Transactional Replication Publication

Published 2021-02-05 by Kevin Feasel

Steve Stedman has a script for us:

When working with SQL Server transactional replication, one of the common checks to see if things look healthy is to insert a tracer token into the replication data stream and see how long it takes to get to the subscribers.
This is a process that can be very tedious if you have multiple publications to check, as you need to go to the replication monitor, click on the publisher, then go to tracer tokens tab and insert a tracer token. Then wait to see the result. If you need to do this more than a couple times, you may be waiting for a while.

As a result, Steve has created a way to simplify this. Click through for that script.

Comments closed

Scrolling Credits in Video Meetings

Published 2021-02-05 by Kevin Feasel

Rob Farley continues a thread:

The basic concept is that once you have a greenscreen application background in place, you can easily show text or drawings or whatever on the screen, superimposed over your webcam feed. I show how to do this in this post: http://blogs.lobsterpot.com.au/2021/01/30/presentation-trickery-online-glassboard-like-lightboard-but-using-just-free-software/. Scott Hanselman also made a video about it at https://youtu.be/-oaikJCR6ec, and used the idea to make scrolling credits, which is a really neat idea that everyone loves.

Click through to see how it works.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Author: Kevin Feasel