Kevin Feasel – Page 615

Apache Flink and Delta Lake

Published 2022-02-11 by Kevin Feasel

Max Fisher and Dylan Gessner use Flink to load data in Delta Lake:

As with all parts of our platform, we are constantly raising the bar and adding new features to enhance developers’ abilities to build the applications that will make their Lakehouse a reality. Building real-time applications on Databricks is no exception. Features like asynchronous checkpointing, session windows, and Delta Live Tables allow organizations to build even more powerful, real-time pipelines on Databricks using Delta Lake as the foundation for all the data that flows through the Lakehouse.
However, for organizations that leverage Flink for real-time transformations, it might appear that they are unable to take advantage of some of the great Delta Lake and Databricks features, but that is not the case. In this blog we will explore how Flink developers can build pipelines to integrate their Flink applications into the broader Lakehouse architecture.

Click through for two methods of doing so.

Comments closed

Formatting Options in Power BI

Published 2022-02-11 by Kevin Feasel

Allison Kennedy shares a few formatting preferences for Power BI:

Today’s post is inspired by @PlentyL in the Power BI Community. I find myself repeating many formatting options across Power BI reports, so thought I’d compile some of my ‘defaults’ here.

Click through for guidance on column charts displaying time series data, as well as slicers.

Comments closed

Troubleshooting Networking Issues with SNITrace

Published 2022-02-11 by Kevin Feasel

Bob Dorr digs into SNITrace:

For my specific issue I was attempting to debug why the SQLDriverConnectW errored with TCP 10054 because it was failing the pre-login handshake. For that I first needed to understand and capture the flow of the handshake activities and this is where SNITrace is helpful.

Read on to see what it does and how it works, as well as some nice Wireshark screen shots.

Comments closed

Incremental Refresh in Power BI Desktop

Published 2022-02-11 by Kevin Feasel

Soheil Bakhshi starts off a series on incremental refresh in Power BI:

Incremental refresh, or in short, IR, refers to loading the data incrementally, which has been around in the world of ETL for data warehousing for a long time. Let us discuss incremental refresh (or incremental data loading) in a simple language to better understand how it works.

Read on for the explanation as well as how you can implement it in Power BI Desktop. There are a lot of instructions here but they include publication and testing as well as development.

Comments closed

Solutions for Matching Supply with Demand

Published 2022-02-11 by Kevin Feasel

Itzik Ben-Gan continues reviewing solutions to a tricky problem:

Last month I covered a solution based on interval intersections, using a classic predicate-based interval intersection test. I’ll refer to that solution as classic intersections. The classic interval intersections approach results in a plan with quadratic scaling (N^2). I demonstrated its poor performance against sample inputs ranging from 100K to 400K rows. It took the solution 931 seconds to complete against the 400K-row input! This month I’ll start by briefly reminding you of last month’s solution and why it scales and performs so badly. I’ll then introduce an approach based on a revision to the interval intersection test. This approach was used by Luca, Kamil, and possibly also Daniel, and it enables a solution with much better performance and scaling. I’ll refer to that solution as revised intersections.

Read on for one class of solution which performed quite well.

Comments closed

Dynamic SQL No-Go

Published 2022-02-11 by Kevin Feasel

Kenneth Fisher can’t go in dynamic SQL and neither can you:

This is one of those things that when I look back on it seems really obvious. Note: If at the end of this it isn’t overly obvious to you that’s ok too. I do a lot of dynamic SQL and GO is one of my favorite commands.

Read on to understand why. I was going to “One minor clarification…” Kenneth about it being an SSMS command (implying that it’s not available elsewhere) but he successfully parried the attack en passant.

1 Comment

Database Schema Types

Published 2022-02-11 by Kevin Feasel

Steve Jones explains schema types:

OLTP/Relational
The type of schema that many of us work with is the standard OLTP or relational model. We have lots of transaction tables, most should have a PK, some of which have PKs. The schema expands to meet different needs and can have lots of entities.

It may just be the time of morning but “Galaxy schema” sounds dumb specifically because the Kimball style of star schema implicitly includes what the galaxy schema shows. Dimensions are conformed, which means they apply across facts, which implies that there may be multiple facts in the schema design. This means that galaxy schemas necessarily star schemas. For the sake of education, we tend to focus on one fact table but a star schema with two fact tables is still a star schema.

Anyhow, that’s my minor rant of the day. It’s not Steve’s fault somebody misunderstood the concept of star schemas and began promulgating this unnecessary term.

Comments closed

Slide Sharing and Story-Telling

Published 2022-02-10 by Kevin Feasel

Elizabeth Ricks answer a question:

Often a graph that makes sense during a live presentation loses meaning when distributed as a PowerPoint later. How can we retain context when transitioning between audiences without having to rework the entire presentation?

Read on for Elizabeth’s answer.

Comments closed

Identifying R Functions and Packages in GitHub Gists

Published 2022-02-10 by Kevin Feasel

Bryan Shalloway looks at gists:

A problem I bumped into was that most of Chelsea’s gists don’t actually have .R or .Rmd extensions so my approach skipped most of her snippets. I wanted to parse my own gists but ran into a related problem that most of my github gist code snippets are saved as .md files¹.
In this post I…
1. create a function to extract code chunks from simple .md files
2. parse the functions and packages in my code using funspotr.

Click through to see the code in action.

Comments closed

Who You Gonna Call? Upgrade Edition

Published 2022-02-10 by Kevin Feasel

Kenneth Fisher pulls out the company directory:

This month the topic we are blogging about is Upgrade Strategies. Or, how do we look at SQL Server upgrades. In my case I want to talk about the absolute hardest part of any upgrade at my company.
I should point out that I work for a large company with a lot of moving parts. Over the course of my tenure here I’ve helped to support hundreds to thousands of SQL Server instances. And at least for us, the technical part of an upgrade isn’t too bad. Where we almost always run into problems is Who do we contact?

Read on for Kenneth’s thoughts on the topic.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Author: Kevin Feasel