I’m pretty much going to leave the code as-is from the previous post but will move things about a bit and add a
DataFrameclass. Also, instead of passing the session id and client around i’m going to wrap them in the
SparkSessionso that we can just pass a single object and also use it to construct the
DataFrameso we don’t even have to worry about passing it around.
The first thing is to take all of that gRPC connection stuff and shove in into
SparkSessionso it is hidden from the callers:
Read on for the end state that Ed is headed toward and how to get closer to that state.