Ed Elliott continues a series on Spark Connect:
I’m pretty much going to leave the code as-is from the previous post but will move things about a bit and add a
SparkSession
and aDataFrame
class. Also, instead of passing the session id and client around i’m going to wrap them in theSparkSession
so that we can just pass a single object and also use it to construct theDataFrame
so we don’t even have to worry about passing it around.The first thing is to take all of that gRPC connection stuff and shove in into
SparkSession
so it is hidden from the callers:
Read on for the end state that Ed is headed toward and how to get closer to that state.