Ed Elliott continues a series on Spark Connect:
I’m pretty much going to leave the code as-is from the previous post but will move things about a bit and add a
SparkSessionand aDataFrameclass. Also, instead of passing the session id and client around i’m going to wrap them in theSparkSessionso that we can just pass a single object and also use it to construct theDataFrameso we don’t even have to worry about passing it around.The first thing is to take all of that gRPC connection stuff and shove in into
SparkSessionso it is hidden from the callers:
Read on for the end state that Ed is headed toward and how to get closer to that state.